Parallel computing challenge

Write a program, it a very easy thing to do. Writing a software that can run in parallelzim is something that is not so easy…. In this post I want to simplify the way of thinking about parallel computing.

The Chef case

Let's take a real life - Chef restaurant, three Michelin stars.

Every evening the restaurant hosted at about 30 guest tables.

The Master Chef gets at the beginning of the evening the orders for all tables.

The master divides the task to all his assistants by working positions.

All working there space, regardless of the other working positions. Every single space has all ingredients that needed to make the specific order. The working space in the 'making time' doesn't need anything from outside. And most important we never see that one space doesn't interfere to another area

Once the dishes are ready at the outgoing serving lane, the responsible for lane decide when to spend and for which table.

Here without anymore works – we have a Parallel computing challenge With most of the principles. NO Sue-Chef forbidden to lock his co-friend

Four Big Concurrency Challenges

When we are dealing with concurrent programing we handle with few Non Functional challenges

  • Shared State- we can see that common applications like to share the data cross over the process. When we do that we led to those problems
    • Difficult to maintain
    • Very difficult to parallelize!
    • Locking is a fundamental error:
      • Must guess where parallelism will be needed
      • All consumers need to participate
      • Performance issue
    • At the end – deadlock is raised up
  • Inversion of Control
    • We’re used to writing code linearly
    • Parallel computing requires decoupling Begin from End
    • Very difficult to
      • Combine multiple asynchronous operations
      • Deal with exceptions and cancellation
  • I/O Parallelism
    • Software is often I/O-bound
      • Leveraging web services
      • Working with data on disk
      • Working with DB calls
    • Network and disk speeds increasing slower
    • I/O resources are inherently parallel
      • Huge opportunity problem for performance
      • The resource is very easily locking achieved
  • Scaling to multi-machine
    • To scale out, we must go beyond a single machine
      • Multi-machine resources becoming common
      • Roll-your-own clusters with cheap hardware
      • On-demand cloud computes
    • But - Shared memory doesn’t scale

Principles through Real-world sample

The principles basically very simple.

  • The core principle – every component is a standalone service
  • Standalone service – Get all the request data as a single input
  • Request data – include unique ID and Data version
  • All reference date – managed by the service
  • Response data – include the request id

After we defined our principles to how to build a parallel computing system lets see the real-world system that we want to implement in parallel approach.
We need to build a diagnostician service that analyzes feasibility of a stock investment.

The following diagram illustrates the High Level Design – HLD for this service

Now the nice part is to see how every component work as a standalone service.

TellMeIfIsGoodStockSI

This service layer can be implemented in several forms like SOAP, REST and more. This component can be execute the StockAnalyzesWF in same process , same machine or a external machine.

StockAnalyzesWF

This component manages the business process. Produces the object that contains the data stack, including any additional data like Customer information, prioritize, etc.
This component as all components in our system handles with all reference data part of the component. E.g. customer data kept in the best form for this component and any component that will update customer data will raise an event about to update to 'system network' and all relevant consumers have the information and will update accordingly their business logic.
After the WF build all needed data for the calculation (include original request id) the next step will be receive the work allocation by distributing messages to the 'system network'.

AnalyzesByCrowdsWisdom , AnalyzesByHistory , AnalyzesBySingleStock

These components perform the work independently. They may will not aware of other instances or version of the same component and other algorithm implementations.
After the calculation any instance will be distribute the data (include original request id) to next step by sending the data to 'system network'

IntelligentAnswerDispatcher

This component is the conjunction component, in the parallel world this is an important component. All parallel work in this state will be composed to one consolidated answer.
The response algorithm can be based on different strategies as necessary business.
Such as the first only, up to a certain number of calculation results, first negative one, result from specific algorithm and more.
The result of this stage will be sent to the closing loop in our case StockAnalyzesWF but this be done to another target.

So what we get from this method

To summarize this post. I want to place the flashlight on the important points that will be the outcome of leveraging the Event-driven architecture (EDA) + service-oriented architecture (SOA) + Self and organized Components

  • Scalability can be implemented in a simple manner only need to and needed component to "Pressure point"
  • Scalability can be done on a variety of hardware: single server , data center or even over the global network
  • Can work simultaneously with different versions of the same algorithm
  • The system monitoring can be achieved in a very easy and fast way
  • New algorithm is transparently to all system

Great post Lior, A few things to keep in mind that are often overlooked. 1) when dealing with WF based on async events/message processing one must take into consideration that an expected event/message may arrive after a long period of time or even never arrive. therefore these "timeout" scenarios must be evaluated and the correct business action for these cases must be determined and handled. 2) If you care about scalability then don't design the system in a way that mandates the ordered processing of the messages in order for the system to be correct. messages will be received out of order....deal with it. 3) same goes for duplicate messages...design your services to be idempotent. 4) As for reference data, one will benefit from having a clear strategy as to how the system handles data augmentation/composition across discrete components of the system (easier said then done).

To view or add a comment, sign in

More articles by 🇮🇱 Lior Israel

  • Do you want to work with me?

    The below is the 100% of what we are looking from the architect. We know that we are looking for professionals.

    3 Comments
  • Please show me your rear-seat empty

    Hello my friends I’m so excited to write about a Unique Event It's going to happen on Thursday 21th of July 2016. This…

    2 Comments
  • 3 things that you should know about Agile Practitioners 2016

    When you say Agile, this not meant to the development will be fast or will be cheaper Or with a good quality. Even not…

    1 Comment
  • Hands-on agility - Agile Practitioners 2016 conference

    http://www.apilconf.

    1 Comment
  • It's so fun to make a difference

    What a letter from 1855 can teach us about Agile today -An amazing article caught me I want to use the title with…

  • Agile Architecture

    Less is More “Just good enough” Parsimony - Simplest solution , Minimize complexity Don't be afraid to throw things…

Others also viewed

Explore content categories