Parallel computing challenge

🇮🇱 Lior Israel

Published Oct 14, 2014

Write a program, it a very easy thing to do. Writing a software that can run in parallelzim is something that is not so easy…. In this post I want to simplify the way of thinking about parallel computing.

The Chef case

Let's take a real life - Chef restaurant, three Michelin stars.

Every evening the restaurant hosted at about 30 guest tables.

The Master Chef gets at the beginning of the evening the orders for all tables.

The master divides the task to all his assistants by working positions.

All working there space, regardless of the other working positions. Every single space has all ingredients that needed to make the specific order. The working space in the 'making time' doesn't need anything from outside. And most important we never see that one space doesn't interfere to another area

Once the dishes are ready at the outgoing serving lane, the responsible for lane decide when to spend and for which table.

Here without anymore works – we have a Parallel computing challenge With most of the principles. NO Sue-Chef forbidden to lock his co-friend

Four Big Concurrency Challenges

When we are dealing with concurrent programing we handle with few Non Functional challenges

Shared State- we can see that common applications like to share the data cross over the process. When we do that we led to those problems
- Difficult to maintain
- Very difficult to parallelize!
- Locking is a fundamental error:
  - Must guess where parallelism will be needed
  - All consumers need to participate
  - Performance issue
- At the end – deadlock is raised up
Inversion of Control
- We’re used to writing code linearly
- Parallel computing requires decoupling Begin from End
- Very difficult to
  - Combine multiple asynchronous operations
  - Deal with exceptions and cancellation
I/O Parallelism
- Software is often I/O-bound
  - Leveraging web services
  - Working with data on disk
  - Working with DB calls
- Network and disk speeds increasing slower
- I/O resources are inherently parallel
  - Huge opportunity problem for performance
  - The resource is very easily locking achieved
Scaling to multi-machine
- To scale out, we must go beyond a single machine
  - Multi-machine resources becoming common
  - Roll-your-own clusters with cheap hardware
  - On-demand cloud computes
- But - Shared memory doesn’t scale

Principles through Real-world sample

The principles basically very simple.

The core principle – every component is a standalone service
Standalone service – Get all the request data as a single input
Request data – include unique ID and Data version
All reference date – managed by the service
Response data – include the request id

After we defined our principles to how to build a parallel computing system lets see the real-world system that we want to implement in parallel approach.
We need to build a diagnostician service that analyzes feasibility of a stock investment.

The following diagram illustrates the High Level Design – HLD for this service

Now the nice part is to see how every component work as a standalone service.

TellMeIfIsGoodStockSI

This service layer can be implemented in several forms like SOAP, REST and more. This component can be execute the StockAnalyzesWF in same process , same machine or a external machine.

StockAnalyzesWF

This component manages the business process. Produces the object that contains the data stack, including any additional data like Customer information, prioritize, etc.
This component as all components in our system handles with all reference data part of the component. E.g. customer data kept in the best form for this component and any component that will update customer data will raise an event about to update to 'system network' and all relevant consumers have the information and will update accordingly their business logic.
After the WF build all needed data for the calculation (include original request id) the next step will be receive the work allocation by distributing messages to the 'system network'.

AnalyzesByCrowdsWisdom , AnalyzesByHistory , AnalyzesBySingleStock

These components perform the work independently. They may will not aware of other instances or version of the same component and other algorithm implementations.
After the calculation any instance will be distribute the data (include original request id) to next step by sending the data to 'system network'

IntelligentAnswerDispatcher

This component is the conjunction component, in the parallel world this is an important component. All parallel work in this state will be composed to one consolidated answer.
The response algorithm can be based on different strategies as necessary business.
Such as the first only, up to a certain number of calculation results, first negative one, result from specific algorithm and more.
The result of this stage will be sent to the closing loop in our case StockAnalyzesWF but this be done to another target.

So what we get from this method

To summarize this post. I want to place the flashlight on the important points that will be the outcome of leveraging the Event-driven architecture (EDA) + service-oriented architecture (SOA) + Self and organized Components

Scalability can be implemented in a simple manner only need to and needed component to "Pressure point"
Scalability can be done on a variety of hardware: single server , data center or even over the global network
Can work simultaneously with different versions of the same algorithm
The system monitoring can be achieved in a very easy and fast way
New algorithm is transparently to all system

Guy Baron 11y

Great post Lior, A few things to keep in mind that are often overlooked. 1) when dealing with WF based on async events/message processing one must take into consideration that an expected event/message may arrive after a long period of time or even never arrive. therefore these "timeout" scenarios must be evaluated and the correct business action for these cases must be determined and handled. 2) If you care about scalability then don't design the system in a way that mandates the ordered processing of the messages in order for the system to be correct. messages will be received out of order....deal with it. 3) same goes for duplicate messages...design your services to be idempotent. 4) As for reference data, one will benefit from having a clear strategy as to how the system handles data augmentation/composition across discrete components of the system (easier said then done).

1 Reaction

To view or add a comment, sign in

Parallel computing challenge

🇮🇱 Lior Israel

The Chef case

Four Big Concurrency Challenges

Principles through Real-world sample

So what we get from this method

More articles by 🇮🇱 Lior Israel

Others also viewed

Distributed Systems & the CAP Theorem

Diving deep into Key Elements of Apache Spark: Driver & Executor Memory, What are the constraints?

Consistency Wars: Strong vs. Eventual Consistency in Distributed Systems

🚀 Beyond CAP: Why the PACELC Theorem Explains Distributed Systems Better

Speculative Execution in Apache Spark

Cross pollination

The Google File System: What Every Distributed Systems Engineer Should Actually Understand About It

Distributed computing for the non-technical audience (Part II).

A primer on Queuing Systems in Computer Science

CAP Theorem

Explore content categories

The Chef case

Four Big Concurrency Challenges

Principles through Real-world sample

So what we get from this method

More articles by 🇮🇱 Lior Israel

Do you want to work with me?

Please show me your rear-seat empty

3 things that you should know about Agile Practitioners 2016

Hands-on agility - Agile Practitioners 2016 conference

It's so fun to make a difference

Agile Architecture

Others also viewed

Distributed Systems & the CAP Theorem

Diving deep into Key Elements of Apache Spark: Driver & Executor Memory, What are the constraints?

Consistency Wars: Strong vs. Eventual Consistency in Distributed Systems

🚀 Beyond CAP: Why the PACELC Theorem Explains Distributed Systems Better

Speculative Execution in Apache Spark

Cross pollination

The Google File System: What Every Distributed Systems Engineer Should Actually Understand About It

Distributed computing for the non-technical audience (Part II).

A primer on Queuing Systems in Computer Science

CAP Theorem

Explore content categories