Why Complex Event Processing is … well … Complex!

Why Complex Event Processing is … well … Complex!

In some of my work at e-commerce companies business stakeholders have asked for real time capabilities which will look at a stream of data real time, and compare it to a slice of historical data for a certain time duration, perhaps the past 5 minutes, 30 minutes or even a day.  They then want to compare it to other information which is real time to ensure it’s relevant, to either flag it as an opportunity or a risk.

A simple example of Complex Event Processing

Let me give you some examples that are relatable and will highlight the challenge.

No alt text provided for this image

On a particular payment platform, identify if the incoming payment request matches with prior traffic patterns for that particular user or institution.  For example, I typically pay my bills online Saturday morning. However if there is a payment initiated from my account between 2 AM and 5 AM, this should be flagged as suspicious.  Not because of the time of the day, because I don’t pay bills at that time.   

What’s needed to fulfill this?  It needs to generate when my payment pattern is.  Perhaps the past 10 payments, what portion of the week or day do I initiate payments?  This is easy enough you might say, until you think that this needs to be constantly re-calculated to just look at the past 10 payments.  Each time I initiate a payment, the 11th payment would fall off and the system would need to determine the past 10 payments. This would need to be done for everyone every time they make a payment.

In this particular seeming straight forward example, every time a payment is made, the system needs to compare that one payment to pre-computed payment history; or compute the payment history histogram on fly.  All data needs to be available with very quick response time.

Easy enough.  Here’s another example which I personally experienced which is more complex:

A more complicated example of Complex Event Processing

No alt text provided for this image

For customers using the mobile application, in real time:

  1. Look at the lat-long coordinates and identify if they are close to a store we have an interest in.
  2. If they are close, send them a message if

- We haven’t sent them something for that store brand in the past 24 hours AND

- We haven’t sent them something in the past 5 minutes for any store brand

  1. Oh, and also if this all checks out, check to see if we have inventory for that store

This is indeed very complex.  Why?

For every lat-long coordinate, identify if the values are changed from the past coordinates.  Meaning if you are at home, it’s probably not changing. But if you are strolling along 5th avenue in NYC, driving, etc, compare it to the past value (and persist the past value if it has changed).

Identify if the lat-long coordinates are associated to a store, say Starbucks, for example.  Again, do this real time. Ok, you say, easy enough.

Knowing if you have sent them something for that store for the past day means keeping the last time you sent them something, for that store brand.  This information is time sensitive obviously and for each brand, you'll want to store the date and time you reached out to the customer about that brand.

Also you’ll want to know when was the last time you sent them a message, and if it’s more than 5 minutes, they are open to being sent another message.

Finally, you’ll need to check inventory in your own system.

And it all needs to be done for every customer which is being tracked like this.  This is heavy duty processing!

Talking more Abstractly

Taking a step back and looking at it more abstractly, these examples highlight the challenges of performing Complex Event Processing:

  1. Looking at information real time
  2. Comparing this real time information to other pre-computed data across multiple data streams.  Some of the data in multiple data streams need to be pre-computed so inquiries against it will be very quick.
  3. Each data stream will have it's own stream of data, retention requirements, and pre-compute complexities to ensure the proper slice of time is looked at.
  4. Lastly, building this to scale for any number of events which come in.  Not for 1 person, or 10, this needs to scale to an unknown number of events. 

It’s complex and often requires real engineering and thought of creating the structures, and finally architecting the computing power to cost effectively only give you what you need.  

However it’s also a really fun challenge.

No alt text provided for this image




To view or add a comment, sign in

More articles by Jay Hakim

Others also viewed

Explore content categories