Key things to consider for a server-less Data driven solution or a Data pipe line

Key things to consider for a server-less Data driven solution or a Data pipe line

During my discussions with customers on data driven solutions, I often try to squeeze in a serverless component to the solution. Partly because there is always a gap which can be bridged and addressed swiftly with serverless implementation and partly because it is simply awesome when the use case is correct. And with micro services style implementation it is something next level. 

Below are my suggestion on few things(in no specific order) to consider when planning for a serverless implementation of any data driven solution,

-       Data/Payload size limit at each layer of data pipeline

You must define a payload size constraint while designing a serverless platform, else it is just a matter of time for your data pipeline starts to break/leak.

-       Stateful or stateless

Stateful and serverless just don’t gel well. REST/Stateless/event driven is the way to go.

-       Code/Executable size

Cloud vendors already have put size constraint on server-less services(such as AWS Lambda, Azure functions etc.), if deployment units are dockers/containers then too lesser the size the better for a reduced warm up time. And it is always a good idea to break it into multiple micro services if the code is huge/not optimized.

-       What is the data pipeline length or how many clocked stages your solution has got

This I learnt from experience. In simple terms, if the solution you are planning for serverless has more components throughout the pipeline (typically 3-5 including database layer is optimal), then it is time to re-structure or preferrably a parallel pipe.

-       Is it event driven

Your solution must be event/trigger/schedule driven. Serverless is on demand and stateless and hence it must be triggered.

-       SQL/NoSQL

Though both work well, typically the solutions I design are largely NoSQL driven if designed from scratch. The advantage of NoSQL is on-demand scalability which goes well with serverless. You can have a hybrid model if your data source/destination is SQL based.

-       Data sources and data delivery/extraction, continuous stream/ batch/ poll

Don’t miss out on data sources while designing serverless, they play a crucial role. Also, the data extraction/delivery strategy is equally important.

-       Compliance/security or cloud /on prem

Last but not the least, compliance and security. Serverless is tricky when it comes to security.

Don't hesitate to add anything if I missed, in the comments. Thanks for reading.


To view or add a comment, sign in

More articles by Binaya Kumar Behera

Others also viewed

Explore content categories