Building a Large-Scale Distributed Technology Platform?
web

Building a Large-Scale Distributed Technology Platform?

●     Should you buy ‘off the shelf’ products?

●     Should you write our own? - What architecture? - What should be the technology stack?

In today’s scenario there are multiple solutions available in the market (e.g. Sharetribe, arcadier, Demandware, etc.) offering catalogue/inventory/cart/payments/etc. offering end-to-end capabilities to build e-commerce platforms. It surely has an advantage; your site will be up and running in a short period of time with the help of a relatively low number of engineers.

But, with ever changing business requirements, a lot of customizations will pop up with time and thus things will become increasingly difficult. Moreover, in the present Indian business scenarios specifically w.r.t. e-commerce/marketplace - logistics, seller, shipping, payment gateways etc. are changing rapidly. All these dynamisms will bring added/changing workflows on a regular basis. Primarily, it will be very challenging to maintain availability and performance on a lesser known platform.

My take, readily available platforms will enable you to go to market fast in relatively lesser amount of time. Moving forward, above limitations will slow down the agility. At the end, you need to choose as per your business need, as the saying goes one size doesn’t fit all.

Architecture

We will look into how to build it and what should be typical components?

Going by the needs of marketplace – e.g. user interface, pricing, inventory management, order management, catalogue management, payments, shipping, warehouse management, review, ratings, search, recommendation, fraud & risk, list will go on … if single system is built to handle all today & tomorrow’s need then I am sure it will be nightmare to maintain such system and would be very difficult to scale and evolve with changing need/technology.

No alt text provided for this image

To start with, you can think of this platform consisting of – A User interface layer, Middle data processing layer and Backend database layer. Why? These layers have different work to do which means that they need to be handled differently isn't it? The UI layer will have more connections to serve; the middle layer will have high concurrency, process and business logic; backend layer will have to store & retrieve data in some defined manner. In this way, you can handle each layer requirements separately and independently. This 3-Tier architecture allows optimal tuning and in turn allows fast and horizontal scaling.

On top of this, Microservices should be used to modularize, distribute and decentralize each layer specific components. These isolated sub systems talk through well-defined APIs (REST) which are exposed as service to others. Here isolation means at code/deployment/storage level. You can achieve development/release agility through these modularizations. You can achieve scaling also pretty fast. Problem root identification/fix can be also fast.

Caching and Storage

Marketplace sites are content rich - large number of images, different kinds of widgets, myriads of product categories, etc. Lots n lots of data crunching happens for millions of users, distributed geographically, in a short span of time. Good problem to solve? What should be the solution? One option can be having multi-layered and distributed caching technologies to address these challenges. The important factor to consider here, much of the data served is static (like images, etc.) and dynamic data also is not changing rapidly. You can do caching in the browser itself and data can live on till it becomes stale or expires.

Next level of caching via using CDN (content Delivery network, likes of CloudFront, Akamai) for images, JS, etc. These CDNs can partner with ISP and it can sit next to ISP’s edge routers. That way, it will avoid multiple network hops and hence it helps in serving consumers faster.

Next level of caching via using reverse proxies, where dynamic page content doesn’t change real time. This will reduce the load on web servers.

Next level of caching can be used at the application level, with existing latest hardware technology we can have serious big caches. But as they say, power comes with responsibilities, for example in case of Java, make sure heap memory is managed properly. If they are big, java process will go in dark mode (freeze ☺) when executing full GC.

Next level of caching can be at storage level. NoSQL? This can be used as another level of caches or even authoritative data sources. They provide high performance and scalability at the cost of consistency and structure (It provides eventual consistency). Now we have many open source distributed cache available for use and can be picked as per the need.

Another option can be use of multiple replicas of the master database. Maybe traffic can be split in such a way, all updates go to the master and while all read queries goes to one or more replicas.

In-depth Monitoring

Monitoring is the lifeline of any top performing marketplace. The system should have power to analyze the impact and be robust enough to correct as quickly as possible. Monitoring capability also helps in forecasting for capacity planning.

To have this at n-level, you need to monitor each layer, microservice component, network, IOPS, hardware, service etc. There also needs to be an alert mechanism for all critical production issues (e.g. failure will result in business impact). There are many tools available off-the-shelf to serve these purposes.

Kubernetes/Istio– can be used for microservice Orchestration

Prometheus/Logstash/Grafana/Kibana/pinpoint – can be used for data visualization/alerts/APM/etc.

Cloud as Infrastructure as a service (IaaS)

Using cloud as IaaS for marketplace/distributed system will help to focus on building the features and not worry about HW/CPU/RAMs/etc. This will also provide advantage in scaling/de-scaling according to the business needs. Clouds on demand processing capability gives advantage of processing of enormous data for analytics, personalization, etc.

Programming Languages

Microservice gives the most important advantage here, each component can choose their own language which serves their purpose at best. We went ahead with Java as it is easily maintainable, scalable, more mature platform and plenty of brilliant engineers available who can solve the complex/performance problem in this area.

If we talk about open source, I feel that the whole marketplace can be built by using open source. Starting with programming languages, Linux, Spring, Hibernate, NoSQL DBs, etc. Now, I see open source community forums more active and helpful.

What’s your view/s?

 

Disclaimer: The opinions expressed in this post and all other posts are my personal opinions and do not reflect the opinions of any organisation that I'm associated with - either in the past or the present.

A good primer on building large scale architecture. 

In my humble opinion, if one is large enough to set up a market place they should go for own platform . On the other side, while microservices offer a unique set of advantages and have become undeniable in any architecture, its equally important to think deployment at the point of development to ensure production stability.

To view or add a comment, sign in

More articles by Anand Tripathi

  • Engineering Team Happiness

    When you're leading a team, you often wonder, 'Are my team members happy?' In this post, I'll explain how we can tell…

  • A Simple Guiding principle for Prioritizing Tech Debt

    In the world of software development, tech debt is like that pile of tasks you keep putting off until they become a big…

    1 Comment
  • Cache memory - An overview

    In today’s world, unsaid expectation is to move ‘fast’. This is true in our tech world also, where sub-second latencies…

    2 Comments
  • Event-Driven Architecture (EDA): Benefits and Pitfalls

    Deployment of microservices opens up many communication channels. It becomes increasingly difficult to manage the…

    5 Comments
  • Precept of High-performing team

    What we understand with word "TEAM"? Together, Everyone Achieves More An interdependent group of individuals who share…

    2 Comments

Others also viewed

Explore content categories