Some design consideration when Integrating with different systems - Part 1

Interfacing with different system is common thing. To avail services - A system has to expose APIs or have to consume APIs exposed by other System(s)

While EAI Patterns and many Frameworks help us to address it. But there are some underlying principles that needs to adapted in either case.

Mode of communication - Asynchronous or Synchronous calls

Most of the system (traditionally) prefers synchronous call - means one system sends a request and waits for it to response.

While this is commonly adapted approach as it involves minimalist coding and does not require complex queuing mechanism and syncing of response to same request. But Synchronous poses some problem like

  • Performance issue - System X has made a Request to System Y ,and "Y" is taking longer time to process request. Thread of "X" has to wait for it to complete.... Okay..Okay.. Multi - Threading helps as Application container takes care of it. But imagine if several hundred request come from "X" , and "Y" takes 20 Seconds for each request to process ...then ? "X" would start misbehaving because of "Y". Of course "X" should be fault tolerant , but then too it would reduce overall TPS of the system

We faced such issue in past... Mobile Banking App use to communicate to Mobile Application Server via Web Server . In-turn Mobile Application Server made a request to CBS Interface. One day ..CBS Interface was facing some issue, all Mobile Request request where getting piled at Mobile Application Server and it was unable to process further request (Denial of Service) . There were multiple solution adapted to resolve this issue, but one of them was making it fault tolerant and Timing it Out

  • Handling through Timing Out - Here "X" waits for "Y" for certain time ( let's say 10 Second) and then it signals time-out , throws exception and exits gracefully . And System is stable... :)

However , it creates other issues that needs to handled. It may not create problem if the service call is just inquiry, but if service call is for transaction processing (like Fund Transfer) then time out scenarios has to be properly handled.

Let's say user is making a payment from his Mobile App and request is send to Application Server and from there to CBS or to NPCI for inter fund transfer

If response is not received in stipulated time, Mobile App Server would respond to Mobile App that request has timeout. But for End user it is a bad user experience ...Okay it has time out but what has happened to his transaction - His account has been debited and he would have receive SMS for it .. Where is the money gone ????

The Problem starts here - Few things to be while designing the system to address above issues - This is applicable whether is Async or Sync communication

  • Each Request should have unique identifier a Request ID. Best practice would be to generate RequestID from point of Origination. In above scenario the Point of Origin is Mobile Banking APP. RequestID would help to trace the fate of the request while troubleshooting . Also User can make a request based on RequestID to check the fate of transaction
  • It is quite possible there is one call from Point of Origin but internally the system makes multiple calls to other systems. Like in case Mobile Banking Apps , it calls CBS system multiple times first to check whether A) account can be debited B) PIN Verification and C) Actual fund transfer (Ideally a composite call is required in this case...but sometimes it is not possible..because existing system APIs are developed in that way) and D) Send it to NPCI . In such scenario each request to other system should generate new request ID. So a Single RequestID from Mobile Banking and multiple Request A, B , C and D each having unique RequestID mapped to Original RequestID
  • State of Request - It is imperative to maintain state of Request at every stage , it would help to debug and troubleshoot. For e.g. In our system we use to maintain following states. a) Request Received from Mobile App as "Received Request" b) Request sent to CBS and Pending for Response as "Pending from CBS" c) Response Received from CBS as other state ("Success" or "Failure" or "Timeout") and so on ... You can have intermediate state if needed
  • Logging - The RequestID has to stored in Files Logs or DB to trace the fate of the Request. Each RequestID should have corresponding details (minimum) for troubleshooting - 1. Parameters passed and response received 2. System from/to where the request was made 3. Time of Receiving request 4. Time of Request sent to other system 5. Time to Respond Request 6 State of Request - Success and if Failed appropriate reason Code. Response Details and if it is Async call then corresponding Response ID
  • Polling Mechanism - This is interesting concept for async call or sync calls when timeout has happened. Each API can have a parameter - a Poll Flag, when this is true then API does not create a new Request , but it checks the state of request. Instead of Polling one can implement Rest Hooks or Web Hooks

There are various other parameters to be considered also and would cover in next article.

Few points to highlight ,

  • There are various way of implementation and there is no one correct way of implementing , but always take a note of basic principles while designing. Even if you can implement some of them , make a note of it - that you could not implement due to some constraints (technical or business)
  • There are lot of sites which mentions about various framework and technologies for implementation. However think before applying them in your product whether they are really applicable. First thing is to Think and then go for google search, Google would help you searching what you are thinking but it would not help you thinking

Few things I would like to cover in my upcoming articles in relation to this topic

  • Exception Handling
  • Versioning of APIs
  • Security mechanism - To Prevent unauthorized request to process
  • Scalability
  • Handling Load and achieving high throughput
  • Caching
  • Auditing and Traceability
  • Isolation each Services
  • Fault tolerant system
  • Troubleshooting and Logging


To view or add a comment, sign in

Others also viewed

Explore content categories