Integrated Automated Solutions for Enterprise Document Processing

A plethora of automation solutions in the market today claim capabilities to deal with data extraction from various document types. They either cater to text-format documents or for image-format documents. Generally, solutions for text-format documents are found to yield better accuracy and thus higher STP. For image format documents, solutions / products heavily rely on the maturity of OCR/ICR technologies. An augmentation wrapper of business rules and/or machine learning (ML) components enable better recognition results.

Traditional OCR/ICR technology solutions have matured significantly on reducing noise and improving character recognition accuracy. Along with printed character recognition, the capabilities for hand-printed and cursive handwriting are improving by the day.

RPA products are partnering with these technologies to augment their offerings. Integrated with ML libraries / components and configurable low/no-code features, the products are becoming friendlier for business users (non-programmers) to build solutions.

System Integrators (SIs) are leveraging their experiences from across multiple scenarios to build & package tailored solutions to client-specific needs:

  1. Add-on modules such as: ingestion, workflow, reporting
  2. Accelerators such as: ML components or look-ups for improving accuracy
  3. Industry specific business-rules engine


No alt text provided for this image

There are many hundreds or thousands of unique document types in use across various industries. While there are certain products or as-a-service providers suited for select business segments, their impact is limited and siloed to certain business functions only. For example: invoice processing solutions are fairly well-matured and yielding reasonably good benefits.


Large enterprises with thousands of documents to be processed globally everyday are having to leverage different solutions using a mix of multiple tools & technologies for documents based on business domain (invoices, mortgages, claims, etc.). This demands complicated solution integrations with various workflows, document repositories and legacy & new-age applications, besides being economically expensive. Adoption of automation for such enterprises remains a challenge.

Consider a bank that has multiple functions such as: cheque processing, customer account creation, merchant POS terminal setup & certification, credit/debit card processing, wire transfers, loan processing, etc. Across these functions, numerous documents are sent / received every day and there are different solutions available for processing them:

  1. Customers opening a bank account submit KYC documents (Passport, Driving License, Income statement, etc.) or cheques or loan applications (loan, credit card, etc.)
  2. Vendors submit invoices for services or products rendered
  3. Purchase Orders (PO) raised by bank

The different solutions available for each of these document types include: AP solutions for invoices, cheque readers, passport readers, PAN Card readers, etc.

All these documents from various branches are collated in the bank’s centralized processing center (CPC). To process this assortment of documents within the prescribed SLA, the CPC will have to rely on an integrated master automation solution that orchestrates the individual child document processing solutions. The outputs from these child processes are then matched against the bank’s internal data systems or external lookups to cross validate and reconcile for the final output to be delivered. Finally, the overall performance of the batch, document and data are all consolidated and reported for further analysis / downstream use. Broadly, the process steps are:

  1. Perform document classification
  2. Route each document to the specific individual solution for data capture

  • In the Accounts Payable process, depending on availability of PO, Goods receipt and invoice, the solution may decide to use a 2-way or 3-way match
  • Cheque processing solution extracts the different data fields
  • Passport reader extracts the individual’s details (First name, Last name, Date of Birth, Address)
  • Electricity bill to extract the property address, owner’s name
  • Driving License reader extracts the individual’s name, date of birth, address

3. Validation of the extracted data

4. Match data from one document with another (if required) to determine STP criteria:

  • STP Pass – when data fields from 2 different documents match with each other
  • STP Fail – when data fields from 2 different documents do not match

5. Update the status at every stage of every individual process and for every document

A simple workflow depiction is shown below:

No alt text provided for this image

Having a single integrated solution to deliver all the above is a seemingly complex challenge. The RPA or BPM products and SIs are making rapid strides towards the objective of enabling the building of smart enterprise automated solution(s). This is through: (1) product enhancements, and (2) building plug-ins to integrate and orchestrate the different 3rd party siloed solutions. 

To view or add a comment, sign in

More articles by Sivananda S

Others also viewed

Explore content categories