Optical Character Recognition (OCR)

Optical Character Recognition (OCR)

Optical Character Recognition (OCR) is the process of electronically extracting text from images. The extracted text can be reused in a variety of ways such as document editing, free-text searches, or compression.

OCR process includes the mechanical and electrical conversion of scanned images of handwritten, typewritten text into machine text. It is common method of digitizing printed texts so that they can be electronically searched, stored more compactly, displayed on line, and used in machine processes such as machine translation, text to speech and text mining.

OCR technology can be applied across the entire spectrum of industries, revolutionizing the document management process. OCR has enabled scanned documents to be more than just image files by turning them into fully searchable documents. OCR extracts relevant information from the scanned documents and enters it automatically into a database making the data entry accurate and efficient information processing.

There are three essential elements in OCR technology

  •  Scanning
  • Recognition
  • Reading Text

Initially, a printed document is scanned by a camera. OCR software converts the image into recognized characters and words. The synthesizer in the OCR system then speaks the recognized text. Finally, the information is stored in an electronic form, either in a personal computer (PC) or the memory of the OCR system itself.

The recognition process takes account of the logical structure of the language. An OCR system will deduce that the word "tke" at the beginning of a sentence is a mistake and should be read as the word "the." OCR's also use a lexicon and apply spell checking techniques similar to those found in many word processors.

All OCR systems create temporary files containing the texts' characters and page layout. In some OCR's these temporary files can be converted into formats retrievable by commonly used computer software such as word processors and spreadsheet and database software.

There are three essential elements to OCR technology—scanning, recognition, and reading text. Initially, a printed document is scanned by a camera. OCR software then converts the images into recognized characters and words. The synthesizer in the OCR system then speaks the recognized text. Finally, the information is stored in an electronic form, either in a personal computer (PC) or the memory of the OCR system itself.

The recognition process takes account of the logical structure of the language. An OCR system will deduce that the word "tke" at the beginning of a sentence is a mistake and should be read as the word "the." OCR's also use a lexicon and apply spell checking techniques similar to those found in many word processors.

All OCR systems create temporary files containing the texts' characters and page layout. In some OCR's these temporary files can be converted into formats retrievable by commonly used computer software such as word processors and spreadsheet and database software.

OCR has multiple research areas but most common areas are listed as follows:

Banking

OCR is widely used application in banking, where it is used to process cheques without human involvement. A cheque can be inserted into a machine. The text on the cheque is scanned instantly, and the correct amount of money is transferred.

This technology has nearly been perfected for printed cheques, and is fairly accurate for handwritten cheques as well, though manual authentication/involvement is occasionally needed for various approvals and confirmation. This technology not only reduced waiting time in many banks but also the human effort needed.

Blind and visually impaired persons

One of the major reason behind the research on OCR is that to device a a software/system which could read a book to the blind people out loud. As part of this research a flatbed scanner was found which is most commonly known to us as document scanner.

OCR technology offers blind and visually impaired persons the capacity to scan printed text and then speak it back in synthetic speech or save it to a computer. Little technology exists to interpret graphics such as line art, photographs, and graphs into a medium i.e. easily accessible to blind and visually impaired persons. It also is not yet possible to convert handwriting, whether script or block printing, into an accessible medium.

The blind or visually impaired user can access the scanned text by using adaptive technology devices that magnify the computer screen or provide speech or braille output.

Legal department

There is a huge a significant movement to digitize paper documents in legal industry. In order to save space and eliminate the need to sift through boxes of paper files, documents are being scanned and entered into computer databases. OCR further simplifies the process by making documents text-searchable, so that they are easier to locate and work with once in the database. Legal professionals now have fast, easy access to a huge library of documents in electronic format, which they can find simply by typing in a few keywords.

Retail Industry

Barcode recognition technology is also related to OCR. As we daily come across different consumer goods, the usage is very well known to us.

Other Uses

OCR is widely used in many other fields, including education, finance, and government agencies. OCR has made countless texts available online, saving money for students and allowing knowledge to be shared. Invoice imaging applications are used in many businesses to keep track of financial records and prevent a backlog of payments from piling up. In government agencies and independent organizations, OCR simplifies data collection and analysis, among other processes. As the technology continues to develop, more and more applications are found for OCR technology, including increased use of handwriting recognition.

Taking few minutes to OCR our PDF document is all it'll take to get them from being basic images of paper documents to fully digitized documents in which we can search, copy, markup and many more things that we can do with a normal file.

Following few tools are considered to be best for OCR.

  •  gImage Reader
  • Capture2Text
  • VueScan

To view or add a comment, sign in

More articles by Pruthvidhar Pendyala

  • Not the Bahubali way

    I am starting this article based on the assumption that the audience had already watched the two parts of the movie…

    2 Comments
  • QUALITY?? COSTLY!!

    Quality is not an Act, It's a Habit I have remembered this quote after watching an Ad, recently. The Ad goes like this.

  • Artificial Stupidity

    "Hello World" people out there can hear to me who are still panicking about the current market situation. I have seen…

  • Agile Methodology

    What Is Agile? It's not a methodology! The Agile movement seeks alternatives to traditional project management. Agile…

    2 Comments
  • Natural Language Processing (NLP)

    Natural Language Processing (NLP) refers to AI method of communicating with an intelligent system using a natural…

  • Software Configuration Management (SCM)

    Definition: Software Configuration Management (Hereafter referred as SCM) is the ability to control and manage changes…

  • Ad-hoc Testing

    We might have come across various types of software testing that are performed to achieve different objectives when…

  • Communication - the sweet way

    Recently, I have come across a story which my friend told me during our best pass of times about communication - sweet…

  • Comparison between code review tools

    Code reviews can often find and remove common vulnerabilities, there by improving software security. Online software…

  • RBT (Risk Based Testing)

    Overview: RBT is basically a testing done for the project based on risks. It uses risk to prioritize and emphasize the…

Others also viewed

Explore content categories