Azure AI+Machine Learning: Tutorial on Computer Vision API (Text Extraction)

Otto Aaron Negron Montero

Published Mar 13, 2018

Machine Learning tools can be very powerful, and they can often provide information that would not be possible to obtain in a simple way. However, many individuals out there might get frightened at the thought of learning about… machine learning. This is usually the case because it can be a bit complex at times.

Fortunately for us, we have Microsoft to help! With Microsoft’s Azure, we have access to a variety of tools that are simple to implement, without the need to dive deep into machine learning theory. One of these great tools is Azure’s Computer Vision API. With this service, we can do two things:

1. Analyze an image and automatically generate a brief description of the contents.

2. Automatically extract typed and handwritten text from an image.

The results can sometimes be quite impressive. Having that in mind, I will briefly demonstrate how to implement this service using the example code provided by Microsoft. With this tutorial you will:

1. Create a Computer Vision Resource

2. Connect a console app to the Computer Vision Resource

3. Compare results (typed text vs handwritten text results)

Follow these steps:

PART A) CREATE A COMPUTER VISION RESOURCE

1. Log in to Microsoft Azure and click on “Create a Resource” on the top left corner.

2. You will notice several Azure services. In the “Create Resource” menu, go to the “AI+Cognitive Services”, and click on “Computer Vision API”.

3. You will be redirected to the “Create Computer Vision API” menu. To begin, provide a name to this resource. Provide a name that is easy to remember and easy to organize. In the future, if you have multiple resources with cryptic names, it might be hard to find among the clutter.

4. Select your subscription/payment method.

5. Select a region. Choose a region close to you. For this example, select the WestCentral US. Free Trial Accounts will only be able to choose the West Central region.

6. Pricing: You will notice two options (F0 and S1). F0 is free, but you will be limited to only 20 calls per minute, with a total of 5,000 calls per month. Not bad. On the other hand, the S1 service has different pricing levels. It starts at $1 per 1,000 transactions, up to 1,000,000 total transactions. The price drops to $0,65 if you have a total of more than 5 million transactions.

7. Create a name for your resource group. Choose any name that you will easily remember. These names are used to help you manage resources. For now, you will not need to worry about it. Just give any name.

8. You will have to wait a few seconds while this resource is being created. When completed, you will be redirected to the resource QuickStart menu. Click on “Keys” in the first step.

9. You will see two keys generated for you. You will use these to contact the API with your app.

10. You are done with Part A!!

PART B) CONNECT THE APP TO THE COMPUTER VISION RESOURCE

1. For this tutorial, we will copy some example code provided by Microsoft. First, open Visual Studio, and create a C# Console App.

2. Go to this website and copy the code they have provided. Paste it in the Program.cs file in your Visual Studio Project.

https://docs.microsoft.com/en-us/azure/cognitive-services/Computer-vision/QuickStarts/CSharp#handwriting-recognition-c-example

3. You will notice two variables at the top: SubscriptionKey and uriBase.

a. Go back to your Vision API Resource in the Azure dashboard. Get one of the keys generated for this resource, and substitute the SubscriptionKey in the example. This is what you will need to access the API.

b. Copy the link to the resource region and substitute the link in uriBase.

4. IMPORTANT: The link you pasted requires a bit more information. You need to specify which service you will use in the link. Add to the end of the link the following: “recognizeText”.

a. It should look like this:

"https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/recognizeText";

5. You are now good to go. Save two example images in your computer. Copy the path to one image.

6. Run your app and paste the image pathway to upload. You should wait a few seconds and the result will be printed.

RESULTS

The result will come back in JSON format. The results from your app should look like this:

1. Handwriting Example and Result

Actual result text:

fear and Tom Congratulations on your twenty fifth wedding

There are so feelings anniversary so are as fortunate to have what you have

companionship , friendship , and a loving relationship

Happy anniversary

Very truly

Helen

2. Typed Text Example and Result

Actual text from result:

"William Henry Perkin From Wikipedia , the free encyclopedia

For William Henry Perkin Jr , the son of William Henry Perkin , see William Henry Perkin JR .

Sir William Henry Perkin , FRS ( 12 March 1838 - 14 July 1907 1 was a British chemist and entrepreneur best known for his serendipitous discovery of the first synthetic organic dye mauveine , made from aniline. Though he failed in trying to synthesise quinine for the treatment of malaria , he became successful in the field of dyes after his first discovery at the age of 18 12 ) "

Also you will also notice that your resource overview will show activity for usage:

Generally the results will be accurate when it comes to typed text in images. However, images with handwriting will have an occasional error, giving you some words that are not in the image.

For the most part, the API will work well. It is important to note that handwriting results should improve every year due to the larger training sets fed into the machine learning algorithms.

As you can see, it is not hard to implement. The process and the result is similar for the analysis of image content. Have you also used this service before? If you have, feel free to comment below on how you have used it.

To view or add a comment, sign in

Azure AI+Machine Learning: Tutorial on Computer Vision API (Text Extraction)

Otto Aaron Negron Montero

PART A) CREATE A COMPUTER VISION RESOURCE

PART B) CONNECT THE APP TO THE COMPUTER VISION RESOURCE

RESULTS

More articles by Otto Aaron Negron Montero

Explore content categories

PART A) CREATE A COMPUTER VISION RESOURCE

PART B) CONNECT THE APP TO THE COMPUTER VISION RESOURCE

RESULTS

More articles by Otto Aaron Negron Montero

Training Bits: Easy Peasy Ansible Intro

Tutorial: MongoDb Basics

SQL Security: Implementing a DML Action Auditing process

How to generate a database using Entity Framework Model-First in ASP.NET

Explore content categories