A Code-Based Introduction to AI For .Net Programmers

A Code-Based Introduction to AI For .Net Programmers

If you are new to AI but want to get a grasp of the fundamentals and some terminology, going through some code may be the easiest way to do it. Thanks to the ONNX Runtime and Microsoft DirectML, the code is easy-to-implement, succinct, and digestible for those of us .Net folks.

Article content
We're going to examine a tiny console application that takes in an image and gives its best guess as to what it's an image of.

Terminology

  • Model - An AI model is a mathematical representation of "knowledge", in this case, we are using the ResNet50 model. This knowledge is based on -- i.e., trained on -- millions of images divided into 1000 classes, or things it knows about. The material a model is trained on can limit its abilities -- so, in this case, if ResNet50 is presented with something it doesn't know about, for instance, an eyedropper, it might think it's a screwdriver.
  • ONNX - The Open Neural Network Exchange file format is a cross-platform way of representing a model. These models are used by the ONNX Runtime.
  • Execution Provider (EP) - The ONNX Runtime supports different plugins that allow for optimum execution on various hardware platforms. We will be using a DirectML execution provider.
  • DirectML - Much like how DirectX enabled programmers to abstract away from hardware-specifics for multimedia, DirectML does the same for AI and machine learning, allowing tasks to be run on different hardware without developers needing to know the ins and outs of each.
  • Tensor - A tensor is a multidimensional numerical array that represents something to be used in processing, whether it be text, image, sound, etc. Much of AI is, at its core, complex math done on these tensors, even large language models like ChatGPT.

The Console Application

This application requires only two NuGet packages:

  • Microsoft.ML.OnnxRuntime.DirectML (automatically brings in System.Memory, Microsoft.AI.DirectML and Microsoft.ML.OnnxRuntime.Managed)
  • SixLabors.ImageSharp for image manipulation

Program.cs

Stripped of error handling and user messaging, the main program is very basic, mostly just dealing with files. It creates an instance of the ImageClassifier class, makes it do the grunt work, and finally prints the top guess returned by the model.

var imagePath = args[0]; //the image file

var modelPath = "resnet50.onnx"; //the ONNX model
var labelsPath = "labels.txt"; //the list of labels the ONNX model was trained on

var classifier = new ImageClassifier(modelPath, labelsPath);
var result = classifier.Predict(imagePath);

Console.WriteLine($"This is most likely: {result}");        

ImageClassifier.cs

There are three methods in this class: the constructor, the Predict method used in program.cs, and a method that converts the image to a tensor.

Constructor

        public ImageClassifier(string modelPath, string labelsPath)
        {
            var options = new SessionOptions();
            options.AppendExecutionProvider_DML(0);

            _session = new InferenceSession(modelPath, options);
            
            _classLabels = File.ReadAllLines(labelsPath);
        }        

The constructor creates an ONNX Runtime inference session for the model, and we tell it via the session options to use the DirectML execution provider with a 0 as the parameter, which specifies the primary GPU as the hardware we want to run it on.

Remember how AI is basically lots of work with numbers? What better hardware to do that with than a GPU, which excels at crunching numbers. (Well, OK, better would be an NPU, a neural processing unit, now in some machines, which is dedicated hardware for this math geekery, but it is not supported in this particular version of the DirectML EP.)

Finally it reads in the list of labels that ResNet50 was trained on that will be used later when we get the result.

Predict function

  public string Predict(string imagePath)
        {
            var inputTensor = PreprocessImage(imagePath);

            // Run inference
            var inputs = new List<NamedOnnxValue>
                {
                    NamedOnnxValue.CreateFromTensor("image_tensor", inputTensor)
                };
            using IDisposableReadOnlyCollection<DisposableNamedOnnxValue> results = _session.Run(inputs);

            // Get the output and convert it to an array of probabilities
            var output = results.First().AsTensor<float>().ToArray();

            // Find the index of the maximum probability
            int maxIndex = output.ToList().IndexOf(output.Max());

            // Map the index to a label
            return _classLabels[maxIndex];
        }        

This is where the magic happens, in three parts:

  1. Call PreprocessImage to convert the image to a tensor
  2. Pass that image tensor to the inference session (which was set up with the model in the constructor) and run it
  3. Return the label string that's associated with the output tensor of the inference session.

PreprocessImage function

  private DenseTensor<float> PreprocessImage(string imagePath)
  {
      const int targetWidth = 224;
      const int targetHeight = 224;
      var tensor = new DenseTensor<float>(new[] { 1, 3, targetHeight, targetWidth });

      using var image = Image.Load<Rgb24>(imagePath);
      image.Mutate(x => x.Resize(targetWidth, targetHeight));

      for (int y = 0; y < targetHeight; y++)
      {
          for (int x = 0; x < targetWidth; x++)
          {
              var pixel = image[x, y];
              tensor[0, 0, y, x] = pixel.R / 255.0f;
              tensor[0, 1, y, x] = pixel.G / 255.0f;
              tensor[0, 2, y, x] = pixel.B / 255.0f;
          }
      }

      return tensor;
  }        

One of the requirements for using the ResNet50 model is that input images need to be 224x224 (yes, that tiny!). This method resizes the input image to that size and then generates a tensor based on the RGB values of each of those pixels.

Tensors have always sounded mysterious and inscrutable, but you can clearly see here it's just the image in numerical form.

There you have it

Let's recap with one more example:

Article content

The application:

  1. Converts the specified image into a tensor representation
  2. Runs an ONNX runtime inference session with the tensor as input (using the ResNet50 ONNX model, the DirectML execution provider, and the GPU)
  3. Prints the label that has the highest probability returned by the model.


I'm in that spot you mentioned where you could have used this, very useful!

Like
Reply

By the way, if you want a source for models, check out Hugging Face for downloadable models, especially Microsoft's DirectML AI Hub collection, and the GitHub Model Catalog for models you can run in the cloud via Azure AI. https://huggingface.co/microsoft/dml-ai-hub-models https://github.com/marketplace/models The code is available at https://github.com/jonathankhootek/ResnetTest

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore content categories