Deploying a Machine Learning Model to a Serverless API Using Boto3
Deploying to API Gateway via AWS Lambda creates an HTTP endpoint with no servers to manage (i.e., serverless). Additionally, the API is auto-scaled and we are only charged for the time the function runs rather than paying for the entire server unit (i.e., paying for idle time).
This tutorial demonstrates the entire process of deploying a Machine Learning (ML) model to a serverless API using boto3 by running a Sagemaker jupyter notebook. Putting it all in code allows for repeatability, transparency, version control, and explicit knowledge transfer.
Contents:
First, we will create a Docker container containing all model artifacts, code, and dependencies required to accept an event and return a response. Then, we will push the image to AWS ECR.
Next, we will create a lambda function using the image we just pushed to ECR.
Let’s import a sample payload (event) and send it to the function to generate a response.
After successfully generating a response, we will create a REST API and integrate it with the lambda function.
In order to process a POST request, we need to add a POST method.
Now, we need to integrate the lambda function with the REST API.
Let’s create a stage named “dev” and deploy.
Lastly, we will get the endpoint URL and send a POST request to it.
Using the code shown in this tutorial allows for serverless API deployment by running a jupyter notebook as opposed to navigating the AWS user interface. Thus, it is repeatable, transparent, version controlled, and easy for getting team members up-to-speed.
We're all serverless at Informed.IQ running on AWS Lambdas
Thanks for sharing!