Deploying AI with Docker and Cloud Run

Amar Kasbe

Published Oct 18, 2024

In the rapidly advancing world of technology, the combination of Artificial Intelligence (AI) and cloud platforms is reshaping how applications are developed and deployed. This article will take you through the process of building a simple Generative AI project with Google’s Gemini model, containerizing it with Docker, and deploying it to Google Cloud Run. We’ll explore each phase, from local setup to cloud deployment, with a focus on using command-line tools and understanding the underlying architecture.

Let’s start with an overview of our project architecture:-

Article content — This diagram illustrates the journey of our AI project from local development to cloud deployment. We’ll walk through each stage, explaining the components and processes involved.

Prerequisites:-

Before we dive in, let’s ensure we have all the necessary tools installed:

Python 3.8+
Docker
Google Cloud SDK
A Google Cloud account with billing enabled

Let’s start by checking our installations:

python --version
docker --version
gcloud --version

If any of these commands fail, you’ll need to install the missing components. For installation guides, refer to the following links:

Step 1: Local Development :-

Let’s create a simple AI project using Google’s Gemini model. We’ll build a text summarization API.

First, set up your project directory:

mkdir gemini-summarizer
cd gemini-summarizer
python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install the required packages:

pip install flask google-generativeai

Now, let’s create our app.py:

import os
from flask import Flask, request, jsonify
import google.generativeai as genai

app = Flask(__name__)

# Configure the Gemini model
genai.configure(api_key=os.environ.get("GEMINI_API_KEY"))
model = genai.GenerativeModel('gemini-pro')

@app.route('/summarize', methods=['POST'])
def summarize():
    data = request.json
    text = data.get('text', '')
    
    if not text:
        return jsonify({"error": "No text provided"}), 400
    
    prompt = f"Summarize the following text in one paragraph:\n\n{text}"
    response = model.generate_content(prompt)
    
    return jsonify({"summary": response.text})

if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0', port=int(os.environ.get('PORT', 8080)))

This script creates a Flask app with a /summarize endpoint that uses the Gemini model to summarize provided text.

Let’s visualize the local development setup:-

Step 2: Containerization with Docker :-

Now that we have our application working locally, let’s containerize it using Docker. Docker allows us to package our application and its dependencies into a standardized unit called a container.

Create a Dockerfile in your project directory:

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["python", "app.py"]

Create a requirements.txt file:

flask
google-generativeai

Now, let’s build our Docker image:

Recommended by LinkedIn

Leverage RAG (Retrieval Augmented Generation) on AWS…

Offer SADEY, CISSP 1 year ago

AWS announces 5 new innovations at AWS Summit New York…

AWS Events 1 year ago

Introducing Cercano: A Local AI Sidekick for Your…

Bryan Costanich 1 week ago

docker build -t gemini-summarizer .

To run the container locally:

docker run -p 8080:8080 -e GEMINI_API_KEY=your_api_key gemini-summarizer

Let’s visualize the containerization process:-

Step 3: Deploying to Google Cloud Run :-

Google Cloud Run is a fully managed compute platform that automatically scales your stateless containers. It’s an excellent choice for deploying containerized applications like ours.

Authenticate with Google Cloud:

First, ensure you’re authenticated with Google Cloud, then push your Docker image to Google Container Registry:

gcloud auth login
gcloud config set project YOUR_PROJECT_ID

Tag and Push Your Docker Image:

Tag your Docker image and push it to Google Container Registry:

docker tag gemini-summarizer gcr.io/[YOUR_PROJECT_ID]/gemini-summarizer
docker push gcr.io/[YOUR_PROJECT_ID]/gemini-summarizer

Deploy to Cloud Run:

Next, deploy the application to Cloud Run:

gcloud run deploy gemini-ai-service \
  --image gcr.io/[YOUR_PROJECT_ID]/gemini-summarizer \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated

Let’s visualize the process:-

How Docker and Cloud Run Work Together :-

Docker and Cloud Run work together seamlessly to deploy and run your application:

Docker creates a container image that includes your application and all its dependencies.
This image is pushed to Google Container Registry (GCR), a private container image storage.
Cloud Run pulls the image from GCR and deploys it as a scalable, serverless container.

Cloud Run automatically manages the infrastructure, scaling, and load balancing for your application. It only runs containers when there are requests, scaling to zero when there’s no traffic, which can significantly reduce costs.

Conclusion :-

This article illustrated the seamless transition of any application, including a Gemini AI project, from local development to cloud deployment using Docker and Google Cloud Run. By utilizing Docker, developers can ensure consistency across different environments, allowing their applications to run smoothly regardless of where they are deployed.

Moreover, Cloud Run offers automatic scalability and cost-effectiveness, charging only for the resources consumed. This means developers can focus on building innovative applications without worrying about infrastructure management. By leveraging these powerful technologies, teams can streamline their development processes and enhance overall productivity.