Deploying AI with Docker and Cloud Run
In the rapidly advancing world of technology, the combination of Artificial Intelligence (AI) and cloud platforms is reshaping how applications are developed and deployed. This article will take you through the process of building a simple Generative AI project with Google’s Gemini model, containerizing it with Docker, and deploying it to Google Cloud Run. We’ll explore each phase, from local setup to cloud deployment, with a focus on using command-line tools and understanding the underlying architecture.
Let’s start with an overview of our project architecture:-
Prerequisites:-
Before we dive in, let’s ensure we have all the necessary tools installed:
Let’s start by checking our installations:
python --version
docker --version
gcloud --version
If any of these commands fail, you’ll need to install the missing components. For installation guides, refer to the following links:
Step 1: Local Development :-
Let’s create a simple AI project using Google’s Gemini model. We’ll build a text summarization API.
First, set up your project directory:
mkdir gemini-summarizer
cd gemini-summarizer
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
Install the required packages:
pip install flask google-generativeai
Now, let’s create our app.py:
import os
from flask import Flask, request, jsonify
import google.generativeai as genai
app = Flask(__name__)
# Configure the Gemini model
genai.configure(api_key=os.environ.get("GEMINI_API_KEY"))
model = genai.GenerativeModel('gemini-pro')
@app.route('/summarize', methods=['POST'])
def summarize():
data = request.json
text = data.get('text', '')
if not text:
return jsonify({"error": "No text provided"}), 400
prompt = f"Summarize the following text in one paragraph:\n\n{text}"
response = model.generate_content(prompt)
return jsonify({"summary": response.text})
if __name__ == '__main__':
app.run(debug=True, host='0.0.0.0', port=int(os.environ.get('PORT', 8080)))
This script creates a Flask app with a /summarize endpoint that uses the Gemini model to summarize provided text.
Let’s visualize the local development setup:-
Step 2: Containerization with Docker :-
Now that we have our application working locally, let’s containerize it using Docker. Docker allows us to package our application and its dependencies into a standardized unit called a container.
Create a Dockerfile in your project directory:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]
Create a requirements.txt file:
flask
google-generativeai
Now, let’s build our Docker image:
Recommended by LinkedIn
docker build -t gemini-summarizer .
To run the container locally:
docker run -p 8080:8080 -e GEMINI_API_KEY=your_api_key gemini-summarizer
Let’s visualize the containerization process:-
Step 3: Deploying to Google Cloud Run :-
Google Cloud Run is a fully managed compute platform that automatically scales your stateless containers. It’s an excellent choice for deploying containerized applications like ours.
First, ensure you’re authenticated with Google Cloud, then push your Docker image to Google Container Registry:
gcloud auth login
gcloud config set project YOUR_PROJECT_ID
Tag your Docker image and push it to Google Container Registry:
docker tag gemini-summarizer gcr.io/[YOUR_PROJECT_ID]/gemini-summarizer
docker push gcr.io/[YOUR_PROJECT_ID]/gemini-summarizer
Next, deploy the application to Cloud Run:
gcloud run deploy gemini-ai-service \
--image gcr.io/[YOUR_PROJECT_ID]/gemini-summarizer \
--platform managed \
--region us-central1 \
--allow-unauthenticated
Let’s visualize the process:-
How Docker and Cloud Run Work Together :-
Docker and Cloud Run work together seamlessly to deploy and run your application:
Cloud Run automatically manages the infrastructure, scaling, and load balancing for your application. It only runs containers when there are requests, scaling to zero when there’s no traffic, which can significantly reduce costs.
Conclusion :-
This article illustrated the seamless transition of any application, including a Gemini AI project, from local development to cloud deployment using Docker and Google Cloud Run. By utilizing Docker, developers can ensure consistency across different environments, allowing their applications to run smoothly regardless of where they are deployed.
Moreover, Cloud Run offers automatic scalability and cost-effectiveness, charging only for the resources consumed. This means developers can focus on building innovative applications without worrying about infrastructure management. By leveraging these powerful technologies, teams can streamline their development processes and enhance overall productivity.
Love this
Very informative
Interesting
Useful tips
Very informative