Chapter 7: The Automatic Learner
Our AI assistant is finally working! It has a brain, it has a memory, and it knows how to use them together. It can answer our personal questions based on a single fact we've taught it.
But right now, our assistant is like a student who has only ever studied one flashcard. That's not enough to pass the test. To make our assistant genuinely useful, we need to give it the ability to learn from an entire library of notes, not just one.
What happens when we have ten notes? A hundred? We can't keep running a special script for every single new fact. We need a better way. We need to teach our AI how to study.
The Plan: From One Note to Many
We're going to create a new, powerful script. Its only job is to look inside our knowledge folder, read every single note it finds, and make sure each one is stored safely in its memory bank.
We'll design it to be smart. It won't create duplicate memories. If it sees a note it has already learned, it will simply update it with the latest information. This is how our assistant will stay current.
Think of this as the "study session" script. Whenever we want our AI to learn new things, we simply run this program, and it will absorb all the new knowledge we've provided.
Before We Code: Expand the Library
Our new script won't be very impressive if it only has one file to read. Let's give our AI a few more things to learn.
Go into your knowledge folder and create two new text files:
Now our library has three "books" in it. It's time to build the librarian who will read them all.
The Code: The "Study Session" Script
This script will loop through every file in our knowledge folder and add it to our ChromaDB memory bank. We’ll use a clever command called upsert, which means: if this memory ID already exists, update it. If not, insert it. This prevents errors and keeps our memory fresh. Create a file called "load_memory.py":
import google.generativeai as genai
import os
from dotenv import load_dotenv
import chromadb
# --- Setup ---
load_dotenv()
API_KEY = os.getenv("GOOGLE_API_KEY")
if not API_KEY:
print("Error: GOOGLE_API_KEY not found in .env file.")
exit()
genai.configure(api_key=API_KEY)
embedding_model = 'models/text-embedding-004'
# --- Connect to ChromaDB ---
client = chromadb.PersistentClient(path="my_chroma_db")
collection = client.get_or_create_collection("personal_facts")
# --- The Learning Process ---
knowledge_folder = "knowledge"
print("Starting study session...")
for filename in os.listdir(knowledge_folder):
# We only want to read text files
if filename.endswith(".txt"):
file_path = os.path.join(knowledge_folder, filename)
print(f"Reading: {filename}...")
with open(file_path, 'r') as f:
knowledge_text = f.read()
# Create the embedding for the file content
embedding = genai.embed_content(
model=embedding_model,
content=knowledge_text
)['embedding']
# Use upsert to add or update the memory.
# We'll use the filename as the unique ID for each memory.
collection.upsert(
ids=[filename],
embeddings=[embedding],
documents=[knowledge_text]
)
print(f" -> Memory for '{filename}' is stored.")
print("\nStudy session complete! I've learned everything in the knowledge folder.")
The code in "main.py":
import google.generativeai as genai
import os
from dotenv import load_dotenv
import chromadb
# --- Setup ---
load_dotenv()
API_KEY = os.getenv("GOOGLE_API_KEY")
if not API_KEY:
print("Error: GOOGLE_API_KEY not found in .env file.")
exit()
genai.configure(api_key=API_KEY)
embedding_model = 'models/text-embedding-004'
llm = genai.GenerativeModel('gemini-2.5-flash-preview-05-20')
# --- Connect to ChromaDB ---
client = chromadb.PersistentClient(path="my_chroma_db")
try:
collection = client.get_collection("personal_facts")
except ValueError:
print("Error: The 'personal_facts' collection does not exist.")
print("Please run the script from Chapter 3 to create and store your first memory.")
exit()
print("Hello! I'm your personal AI assistant. I now have a memory of our facts.")
print("Type 'exit' or 'quit' to end the chat.")
# --- The Main Chat Loop ---
while True:
user_question = input("\nYou: ")
if user_question.lower() in ["quit", "exit"]:
print("Goodbye! It was nice chatting with you.")
break
# --- Step 1: Look up relevant facts in our memory ---
# First, we create an embedding for the user's question
question_embedding = genai.embed_content(
model=embedding_model,
content=user_question
)['embedding']
# Then, we query our collection to find the most relevant memory
results = collection.query(
query_embeddings=[question_embedding],
n_results=1 # We only want the single most relevant fact
)
# Let's get the text of the most relevant memory
if results and results['documents'] and results['documents'][0]:
retrieved_memory = results['documents'][0][0]
else:
retrieved_memory = None # No memory was found
# --- Step 2: Formulate the answer ---
if retrieved_memory:
# We found a relevant memory! Let's use it to augment our prompt.
prompt_with_context = (
"You are a helpful personal assistant. "
"Based on this fact I'm providing you: "
f"'{retrieved_memory}'"
"\nPlease answer the following question: "
f"'{user_question}'"
)
print(f"AI (thinking with memory): I found a relevant fact... '{retrieved_memory}'")
else:
# We didn't find a relevant memory. We'll just ask the AI the question directly.
prompt_with_context = user_question
print("AI (thinking): I don't have a specific memory for this, but I'll answer from my general knowledge.")
# Now, send the (potentially augmented) prompt to the Gemini LLM
response = llm.generate_content(prompt_with_context)
print(f"AI: {response.text}")
Let's See It in Action
Run this new script. You should see it process each of your three text files one by one.
Now for the real test. Run your main chatbot script. Your AI's memory is now three times larger. Try asking it some new questions:
Recommended by LinkedIn
You: What is my main goal?
AI (thinking with memory): I found a relevant fact... 'My main goal right now is to build a personal AI assistant.'
AI: Your main goal is to build a personal AI assistant.
You: What kind of food do I love?
AI (thinking with memory): I found a relevant fact... 'I absolutely love eating pho.'
AI: You love eating pho.
It works! Our assistant now has a much richer understanding of us. To make it easier for your AI Assistant to learn from your data, which is currently unstructured, we need to structure it. We'll explore this in the next chapter.
If you’re enjoying this series and find it helpful, I'd love for you to like and share it! Thank you so much!
First AI Journey:
Table of Contents
Part 1: The First Spark of Intelligence
Part 2: Building the Memory System
Part 3: Creating a True Assistant
Part 4: The Responsible AI Owner
Part 5: The Next Evolution: AI Agents