Chapter 7: The Automatic Learner

Chapter 7: The Automatic Learner

Our AI assistant is finally working! It has a brain, it has a memory, and it knows how to use them together. It can answer our personal questions based on a single fact we've taught it.

But right now, our assistant is like a student who has only ever studied one flashcard. That's not enough to pass the test. To make our assistant genuinely useful, we need to give it the ability to learn from an entire library of notes, not just one.

What happens when we have ten notes? A hundred? We can't keep running a special script for every single new fact. We need a better way. We need to teach our AI how to study.

The Plan: From One Note to Many

We're going to create a new, powerful script. Its only job is to look inside our knowledge folder, read every single note it finds, and make sure each one is stored safely in its memory bank.

We'll design it to be smart. It won't create duplicate memories. If it sees a note it has already learned, it will simply update it with the latest information. This is how our assistant will stay current.

Think of this as the "study session" script. Whenever we want our AI to learn new things, we simply run this program, and it will absorb all the new knowledge we've provided.

Before We Code: Expand the Library

Our new script won't be very impressive if it only has one file to read. Let's give our AI a few more things to learn.

Go into your knowledge folder and create two new text files:

  1. "project_goal.txt" - Inside this file, write: My main goal right now is to build a personal AI assistant.
  2. "favorite_food.txt" - Inside this file, write: I absolutely love eating pho.

Now our library has three "books" in it. It's time to build the librarian who will read them all.

The Code: The "Study Session" Script

This script will loop through every file in our knowledge folder and add it to our ChromaDB memory bank. We’ll use a clever command called upsert, which means: if this memory ID already exists, update it. If not, insert it. This prevents errors and keeps our memory fresh. Create a file called "load_memory.py":

import google.generativeai as genai
import os
from dotenv import load_dotenv
import chromadb

# --- Setup ---
load_dotenv()

API_KEY = os.getenv("GOOGLE_API_KEY")
if not API_KEY:
    print("Error: GOOGLE_API_KEY not found in .env file.")
    exit()

genai.configure(api_key=API_KEY)
embedding_model = 'models/text-embedding-004'

# --- Connect to ChromaDB ---
client = chromadb.PersistentClient(path="my_chroma_db")
collection = client.get_or_create_collection("personal_facts")

# --- The Learning Process ---
knowledge_folder = "knowledge"

print("Starting study session...")

for filename in os.listdir(knowledge_folder):
    # We only want to read text files
    if filename.endswith(".txt"):
        file_path = os.path.join(knowledge_folder, filename)
        
        print(f"Reading: {filename}...")
        
        with open(file_path, 'r') as f:
            knowledge_text = f.read()

        # Create the embedding for the file content
        embedding = genai.embed_content(
            model=embedding_model,
            content=knowledge_text
        )['embedding']
        
        # Use upsert to add or update the memory.
        # We'll use the filename as the unique ID for each memory.
        collection.upsert(
            ids=[filename],
            embeddings=[embedding],
            documents=[knowledge_text]
        )
        print(f"  -> Memory for '{filename}' is stored.")

print("\nStudy session complete! I've learned everything in the knowledge folder.")        

The code in "main.py":

import google.generativeai as genai
import os
from dotenv import load_dotenv
import chromadb

# --- Setup ---
load_dotenv()

API_KEY = os.getenv("GOOGLE_API_KEY")
if not API_KEY:
    print("Error: GOOGLE_API_KEY not found in .env file.")
    exit()

genai.configure(api_key=API_KEY)
embedding_model = 'models/text-embedding-004'
llm = genai.GenerativeModel('gemini-2.5-flash-preview-05-20')

# --- Connect to ChromaDB ---
client = chromadb.PersistentClient(path="my_chroma_db")
try:
    collection = client.get_collection("personal_facts")
except ValueError:
    print("Error: The 'personal_facts' collection does not exist.")
    print("Please run the script from Chapter 3 to create and store your first memory.")
    exit()

print("Hello! I'm your personal AI assistant. I now have a memory of our facts.")
print("Type 'exit' or 'quit' to end the chat.")

# --- The Main Chat Loop ---
while True:
    user_question = input("\nYou: ")

    if user_question.lower() in ["quit", "exit"]:
        print("Goodbye! It was nice chatting with you.")
        break

    # --- Step 1: Look up relevant facts in our memory ---
    # First, we create an embedding for the user's question
    question_embedding = genai.embed_content(
        model=embedding_model,
        content=user_question
    )['embedding']

    # Then, we query our collection to find the most relevant memory
    results = collection.query(
        query_embeddings=[question_embedding],
        n_results=1 # We only want the single most relevant fact
    )
    
    # Let's get the text of the most relevant memory
    if results and results['documents'] and results['documents'][0]:
        retrieved_memory = results['documents'][0][0]
    else:
        retrieved_memory = None # No memory was found

    # --- Step 2: Formulate the answer ---
    if retrieved_memory:
        # We found a relevant memory! Let's use it to augment our prompt.
        prompt_with_context = (
            "You are a helpful personal assistant. "
            "Based on this fact I'm providing you: "
            f"'{retrieved_memory}'"
            "\nPlease answer the following question: "
            f"'{user_question}'"
        )
        print(f"AI (thinking with memory): I found a relevant fact... '{retrieved_memory}'")
    else:
        # We didn't find a relevant memory. We'll just ask the AI the question directly.
        prompt_with_context = user_question
        print("AI (thinking): I don't have a specific memory for this, but I'll answer from my general knowledge.")


    # Now, send the (potentially augmented) prompt to the Gemini LLM
    response = llm.generate_content(prompt_with_context)
    
    print(f"AI: {response.text}")        

Let's See It in Action

Run this new script. You should see it process each of your three text files one by one.

Now for the real test. Run your main chatbot script. Your AI's memory is now three times larger. Try asking it some new questions:

You: What is my main goal?
AI (thinking with memory): I found a relevant fact... 'My main goal right now is to build a personal AI assistant.'
AI: Your main goal is to build a personal AI assistant.

You: What kind of food do I love?
AI (thinking with memory): I found a relevant fact... 'I absolutely love eating pho.'
AI: You love eating pho.        

It works! Our assistant now has a much richer understanding of us. To make it easier for your AI Assistant to learn from your data, which is currently unstructured, we need to structure it. We'll explore this in the next chapter.

If you’re enjoying this series and find it helpful, I'd love for you to like and share it! Thank you so much!


First AI Journey:

  1. PromptOps: From YAML to AI
  2. The DevOps AI Advantage
  3. The AIOps Book


Table of Contents

Part 1: The First Spark of Intelligence

  • Chapter 1: Your First Conversation with an AI
  • Chapter 2: Building the Chatbot Shell

Part 2: Building the Memory System

  • Chapter 3: Giving Your AI Its First Memory
  • Chapter 4: The Memory Bank - Your First Vector Database
  • Chapter 5: Retrieval - Finding the Right Memory

Part 3: Creating a True Assistant

  • Chapter 6: The RAG Pipeline - How Your AI Thinks
  • Chapter 7: The Automatic Learner
  • Chapter 8: How to structure your data for training an AI Assistant
  • Chapter 9: A Friendly Face - Building a Simple Web UI

Part 4: The Responsible AI Owner

  • Chapter 10: Privacy, Security, and Future Steps

Part 5: The Next Evolution: AI Agents

  • Chapter 11: From Assistant to Agent: The Power to Act
  • Chapter 12: The Agent's Toolkit: Giving Your AI Superpowers
  • Chapter 13: The Re-Act Loop: How an Agent Thinks
  • Chapter 14: Your First Agent in Action


To view or add a comment, sign in

More articles by Quan Huynh

Others also viewed

Explore content categories