Chapter 7: The Automatic Learner

Quan Huynh

Published Jun 26, 2025

Our AI assistant is finally working! It has a brain, it has a memory, and it knows how to use them together. It can answer our personal questions based on a single fact we've taught it.

But right now, our assistant is like a student who has only ever studied one flashcard. That's not enough to pass the test. To make our assistant genuinely useful, we need to give it the ability to learn from an entire library of notes, not just one.

What happens when we have ten notes? A hundred? We can't keep running a special script for every single new fact. We need a better way. We need to teach our AI how to study.

The Plan: From One Note to Many

We're going to create a new, powerful script. Its only job is to look inside our knowledge folder, read every single note it finds, and make sure each one is stored safely in its memory bank.

We'll design it to be smart. It won't create duplicate memories. If it sees a note it has already learned, it will simply update it with the latest information. This is how our assistant will stay current.

Think of this as the "study session" script. Whenever we want our AI to learn new things, we simply run this program, and it will absorb all the new knowledge we've provided.

Before We Code: Expand the Library

Our new script won't be very impressive if it only has one file to read. Let's give our AI a few more things to learn.

Go into your knowledge folder and create two new text files:

"project_goal.txt" - Inside this file, write: My main goal right now is to build a personal AI assistant.
"favorite_food.txt" - Inside this file, write: I absolutely love eating pho.

Now our library has three "books" in it. It's time to build the librarian who will read them all.

The Code: The "Study Session" Script

This script will loop through every file in our knowledge folder and add it to our ChromaDB memory bank. We’ll use a clever command called upsert, which means: if this memory ID already exists, update it. If not, insert it. This prevents errors and keeps our memory fresh. Create a file called "load_memory.py":

import google.generativeai as genai
import os
from dotenv import load_dotenv
import chromadb

# --- Setup ---
load_dotenv()

API_KEY = os.getenv("GOOGLE_API_KEY")
if not API_KEY:
    print("Error: GOOGLE_API_KEY not found in .env file.")
    exit()

genai.configure(api_key=API_KEY)
embedding_model = 'models/text-embedding-004'

# --- Connect to ChromaDB ---
client = chromadb.PersistentClient(path="my_chroma_db")
collection = client.get_or_create_collection("personal_facts")

# --- The Learning Process ---
knowledge_folder = "knowledge"

print("Starting study session...")

for filename in os.listdir(knowledge_folder):
    # We only want to read text files
    if filename.endswith(".txt"):
        file_path = os.path.join(knowledge_folder, filename)
        
        print(f"Reading: {filename}...")
        
        with open(file_path, 'r') as f:
            knowledge_text = f.read()

        # Create the embedding for the file content
        embedding = genai.embed_content(
            model=embedding_model,
            content=knowledge_text
        )['embedding']
        
        # Use upsert to add or update the memory.
        # We'll use the filename as the unique ID for each memory.
        collection.upsert(
            ids=[filename],
            embeddings=[embedding],
            documents=[knowledge_text]
        )
        print(f"  -> Memory for '{filename}' is stored.")

print("\nStudy session complete! I've learned everything in the knowledge folder.")

The code in "main.py":

import google.generativeai as genai
import os
from dotenv import load_dotenv
import chromadb

# --- Setup ---
load_dotenv()

API_KEY = os.getenv("GOOGLE_API_KEY")
if not API_KEY:
    print("Error: GOOGLE_API_KEY not found in .env file.")
    exit()

genai.configure(api_key=API_KEY)
embedding_model = 'models/text-embedding-004'
llm = genai.GenerativeModel('gemini-2.5-flash-preview-05-20')

# --- Connect to ChromaDB ---
client = chromadb.PersistentClient(path="my_chroma_db")
try:
    collection = client.get_collection("personal_facts")
except ValueError:
    print("Error: The 'personal_facts' collection does not exist.")
    print("Please run the script from Chapter 3 to create and store your first memory.")
    exit()

print("Hello! I'm your personal AI assistant. I now have a memory of our facts.")
print("Type 'exit' or 'quit' to end the chat.")

# --- The Main Chat Loop ---
while True:
    user_question = input("\nYou: ")

    if user_question.lower() in ["quit", "exit"]:
        print("Goodbye! It was nice chatting with you.")
        break

    # --- Step 1: Look up relevant facts in our memory ---
    # First, we create an embedding for the user's question
    question_embedding = genai.embed_content(
        model=embedding_model,
        content=user_question
    )['embedding']

    # Then, we query our collection to find the most relevant memory
    results = collection.query(
        query_embeddings=[question_embedding],
        n_results=1 # We only want the single most relevant fact
    )
    
    # Let's get the text of the most relevant memory
    if results and results['documents'] and results['documents'][0]:
        retrieved_memory = results['documents'][0][0]
    else:
        retrieved_memory = None # No memory was found

    # --- Step 2: Formulate the answer ---
    if retrieved_memory:
        # We found a relevant memory! Let's use it to augment our prompt.
        prompt_with_context = (
            "You are a helpful personal assistant. "
            "Based on this fact I'm providing you: "
            f"'{retrieved_memory}'"
            "\nPlease answer the following question: "
            f"'{user_question}'"
        )
        print(f"AI (thinking with memory): I found a relevant fact... '{retrieved_memory}'")
    else:
        # We didn't find a relevant memory. We'll just ask the AI the question directly.
        prompt_with_context = user_question
        print("AI (thinking): I don't have a specific memory for this, but I'll answer from my general knowledge.")


    # Now, send the (potentially augmented) prompt to the Gemini LLM
    response = llm.generate_content(prompt_with_context)
    
    print(f"AI: {response.text}")

Let's See It in Action

Run this new script. You should see it process each of your three text files one by one.

Now for the real test. Run your main chatbot script. Your AI's memory is now three times larger. Try asking it some new questions:

Chapter 7: The Automatic Learner

Quan Huynh

The Plan: From One Note to Many

Before We Code: Expand the Library

The Code: The "Study Session" Script

Let's See It in Action

Recommended by LinkedIn

Table of Contents

Part 1: The First Spark of Intelligence

Part 2: Building the Memory System

Part 3: Creating a True Assistant

Part 4: The Responsible AI Owner

Part 5: The Next Evolution: AI Agents

Building Your AI Assistant

1,623 follower

More articles by Quan Huynh

Others also viewed

The silent risk in the AI era: learning tools before learning to think

Skills - the new gold?

Explaining Machine Learning in a simple way

Think Like a Human, Learn Like a Machine

Even with AI, understanding is still earned

LLMs Stop Learning in Inference Mode-So Do Many Companies

S1E2 : Machine Learning

How Do Machines Learn? A Beginner’s Guide to Machine Learning

Origin Story: Building ResearchCollab.ai to Amplify Learning, Not Bypass It

Day 16 of 30-Day Challenge: Learning Gen AI and LLM's

How to Build AI Agents With Memory

How to Use AI for Manual Coding Tasks

How to Build AI Literacy for Continuous Learning

How to Build AI Fluency in Technical Roles

Explore content categories

The Plan: From One Note to Many

Before We Code: Expand the Library

The Code: The "Study Session" Script

Let's See It in Action

Recommended by LinkedIn

Table of Contents

Part 1: The First Spark of Intelligence

Part 2: Building the Memory System

Part 3: Creating a True Assistant

Part 4: The Responsible AI Owner

Part 5: The Next Evolution: AI Agents

Building Your AI Assistant

1,623 follower

More articles by Quan Huynh

Chapter 2: Your AWS Account - The Foundation for Business

Chapter 9: Building a Web Interface for Your AI

Chapter 1: Introducing the AWS Cloud and Our Business

A Step-by-Step Guide for Beginners to Design, Build, and Scale Your Business Infrastructure on Amazon Web Services

Chapter 8: How to Structure Your Personal Data for Training an AI Assistant

Chapter 6: How Your AI Thinks - The RAG Pipeline

The Prompt-Driven DevOps: From YAML to AI

Chapter 5: How Your AI Finds the Right Memory

Chapter 4: The Memory Bank - Your First Vector Database

Chapter 3: Giving Your AI Its First Memory

Others also viewed

The silent risk in the AI era: learning tools before learning to think

Skills - the new gold?

Explaining Machine Learning in a simple way

Think Like a Human, Learn Like a Machine

Even with AI, understanding is still earned

LLMs Stop Learning in Inference Mode-So Do Many Companies

S1E2 : Machine Learning

How Do Machines Learn? A Beginner’s Guide to Machine Learning

Origin Story: Building ResearchCollab.ai to Amplify Learning, Not Bypass It

Day 16 of 30-Day Challenge: Learning Gen AI and LLM's

Similar topics

How to Build AI Agents With Memory

How to Use AI for Manual Coding Tasks

How to Build AI Literacy for Continuous Learning

How to Build AI Fluency in Technical Roles

Explore content categories