Agents That Remember You - Building Memory-Enabled Agents with Copilot Studio
The Future of AI Personalisation - Agents That Remember You
I’ll be the first to admit it, I’m forgetful. My wife sends me a grocery list on WhatsApp, tells me about it in person, and I still come home with missing items, and when I feel I am forgetting an item I call her, just for her to happily remind me that the list is in my WhatsApp as she already told me. She’s patient with me, God bless her. But here’s the thing: as human beings, we can afford to be patient with forgetful people we love. As users of AI agents? Not so much.
This is a problem I’ve been working on with customers for a while now. Most agents today are fundamentally stateless, this is by design to protect the precious context window, every conversation starts from zero. Every session forces users to rebuild context from scratch. And every interaction feels like meeting a stranger.
In this article, I want to share what I’ve learned about adding a memory layer to agents built specifically with Copilot Studio and Dataverse -because while the pro-code world has had frameworks like Zep, Mem0, and Graphiti for a while, the low-code world has been left to look on with envy. That changes now. Everything I’m sharing here can be built today with tools you already have.
The Cost of Forgetting
One of the very first agentic scenarios I helped a client with was a market analyst agent. This team used to create monthly summaries for their executive leadership -covering priority areas and what competitors were doing. When we brought in AI, we improved the frequency and the number of topics covered dramatically. But something was lost.
Those executives missed the fact that the human analyst knew them. They knew what level of summarisation each exec preferred. They knew the usual follow-up questions. They knew whether someone wanted their report in PDF or PowerPoint. The agent didn’t know any of this. We had improved output upwards, but we had delivered a worse user experience.
Here’s what that looks like in practice. Take a simple IT support scenario:
A standard agent takes 15 minutes of back-and-forth; a memory-enabled agent resolves the same issue in under a minute.
On the left, the agent starts asking: what device arA standard agent takes 15 minutes of back-and-forth; a memory-enabled agent resolves the same issue in under a minute.e you on? What OS version? Which VPN client? Fifteen minutes of interrogation for a problem that recurs monthly. On the right, the agent already knows the user’s MacBook Pro, their macOS version, their history of VPN issues -and it jumps straight to the solution in under a minute.
Familiarity matters more than raw intelligence. Anyone in sales understands this -you’re more likely to get a good outcome when dealing with someone who knows you, who you’ve built rapport with. Same in tech support. If you deal with a support person you’ve had pleasant experience with before, you’ll be more forgiving in your interaction. If we can get our agents to move from being chatbots to companions, I think that makes a very big difference.
Three Types of Memory That Matter
If you look at the literature, there are maybe seven or eight memory types. But for practical Copilot Studio implementation, three are most relevant:
Short-Term Memory is handled very well by Copilot Studio already. We’re all aware of topic variables and global variables. Within a single session, we have the tools and it works great. Long-Term Memory is where Copilot Studio doesn’t help out of the box. These are the preferences, the context, the past decisions that need to persist across sessions. And in the pro-code world, they’ve solved this problem with frameworks like Zep, Mem0, and Graphiti. But in the low-code world, we don’t have that yet. Episodic Memory captures time-stamped events, knowing not just what happened but when it happened. An agent that knows you ordered pizza every Thursday until January 5th, then switched to avocado salads after starting a keto diet, is an agent that truly understands you. The patterns I’ll share are specifically about closing the long-term memory gap, with Pattern 5 addressing the episodic side.
Guidelines: Helpful, Not Creepy
Before jumping into patterns, there are principles I’ve made sure to follow across every implementation. First, transparency -users must know what memories you’re saving about them, and they must be able to easily access, update, or delete those memories. Second, performance -you have to watch the latency implications of adding a memory layer. Third, never save sensitive data -no passwords, no credit card numbers, no health information, even if the user shares them.
You also need intentional forgetfulness -knowing when to invalidate or refresh memories. And finally, confidence scoring: if a preference was explicitly stated by the user, that’s high confidence. If it was inferred from a conversation, that’s lower confidence. This scoring should shape how your agent responds -giving higher weight to memories with higher certainty.
The golden rule: if users feel surveilled instead of supported, you’ve lost trust.
Five Patterns for Adding Memory to Copilot Studio Agents
Here’s the core of what I want to share. These are five practical patterns, ranging from simple to sophisticated, all implementable with Copilot Studio, Dataverse, and Agent Flows. You don’t have to pick just one -in practice, you’ll likely mix them based on your use case.
Pattern 1: Direct Tool Access
Pattern 1: The agent directly calls memory tools (GetUserMemory, UpdateMemory, PurgeMemory) against Dataverse.
This is the simplest pattern and anyone can do it with low-code. Your Copilot Studio agent has tools -Agent Flows -for reading, writing, and deleting memories, with Dataverse as the governed storage layer. When a user says “From now on, generate the report in PDF format,” the agent picks up that hint and calls UpdateMemory immediately.
The Dataverse model can be very simple -a flat key-value schema with a user ID, a memory key, a memory value, a type classification, and a confidence score. For the market analyst scenario, the canonical memory keys were things like PreferredSummaryStyle, PreferredResponseFormat, PreferredCompetitors, and PrimaryFocusAreas. Being intentional and specific about what you collect simplifies everything.
In the agent instructions, I’m very specific about how the agent handles memory. Here are actual snippets from the instructions I use. First, memory key governance -the agent must never invent new memory keys. It always maps user language to predefined canonical keys:
Agent instruction snippet: Memory Key Governance - the agent must never invent new keys, only use predefined canonical ones.
Next, the workflow starts by loading memories. At the start of every conversation, the agent calls GetUserMemory, parses the returned JSON, extracts only the key, value, and memory type fields, ignores all Dataverse metadata, and builds an internal preference map. If no memory exists, it proceeds as a first-time interaction:
Recommended by LinkedIn
Agent instruction snippet: Step 1 - Load User Memory at conversation start.
And when it’s time to save, the agent uses a strict JSON payload format through UpdateMemory. Being very prescriptive about the payload shape is important -you give it exactly the format and it follows it:
Agent instruction snippet: Step 5 - Save Memory using a strict JSON payload to UpdateMemory.
Pattern 2: Dedicated Memory Manager Agent
Pattern 2: A specialised Memory Agent handles all read/write logic, freeing the primary agent to focus on business logic.
Here’s the problem with Pattern 1: Copilot Studio has a specific instruction length limit. If you consume most of that space defining how the agent should manage memory, you’re taking away from what the agent actually needs to do -its core business logic.
So in this pattern, we create a second agent whose sole purpose is memory management. The primary agent -your market analyst, your IT support bot -focuses entirely on its domain. When it needs to retrieve or save memories, it delegates to the Memory Manager Agent. This gives you clean separation of concerns. And the real power? That same Memory Manager Agent can be reused across multiple agents, creating a unified memory knowledge store that serves your entire agent ecosystem.
Pattern 3: Asynchronous Memory Processor
Pattern 3: The agent responds immediately while a background process harvests memories from completed conversations.
Now, what if you’re already struggling with latency before adding a memory layer? Or what if you need to deal with a large volume of memories? In this pattern, the agent doesn’t save memories during the conversation at all. It responds to the user immediately -zero added latency -and fires off a background event.
A separate process (it could be an autonomous agent, a Power Automate flow, whatever you want) picks up completed conversation transcripts from a queue, goes through them with the same memory-extraction logic, and persists those memories asynchronously. This lets you do much heavier analysis -richer memory extraction, better accuracy, proper type classification, and implementing intentional forgetfulness -without impacting the user experience at all. And as one audience member pointed out during my session, this approach can be significantly cheaper in terms of compute costs too.
The Hybrid Variant: In practice, it’s rarely a clean either/or choice. Some memories are time-sensitive -imagine a session dropping due to an error or disconnection. You don’t want the user to restart and feel like the agent has forgotten everything. So the hybrid model saves time-critical memories synchronously in the hot path (using the Memory Manager from Pattern 2), while sending everything else to the background processor.
Hybrid variant: Time-critical memories are saved synchronously; deeper analysis happens in the background.
Pattern 4: Proactive Feedback Loop
Pattern 4: The agent identifies new types of memories worth capturing, creating a self-improving system with human-in-the-loop approval.
This is where it gets really interesting. In this pattern, you leverage the intelligence of the large language model itself to identify new memory types that you hadn’t thought to track. At the end of each conversation -or asynchronously across a batch of conversations -the LLM analyses the interaction and suggests: “I think it would be useful to also capture X about this user.”
These suggestions go to a human in the loop -the agent owner or administrator -who approves or rejects them. Approved types get added to the canonical memory key registry, and the system dynamically evolves what it captures. It’s a self-improving system. Now, in some cases this might generate noise, so you might run it on just 1–2% of conversations. But over time, it makes your agent more perceptive about what matters to your users.
Pattern 5: Graphiti MCP Server Integration
Pattern 5: Copilot Studio connects to a Graphiti MCP Server for temporal-aware, graph-based memory management.
For more advanced memory needs, particularly episodic memory with temporal awareness, the first four patterns start to hit their limits. Implementing time-aware memory tracking in Copilot Studio and Dataverse alone is a lot of work. Fortunately, the creators of Zep have open-sourced Graphiti, a library for temporal-aware context management, and they offer it through an MCP (Model Context Protocol) server.
You can deploy this MCP server and have your Copilot Studio agent push all the memory management logic to it. Similar to Pattern 2 where we had a dedicated agent for memory, here you’re offloading to an external service that handles the sophistication: dynamic knowledge graphs, temporal reasoning, automatic invalidation. Your agent stays focused on its core business. The memories live outside Dataverse in this case, managed entirely by the Graphiti server. It’s the closest thing we have in the low-code world to what pro-code developers get with Zep out of the box.
Progressive Profiling: Ask Once, Remember Forever
Across all these patterns, the UX principle is the same: progressive profiling. When a user says “Give me the executive summary version,” a well-designed agent confirms whether to remember that preference, stores it, and never asks again. Low-stakes inferences can be stored silently. High-stakes ones should be confirmed. And users should always have explicit control -“Forget that” or “Actually, I prefer X instead” should just work.
With custom domain-specific agents, you have an advantage that generalist tools like ChatGPT or M365 Copilot don’t: you can be intentional about exactly which memories matter for your use case. A market research agent knows to capture competitor lists and summarisation preferences. An IT support agent tracks device types and recurring issues. This specificity is a strength -embrace it.
You’re Not Building Infrastructure -You’re Configuring It
The good news is that everything you need exists today. Copilot Studio gives you memory patterns through Topics and Agent Flows, with tool-based memory access control. Dataverse provides enterprise-grade governed storage with built-in security, compliance, identity handling, and audit logging. You’re not building memory infrastructure from scratch -you’re configuring it.
My advice: start with one memory type -user preferences are the easiest win. Keep your schema flat -key-value pairs scale better than complex relational models. Design memory as a UX feature, not just a data layer. And iterate based on value signals: measure reuse rates, user satisfaction, and task completion times to guide what you remember next.
We don’t have to look with envy at the pro-code world anymore. Memory transforms agents from stateless tools into trusted digital counterparts -and with Copilot Studio and Dataverse, familiarity at scale is achievable without over-engineering. The agents that win adoption won’t be the ones with the highest benchmarks. They’ll be the ones that remember you.
In the coming few weeks; I will detail the technical implementation of each of the patterns; I will also post the solution in Github Repo for you to download. If you need earlier access, comment below or send a DM and I will send you a less polished version as soon as I can.
This article is based on a session I delivered at Microsoft MCAPS Tech Connect (February 2026) on building memory-enabled agents with Copilot Studio.
#AIAgents #CopilotStudio #Dataverse #ArtificialIntelligence #Personalisation #AgenticAI #FutureOfWork #LowCode #MicrosoftAI #AgentMemory
I need the access! Excellent post!
Great stuff, this meets exactly the approach I was thinking of! You wrote you are going to share the details on a GitHub repo. Has it been published?
An insightful read. Memory feels like the kind of functionality that should come built-in, especially since Microsoft positions Copilot Studio as simple to get started with. What concerns me is the effort required to build memory now, only for Microsoft to eventually make it as easy as flipping a toggle. It would be helpful to know whether Microsoft plans to offer memory out of the box and what the timeline looks like. Either way, this was a worthwhile read.
Awesome article. Very good job. Thanks Sultan
Very insightful article and well presented! Thank you!