Agentic Memory
We’ve seen how tool calling and iterative searches over a Chroma collection can build context for an agent. While this works well for individual runs, agents start fresh each time—repeating expensive computations, re-learning user preferences, and rediscovering effective strategies they’ve already found. Agentic memory solves this by persisting data from agent runs that can be leveraged in the future. This reduces cost on LLM interactions, personalizes user experience, and improves agent performance over time.Memory Records
Context engineering is both an art and a science. Your memory schema will ultimately depend on your application’s needs. However, in practice, three categories lend themselves well to most use cases:Semantic Memory
Facts about users, processes, or domain knowledge that inform future interactions:- User preferences: “Prefers concise responses”
- Context: “Works in marketing, needs quarterly reports”
- Domain facts: “Company fiscal year starts in April”
Procedural Memory
Patterns and instructions that guide tool selection and execution:- “If a user asks about sales data, query the sales_summary table first”
- “For date ranges, always confirm timezone before querying”
- “Use the PDF parser for files from the legal department”
Episodic Memory
Artifacts and results from previous runs that can be reused or referenced:- Successful query plans
- Expensive computation results
- Search results and their relevance scores
- Previous tool call sequences that worked well
Memory in an Agentic Harness
Agentic memory integrates naturally with the plan-execute-evaluate architecture we discussed in the agentic search guide. During the planning phase, retrieve memories that will help the agent construct better plans, like examples of successful plans for similar queries and facts about the user or process. During the execution phase, retrieve memories that guide tool usage:- Procedural instructions for tool selection
- Parameter patterns that worked before
- Known edge cases to handle
- Did the plan succeed? What made it work?
- What new facts did we learn?
- Should we update existing procedural knowledge?
Implementation
The best way to implement a memory store for an agent is simply to dedicate a Chroma collection for memory records. This gives us out-of-the-box search functionality that we can leverage - metadata filtering for types of memories, advanced search over the store, and versioning with collection forking. We can establish a simple interface for interacting with this Chroma collection:MemoryRecords:
Memory Writing Strategies
How you write memories should be guided by how the agent will access them. A well-designed writing strategy ensures memories remain useful, accurate, and retrievable over time.Extraction Timing
End-of-run extraction processes the entire conversation after completion. This gives full context for deciding what’s worth remembering, but delays availability until the run finishes. Real-time extraction writes memories as the conversation progresses. This makes memories immediately available for the current run, but risks storing information that later turns out to be incorrect or irrelevant. Async extraction queues memory writing as a background job. This keeps the agent responsive but introduces complexity around consistency—the agent might not have access to memories from very recent runs. In practice, a hybrid approach often works best: extract high-confidence facts in real-time, and defer nuanced evaluation to end-of-run processing. You can also save memories identified in one step in the agent’s context, so they are available for downstream or long-running parallel steps.Selectivity
Not everything is worth remembering. Storing too much creates noise that degrades retrieval quality. Consider:- Signal strength: How confident is the agent that this information is correct? User-stated facts (“I work in marketing”) are higher signal than inferences (“they seem to prefer detailed responses”).
- Reuse potential: Will this information be useful in future runs? A user’s timezone is broadly applicable; the specific query they ran last Tuesday probably isn’t.
- Redundancy: Does this duplicate existing memories? Adding “user works in marketing” when you already have “user is a marketing manager” creates clutter without value.
- A useful heuristic: if the agent would need to ask about this information again in a future run, it’s worth storing.
Classification
Tag memories at write time to enable filtered retrieval. Key dimensions include:- Type: Is this a fact (semantic), an instruction (procedural), or a past result (episodic)?
- Phase relevance: When should this memory surface—during planning, execution, or evaluation?
- Scope: Is this user-specific, or does it apply globally across all users?
- Confidence: How certain is the agent about this memory’s accuracy?
- Source: Did this come from the user directly, from a tool result, or from agent inference?
Conflicts
New information sometimes contradicts existing memories. Your strategy might:- Override: Replace the old memory with new information. Simple, but loses historical context.
- Version: Keep both memories with timestamps, surfacing the most recent.
- Merge: Combine old and new into a single updated memory. Requires careful prompting to avoid losing important nuance.
- Flag for review: Mark conflicting memories for human review before resolution.
- Fork: Taking advantage of Chroma’s collection forking, create a branch of the memory collection with the new information, keeping the original intact. This is particularly useful when you’re uncertain which version will perform better — so you can run both branches and measure outcomes. Forking also enables rollback if new memories degrade agent performance, and can support A/B testing different memory strategies across user segments.
Decay and Relevance
Memories don’t stay useful forever. Consider tracking:- Access patterns: Memories that are frequently retrieved are proving their value. Memories never accessed may be candidates for removal.
- Recency: Recently created or accessed memories are more likely to be relevant than stale ones.
- Time-sensitivity: Some memories have natural expiration. “User is preparing for Q3 review” becomes irrelevant after Q3 ends.
Example: An Inbox Processing Agent
In the Chroma Cookbooks repo, we feature a simple example using agentic memory. The project includes an inbox-processing agent, which fetches unread emails from a user’s inbox and processes each one by user-defined rules. If the agent does not know how to process a given email, it will prompt the user for instructions. These instructions are then extracted from the run to be persisted in the agent’s memory collection as procedural memory records, which can be used in future runs. The project is accompanied by a dataset of mock emails on Chroma Cloud. You can mark an “email” as “unread” by setting a record’sunread metadata field to true.
The project includes an InboxService interface, which includes the actions the agent can take on a user’s inbox. It includes an implementation for interacting with the mock dataset on Chroma Cloud. You can extend the functionality of the agent by providing your own implementation for a real email provider.
The project uses the same generic agentic harness we introduced for the agentic search project. This time, the harness is configured with:
- A planner that simply fetches unread emails, and creates a plan step for processing each one.
- Data shapes and prompts to support the inbox-processing functionality.
- An input-handler to get email-processing instructions from the user.
- A memory implementation that exposes search tools over the memory collection, and memory extraction logic for persisting user-defined rules.
1
2
Use the “Create Database” button on the top right of the Chroma Cloud dashboard, and name your DB
agentic-memory (or any name of your choice). If you’re a first-time user, you will be greeted with the “Create Database” modal after creating your account.3
Choose the “Load sample dataset” option, and then choose the “Personal Inbox” dataset. This will copy the data into a collection in your own Chroma DB.
4
Once your collection loads, choose the “Settings” tab. At the bottom of the page, choose the
.env tab. Create an API key, and copy the environment variables you will need for running the project: CHROMA_API_KEY, CHROMA_TENANT, and CHROMA_DATABASE.5
Clone the Chroma Cookbooks repo:
6
Navigate to the
agentic-memory directory, and create a .env file at its root with the values you obtained in the previous step:7
To run this project, you will also need an OpenAI API key. Set it in your
.env file:8
This project uses pnpm workspaces. In the root directory, install the dependencies:
- Archive all GitHub notifications
- Label all emails from dad with the “family” label.