Agent Memory

Learn how ThinkFleet agents remember conversations and user context using memory systems.

5 min readAI Agents

Agent Memory

Memory gives agents the ability to maintain context across conversations, remember user preferences, and recall relevant information from past interactions. ThinkFleet provides several memory mechanisms that work together.

Memory Types

Conversation History

The most basic form of memory. ThinkFleet includes recent messages from the current conversation in the agent's context window.

Configuration:

Setting Description Default
History Length Number of recent message pairs to include 20
Max Tokens Maximum tokens allocated to history 4000

When the conversation exceeds the history length, older messages are dropped from context (but remain stored in the database for recall).

Per-User Memory

Per-user memory stores persistent facts about individual users across sessions. When a user tells the agent their preferences, role, or other personal details, the agent can remember this in future conversations.

How it works:

  1. During a conversation, the agent identifies important user facts
  2. These facts are stored as key-value pairs associated with the user
  3. In future sessions, relevant facts are loaded into context

Examples of stored facts:

user_timezone: "America/New_York"
preferred_language: "English"
role: "Engineering Manager"
team: "Platform Team"
notification_preference: "Slack DM"

Semantic Search (Recall)

Semantic recall uses vector similarity search to find relevant past conversations. When a user asks a question, ThinkFleet searches the conversation history for related exchanges and includes them in context.

How it works:

  1. All conversation messages are embedded as vectors using an embedding model
  2. When a new message arrives, it is embedded and compared against stored vectors
  3. The most semantically similar past exchanges are retrieved
  4. These are included in the agent's context as "relevant history"

This enables the agent to recall details from conversations that happened weeks or months ago, even if they've fallen out of the recent history window.

Memory Architecture

User Message
    │
    ├─── Recent History (last N messages)
    │
    ├─── Per-User Facts (persistent key-value store)
    │
    ├─── Semantic Recall (vector similarity search)
    │
    └─── Knowledge Base Results (document RAG)
           │
           ▼
    Full Context → LLM → Response

All memory sources are assembled into the agent's context before the LLM processes each message.

Configuring Memory

In Agent Settings

Navigate to the Memory tab in agent settings:

  1. Conversation History

    • Toggle on/off
    • Set history length (messages)
    • Set max token budget
  2. Per-User Memory

    • Toggle on/off
    • Set max facts per user
    • Configure fact categories (optional)
  3. Semantic Recall

    • Toggle on/off
    • Set number of results to retrieve
    • Set similarity threshold (0.0 - 1.0)

Customer Support Agent

Conversation History: 20 messages
Per-User Memory: Enabled
  Max Facts: 50
Semantic Recall: Enabled
  Results: 5
  Threshold: 0.7

Good for remembering customer context, past issues, and preferences.

Quick Q&A Chatbot

Conversation History: 10 messages
Per-User Memory: Disabled
Semantic Recall: Disabled

Minimal memory for stateless FAQ-style interactions.

Executive Assistant Agent

Conversation History: 30 messages
Per-User Memory: Enabled
  Max Facts: 100
Semantic Recall: Enabled
  Results: 10
  Threshold: 0.6

Maximum context for an agent that manages schedules, preferences, and ongoing projects.

Memory and Privacy

Data Storage

  • Conversation history is stored in the project database
  • Per-user facts are stored as structured records
  • Vector embeddings are stored in pgvector (PostgreSQL)
  • All data is encrypted at rest

Data Retention

Configure retention policies per agent:

Setting Description
Conversation Retention How long to keep conversation history (default: 90 days)
Memory Retention How long to keep per-user facts (default: indefinite)

User Data Deletion

To comply with privacy regulations (GDPR, CCPA), you can delete a user's memory:

  1. Go to the agent's Memory tab
  2. Search for the user
  3. Click Delete User Memory

This removes:

  • All conversation history for that user
  • All per-user facts
  • All vector embeddings for their messages

API-Based Deletion

curl -X DELETE \
  https://your-instance.thinkfleet.com/api/v1/agents/{agentId}/memory/{userId} \
  -H "Authorization: Bearer {apiKey}"

Memory Best Practices

  1. Set appropriate history lengths — Too short and the agent loses context mid-conversation. Too long and you waste tokens on irrelevant old messages.

  2. Use semantic recall for long-running relationships — If users interact with your agent over weeks or months, semantic recall helps maintain continuity.

  3. Be explicit in system prompts about memory — Tell the agent when and how to use remembered information:

When a user mentions their preferences, remember them for future sessions.
Always check the user's timezone before suggesting meeting times.
  1. Monitor memory usage — Large per-user memory stores can impact response latency. Review and prune stored facts periodically.

  2. Test with returning users — Simulate multi-session conversations during testing to verify memory works as expected.

Troubleshooting

Agent doesn't remember previous sessions

  • Check that Per-User Memory is enabled
  • Verify the user identifier is consistent across sessions
  • Check that the max facts limit hasn't been reached

Agent recalls irrelevant information

  • Increase the similarity threshold for semantic recall
  • Reduce the number of recall results
  • Review and clean up stored facts

Memory adds too much latency

  • Reduce the number of semantic recall results
  • Lower the conversation history length
  • Disable semantic recall for time-sensitive agents

Next Steps