Mindcache Memory Runtime v0 Design Spec
Date: 2026-03-25 Status: Draft v0 Scope: Personal memory infrastructure, local-first
1. Purpose
This document defines the v0 design for Mindcache's memory runtime.
The goal is to establish a memory-native infrastructure for agents and personal AI systems that is:
- local-first
- lightweight
- zero-ops by default
- structured around memory units instead of raw conversation logs
- usable as a foundation for later agent integrations, including import/export with external systems
This is not yet a full production implementation plan for multi-user or cloud-native deployments. It is a design baseline for a robust personal memory system.
2. Problem Statement
Conventional memory patterns built on top of raw document storage, vector databases, and RAG retrieval are not sufficient for the intended Mindcache use case.
The key limitations are:
- raw messages or chunks are poor long-term memory units
- semantic retrieval alone does not model structured relationships well
- large memory graphs become too expensive to search directly for every query
- current systems usually lack explicit consolidation, fading, and multi-stage recall
- future knowledge such as calendar commitments should also be represented, not only past events
Mindcache therefore needs a dedicated memory runtime with:
- structured memory nodes
- graph-based long-term storage
- a smaller high-level topic tree view for fast recall
- nightly digestion and incremental reorganization
- support for faint memory instead of hard deletion
3. Design Principles
3.1 Graph Is Core, Tree Is View
The source of truth should be a graph-like memory structure. The tree is a derived access view optimized for fast search, navigation, and summarization.
3.2 Fact/Event-Centric Memory
The system should not treat raw files, raw chat logs, or generic chunks as the durable long-term memory unit.
The durable unit is a structured memory node that represents one of:
- a fact
- an event
- an entity
- a summary
- a future plan
3.3 Local-First and Zero-Ops
v0 should prioritize:
- local deployment
- no external infrastructure requirement
- SQLite as primary storage
- minimal runtime dependencies
3.4 Retrieval Should Be Budgeted
Not every query should search the full graph.
The runtime should support:
- fast recall over a compact topic tree
- deeper recall over the graph when needed
3.5 Sleep-Time Consolidation Is Required
Memory is not static storage. The system should periodically reorganize and compress memory to keep retrieval usable over time.
3.6 No Heavy Graph Database
v0 explicitly avoids heavy systems such as Neo4j. The architecture should remain light enough for personal local use.
4. v0 Scope
v0 is responsible for:
- importing memory-relevant data
- extracting structured memory nodes
- storing nodes and relations
- supporting exact, semantic, graph, and topic-based search
- maintaining a topic tree view
- running nightly digest and fainting logic
- importing future commitments such as calendar events
- exporting data in portable formats
v0 does not aim to solve:
- multi-user collaborative memory
- cloud synchronization
- complex permissions
- enterprise-scale graph infrastructure
- advanced UI beyond basic inspection tools
5. High-Level Architecture
The runtime has three major layers:
Layer 1: Memory Graph
This is the long-term source-of-truth memory layer.
It stores:
- memory nodes
- typed edges
- source references
- timestamps
- locations
- importance and confidence metadata
Layer 2: Topic Tree View
This is a derived structure built from the graph.
It stores:
- high-level topics
- topic hierarchy
- topic summaries
- representative anchors back into the graph
It is optimized for:
- fast recall
- rough navigation
- small search footprint
Layer 3: Recall Controller
This is the query-time orchestration layer.
It decides how to answer a query using:
- exact match
- semantic search
- graph search
- topic tree recall
The controller may:
- answer from the topic tree only
- use the topic tree to route deeper graph search
- go directly to graph retrieval when precision is needed
6. Memory Node Model
Each memory unit is represented as a MemoryNode.
Suggested v0 shape:
type MemoryNode = {
id: string
type: "fact" | "event" | "entity" | "summary" | "plan"
content: string
time_start?: string
time_end?: string
location?: string
entities?: string[]
related_nodes?: RelatedEdge[]
source_ref?: SourceRef
confidence?: number
importance?: number
status: "active" | "planned" | "occurred" | "missed" | "canceled" | "faint" | "archived"
created_at: string
updated_at: string
}
6.1 Node Types
fact
Stable factual content extracted from source material.
Examples:
- a user preference
- a project decision
- a known historical fact from conversation or notes
event
A concrete occurrence, usually with time and optional location.
Examples:
- a meeting
- a discussion
- a trip
- a change in status
entity
A person, project, place, company, topic, or concept.
Examples:
- Mindcache
- Bohang Li
- Singapore
- sleep-time consolidation
summary
A derived abstraction used to compress lower-level memory.
Examples:
- daily digest output
- topic summary
- weekly synthesis
plan
A future-oriented memory item.
Examples:
- a calendar event
- a scheduled task
- an intended follow-up
7. Edge Model
Nodes are connected by typed edges.
Suggested v0 edge shape:
type RelatedEdge = {
node_id: string
relation_type:
| "about"
| "involves"
| "before_after"
| "same_topic"
| "derived_from"
| "planned_for"
| "occurred_after"
| "contradicts"
}
7.1 Edge Philosophy
The system should avoid untyped generic linkage wherever possible.
Even a small relation vocabulary is better than storing only undifferentiated adjacency.
For v0, the relation set should stay intentionally small and practical.
8. Source Model
Every memory node should be traceable to an origin.
type SourceRef = {
source_type: "markdown" | "sqlite" | "calendar" | "chat" | "import"
source_id?: string
source_path?: string
external_ref?: string
}
This is important for:
- re-ingestion
- debugging
- auditability
- conflict resolution
- import/export compatibility
9. Calendar as First-Class Input
Calendar is a core v0 input source.
This matters because memory is not only retrospective.
The system should also know:
- what is planned
- what should happen in the future
- whether something later occurred, was missed, or was canceled
9.1 Calendar Import Rules
When importing calendar data:
- create
plannodes for future items - populate
time_start,time_end, andlocationwhen available - attach related entities if detectable
- set initial status to
planned
9.2 Calendar Update Rules
A future plan may later transition to:
occurredmissedcanceledarchived
This allows memory to represent expectations and outcomes as part of the same long-term structure.
10. Storage Model
SQLite is the primary storage layer in v0.
It stores:
- nodes
- edges
- source references
- topic tree nodes
- topic-to-graph anchors
- digest state
Vector search is supportive, not authoritative.
It should be used for:
- semantic similarity
- fuzzy recall
- approximate relevance
It should not define the core organization of memory.
11. Search and Recall Interfaces
v0 should expose four primary retrieval interfaces.
11.1 Exact Match
Purpose:
- exact text lookup
- time/location filtering
- entity lookup
- identifier or known-name retrieval
Example use cases:
- search by exact title
- fetch all nodes for a date
- fetch memory for a named person
11.2 Semantic Search
Purpose:
- fuzzy semantic recall
- conceptually similar content
- non-literal retrieval
Example use cases:
- find earlier thoughts related to "memory consolidation"
- retrieve similar past planning discussions
11.3 Graph Search
Purpose:
- traverse relationships
- recover causal or contextual neighborhoods
- inspect local subgraphs
Example use cases:
- retrieve events related to a specific project and person
- find what happened before/after an important decision
11.4 Topic Search
Purpose:
- fast recall over the topic tree
- cheap first-pass routing
- identify likely graph anchor points
Example use cases:
- find major themes from recent weeks
- determine whether a query belongs to an existing long-term topic
12. Recall Modes
The runtime supports at least two recall modes.
12.1 Fast Recall
Fast recall operates on:
- the topic tree
- topic summaries
- high-level hot/warm topics
- a small number of graph anchors
Its purpose is:
- routing
- quick orientation
- low-latency memory access
12.2 Deep Recall
Deep recall operates on:
- graph traversal
- exact match
- semantic search over graph-backed nodes
- time/entity constrained graph extraction
Its purpose is:
- precise evidence gathering
- detailed reconstruction
- deeper reasoning
The LLM or recall controller decides when to escalate from fast recall to deep recall.
13. Topic Tree View
The topic tree is a compact, rough, high-level view derived from the memory graph.
It is not the source of truth.
It exists to:
- keep fast search efficient
- reduce the need to scan the graph for every query
- maintain a manageable conceptual overview of memory
13.1 Topic Node Shape
type TopicNode = {
id: string
title: string
summary: string
parent_topic_id?: string
child_topic_ids: string[]
anchor_graph_nodes: string[]
related_topics: string[]
status: "hot" | "warm" | "cool" | "faint"
first_seen: string
last_updated: string
importance: number
}
13.2 Topic Tree Properties
The topic tree should remain:
- much smaller than the full graph
- rougher and more abstract
- good enough for fast routing
A topic tree node may represent:
- a project theme
- an ongoing personal concern
- a cluster of recurring design ideas
- a long-running subject area
14. Topic Tree Maintenance Strategy
The maintenance strategy is:
daily incremental updates, periodic small-cycle restructuring
14.1 Daily Incremental Update
Every day, the runtime should:
- inspect newly added or updated graph nodes
- group them into clusters
- generate topic candidates
- merge them into the existing topic tree
- update topic summaries and anchors
- refresh hot/warm/cool/faint status
14.2 Periodic Small-Cycle Rebuild
On a weekly or similarly short cycle, the runtime should:
- merge duplicate or near-duplicate topics
- split overly broad topics
- rebalance tree depth and breadth
- faint stale branches
The topic tree should not be fully rebuilt from scratch every day, because that would reduce structural stability and increase cost.
15. Nightly Digest
Nightly digest is a structural maintenance process, not just a plain summary job.
15.1 Goals
- consolidate recent memory
- compress graph growth into more navigable abstractions
- maintain the topic tree
- reduce noise
- support faint memory transitions
15.2 Proposed Pipeline
Step 1: Event Grouping
Group recent nodes based on:
- time proximity
- shared entities
- shared location
- explicit graph relations
- semantic similarity
Step 2: Fact Consolidation
Within each cluster:
- deduplicate similar facts
- resolve obvious overlaps
- identify conflicts
- adjust confidence where needed
Step 3: Topic Induction
Infer high-level topic candidates from the grouped clusters.
Step 4: Topic Merge
Compare new topic candidates against the existing topic tree:
- merge into an existing topic
- create a new branch
- attach as a child topic
Step 5: Fainting and Cooling
Reduce maintenance priority for low-value or long-inactive regions.
Step 6: Fast Index Refresh
Refresh summaries, anchors, and the topic-level search surface used for fast recall.
16. Faint Memory
Nodes should not be deleted aggressively. Instead, old or low-priority memory can transition into a faint state.
16.1 Why Faint Memory Exists
It preserves:
- historical detail
- possible later reactivation
- cheaper long-term storage than full active maintenance
It avoids:
- premature deletion
- loss of latent context
- overloading fast recall with stale material
16.2 Faint Node Behavior
Faint nodes:
- are excluded from most default fast recall paths
- are updated less frequently
- may keep only minimal summary/index participation
- remain reachable in deep recall
- may return to active states if re-accessed
16.3 Faint Topic Behavior
Topic nodes may also become faint when:
- they have not been updated for a long period
- they are no longer central to recent memory
- they retain archival value but low current relevance
17. Import Model
The runtime should support pluggable input adapters.
The ingestion boundary is not the architectural core, but it must be flexible.
v0 target import sources:
- markdown files
- sqlite exports
- calendar data
- future external systems such as OpenClaw exports
General pipeline:
raw source -> extraction -> memory nodes -> graph edges -> digest integration
18. Export Model
The runtime should support export in portable formats.
v0 targets:
- JSON export
- Markdown export
- SQLite backup/export
Goals:
- portability
- user ownership
- offline inspection
- compatibility with future systems
19. Implementation Direction
Rust is the preferred implementation language.
Reasons:
- strong performance
- excellent SQLite integration
- low deployment footprint
- good support for local CLIs and services
- suitable for a long-lived infrastructure core
20. Suggested v0 Success Criteria
v0 is successful if it can:
- run locally with minimal setup
- import markdown and calendar inputs
- create and persist memory nodes and edges
- support exact, semantic, graph, and topic recall
- maintain a topic tree for fast recall
- run nightly digest incrementally
- support faint memory transitions
- export data cleanly
21. Open Questions for Next Spec
This document intentionally leaves several details for the next layer of technical design:
- exact SQLite schema
- edge indexing strategy
- vector index layout
- topic induction algorithm details
- fainting thresholds
- recall controller API contract
- import adapter interfaces
22. Recommended Next Documents
The next documents to write are:
Mindcache Memory Runtime v0: Data Model and SQLite SchemaMindcache Memory Runtime v0: Nightly Digest and Topic InductionMindcache Memory Runtime v0: Recall Controller and Search APIs
23. Summary
Mindcache Memory Runtime v0 is a local-first memory system built around a simple but durable idea:
- the graph stores the real memory
- the topic tree gives the system a compact rough map of that memory
- nightly digest keeps both layers usable over time
- faint memory preserves long-term detail without overwhelming active recall
This architecture is intentionally lightweight, practical, and suitable for a first personal memory infrastructure before expanding to collaborative or cloud-based versions.