This page explains how a privacy-first, fully offline AI assistant can remember personal facts and use them in conversation without sending any data to the cloud. We use a single question—“Do you know my name?”—to illustrate the core ideas.
User asks: “Do you know my name?”
Assistant queries memory: “Have I stored anything about the user’s name?”
A fact such as “Your name is Enrique.” is found in persistent memory.
Assistant answers: “Your name is Enrique.”
Simple on the surface—powerful underneath.
Sentence → Numbers
A lightweight embedding model turns the question into a 768-element vector—a unique “meaning fingerprint.”
Vector Search
A local vector database (e.g., ChromaDB, FAISS) compares that fingerprint with the fingerprints of all stored facts and returns the closest match.
Prompt Assembly
The retrieved fact plus the new user message are combined into a single prompt.
Language Model Generation
A local LLM (Llama, Mistral, etc.) receives the prompt and produces the final text reply.
All computation runs entirely on your device—PC, Mac, or even a Raspberry Pi.
Meaning over wording: “What’s my name?” and “Do you remember what I’m called?” generate nearly identical vectors, so the assistant finds the correct fact even when phrased differently.
Language-agnostic: Ask in Spanish or English; meaning vectors remain comparable.
Noise tolerant: Typos and synonyms affect cosine similarity far less than exact string matching.
Q: Does the assistant hit the memory database for every message?
A: Yes. A vector lookup typically costs <1 ms on a modern CPU, so there’s minimal overhead.
Q: Can I wipe everything it knows about me?
A: Delete the persistent_memory/ folder and restart the assistant. You’ll start with a clean slate.
Q: Which model creates the embeddings?
A: Any small open-source model works—popular choices include nomic-embed-text, all-MiniLM-L6, or e5-small. All run offline.
Q: Is any data ever uploaded?
A: No. All storage and inference remain on your hardware unless you explicitly enable cloud backup.
A local AI assistant turns your words into numerical fingerprints, matches them to stored fingerprints of your personal facts, and responds—all privately, instantly, and without an internet connection.
Updated for the Streamline Core Initiative educational site – June 2025.