Four distinct users. Their language, goals, and the exact moment they find Lyra.
Lyra runs entirely on Alex's own hardware — no cloud dependency for inference, no data leaving the machine. The architecture is ~300 lines of auditable Python, not a black box. He can read every routing decision, every memory write. The TOML config means he can swap models without touching code. Phase 2 gives him full local inference, eliminating Anthropic as a dependency. He installs it on a Saturday afternoon, reads the architecture doc over coffee, and trusts what he reads.
Lyra's 5-level memory means Sofia stops re-explaining herself. Session memory keeps context alive during a conversation; episodic memory recalls what she worked on last week; semantic memory searches her vault of saved articles and notes. Her Telegram conversation with Lyra is the unified command surface she doesn't have today. She can /add a client email, /search her notes, and continue a half-finished draft — from her phone, on a train, at 10pm.
Lyra's semantic memory (SQLite + BM25 + fastembed) is exactly what Marcus has been trying to build himself for 3 years. The /add command captures any URL from Telegram — scraped, summarized, indexed in seconds. /search gives him hybrid BM25 + vector retrieval over everything he's saved. His knowledge vault compounds over time and lives on his NAS, not a SaaS product that might pivot. The AI responds from his actual research corpus, not just its training data.
Lyra is the clean foundation Yuki keeps trying to build himself on weekends. The hub is ~300 lines of readable asyncio Python — he reads the architecture doc in 20 minutes and understands every routing decision. The extension model (adapters, agents, skills) is the separation he needs to add his Discord adapter without breaking Telegram. Phase 2's Ollama support is directly on the roadmap. The 24 ADRs tell him why each decision was made, not just what it is. He forks it on day one.