Lyra by Roxabi — Product Intelligence

Customer Personas & Voice Guide

Four distinct users. Their language, goals, and the exact moment they find Lyra.

Four personas — click any card to expand
Alex Chen
Senior Backend Engineer · Berlin · 32
Privacy-first Self-hosted OSS
"I'm not feeding proprietary client code into someone else's black box. Full stop."
Tech comfort
Expert — runs k8s at home
Self-host?
Yes — prefers it
Current tool
Local Ollama + cobbled shell scripts
Main channel
Telegram (work phone)

Day in the Life

07:30Opens Telegram on the commute — wants a brief on overnight CI failures
09:00Deep work: reviewing architecture for a client microservice (NDA-level code)
12:30Asks AI to compare two approaches — pastes proprietary schema, hesitates, closes the tab
16:00Gets pinged about an outage — wants AI help diagnosing logs, but they contain PII
20:00Reads HN thread about another AI company's data breach, screenshots it, shares it internally

Pain Points

  • Every useful AI tool requires sending data to external servers he doesn't control
  • Cloud AI retains conversation history by default — he's read the ToS
  • Cobbled Ollama setup has no memory, no channels, restarts constantly
  • Team keeps using ChatGPT for client work — he's the only one uncomfortable with it
  • Can't recommend any cloud AI to the client without a legal review first

Goals

  • A personal AI that runs on his hardware — auditable, inspectable, his
  • Persistent memory of his preferences, codebases, recurring patterns
  • Accessible from Telegram without going through a cloud middleman
  • Easy model swap when a better local LLM ships
  • Confidence that client code never leaves the machine

Objections

  • "How much setup is this actually? I've been burned by self-hosted projects before."
  • "Does it work on my existing RTX 3060 or do I need new hardware?"
  • "If the Anthropic API is the default, that defeats the purpose for me."
  • "Is this maintained? Or will it be abandoned in 6 months?"

Technical Comfort

Linux/Server
Docker/k8s
Python
LLM/AI ops

Why Lyra

Lyra runs entirely on Alex's own hardware — no cloud dependency for inference, no data leaving the machine. The architecture is ~300 lines of auditable Python, not a black box. He can read every routing decision, every memory write. The TOML config means he can swap models without touching code. Phase 2 gives him full local inference, eliminating Anthropic as a dependency. He installs it on a Saturday afternoon, reads the architecture doc over coffee, and trusts what he reads.

Click to expand full profile
Sofia Rossi
Indie Hacker & Consultant · Lisbon · 29
Solo operator Async work Multi-context
"I don't need a smarter search engine. I need something that actually knows what I'm working on right now."
Tech comfort
Moderate — can SSH, run scripts
Self-host?
Willing if setup is clear
Current tools
ChatGPT Plus + Notion + Zapier
Main channel
Telegram (always open)

Day in the Life

08:00Three projects active simultaneously — client consulting, SaaS build, personal newsletter
09:30Starts a ChatGPT conversation — spends 5 minutes re-explaining context from yesterday
11:00Gets a client question on WhatsApp — needs to check notes across 3 tools to answer
14:00Writing a proposal — ChatGPT doesn't know her tone, her previous proposals, her client
22:00Works from the couch — wants AI on Telegram, not opening a laptop just to ask a question

Pain Points

  • Every AI conversation starts from zero — no memory of who she is or what she's building
  • ChatGPT doesn't know her voice, her clients, or her projects
  • Context switching between tools (Notion, email, Slack, ChatGPT) destroys focus
  • Managing 3+ concurrent projects with zero AI continuity is exhausting
  • AI tools are desktop-first; she works everywhere including mobile

Goals

  • An AI that remembers: her active projects, her tone, her recurring clients
  • Message from Telegram without context setup every single time
  • Unified command surface — search notes, get summaries, draft messages
  • Something that feels like a reliable assistant, not a stateless chatbot
  • Runs 24/7 — not dependent on a laptop being open

Objections

  • "I'm not a developer. Will I need to touch Python config files?"
  • "What does 'always-on hardware' mean? I don't have a server at home."
  • "How is this different from just using Claude in Telegram directly?"
  • "I already pay for ChatGPT Plus. Is this worth the added complexity?"

Technical Comfort

Linux/Server
Terminal/CLI
APIs/Webhooks
AI/LLM

Why Lyra

Lyra's 5-level memory means Sofia stops re-explaining herself. Session memory keeps context alive during a conversation; episodic memory recalls what she worked on last week; semantic memory searches her vault of saved articles and notes. Her Telegram conversation with Lyra is the unified command surface she doesn't have today. She can /add a client email, /search her notes, and continue a half-finished draft — from her phone, on a train, at 10pm.

Click to expand full profile
Marcus Webb
Independent Researcher & Analyst · London · 37
Knowledge vault Deep research Information overload
"I've saved 6,000 links since 2019. I can't find anything I've read. The knowledge is useless if I can't recall it."
Tech comfort
High — writes Python scripts
Self-host?
Yes — already runs a NAS
Current stack
Obsidian + Readwise + Raindrop + ChatGPT
Main channel
Telegram + Discord

Day in the Life

07:00Morning reading: 40+ tabs of research papers, articles, Twitter threads
09:00Writing a report — remembers reading something relevant 3 months ago, can't find it
11:00Tries Readwise → Raindrop → Obsidian → Google (own bookmarks) — finds it in 25 minutes
15:00Sends 3 URLs to himself on Telegram to "read later" — they join 400 others, never read
20:00Wants to query his reading history: "what have I saved about central bank policy?" — no tool does this well

Pain Points

  • Knowledge is scattered across 5+ tools — no single queryable store
  • AI doesn't know what he's read — answers from training data, not his actual research
  • Readwise/Raindrop have no semantic search, just tags and folders
  • Context window limits mean ChatGPT can't process his full research corpus
  • Every new AI conversation starts without the 5 years of context he's accumulated

Goals

  • A queryable semantic vault of everything he's read — "what have I saved about X?"
  • AI that reasons over his notes, not just general training data
  • Instant capture from Telegram: drop a URL, get it summarized and indexed
  • Memory that compounds over time — the longer he uses it, the smarter it gets
  • Runs on his own hardware so the knowledge base doesn't disappear with a SaaS pivot

Objections

  • "How do I migrate my existing Readwise/Raindrop data into Lyra?"
  • "Is the semantic search actually good, or just keyword matching?"
  • "What happens if the embeddings model changes and my semantic index breaks?"
  • "Does this replace my Obsidian workflow or add to it?"

Technical Comfort

Linux/Server
Python
Databases
AI/Embeddings

Why Lyra

Lyra's semantic memory (SQLite + BM25 + fastembed) is exactly what Marcus has been trying to build himself for 3 years. The /add command captures any URL from Telegram — scraped, summarized, indexed in seconds. /search gives him hybrid BM25 + vector retrieval over everything he's saved. His knowledge vault compounds over time and lives on his NAS, not a SaaS product that might pivot. The AI responds from his actual research corpus, not just its training data.

Click to expand full profile
Yuki Tanaka
Hobbyist Maker & ML Engineer · Tokyo · 26
Builder mindset Local LLMs Homelab
"I want to understand every layer of my AI stack. If I can't read the source, I don't trust it."
Tech comfort
Expert — trains models, writes CUDA
Self-host?
Only option he'd consider
Current stack
Ollama + raw Python scripts + llama.cpp
Main channel
Discord (community first)

Day in the Life

06:00Before work: experimenting with a new 7B model fine-tune on his RTX 4090
12:00Lunch: reading the Lyra architecture PR on GitHub — evaluating before trying
18:00After work sprint: wants to add a Discord bot that routes to his local model, writes 200 lines of glue code
21:00Realizes his glue code doesn't handle concurrent users, restarts, gets frustrated
23:00Posts in r/LocalLLaMA: "anyone have a clean asyncio bot setup that's actually production-ready?"

Pain Points

  • Every open-source bot is either too simple (no concurrency) or too complex (LangChain hell)
  • Concurrency bugs are hard — his bots break under concurrent requests
  • No clean extension model — adding a skill means modifying the core
  • Frameworks abstract too much — he wants to understand what's happening
  • No memory across sessions — every conversation restarts from zero

Goals

  • A production-ready async foundation he can actually read and extend
  • Clean adapter pattern — add Telegram, Discord, Matrix without touching the hub
  • Local LLM support first, cloud as fallback
  • Memory system he can inspect and extend (not a black box vector DB)
  • Weekend project that becomes a real daily tool

Objections

  • "Is this actually minimal or just calling itself minimal? I'll read the source before trusting."
  • "The Anthropic API default makes me skeptical — is local-first a real priority?"
  • "300 lines of hub sounds great but how much is the rest? Show me the total."
  • "Will this support Ollama / llama.cpp endpoints properly?"

Technical Comfort

Python/asyncio
CUDA/GPU
LLM/Inference
MLOps

Why Lyra

Lyra is the clean foundation Yuki keeps trying to build himself on weekends. The hub is ~300 lines of readable asyncio Python — he reads the architecture doc in 20 minutes and understands every routing decision. The extension model (adapters, agents, skills) is the separation he needs to add his Discord adapter without breaking Telegram. Phase 2's Ollama support is directly on the roadmap. The 24 ADRs tell him why each decision was made, not just what it is. He forks it on day one.

Click to expand full profile
A

Alex Chen

Privacy-Conscious Developer

Pain Language

"I don't trust sending client code to OpenAI" "My conversations are training their model" "Every useful tool leaks data by default" "I've read the ToS. It's not good." "My local setup is a mess of bash scripts" "No memory between sessions anyway"

Desire Language

"I want full control over my AI stack" "I want to audit what it does with my data" "I want to swap models without breaking everything" "Something I can actually read the source of" "Runs on my hardware, stays on my hardware"

Communities

Hacker News (lurker, occasional poster) r/LocalLLaMA r/selfhosted Privacy-focused dev Discord servers GitHub trending — weekly check

Trigger Events

Another AI company data breach in the news Client asks: "is our code safe in ChatGPT?" HN thread: "here's what OpenAI stores" Team uses GPT-4 for NDA project — he notices Weekend with nothing to do — wants to build proper setup
S

Sofia Rossi

Solo Operator

Pain Language

"I have to re-explain everything every time" "It doesn't know my clients, my tone, my projects" "ChatGPT is a stranger who needs context every morning" "I'm switching between 6 tools just to answer one question" "My AI tools are desktop-only and I'm always on mobile"

Desire Language

"I want an AI that actually knows me" "I want it to remember our last conversation" "I want one place for everything — not 5 apps" "I want a reliable AI teammate, not a tool" "Available on Telegram, all day, no setup" "An assistant that compounds knowledge over time"

Communities

Twitter/X — indie hacker circles r/Entrepreneur Indie Hackers forum Telegram productivity groups Make / Zapier communities

Trigger Events

Spends 20 minutes re-explaining a project to ChatGPT Misses something because it was in a different tool Sees "AI with memory" on Product Hunt — tries, disappointed Overwhelmed by context switching between projects Colleague recommends "self-hosted AI" — wants to understand
M

Marcus Webb

Knowledge Worker & Researcher

Pain Language

"I have 6,000 bookmarks and can't find anything" "My knowledge is trapped in different silos" "ChatGPT answers from its training data, not my research" "Tags and folders don't scale to 5 years of reading" "The knowledge I've accumulated is basically useless" "I'm paying for 4 tools that don't talk to each other"

Desire Language

"I want to query everything I've ever read" "I want an AI that knows my research corpus" "One drop point for URLs, automatically indexed" "Semantic search across my own knowledge base" "A second brain that actually works" "Knowledge that compounds the longer I use it"

Communities

r/PKM (Personal Knowledge Management) Obsidian Discord Hacker News — "Ask HN: how do you manage reading?" Twitter/X — PKM researchers r/LocalLLaMA — RAG threads

Trigger Events

Spends 30 mins finding a link he saved 3 months ago Readwise raises prices again — evaluates alternatives HN thread: "building my own RAG pipeline" — considers it Writing a report — knows he read the perfect source, can't find it Tries a "second brain" app — it doesn't have semantic search
Y

Yuki Tanaka

Hobbyist Maker & Tinkerer

Pain Language

"Every open-source bot is either too basic or LangChain spaghetti" "I break under concurrent requests every time" "No clean extension model — I have to fork the core" "Frameworks abstract too much, I can't debug anything" "I spend the weekend building what should already exist" "No memory = no point, it's just a chatbot"

Desire Language

"I want a clean asyncio foundation I can read" "Proper concurrency, not bolted-on threading" "Clean adapter pattern — add channels without touching the hub" "Local LLM first, cloud fallback" "Something I can fork and extend my way" "A weekend project that becomes a real daily tool"

Communities

r/LocalLLaMA (active contributor) r/MachineLearning GitHub — stars, watches, forks Discord — ML Japan community Twitter/X — AI researchers, Andrej Karpathy

Trigger Events

r/LocalLLaMA post: "clean Python bot that handles concurrency?" His asyncio bot breaks at 3 concurrent users — again Sees Lyra architecture doc on HN — reads the ADRs, impressed New 7B model drops — wants to wire it into a useful interface Weekend project frustration — builds the same thing for the 4th time
Jobs to Be Done — functional, emotional, social dimensions
A
Alex Chen
Privacy-Conscious Developer
Functional
When I need AI assistance on client code, I want to query a local AI without data leaving my machine, so I can get help without violating client data agreements or my own privacy principles.
Emotional
When I use AI tools at work, I want to feel confident that I control my data, so I can stop the background anxiety of "what are they doing with this?"
S
Sofia Rossi
Solo Operator
Functional
When I start a new AI conversation, I want to continue exactly where I left off — with full context of my projects, clients, and tone, so I can stop wasting 10 minutes per session re-explaining who I am.
Emotional
When I'm working across three projects simultaneously, I want to feel like I have a reliable teammate who knows what I'm doing, so I can stop feeling like I'm managing everything alone.
M
Marcus Webb
Knowledge Worker
Functional
When I'm writing a report or researching a topic, I want to semantically query everything I've ever saved — articles, papers, notes, so I can find the source I know exists without spending 30 minutes hunting across tools.
Emotional
When I save something to read later, I want to feel confident it will be findable and useful in a year, so I can stop the nagging feeling that I'm collecting knowledge I'll never be able to use.
Y
Yuki Tanaka
Hobbyist Maker
Functional
When I want to build a personal AI with my own models and channels, I want to start from a clean, auditable asyncio foundation with a real extension model, so I can stop rewriting the same concurrency and routing logic every weekend project.
Emotional
When I spend a weekend building AI infrastructure that breaks under load, I want to feel like I'm standing on a solid foundation someone else figured out, so I can focus my energy on the interesting parts — the agents, the skills, the models.