Appearance
How It Works
This page gives a plain-English overview of the technology behind VoiceOtaku. We won't go into implementation details, but you'll get a clear picture of the pipeline that turns your voice into a personalised anime recommendation.
The Pipeline
When you speak during a call, your voice goes through a four-step process:
Your voice → Transcription → AI thinking → Voice synthesis → Your ears1. Voice Capture
Your browser records audio from your microphone in short chunks while you're speaking. The recording runs locally in your browser — audio is sent to the server only when you finish speaking a segment.
2. Speech-to-Text (Transcription)
The recorded audio is transcribed into text by a speech recognition model. This is the text you see appearing on screen as you speak.
3. AI Recommendation
The transcribed text is passed to a large language model (LLM) that acts as a knowledgeable anime advisor. It reads your full conversation history for context and generates a thoughtful, relevant recommendation.
The AI is designed to:
- Draw on a wide knowledge of anime titles, genres, studios, and themes
- Understand nuanced preferences ("something melancholy but hopeful")
- Ask clarifying questions when your request is ambiguous
- Remember what you said earlier in the same call
4. Text-to-Speech (Voice Synthesis)
The AI's text response is converted back into a natural-sounding voice. This audio is streamed back to your browser and played automatically. You'll hear the recommendation as you read it on screen.
The Queue
VoiceOtaku processes one call at a time. All queuing happens server-side using a fast in-memory store. Your place in the queue is tracked by a temporary session token that lives only for the duration of your visit. See The Queue System for user-facing details.
Privacy
- No accounts required. VoiceOtaku does not require you to log in or register.
- No persistent storage of conversations. Your transcript is not saved after the call ends.
- Temporary session tokens. Any tokens used to manage your queue position expire automatically.
- Microphone access is limited. Your browser only grants microphone access while you are actively on a call.
