Huginn
An iOS email client that reads your inbox aloud.
The Idea
Your inbox shouldn't require your eyes. Huginn is a Gmail client for iOS built around a simple idea: you should be able to listen to your emails while commuting, cooking, or doing anything else with your hands and eyes busy.
Named after one of Odin's ravens — the one that represents thought.
What It Does
Listen to Your Email
Huginn generates natural-sounding audio from your emails using on-device speech synthesis. No audio is sent to or generated in the cloud — everything happens on your iPhone.
- Chunked streaming playback — audio starts playing before the full email is synthesized, so there's minimal wait time.
- Queue system — add multiple emails to a listening queue and let them play through like a podcast playlist.
- Persistent playback — pause, leave the app, come back later. Your position and queue are saved.
- Lock screen controls — play, pause, skip forward/back without opening the app.
Choose How It Sounds
- Multiple voice models — download and switch between different TTS voices.
- Per-sender voices — assign specific voices to specific senders so you can tell who's talking just by listening.
- Multi-speaker model support — models with multiple speakers let you pick the one you prefer.
A Proper Email Client
Huginn isn't just an audio player bolted onto Gmail. It's a functional email client:
- Inbox with pull-to-refresh and infinite scroll.
- Full email rendering — HTML emails display correctly via an embedded web view; plain text emails are shown natively.
- Swipe to archive — manage your inbox with familiar gestures.
- Multi-select — select multiple emails and add them all to the queue at once.
- Background queue processing — emails are fetched and synthesized even when the app is in the background.
Privacy-First
- On-device TTS — your email content never leaves your phone for audio generation.
- Minimal API scope — Huginn requests only the permissions it needs from Gmail.
- No tracking, no analytics on email content.
Tech Stack
| Layer | Technology |
|---|---|
| UI | SwiftUI, targeting iOS 26+ |
| Concurrency | Swift structured concurrency (async/await, task groups) |
| TTS Engine | Sherpa ONNX with VITS models and espeak-ng phonemization |
| Audio | AVFoundation (AVAudioPlayer, AVAudioSession) |
| HTML Rendering | WebKit (WKWebView) |
| Networking | URLSession (no third-party HTTP libraries) |
| Auth | Google Sign-In SDK, OAuth 2.0 |
| Backend Services | Firebase |
| Distribution | Fastlane for App Store deployment |
Architecture
MVVM with service injection via SwiftUI's environment system.
- ViewModels use the
@Observablemacro and are annotated@MainActor. - Services are protocol-backed for testability — TTS generation, audio caching, queue persistence, and playback control are all defined behind protocols.
- Navigation is handled by a centralized
Routerobservable.
TTS Pipeline
- Email body is sanitized and split into chunks (roughly 3 sentences each).
- Each chunk is synthesized to audio on-device using Sherpa ONNX.
- Audio chunks are streamed to playback — the first chunk plays while subsequent ones are still being generated.
- Generated audio is cached on disk by email ID to avoid re-synthesis.
Generation is intentionally single-threaded — benchmarking showed that parallelizing inference yields negligible speed gains because the bottleneck is memory bandwidth, not compute.