Huginn

The Idea

Your inbox shouldn't require your eyes. Huginn is a Gmail client for iOS built around a simple idea: you should be able to listen to your emails while commuting, cooking, or doing anything else with your hands and eyes busy.

Named after one of Odin's ravens — the one that represents thought.

What It Does

Listen to Your Email

Huginn generates natural-sounding audio from your emails using on-device speech synthesis. No audio is sent to or generated in the cloud — everything happens on your iPhone.

Chunked streaming playback — audio starts playing before the full email is synthesized, so there's minimal wait time.
Queue system — add multiple emails to a listening queue and let them play through like a podcast playlist.
Persistent playback — pause, leave the app, come back later. Your position and queue are saved.
Lock screen controls — play, pause, skip forward/back without opening the app.

Choose How It Sounds

Multiple voice models — download and switch between different TTS voices.
Per-sender voices — assign specific voices to specific senders so you can tell who's talking just by listening.
Multi-speaker model support — models with multiple speakers let you pick the one you prefer.

A Proper Email Client

Huginn isn't just an audio player bolted onto Gmail. It's a functional email client:

Inbox with pull-to-refresh and infinite scroll.
Full email rendering — HTML emails display correctly via an embedded web view; plain text emails are shown natively.
Swipe to archive — manage your inbox with familiar gestures.
Multi-select — select multiple emails and add them all to the queue at once.
Background queue processing — emails are fetched and synthesized even when the app is in the background.

Privacy-First

On-device TTS — your email content never leaves your phone for audio generation.
Minimal API scope — Huginn requests only the permissions it needs from Gmail.
No tracking, no analytics on email content.

Tech Stack

Layer	Technology
UI	SwiftUI, targeting iOS 26+
Concurrency	Swift structured concurrency (async/await, task groups)
TTS Engine	Sherpa ONNX with VITS models and espeak-ng phonemization
Audio	AVFoundation (AVAudioPlayer, AVAudioSession)
HTML Rendering	WebKit (WKWebView)
Networking	URLSession (no third-party HTTP libraries)
Auth	Google Sign-In SDK, OAuth 2.0
Backend Services	Firebase
Distribution	Fastlane for App Store deployment

Architecture

MVVM with service injection via SwiftUI's environment system.

ViewModels use the @Observable macro and are annotated @MainActor.
Services are protocol-backed for testability — TTS generation, audio caching, queue persistence, and playback control are all defined behind protocols.
Navigation is handled by a centralized Router observable.

TTS Pipeline

Email body is sanitized and split into chunks (roughly 3 sentences each).
Each chunk is synthesized to audio on-device using Sherpa ONNX.
Audio chunks are streamed to playback — the first chunk plays while subsequent ones are still being generated.
Generated audio is cached on disk by email ID to avoid re-synthesis.

Generation is intentionally single-threaded — benchmarking showed that parallelizing inference yields negligible speed gains because the bottleneck is memory bandwidth, not compute.