Didier Fuentes
Projectapps·

Huginn

An iOS email client that reads your inbox aloud.

The Idea

Your inbox shouldn't require your eyes. Huginn is a Gmail client for iOS built around a simple idea: you should be able to listen to your emails while commuting, cooking, or doing anything else with your hands and eyes busy.

Named after one of Odin's ravens — the one that represents thought.

What It Does

Listen to Your Email

Huginn generates natural-sounding audio from your emails using on-device speech synthesis. No audio is sent to or generated in the cloud — everything happens on your iPhone.

  • Chunked streaming playback — audio starts playing before the full email is synthesized, so there's minimal wait time.
  • Queue system — add multiple emails to a listening queue and let them play through like a podcast playlist.
  • Persistent playback — pause, leave the app, come back later. Your position and queue are saved.
  • Lock screen controls — play, pause, skip forward/back without opening the app.

Choose How It Sounds

  • Multiple voice models — download and switch between different TTS voices.
  • Per-sender voices — assign specific voices to specific senders so you can tell who's talking just by listening.
  • Multi-speaker model support — models with multiple speakers let you pick the one you prefer.

A Proper Email Client

Huginn isn't just an audio player bolted onto Gmail. It's a functional email client:

  • Inbox with pull-to-refresh and infinite scroll.
  • Full email rendering — HTML emails display correctly via an embedded web view; plain text emails are shown natively.
  • Swipe to archive — manage your inbox with familiar gestures.
  • Multi-select — select multiple emails and add them all to the queue at once.
  • Background queue processing — emails are fetched and synthesized even when the app is in the background.

Privacy-First

  • On-device TTS — your email content never leaves your phone for audio generation.
  • Minimal API scope — Huginn requests only the permissions it needs from Gmail.
  • No tracking, no analytics on email content.

Tech Stack

LayerTechnology
UISwiftUI, targeting iOS 26+
ConcurrencySwift structured concurrency (async/await, task groups)
TTS EngineSherpa ONNX with VITS models and espeak-ng phonemization
AudioAVFoundation (AVAudioPlayer, AVAudioSession)
HTML RenderingWebKit (WKWebView)
NetworkingURLSession (no third-party HTTP libraries)
AuthGoogle Sign-In SDK, OAuth 2.0
Backend ServicesFirebase
DistributionFastlane for App Store deployment

Architecture

MVVM with service injection via SwiftUI's environment system.

  • ViewModels use the @Observable macro and are annotated @MainActor.
  • Services are protocol-backed for testability — TTS generation, audio caching, queue persistence, and playback control are all defined behind protocols.
  • Navigation is handled by a centralized Router observable.

TTS Pipeline

  1. Email body is sanitized and split into chunks (roughly 3 sentences each).
  2. Each chunk is synthesized to audio on-device using Sherpa ONNX.
  3. Audio chunks are streamed to playback — the first chunk plays while subsequent ones are still being generated.
  4. Generated audio is cached on disk by email ID to avoid re-synthesis.

Generation is intentionally single-threaded — benchmarking showed that parallelizing inference yields negligible speed gains because the bottleneck is memory bandwidth, not compute.

Links