Blog freshness: Research notes liveLatest update: May 2026Telemetry mode: Public-safe live stripAI tools: Self-hosted demos live
Skip to main content

One-page case study

RAG-powered Podcast Generator

AI podcast engine that turns PDFs into multi-host audio with transcripts. Flask + Ollama for transcript generation, FAISS retrieval, MARS5-TTS on Apple M4 Metal for voices. Live as the "PDF to Podcast" tool in the AI Lab.

Proof Points

Multi-voice narration pipeline
RAG + OCR workflow
One-job concurrency on M4 GPU
Public web UI proxied to private Mac Mini backend

Challenges

  • Extracting clean text from PDFs
  • Voice assignment to characters
  • Bounding generation time on a single-GPU host
  • Safely exposing a Mac Mini service to a public web tool

Learnings

  • Text-to-speech synthesis on Metal
  • Prompt engineering for dialogue
  • Single-tenant job queueing across HTTP
  • CORS + Tailscale-fronted private services

Stack

PythonFlaskOllamaFAISSMARS5-TTSPyMuPDFpydubNext.js