One-page case study

RAG-powered Podcast Generator

AI podcast engine that turns PDFs into multi-host audio with transcripts. Flask + Ollama for transcript generation, FAISS retrieval, MARS5-TTS on Apple M4 Metal for voices. Live as the "PDF to Podcast" tool in the AI Lab.

Proof Points

Multi-voice narration pipeline

RAG + OCR workflow

One-job concurrency on M4 GPU

Public web UI proxied to private Mac Mini backend

Challenges

• Extracting clean text from PDFs
• Voice assignment to characters
• Bounding generation time on a single-GPU host
• Safely exposing a Mac Mini service to a public web tool

Learnings

• Text-to-speech synthesis on Metal
• Prompt engineering for dialogue
• Single-tenant job queueing across HTTP
• CORS + Tailscale-fronted private services

Stack

PythonFlaskOllamaFAISSMARS5-TTSPyMuPDFpydubNext.js

RAG-powered Podcast Generator

Proof Points

Challenges

Learnings

Stack

Continue Exploring