TalkToPod
TalkToPod combines AI powered podcast transcripts with ChatGPT to enable users to chat with their favorite podcasts.
As a podcast fanatic, I personally am interested in connecting the dots between stories and trying to find specific niche details buried in podcasts. I built TalkToPod so I can simply ask a question and access the relevant information in the form of an AI-generated answer.
My strongest need was semantic search over podcasts, but it's quicker and more fun when the information is extracted into an AI generated answer with citations.
How it works
Podcasts are downloaded and the audio stream is segmented by pseudonymous speaker (diarisation).
Each segment is transcribed using OpenAI's open source Whisper model.
Pseudonymous podcast speakers are then identified one of three ways:
1) The top of the transcript is passed to ChatGPT's API to infer speakers by the text of the transcript and pattern of speakers (determining if speakers introduce each other or refer to each other by name in a pattern that can be used to identify their pseudonymous tags)
2) Unidentified speakers can be matched to previously previously vectorized voice prints
3) Manual identification
The transcripts with the identified speakers are embedded and stored in a vector database.
When a user asks a question, semantic search is used to find relevant sections of transcripts (the question is embedded and cosine similarity is used to find related portions of transcripts).
The question and related portions of transcripts are passed to ChatGPT to determine the answer to the question citing the appropriate transcripts.
Examples: