Audio

Video Tutorial

AI Diarization : Ask Questions on Audio & Video Recordings

AI Diarization with QAnswer: Ask Questions on Audio & Video Recordings

This video explains how QAnswer’s AI diarization feature detects speakers in audio and video files, generates searchable transcripts, and allows users to ask questions using text or voice to get precise answers from recordings.

Meetings, interviews, podcasts, and recordings often contain valuable information but finding specific answers means replaying hours of audio.

In this tutorial, we show how AI Diarization in QAnswer lets you talk to your audio and video files instead.

What you’ll see in this video:

Upload any audio or video file
Automatically detect and separate speakers
Get a clean, diarized transcript showing who said what
Ask questions using text or voice

Get precise answers grounded in the actual conversation.

Common use cases:

✦ Meetings: Track decisions and commitments without re-listening

✦ Interviews: Automatically separate questions and answers

✦ Call centers: Understand agent vs customer conversations

✦ Podcasts & briefings: Extract insights instantly

QAnswer uses open-source models, so you can run it on your own servers.
Your recordings stay private, secure and fully under your control.

🎥 Watch the video to see diarization in action.

👉 Try it yourself: https://www.app.qanswer.ai/

Audio - Data Source

You can use one or several audio files or recordings as your data source. This section explains how to upload audio files or recordings for use as a data source.

Click on Audio to add audios as a data source:

Upload

Upload your files (click or drag and drop on the dedicated area):

info

The following file formats are currently supported: .mp3, .wav, .flac, .m4a, .mp4, .mov

Empty
Files uploaded

Recording

Record your audio (click on the start recording button):

Empty
Audio recorded

Diarization (Detecting Speakers)

To enable the Diarization feature, click the toggle switch on the right side of the audio you want to apply the diarization.

Diarization is the process of automatically identifying speakers and dividing the audio into distinct segments, with each segment assigned to a specific speaker.

info

By default, audio is converted to text without separating segments or identifying speakers—this is known as transcription.

After you’ve selected all your audio files, click “Next”.

You’ll be redirected to the results page, where the text output is organized into three categories: "All", "Diarized", and "Transcribed". By default, the "All" tab is selected.

Update Speaker

If you want to update a speaker’s name for a segment, click the icon to the right of the current name. You can choose to update the name for all segments associated with that speaker or only for the current segment.

Update all
Update current

info

Only applicable to diarized audio.

Update segment

If you want to update the text in a segment, click the icon on the right side of the segment.

info

Applicable to both transcribed and diarized audio.

If you want to listen to the audio for a specific segment, click the icon to the right of the time interval.

info

Applicable to both transcribed and diarized audio.

After verifying all the text results, click “Finish.” You will be redirected to the data source page:

Video Tutorial​

AI Diarization : Ask Questions on Audio & Video Recordings​

Audio - Data Source​

Upload​

Recording​

Diarization (Detecting Speakers)​

Update Speaker​

Update segment​

Listen related audio​

Video Tutorial

AI Diarization : Ask Questions on Audio & Video Recordings

Audio - Data Source

Upload

Recording

Diarization (Detecting Speakers)

Update Speaker

Update segment

Listen related audio