Skip to main content

Audio

Video Tutorial​

AI Diarization : Ask Questions on Audio & Video Recordings​

AI Diarization with QAnswer: Ask Questions on Audio & Video Recordings

This video explains how QAnswer’s AI diarization feature detects speakers in audio and video files, generates searchable transcripts, and allows users to ask questions using text or voice to get precise answers from recordings.


Meetings, interviews, podcasts, and recordings often contain valuable information but finding specific answers means replaying hours of audio.


In this tutorial, we show how AI Diarization in QAnswer lets you talk to your audio and video files instead.


What you’ll see in this video:

  1. Upload any audio or video file
  2. Automatically detect and separate speakers
  3. Get a clean, diarized transcript showing who said what
  4. Ask questions using text or voice

Get precise answers grounded in the actual conversation.


Common use cases:


✦ Meetings: Track decisions and commitments without re-listening

✦ Interviews: Automatically separate questions and answers

✦ Call centers: Understand agent vs customer conversations

✦ Podcasts & briefings: Extract insights instantly


QAnswer uses open-source models, so you can run it on your own servers.
Your recordings stay private, secure and fully under your control.


🎥 Watch the video to see diarization in action.


👉 Try it yourself: https://www.app.qanswer.ai/

Audio - Data Source​

You can use one or several audio files or recordings as your data source. This section explains how to upload audio files or recordings for use as a data source.

Click on Audio to add audios as a data source:

Upload​

Upload your files (click or drag and drop on the dedicated area):

info

The following file formats are currently supported: .mp3, .wav, .flac, .m4a, .mp4, .mov

Recording​

Record your audio (click on the start recording button):

Diarization (Detecting Speakers)​

To enable the Diarization feature, click the toggle switch on the right side of the audio you want to apply the diarization.

Diarization is the process of automatically identifying speakers and dividing the audio into distinct segments, with each segment assigned to a specific speaker.

info

By default, audio is converted to text without separating segments or identifying speakers—this is known as transcription.

After you’ve selected all your audio files, click “Next”.

You’ll be redirected to the results page, where the text output is organized into three categories: "All", "Diarized", and "Transcribed". By default, the "All" tab is selected.

Update Speaker​

If you want to update a speaker’s name for a segment, click the icon to the right of the current name. You can choose to update the name for all segments associated with that speaker or only for the current segment.

info

Only applicable to diarized audio.

Update segment​

If you want to update the text in a segment, click the icon on the right side of the segment.

info

Applicable to both transcribed and diarized audio.

If you want to listen to the audio for a specific segment, click the icon to the right of the time interval.

info

Applicable to both transcribed and diarized audio.

After verifying all the text results, click “Finish.” You will be redirected to the data source page:

Join Us

We value your feedback and are always here to assist you.
If you need additionnal help, feel free to join our Discord server. We look forward to hearing from you!

Discord Community Server