Delv
Whisper
Getting Started Guide

How to Use Whisper

A practical guide to get you up and running with Whisper. Written by Delv Editorial, Delv Team.

Getting started with Whisper

After reading this guide, you'll be able to install Whisper on your local machine and transcribe audio files into text efficiently.

Step 1: Sign up and set up

Whisper is free and open-source, so you don’t need to sign up for anything. First, ensure you have Python installed on your machine (Python 3.7 or later). You can download it from python.org.

Next, open your terminal (Command Prompt on Windows, Terminal on Mac/Linux) and install Whisper by running:

pip install git+https://github.com/openai/whisper.git

This command downloads and installs Whisper directly from the GitHub repository.

Step 2: Your first transcription

To transcribe an audio file, place your audio file (e.g., audio.mp3) in an accessible folder. In your terminal, navigate to that folder using the cd command:
cd path/to/your/folder
Then, run the following command to start the transcription:
whisper audio.mp3 --model base
Replace audio.mp3 with the name of your file. The transcription will be saved in the same folder as a text file.

Step 3: Get better results

For improved transcription accuracy, consider using the --model option with different model sizes like small, medium, or large. The command would look like this:
whisper audio.mp3 --model large
Larger models generally yield better results but require more memory and processing power. You can also specify the language of the audio using the --language flag, for example:
whisper audio.mp3 --model base --language English

Pro tip

If you have multiple audio files to transcribe, use a loop in your terminal. For example, on Linux or Mac, you can run:
for file in *.mp3; do whisper "$file"; done
This command transcribes all MP3 files in the folder without needing to type each file name.

Common mistake to avoid

A common mistake is not having the correct audio format. Whisper supports various formats (like MP3, WAV, and FLAC), but ensure your audio is clear and of good quality for the best results. Avoid using excessively noisy or low-quality recordings as they may lead to poor transcription outcomes.