Delv
Play.ht
Getting Started Guide

How to Use Play.ht

A practical guide to get you up and running with Play.ht. Written by Delv Editorial, Delv Team.

Getting started with Play.ht

In this guide, you'll learn how to create realistic text-to-speech audio using Play.ht. By the end, you'll be able to generate voiceovers for podcasts, audiobooks, and more in just a few minutes.

Step 1: Sign up and set up

  1. Go to play.ht.
  2. Click on the "Sign Up" button in the top right corner.
  3. Choose to sign up using your email or your Google account.
  4. After signing up, confirm your email address if prompted.
  5. Once logged in, you'll be taken to your dashboard. You can start with the free plan, which allows for a limited number of characters each month.

Step 2: Your first audio generation

  1. From the dashboard, click on the "Create New Audio" button.
  2. In the text box, type or paste the text you want to convert to speech.
  3. Select a voice from the dropdown menu on the right. You can listen to samples by clicking the play button next to each voice.
  4. Adjust the settings for speed and pitch if desired.
  5. Click the "Generate" button. Once processing is complete, you'll see your audio file listed below.
  6. Click the "Download" button to save your audio file to your device.

Step 3: Get better results

  1. Experiment with different voices and accents to find the best fit for your project.
  2. Use the "Advanced Settings" to adjust the emphasis and pauses in your text for a more natural sound.
  3. If you're using voice cloning, click on "Voice Cloning" in the sidebar and follow the instructions to upload your audio sample for custom voice creation.

Pro tip

Use the "Preview" feature before finalising your audio. This allows you to listen to your text-to-speech output and make any necessary adjustments without generating a new file.

Common mistake to avoid

Avoid typing long paragraphs without breaks. Use appropriate punctuation and line breaks to help the AI understand where to pause, resulting in a more natural-sounding voiceover.