
Getting Started Guide
How to Use Microsoft Azure Speech
A practical guide to get you up and running with Microsoft Azure Speech. Written by Delv Editorial, Delv Team.
Getting started with Microsoft Azure Speech
In this guide, you will learn how to quickly set up Microsoft Azure Speech and start using its text-to-speech and speech-to-text features. By the end, you’ll be able to convert text into natural-sounding speech and transcribe audio with high accuracy.Step 1: Sign up and set up
- Go to the Microsoft Azure Speech website.
- Click on the "Get started" button.
- If you don’t have a Microsoft account, click "Create one!" and follow the prompts to sign up.
- Once signed in, navigate to the Azure portal by clicking on "Portal" in the top right corner.
- In the Azure portal, select "Create a resource" from the left-hand menu.
- Search for "Speech" and select "Speech" from the list.
- Click "Create" and fill in the required fields (Subscription, Resource group, Region, and Name).
- Click "Review + create" and then "Create" to provision your Speech resource.
Step 2: Your first text-to-speech task
- In the Azure portal, navigate to your Speech resource.
- Click on "Keys and Endpoint" in the left menu to find your API key and endpoint URL.
- Open a new browser tab and go to the Azure Speech Studio.
- Sign in with your Microsoft account.
- Click on "Try Speech" in the top menu and select "Text to Speech."
- In the text box, enter the text you want to convert to speech.
- Choose a voice from the dropdown menu and adjust the settings like pitch and speed if desired.
- Click the "Play" button to listen to the generated speech.
Step 3: Get better results
- Explore different voices and languages by selecting them from the dropdown menu to see which fits your content best.
- Use the "SSML" option for more control over pronunciation, pitch, and pauses by entering Speech Synthesis Markup Language (SSML) tags in your text.
- For speech-to-text, upload an audio file in the "Speech to Text" section and ensure it’s in a supported format (like WAV or MP3).