Generating text-to-speech using Audition

The Generate Speech tool enables you to paste or type text, and generate a realistic voice-over or narration track. The tool uses the libraries available in your Operating System. Use this tool to create synthesized voices for videos, games, and audio productions.

Speech Generation on Mac uses a different underlying speech synthesis engine than Windows. Both engines are provided by the respective operating system and are not cross-platform compatible. As such, the XML tags that Windows supports in its engine are not compatible on Mac, and vice versa for the tag format that Mac supports.


Voices have license restrictions for commercial or public usage. Check if you have rights to distribute any work containing voices.

Generate speech

  1. Generate speech in either Waveform view or Multitrack view:

    Waveform view:

    • Choose File > New > Audio File and create a mono audio file.
    • Choose Effects > Generate > Speech.

    Multitrack view:

    • Position the playhead and select the track to insert the speech.
    • Choose Effects > Generate > Speech.
    Generate Speech
  2. In the Generate Speech dialog box, you can select the language, gender, and voice of the speech to synthesize. In macOS and Windows, you can find additional voices in the following ways:

    • macOS: In the dialog box, click Settings. Choose System Voice > Customize. You can install voices and languages, directly from Apple.

      You can also use embedded speech commands to create speech. See the Apple developer documentation on using embedded speech commands.

    Text 2 Speech Mac
    Customize Mac

    Click OK.

  3. You can download new voices and languages from Cepstral or NeoSpeech.

    Enter the text in the text entry field. Click Preview to hear the speech in the current voice.

    You can control pronunciation and other parameters by using the tags. For example, type [[emph +]]  to speak a word or phrase with emphasis.