Speech engines with Python (tutorial)

Standard

A computer system used to create artificial speech is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech. How can we use speech synthesis in Python?

Pyttsx
Pyttsx is a cross-platform speech (Mac OSX, Windows, and Linux) library. You can set voice metadata such as age, gender, id, language and name.  Thee speech engine comes with a large amount of voices. Sadly, the default voice sounds very robotic.

Create the code speech1.py

And execute it with python.

Espeak
eSpeak is a compact open source software speech synthesizer for English and other languages, for Linux and Windows.  We can install using:

Create the code speech2.py:

It is very easy to use, but like pyttsx it sounds very robotic.

GoogleTTS
I found a script on Github that uses the Google speech engine.  The script comes with many options and does not speak, instead it saves to an mp3. We added a command to play the mp3 automatically:

Run with:

The voice is extremely natural. The only disadvantage is that you need to be connected with the Internet when running this script.

 Conclusion
GoogleTTS is the most natural speech synthesis engine that we found. While other TTS engines are simple to use they simply do not match the sound quality.  Sadly the number of available voices is limited.

Leave a Reply