Skip to content

The IMS Toucan TTS project claims to support voiceovers in over 7000 languages. I downloaded and tried it, and it does work, but the effect is so-so, not excellent. If you don't have high requirements, you can use it.

This project does not have several fixed voice options like edge-tts. Instead, each language has a fixed voice. You can fine-tune random voices, seeds, genders, etc. through several parameters such as prosody_creativity/duration_scaling_factor/voice_seed/emb1.

Project address https://github.com/DigitalPhonetics/IMS-Toucan

Local Deployment Method

You can go directly to the project's official website and deploy the source code according to the instructions: https://github.com/DigitalPhonetics/IMS-Toucan

I've also created a Windows integration package for those who don't want to go through the trouble.

Download the integration package from Baidu Netdisk and extract it to a directory, such as D:/python/IMS-Toucan.

Integration package download address https://pan.baidu.com/s/1om62tz-fmq4o5sijmHmnMQ?pwd=dck6

After decompression, you will find an espeak-ng-X64.msi file, which can be installed or not. Installation will improve the sound effect. Double-click to follow the default settings.

image.png

You will see 3 bat files in the directory. Double-click to execute them.

image.png

Start api and simple webpage.bat:

Double-clicking will start an api interface service and open a simple webpage, which can be used to connect to the custom TTS interface of the video translation software. This api only supports 24 commonly used languages.

image.png

The interface address is http://127.0.0.1:5020/api, which can be filled in the custom TTS interface of the video translation software.

Start complete webpage ui.bat:

Double-clicking will start the web interface that comes with IMS Toucan, which supports the synthesis and voiceover of all languages. You can try to explore it yourself.

image.png

If the browser does not automatically open the page, manually copy the address and open it in the browser when the terminal displays the following image. image.png

Start advanced QT-ui.bat:

Double-click to start the built-in software interface. This interface has not been localized into Chinese. If you are interested, you can study it.

image.png

Notes

  1. When starting, the terminal window may display a bunch of information, as shown in the figure below. This is not an error and can be ignored.

image.png

  1. The api and the complete webpage ui interface will automatically open the corresponding page in the browser after starting, and the advanced QT will automatically open the software.

  2. Sometimes, a bunch of errors may be prompted, including the Microsoft website https://docs.microsoft.com. At this time, please close the window and re-run the bat as administrator.

  3. The integration package comes with its own model, but it may detect whether there are model updates when starting, and needs to connect to https://huggingface.co. If you are outside China and cannot access it, you need to prepare your own proxy. When the word HTTPSConnect appears in the error, it means you need to use a global or system proxy.

Using in Video Translation Software

First, upgrade the video translation software to the latest patch package, download address https://pyvideotrans.com

After starting the software, click Menu-TTS Settings-Custom TTS Interface, fill http://127.0.0.1:5020/api in the api address, and you can fill in letters in the role list at will, such as a,b,c, etc.

image.png

image.png

After testing without problems, you can use it.

image.png