pyVideoTrans Open Source Video Translation

Everyone knows that Microsoft Edge browser has a powerful read-aloud feature that supports dozens of languages, each with different characters to choose from for pronunciation, with excellent results.

Based on this, a developer created a Python package called edge-tts. This package allows the use of Microsoft's TTS service in programs for dubbing text or subtitles. For example, the video translation software pyVideoTrans integrates edge-tts, and users can directly select it in the dubbing channels.

However, unfortunately, there is a serious problem of abuse of Microsoft TTS by domestic users, and some people even use it for commercial dubbing sales. This has led Microsoft to restrict domestic access. If used too frequently, a 403 error may occur, and only switching IPs or connecting to a stable foreign VPN can continue to be used.

So, is it possible to build a simple transfer service on a foreign server for your own use? This would not only improve stability, but also make the interface compatible with OpenAI TTS, so that it can be used directly in the OpenAI SDK.

The answer is yes. I recently took the time to create a Docker image that can be easily pulled and started on the server.

After starting, the service interface is fully compatible with OpenAI. You only need to change the API address to http://deployment_server_ip:7899/v1 to seamlessly replace OpenAI TTS. In addition, it can also be used directly in video translation software.

The following will introduce in detail how to deploy and use it:
Step 1: Purchase and open a US server
Step 2: Allow port 7899 in the firewall
Step 3: Connect to the terminal and log in to the server
Step 4: Install Docker
Step 5: Pull the edge-tts-api image and start the API service

If you already have a server and Docker installed, you can skip to step five to pull the image

Step 1: Purchase and open a US server

It is recommended to choose a server in the US region, because there are fewer or no restrictions. The server operating system can choose the Linux series, and the following takes Debian 12 as an example, and uses my personal Yecaoyun as an example. The reason for choosing it is very simple: it is cheap and relatively stable, which is enough for dubbing transfer.

If you already have a European or American Linux server, you can skip this section and read the next section directly. If not, please continue reading.

Open this link to the Yecaoyun website, and select Product Services -> American AMD VPS in the top navigation bar.

Then, select any of the top four configurations, which should be sufficient.

I personally use the 29 yuan/month configuration.

Click the "Buy Now" button to enter the configuration page. Here, select the server operating system as Debian 12, set the server password, and keep the other settings as default.

After the payment is completed, wait a few minutes for the server to be created and started successfully. Next, you need to set the firewall to open port 7899. Only by releasing this port can you connect to the service for dubbing.

Step 2: Allow port 7899 in the firewall

If you plan to use a domain name and configure Nginx reverse proxy, you do not need to release the port. If you are not familiar with these, for the sake of simplicity, it is recommended to directly release the port.

The firewall settings interface varies depending on the server and panel. The following uses my Yecaoyun panel as an example, and other panels can be referenced. If you know how to release the port, you can skip this section and read the next section directly.

First, in "My Products and Services", click the product you just opened to enter the product information and management page.

On this page, you can find the server's IP address and password information.

Find "Firewall" under "Additional Tools" and click to open it.

Then release port 7899, as shown in the figure below:

Step 3: Connect to the terminal and log in to the server

If you already know how to connect to the terminal, or have other SSH terminals such as Xshell, you can skip this step and read the next section directly.

On the product information page, find Xterm.js Console and click it. Then operate as shown in the figure below:

When the above figure appears, press Enter a few times.

When Login: is displayed, enter root after it, and then press Enter.

Then Password: will appear, and you need to paste the password you copied (if you forget it, you can find it on the product information page).

Note: Do not use Ctrl+V or right-click to paste when pasting, as this may cause extra spaces or line breaks to be entered, causing the password to be incorrect.

Hold down the Shift key + Insert key to paste the password, to prevent the password from being correct but unable to log in, and then press Enter.

The login is successful as shown in the figure below.

Step 4: Install Docker

If your server has already installed Docker or knows how to install it, you can skip this step.

Execute the following 5 commands in sequence, and pay attention to executing the next command after each command is executed successfully. These commands are only applicable to the Debian 12 series server.

After [root@xxxxxx~]#, right-click to paste the following command, and press Enter to execute after pasting.

Command sudo apt update && sudo apt install -y apt-transport-https ca-certificates curl gnupg

Command 2: curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

Command 3: echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Command 4: sudo apt update && sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin

Command 5: Start the Docker service. sudo systemctl start docker && sudo systemctl enable docker && sudo usermod -aG docker $USER

This command can be right-clicked to paste, and press Enter after pasting.

Step 5: Pull the edge-tts-api image and start the API service

Enter the following command to automatically pull the image and start the service. After the startup is successful, you can use it in video translation software or other tools that support OpenAI TTS.

docker run -p 7899:7899 jianchang512/edge-tts-api:latest

Press Ctrl+C continuously to stop the service.

Note that this command will run in the foreground. If you close the terminal window, the service will stop.
You can use the following command instead, which will start the service in the background. You can safely close the terminal after execution.
docker run -d -p 7899:7899 jianchang512/edge-tts-api:latest

If there is no error, it means the startup is successful. You can open http://your_ip:7899/v1/audio/speech in your browser to verify. If a result similar to the figure below appears, it means the startup is successful.

Use in video translation software

Please upgrade the software to v3.40 before you can use it. Upgrade download address https://pyvideotrans.com/downpackage

Open the menu and enter TTS Settings->OpenAI TTS to change the interface address to http://your_ip:7899/v1

SK can be filled in arbitrarily, as long as it is not empty. In the list of roles, use commas to separate the roles you want to use.

Available Roles

The following is a list of available roles. Please note that the text language and role must match.

Chinese pronunciation roles:
    zh-HK-HiuGaaiNeural
    zh-HK-HiuMaanNeural
    zh-HK-WanLungNeural
    zh-CN-XiaoxiaoNeural
    zh-CN-XiaoyiNeural
    zh-CN-YunjianNeural
    zh-CN-YunxiNeural
    zh-CN-YunxiaNeural
    zh-CN-YunyangNeural
    zh-CN-liaoning-XiaobeiNeural
    zh-TW-HsiaoChenNeural
    zh-TW-YunJheNeural
    zh-TW-HsiaoYuNeural
    zh-CN-shaanxi-XiaoniNeural

English roles:
    en-AU-NatashaNeural
    en-AU-WilliamNeural
    en-CA-ClaraNeural
    en-CA-LiamNeural
    en-HK-SamNeural
    en-HK-YanNeural
    en-IN-NeerjaExpressiveNeural
    en-IN-NeerjaNeural
    en-IN-PrabhatNeural
    en-IE-ConnorNeural
    en-IE-EmilyNeural
    en-KE-AsiliaNeural
    en-KE-ChilembaNeural
    en-NZ-MitchellNeural
    en-NZ-MollyNeural
    en-NG-AbeoNeural
    en-NG-EzinneNeural
    en-PH-JamesNeural
    en-PH-RosaNeural
    en-SG-LunaNeural
    en-SG-WayneNeural
    en-ZA-LeahNeural
    en-ZA-LukeNeural
    en-TZ-ElimuNeural
    en-TZ-ImaniNeural
    en-GB-LibbyNeural
    en-GB-MaisieNeural
    en-GB-RyanNeural
    en-GB-SoniaNeural
    en-GB-ThomasNeural
    en-US-AvaMultilingualNeural
    en-US-AndrewMultilingualNeural
    en-US-EmmaMultilingualNeural
    en-US-BrianMultilingualNeural
    en-US-AvaNeural
    en-US-AndrewNeural
    en-US-EmmaNeural
    en-US-BrianNeural
    en-US-AnaNeural
    en-US-AriaNeural
    en-US-ChristopherNeural
    en-US-EricNeural
    en-US-GuyNeural
    en-US-JennyNeural
    en-US-MichelleNeural
    en-US-RogerNeural
    en-US-SteffanNeural

Japanese roles:
    ja-JP-KeitaNeural
    ja-JP-NanamiNeural

Korean roles:
    ko-KR-HyunsuNeural
    ko-KR-InJoonNeural
    ko-KR-SunHiNeural

French roles:
    fr-BE-CharlineNeural
    fr-BE-GerardNeural
    fr-CA-ThierryNeural
    fr-CA-AntoineNeural
    fr-CA-JeanNeural
    fr-CA-SylvieNeural
    fr-FR-VivienneMultilingualNeural
    fr-FR-RemyMultilingualNeural
    fr-FR-DeniseNeural
    fr-FR-EloiseNeural
    fr-FR-HenriNeural
    fr-CH-ArianeNeural
    fr-CH-FabriceNeural

German roles:
    de-AT-IngridNeural
    de-AT-JonasNeural
    de-DE-SeraphinaMultilingualNeural
    de-DE-FlorianMultilingualNeural
    de-DE-AmalaNeural
    de-DE-ConradNeural
    de-DE-KatjaNeural
    de-DE-KillianNeural
    de-CH-JanNeural
    de-CH-LeniNeural

Spanish roles:
    es-AR-ElenaNeural
    es-AR-TomasNeural
    es-BO-MarceloNeural
    es-BO-SofiaNeural
    es-CL-CatalinaNeural
    es-CL-LorenzoNeural
    es-ES-XimenaNeural
    es-CO-GonzaloNeural
    es-CO-SalomeNeural
    es-CR-JuanNeural
    es-CR-MariaNeural
    es-CU-BelkysNeural
    es-CU-ManuelNeural
    es-DO-EmilioNeural
    es-DO-RamonaNeural
    es-EC-AndreaNeural
    es-EC-LuisNeural
    es-SV-LorenaNeural
    es-SV-RodrigoNeural
    es-GQ-JavierNeural
    es-GQ-TeresaNeural
    es-GT-AndresNeural
    es-GT-MartaNeural
    es-HN-CarlosNeural
    es-HN-KarlaNeural
    es-MX-DaliaNeural
    es-MX-JorgeNeural
    es-NI-FedericoNeural
    es-NI-YolandaNeural
    es-PA-MargaritaNeural
    es-PA-RobertoNeural
    es-PY-MarioNeural
    es-PY-TaniaNeural
    es-PE-AlexNeural
    es-PE-CamilaNeural
    es-PR-KarinaNeural
    es-PR-VictorNeural
    es-ES-AlvaroNeural
    es-ES-ElviraNeural
    es-US-AlonsoNeural
    es-US-PalomaNeural
    es-UY-MateoNeural
    es-UY-ValentinaNeural
    es-VE-PaolaNeural
    es-VE-SebastianNeural

Arabic roles:
    ar-DZ-AminaNeural
    ar-DZ-IsmaelNeural
    ar-BH-AliNeural
    ar-BH-LailaNeural
    ar-EG-SalmaNeural
    ar-EG-ShakirNeural
    ar-IQ-BasselNeural
    ar-IQ-RanaNeural
    ar-JO-SanaNeural
    ar-JO-TaimNeural
    ar-KW-FahedNeural
    ar-KW-NouraNeural
    ar-LB-LaylaNeural
    ar-LB-RamiNeural
    ar-LY-ImanNeural
    ar-LY-OmarNeural
    ar-MA-JamalNeural
    ar-MA-MounaNeural
    ar-OM-AbdullahNeural
    ar-OM-AyshaNeural
    ar-QA-AmalNeural
    ar-QA-MoazNeural
    ar-SA-HamedNeural
    ar-SA-ZariyahNeural
    ar-SY-AmanyNeural
    ar-SY-LaithNeural
    ar-TN-HediNeural
    ar-TN-ReemNeural
    ar-AE-FatimaNeural
    ar-AE-HamdanNeural
    ar-YE-MaryamNeural
    ar-YE-SalehNeural

Bengali roles:
    bn-BD-NabanitaNeural
    bn-BD-PradeepNeural
    bn-IN-BashkarNeural
    bn-IN-TanishaaNeural

Czech roles
    cs-CZ-AntoninNeural
    cs-CZ-VlastaNeural

Dutch roles:
    nl-BE-ArnaudNeural
    nl-BE-DenaNeural
    nl-NL-ColetteNeural
    nl-NL-FennaNeural
    nl-NL-MaartenNeural

Hebrew roles:
    he-IL-AvriNeural
    he-IL-HilaNeural

Hindi roles:
    hi-IN-MadhurNeural
    hi-IN-SwaraNeural

Hungarian roles:
    hu-HU-NoemiNeural
    hu-HU-TamasNeural

Indonesian roles:
    id-ID-ArdiNeural
    id-ID-GadisNeural

Italian roles:
    it-IT-GiuseppeNeural
    it-IT-DiegoNeural
    it-IT-ElsaNeural
    it-IT-IsabellaNeural

Kazakh roles:
    kk-KZ-AigulNeural
    kk-KZ-DauletNeural

Malay roles:
    ms-MY-OsmanNeural
    ms-MY-YasminNeural

Polish roles:
    pl-PL-MarekNeural
    pl-PL-ZofiaNeural

Portuguese roles:
    pt-BR-ThalitaNeural
    pt-BR-AntonioNeural
    pt-BR-FranciscaNeural
    pt-PT-DuarteNeural
    pt-PT-RaquelNeural

Russian roles:
    ru-RU-DmitryNeural
    ru-RU-SvetlanaNeural

Swedish roles:
    sw-KE-RafikiNeural
    sw-KE-ZuriNeural
    sw-TZ-DaudiNeural
    sw-TZ-RehemaNeural

Thai roles:
    th-TH-NiwatNeural
    th-TH-PremwadeeNeural

Turkish roles:
    tr-TR-AhmetNeural
    tr-TR-EmelNeural

Ukrainian roles:
    uk-UA-OstapNeural
    uk-UA-PolinaNeural

Vietnamese roles:
    vi-VN-HoaiMyNeural
    vi-VN-NamMinhNeural

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265

Use in OpenAI sdk

Need to install the openai library pip install openai

from openai import OpenAI

client = OpenAI(api_key='12314', base_url='http://your_ip:7899/v1')
with  client.audio.speech.with_streaming_response.create(
                    model='tts-1',
                    voice='zh-CN-YunxiNeural',
                    input='Hello, dear friends',
                    speed=1.0
                ) as response:
    with open('./test.mp3', 'wb') as f:
       for chunk in response.iter_bytes():
            f.write(chunk)

Call directly using requests

import requests
res=requests.post('http://your_ip:7899/v1',data={"voice":"zh-CN-YunxiNeural",
                    "input":"Hello, dear friends",
                    speed=1.0 })
with open('./test.mp3', 'wb') as f:
    f.write(res.content)

Step 1: Purchase and open a US server ​

Step 2: Allow port 7899 in the firewall ​

Step 3: Connect to the terminal and log in to the server ​

Step 4: Install Docker ​

Step 5: Pull the edge-tts-api image and start the API service ​

Use in video translation software ​

Available Roles ​

Use in OpenAI sdk ​

Call directly using requests ​