pyVideoTrans Open Source Video Translation

In Menu Bar--Help/About there are many links, such as model download addresses, CUDA configuration, etc. You can try to open and use them when you encounter problems.

1. Cannot Open After Double-Clicking sp.exe

The software is developed based on pyside6. The main interface uses a lot of Qt components, so loading may be slow, ranging from 5 seconds to 2 minutes. Please be patient.

If it still does not fully display within a few minutes, and does not display a startup screen, but only a black window, then the program may have an error. Please check the console for errors. For the pre-packaged version, please check whether you have only downloaded the upgrade package. If so, please download the complete package.

If you have tried all methods and waited a long time and still cannot open it, please open the latest log file in the logs folder, view the error information, or submit the file to GitHub Issue or ask a question on bbs.pyvideotrans.com and upload the file.

2. Video Clarity Reduced

Some operations in the translation process involve transcoding, and transcoding will inevitably cause quality loss. If you want to minimize the loss, you can do the following:

The original video uses libx264 encoded mp4 video
In Menu--Tools--Advanced Settings, make the following settings

3. Error in Translation Stage

A red error appears during the translation process after the subtitles are recognized. This is generally a "network connection error" or an "account problem" with the translation channel used.

If you are using the "Google", "Microsoft", "Gemini" and other translation channels, then it is likely a network connection problem. You need to enable scientific Internet access and fill in the network proxy address provided in the "scientific Internet access tool" in the network proxy address text box.

If you have already accessed the Internet scientifically, but still report a network connection error, your proxy may be unavailable. Please correctly fill in the http proxy ip and port number provided in the scientific tool.

If it is determined that the proxy is available, but the error still occurs, it is likely that the account is unavailable. For example, Gemini is not available in all countries. You can try switching the proxy node to another country.

Gemini, ChatGPT, and AzureGPT channels generally have request frequency limits. Excluding network problems and account problems, it may be that the request frequency has exceeded the limit. At this time, you can open Menu--Tools--Advanced Settings and set "Pause time after translation/s" to 30 or a larger number.

3. Voice Recognition Accuracy Too Low

In faster mode and openai mode, using a larger model can improve accuracy. tiny is a small model with poor recognition effect, and large-v3 is the largest model with the best recognition effect. All model download addresses are https://pyvideotrans.com/model
If the original video is in Chinese pronunciation, you can try using zh_recogn, which has a better effect. Instructions for use: https://pyvideotrans.com/zh_recogn.html
Select "Keep background sound", which will denoise in advance, and the recognition effect is better, but note that if the video is very large, please do not select it.

4. Model Download Address

Model download address https://pyvideotrans.com/model

5. Is it Available on Win7

Win7 system is not supported

6. Prompt Missing python310.dll

Maybe you only downloaded the upgrade patch package. The patch package cannot be used alone. Please download the complete package of 1.9G first, and then you can download the patch package to cover it after decompression.

7. Error in Merging Stage `ffprobe {}`

It is likely that the spaces or single and double quotes in the original video name caused it. Try renaming the original video. For example, the original name is D:/UNSW/2024 T2/BIOS 2061/Week 5 Amphibians, reptiles, and birds/_video_out/BIOS2061-5246_00069- Lecture 13 - Birds 1 'Origin of Birds' - Prof. Richard Kingsford- Part 2 - UNSW##BIOS2061-5246_00069- Lecture 13 - Birds 1 'Origin of Birds' - Prof. Richard Kingsford- Part 2 - UNSW.mp4. You can see that the video name is very complicated and there are spaces, single quotes, etc. This kind of name is very easy to make mistakes when processing. Please delete spaces and single quotes.

Does it Support Docker Deployment

Not supported

Can it Recognize Subtitle Text in Videos, i.e. OCR Recognition

The principle of this software is to recognize human voices in videos and convert them into text subtitles. It does not support OCR recognition subtitle function

Can it be Called Through the http api Interface

Not currently, but this function may be added later

Can New Languages be Added

No, because voice recognition for subtitles depends on the whisper model, and the languages supported by this model are limited. Unsupported languages cannot be recognized

Where to Download the Software

https://pyvideotrans.com/downpackage.html

Where to Download the Model

https://pyvideotrans.com/model.html

CUDA has been Installed, but it Still Cannot be Used

Possible reasons:

1: Built-in CUDA support requires version 11.8 or above. Check if your CUDA version is too low 2: The graphics card driver is too old and needs to be updated 3: cudnn is not installed 4: The graphics card is not an N card or is incompatible

Does it Support Multi-Role Recognition and Dubbing

Not supported. The recognized subtitles do not distinguish speakers and roles. This function can be manually implemented through "Set Line Role"

CLI Command Line Mode Always Has Problems

CLI mode is updated with a delay, please use the old version

Error in Translation Stage

Please change the translation channel or fill in the network proxy

The Software Freezes and Does Not Move After Double-Clicking, Stuck on the Startup Screen

The software is large, please be patient. If it still cannot be opened for a long time. Please try

Close anti-virus software, security software, etc.
Confirm that the path and directory of the software are composed of English or numbers, and do not contain spaces, Chinese, special symbols, etc.

If it cannot be started after covering the upgrade package, please download the complete one.

If it is already a complete package, please be patient. If it still does not start after more than 2 minutes, please try to force close and reopen it.

What Translations are Supported

Currently supports Microsoft Translator Google Translate Baidu Translate Tencent Translate DeepL Translate ChatGPT Translate AzureGPT Translate Gemini Pro Translate DeepLx Translate OTT Offline Translate FreeGoogle Translate FreeChatGPT Translate

Connection error

The error "Connection error" indicates that the network connection failed. If you have not filled in the proxy in the software interface, please fill it in. You cannot directly connect to ChatGPT/Gemini/Google apis in China, and you need to fill in the proxy. Note that being able to open the corresponding website in the browser does not mean that you can use it in the software. Please fill in the correct proxy address in the network proxy input box

Whole all out of memory

The error "Whole all out of memory" indicates that there is insufficient video memory. Please use a smaller model, such as tiny small, etc.

Requested float16 compute type, but the target device or backend do not support efficient float 16 computation

The error shows that the current graphics card does not support this data type. The solution is to open Menu--Tools--Advanced Settings and find

CUDA data type

This line, change the content to int8_float16

Then restart the software to execute. If the error still occurs, change it to

float32

How to Install

No installation is required. Download the complete package, unzip it, and double-click sp.exe to use it

Why is it Reported as a Virus or Blocked

This is software packaged using pyinstaller. It has not been digitally signed or certified by anti-virus software, so it may be falsely reported. Please add it to the trust whitelist or close the security software. Or use source code deployment

What TTS Dubbing is Supported

edgeTTS / Azure AI / GPT-SoVITS / clone-voice / elevenlabs

Source Code Deployment Instructions

The default is to use ctranslate2 version 4.x, which only supports CUDA 12.x. If your CUDA is lower than 12 and you cannot upgrade CUDA to 12.x, please execute the command to uninstall ctranslate2 and then reinstall

pip uninstall -y ctranslate2

pip install ctranslate2==3.24.0

You may encounter errors such as xx module not found. Please open requirements.txt, search for the xx module, and then remove the == after xx and the version number after equals

Is There Human Customer Service

No, this is free software, there is no income and no profit, and human customer service cannot be provided

Is it Free

This is a free and open source project, no fees are charged, and it is free to use. The translation and tts interfaces are charged by the respective api merchants, which has nothing to do with this project

Can it be Used Commercially

Individuals or companies can use it freely. However, if you want to integrate it into a commercial project, please follow the GPL-v3 open source license

1. Cannot Open After Double-Clicking sp.exe ​

2. Video Clarity Reduced ​

3. Error in Translation Stage ​

3. Voice Recognition Accuracy Too Low ​

4. Model Download Address ​

5. Is it Available on Win7 ​

6. Prompt Missing python310.dll ​

7. Error in Merging Stage ffprobe {} ​

Does it Support Docker Deployment ​

Can it Recognize Subtitle Text in Videos, i.e. OCR Recognition ​

Can it be Called Through the http api Interface ​

Can New Languages be Added ​

Where to Download the Software ​

Where to Download the Model ​

CUDA has been Installed, but it Still Cannot be Used ​

Does it Support Multi-Role Recognition and Dubbing ​

CLI Command Line Mode Always Has Problems ​

Error in Translation Stage ​

The Software Freezes and Does Not Move After Double-Clicking, Stuck on the Startup Screen ​

What Translations are Supported ​

Connection error ​

Whole all out of memory ​

Requested float16 compute type, but the target device or backend do not support efficient float 16 computation ​

How to Install ​

Why is it Reported as a Virus or Blocked ​

What TTS Dubbing is Supported ​

Source Code Deployment Instructions ​

Is There Human Customer Service ​

Is it Free ​

Can it be Used Commercially ​