pyVideoTrans Open Source Video Translation

Using deepgram.com Speech Recognition API

After v2.92, support for the deepgram.com speech recognition API has been added. This is a foreign AI service that provides a free $200 credit upon registration, which is enough for a while.

Open the website https://deepgram.com/, register, and log in to the console: https://console.deepgram.com/

After logging in, click the big green "Create API Key" in the console.

After clicking, the following window will pop up:

Write a few English letters in the first text box, and then click "" at the bottom. Next, the SK will be displayed. Remember to copy it, as shown below:

Open Menu--Speech Recognition Settings--Deepgram window

API Key: Fill in the key copied in the previous step in the API Key.
Silence Duration: You can keep the default value of 200, which is 200ms. If the speech speed of the video to be recognized is fast, you can appropriately reduce it to 150. If it is slow and there are more silences, you can appropriately increase it to 500 or 800.

Note: The Deepgram platform does not support Chinese well. Whether it is using the subtitles directly organized and returned by Deepgram, or re-segmenting sentences according to the word-level timestamps, there is a lack of punctuation marks, which leads to unsatisfactory subtitle segmentation. To optimize this, an Alibaba Cloud Chinese punctuation recovery model is added to re-segment sentences. Please select "Re-segment Chinese Sentences" in the software interface.

Using deepgram.com Speech Recognition API ​

Using deepgram.com Speech Recognition API