Skip to content

This is a webui and api project for the kokoro TTS project, supporting dubbing in 8 languages: Chinese, English, Japanese, French, Italian, Portuguese, Spanish, and Hindi.

Project address https://github.com/jianchang512/kokoro-uiapi

Web Interface

Default UI address after startup: http://127.0.0.1:5066

  • Supports dubbing of text and SRT subtitles
  • Supports online audition and download
  • Supports aligning subtitles

Installation Method

Windows

win10/11 can directly download the integrated package and double-click start.bat to start. If GPU acceleration is required, please ensure that you have an NVIDIA graphics card and install CUDA12.

Baidu Netdisk download address: https://pan.baidu.com/s/1jTB84E3-gaLqFrl32f4sDw?pwd=xnwp

GitHub download (no models included, requires VPN for online download): https://github.com/jianchang512/kokoro-uiapi/releases/download/v0.1/kokoro-uiapi-noModels-v0.2.7z

Linux/MacOS

First ensure that the system has python3.8+ installed, it is recommended to use 3.10-3.11

Use apt install ffmpeg or yum install ffmpeg to pre-install ffmpeg on Linux

Use brew install ffmpeg to install ffmpeg on MacOS

  1. Pull the source code git clone https://github.com/jianchang512/kokoro-uiapi
  2. Create and activate a virtual environment
cd kokoro-uiapi
python3 -m venv venv
. venv/bin/activate
  1. Install dependencies pip3 install -r requirements.txt
  2. Start python3 app.py

Use in pyVideoTrans

  1. First start this project, double-click start.bat for the windows integrated package, and execute python3 app.py for the source code installation

  2. Upgrade pyVideoTrans to v3.48+, open Menu--TTS Settings-Kokoro TTS--fill in the http address http://127.0.0.1:5066

Compatible with OpenAI API

API is compatible with OpenAI TTS

Default API address after startup: http://127.0.0.1:5066/v1/audio/speech

Request method: POST Request data: application/json

{
input: text to be dubbed,
voice: dubbing role,
speed: default 1.0
}

Successfully returns mp3 audio data

OpenAI SDK Usage Example

from openai import OpenAI
client = OpenAI(
    api_key='123456',
    base_url='http://127.0.0.1:5066/v1'
)

try:
    response = client.audio.speech.create(
		model='tts-1',
        input='Hello, dear friends',
        voice='zf_xiaobei',
        response_format='mp3',
        speed=1.0
	)
    with open('./test_openai.mp3', 'wb') as f:
        f.write(response.content)
    print("MP3 file saved successfully to test_openai.mp3")
except Exception as e:
    print(f"An error occurred: {e}")

Role List

English dubbing roles:

af_alloy
af_aoede
af_bella
af_jessica
af_kore
af_nicole
af_nova
af_river
af_sarah
af_sky
am_adam
am_echo
am_eric
am_fenrir
am_liam
am_michael
am_onyx
am_puck
am_santa
bf_alice
bf_emma
bf_isabella
bf_lily
bm_daniel
bm_fable
bm_george
bm_lewis

Chinese roles:

zf_xiaobei
zf_xiaoni
zf_xiaoxiao
zf_xiaoyi
zm_yunjian
zm_yunxi
zm_yunxia
zm_yunyang

Japanese roles:

jf_alpha
jf_gongitsune
jf_nezumi
jf_tebukuro
jm_kumo

French roles: ff_siwis

Italian roles: if_sara,im_nicola

Hindi roles:hf_alpha,hf_beta,hm_omega,hm_psi

Spanish roles:ef_dora,em_alex,em_santa

Portuguese roles:pf_dora,pm_alex,pm_santa

Proxy VPN

The source code deployment method requires downloading the timbre pt file from huggingface.co, you need to set up a global proxy or system proxy in advance to ensure accessibility

You can also download the model in advance and extract it to the directory where app.py is located.

Model download address https://github.com/jianchang512/kokoro-uiapi/releases/download/v0.1/moxing--jieya--dao--app.py--mulu.7z

Credit