This repository contains a Python script that transcribes and translates live audio from a Twitch stream. The script uses OpenAI's Whisper model for transcription and Hugging Face's MarianMT model for translation. It supports dynamic language translation for most languages. It has been tested on 20 so far but should work on hundreds.
In short, if you speak X and want to understand a stream where the streamer speaks Y, this will do that.
If you like this repo, hit the star, help others find it!
- Features
- Requirements
- Installation
- Setting Up a Virtual Environment
- Configuration
- Usage
- Adding a Language
- How It Works
- CUDA vs CPU
- Troubleshooting
- Exceptions Handling
- Contributing
- Contact
- License
- Transcribes live audio from a specified Twitch channel.
- Translates transcribed text from a source language to a target language.
- Supports dynamic language selection for both source and target languages.
- Uses Whisper model for transcription and MarianMT model for translation.
- Logs system messages and errors for easy debugging.
- Python 3.7 or higher
- ffmpeg
- pip
-
Clone the Repository
git clone https://github.com/gorgarp/TwitchTranslate.git cd TwitchTranslate
-
Install FFmpeg
- Windows: Download and install from FFmpeg official website.
- macOS: Use Homebrew
brew install ffmpeg
- Linux: Use your package manager
sudo apt-get install ffmpeg
Using a virtual environment is one approach to running the script. This method keeps your dependencies isolated from your system Python environment.
-
Create a Virtual Environment
python -m venv myenv
-
Activate the Virtual Environment
- Windows
myenv\Scripts\activate
- macOS/Linux
source myenv/bin/activate
- Windows
-
Install the Required Python Packages
pip install -r requirements.txt
-
Set Up Twitch API Token
- Go to the Twitch Developer Portal.
- Register your application to get the
CLIENT_ID
andCLIENT_SECRET
. - Replace
YOUR_TWITCH_CLIENT_ID
andYOUR_TWITCH_CLIENT_SECRET
in the script with your actual Twitch API credentials.
-
Configure the Twitch Channel
- Replace
YOUR_TWITCH_CHANNEL_NAME
with the name of the Twitch channel you want to transcribe and translate.
- Replace
- Run the Script
python transcribe_translate.py <source_lang> <target_lang>
- Example:
python transcribe_translate.py es en # Translates Spanish to English python transcribe_translate.py pl en # Translates Polish to English
- Example:
-
Confirmed Languages
- The script currently has been confirmed for the following languages:
"en", "fr", "de", "es", "it", "nl", "sv", "pl", "pt", "ru", "zh", "ja", "ko", "ar", "tr", "da", "fi", "no", "cs", "el"
Note: These have been tested, but it should work on any language pair found on Helsinki-NLP on Hugging Face.
- The script currently has been confirmed for the following languages:
-
Add Language Pair to Exceptions List
- If you find a language pair on Hugging Face that does not follow the standard format
Helsinki-NLP/opus-mt-{source_lang}-{target_lang}
, you need to add an exception. - Update the
exceptions
dictionary in the script with the new language pair and the corresponding model name. (See Exceptions Handling)
- If you find a language pair on Hugging Face that does not follow the standard format
-
Transcription
- The script captures live audio from a specified Twitch channel using FFmpeg.
- It uses the Whisper model to transcribe the audio into text.
-
Translation
- The detected language of the transcribed text is checked against the specified source language.
- If it matches, the text is translated into the target language using MarianMT.
- The translated text is printed to the console.
-
System Messages and Error Handling
- The script logs system messages such as model loading and errors for easy debugging and monitoring.
The script can run on either CUDA (GPU) or CPU. Using CUDA significantly improves the performance and speed of both transcription and translation.
-
Checking CUDA Availability
- The script automatically checks if CUDA is available and uses it if possible:
device = "cuda" if torch.cuda.is_available() else "cpu" logging.info(f"Using device: {device}")
- The script automatically checks if CUDA is available and uses it if possible:
-
Installing CUDA (if needed)
-
Windows:
- Download and install the NVIDIA CUDA Toolkit.
- Add CUDA to your PATH:
[Environment]::SetEnvironmentVariable("CUDA_PATH", "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.5", "User") $env:Path += ";C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.5\bin"
- Reboot your system to ensure the changes take effect.
- Verify the installation by running:
nvcc --version
Note: The above commands reference CUDA version 12.5. If you install a different version, adjust the paths accordingly.
-
macOS: CUDA is not supported on macOS.
-
Linux:
- Download and install the NVIDIA CUDA Toolkit.
- Add CUDA to your PATH:
export PATH=/usr/local/cuda-12.5/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.5/lib64\ ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
- Verify the installation by running:
nvcc --version
Note: The above commands reference CUDA version 12.5. If you install a different version, adjust the paths accordingly.
-
-
Installing PyTorch with CUDA Support
- Install PyTorch with CUDA support:
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117
Note: The above command installs PyTorch with CUDA 11.7 support. Ensure the versions are compatible with your CUDA installation.
- Install PyTorch with CUDA support:
- Authentication: The script authenticates with the Twitch API using the provided client ID and secret. It obtains an access token required for making API requests.
- Fetching Stream Metadata: The script fetches metadata for the specified Twitch channel to check if the channel is live.
- Getting Stream URL: The script uses Streamlink to get the best quality stream URL from the Twitch channel.
- Capturing Audio: The script uses FFmpeg to capture audio from the Twitch stream.
- Transcribing Audio: The Whisper model is used to transcribe the captured audio into text.
- Translating Text: The script detects the language of the transcribed text and translates it into the target language using MarianMT if the language matches the specified source language.
- Output: The translated text is printed to the console.
- Common Issues:
- Ensure FFmpeg is installed and added to your system's PATH.
- Ensure you have the correct client ID, client secret, and Twitch channel name in the script.
- Verify CUDA installation if using GPU for better performance.
- Logs and Debugging:
- Check the logs for any error messages or system messages to identify issues.
- The script logs system messages and errors for easy debugging and monitoring.
- The script uses a default format
Helsinki-NLP/opus-mt-{source_lang}-{target_lang}
for loading models. - Some language pairs do not follow this format or may not exist under this naming convention. For these, specific exceptions are added.
- Example: For Portuguese to English (
pt-en
), the script usesHelsinki-NLP/opus-mt-mul-en
as the model name. - The exceptions are defined in a dictionary and checked during model loading.
Note: Helsinki-NLP on Hugging Face is where you can find language combinations.
- Fork the Repository
- Create a Feature Branch
git checkout -b feature-branch
- Commit Your Changes
git commit -m "Add some feature"
- Push to the Branch
git push origin feature-branch
- Open a Pull Request
For any questions or issues, please open an issue in the GitHub repository.
This project is licensed under the MIT License.