Spark TTS is an advanced text-to-speech system that utilizes the power of large language models (LLM) to achieve highly accurate and natural speech synthesis. It is designed to be efficient, flexible, and powerful, suitable for both research and production purposes.

Spark-TTS Installation (Windows Guide)
1. Install Conda (if you haven’t already)
- Download Miniconda and install it.
- Make sure to check “Add Conda to PATH” during installation.
Download Spark-TTS
You have two options to get the files:
Option 1 (Recommended for Windows): Download ZIP manually
- Go to Spark-TTS GitHub
- Click “Code” > “Download ZIP”, then extract it.
Option 2: Use Git (Optional)
If you prefer using Git, install Git and run:git clone https://github.com/SparkAudio/Spark-TTS.git
2. Create a Conda Environment
Open Command Prompt (cmd) and run:
conda create -n sparktts python=3.12 -y
conda activate sparktts
This creates and activates a Python 3.12 environment for Spark-TTS.
3. Install Dependencies
Inside the Spark-TTS folder (whether from ZIP or Git), run:
pip install -r requirements.txt
4. Install PyTorch (Auto-Detect CUDA or CPU)
pip install torch torchvision torchaudio --index-url https://pytorch.org/get-started/previous-versions/
# OR Manually install a specific CUDA version (if needed)
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 # Older GPUs
5. Download the Model
There are two ways to get the model files. Pick one:
Option 1 (Recommended): Using Python
Create a new file in the Spark-TTS folder called download_model.py
, paste this inside, and run it:
from huggingface_hub import snapshot_download
import os
# Set download path
model_dir = "pretrained_models/Spark-TTS-0.5B"
# Check if model already exists
if os.path.exists(model_dir) and len(os.listdir(model_dir)) > 0:
print("Model files already exist. Skipping download.")
else:
print("Downloading model files...")
snapshot_download(
repo_id="SparkAudio/Spark-TTS-0.5B",
local_dir=model_dir,
resume_download=True # Resumes partial downloads
)
print("Download complete!")
Run it with:
python download_model.py
Option 2: Using Git (If You Installed It)
mkdir pretrained_models
git clone https://huggingface.co/SparkAudio/Spark-TTS-0.5B pretrained_models/Spark-TTS-0.5B
Either method works—choose whichever is easier for you.
6. Run Spark-TTS
Web UI (Recommended)
For an interactive browser-based interface, run:
python webui.py
This launches a local web server where you can enter text and generate speech or clone a voice.
7. Troubleshooting & Common Questions
Before Asking for Help
Many common issues are already covered in existing discussions, documentation, or online resources. Please:
- Search GitHub issues first
- Check the documentation
- Google or use AI tools (ChatGPT, DeepSeek, etc.)
If you still need help, please explain what you’ve already tried so we can assist you better!
Leave a Reply