Qwen3-TTS-WebUI - A versatile text-to-speech web application with voice customization features.

Qwen3-TTS-WebUI

by

A versatile text-to-speech web application with voice customization features.

Pitch

Qwen3-TTS WebUI is an advanced text-to-speech application based on the Qwen3-TTS model, offering robust features like custom voice creation, voice design from natural language descriptions, and voice cloning. With support for multiple languages and an intuitive interface, it caters to diverse voices and applications.

Description

Qwen3-TTS WebUI

Qwen3-TTS WebUI is a comprehensive text-to-speech web application built on the Qwen3-TTS model, featuring 1.7 billion parameters. This application not only allows for customizable voice options but also offers innovative features like voice design and voice cloning, enhancing the versatility of text-to-speech solutions.

Key Features

Custom Voice: Utilizes predefined speaker voices for versatile speech outputs.
Voice Design: Enables the creation of unique voices based on natural language descriptions, tailoring the output to specific needs.
Voice Cloning: Facilitates cloning of voices from uploaded audio, providing an effective tool for voice replication.
Dual Backend Support: Easily switch between a locally hosted model and the Aliyun TTS API for flexible deployment options.
Multi-language Support: Offers capabilities in English, Simplified Chinese, Traditional Chinese, Japanese, and Korean, catering to a global audience.
Advanced Features: Includes JWT authentication, asynchronous task handling, voice caching, and a dark mode option for user-friendly navigation.

Interface Preview

The application interface provides a modern user experience:

Desktop Light Mode
Desktop Dark Mode
Desktop Voice Design List
Desktop Save Voice Design Dialog
Desktop Voice Cloning
Mobile Light & Dark Mode

Mobile Settings & History

Technology Stack

Backend: FastAPI, SQLAlchemy, PyTorch, JSON Web Tokens (JWT)
Frontend: React 19, TypeScript, Vite, Tailwind CSS, Shadcn/ui

API Overview

The application provides a RESTful API with endpoints for:

User registration and login
Creating custom voices
Designing voices
Cloning voices
Managing jobs and results

Example API Usage:

POST /auth/register          - Register new users
POST /auth/token             - User login
POST /tts/custom-voice       - Create a custom voice
POST /tts/voice-design       - Design a new voice
POST /tts/voice-clone        - Clone a specified voice
GET  /jobs                   - List all jobs
GET  /jobs/{id}/download     - Download results of a specific job

The Qwen3-TTS WebUI stands out by integrating advanced features, robust backend support, and multi-language capabilities, making it an ideal choice for developers and businesses looking for effective text-to-speech solutions.

0 comments

No comments yet.

Sign in to be the first to comment.

New comment