GitHub - AIGC-Audio/AudioGPT: AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

🔊🎵 Dive into the world of speech, music, and more with AudioGPT by AIGC-Audio/AudioGPT! 🤖🎶 Generate speech, music, and even talking head synthesis with this powerful AI tool. Explore Text-to-Speech, Style Transfer, Text-to-Sing, and more! 🎙️🎨 #AI #Audio #Music

AIGC-Audio/AudioGPT is an open-source repository for implementing and providing pretrained models for understanding and generating speech, music, sound, and talking head.
Capabilities of AudioGPT include speech tasks like Text-to-Speech, Style Transfer, Speech Recognition, Speech Enhancement, Speech Separation, and Speech Translation.
Additionally, AudioGPT supports Sing tasks such as Text-to-Sing.
For Audio tasks, capabilities include Text-to-Audio, Audio Inpainting, Image-to-Audio, Sound Detection, and Sound Extraction.
Talking Head tasks are supported for Talking Head Synthesis.
Acknowledgement is provided to projects like ESPNet, NATSpeech, Visual ChatGPT, Hugging Face, and LangChain for their open-source contributions.
The repository contains models like FastSpeech, SyntaSpeech, VITS, GenerSpeech, whisper, Conformer, ConvTasNet, TF-GridNet, Multi-decoder, NeuralWarp, DiffSinger, VISinger, LASSNet, and GeneFace.
More supported models and tasks are expected to be added in the future.
Detailed information can be found in the README and run.md files of the repository.