GitHub - AIGC-Audio/AudioGPT: AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
🔊🎵 Dive into the world of speech, music, and more with AudioGPT by AIGC-Audio/AudioGPT! 🤖🎶 Generate speech, music, and even talking head synthesis with this powerful AI tool. Explore Text-to-Speech, Style Transfer, Text-to-Sing, and more! 🎙️🎨 #AI #Audio #Music
- AIGC-Audio/AudioGPT is an open-source repository for implementing and providing pretrained models for understanding and generating speech, music, sound, and talking head.
- Capabilities of AudioGPT include speech tasks like Text-to-Speech, Style Transfer, Speech Recognition, Speech Enhancement, Speech Separation, and Speech Translation.
- Additionally, AudioGPT supports Sing tasks such as Text-to-Sing.
- For Audio tasks, capabilities include Text-to-Audio, Audio Inpainting, Image-to-Audio, Sound Detection, and Sound Extraction.
- Talking Head tasks are supported for Talking Head Synthesis.
- Acknowledgement is provided to projects like ESPNet, NATSpeech, Visual ChatGPT, Hugging Face, and LangChain for their open-source contributions.
- The repository contains models like FastSpeech, SyntaSpeech, VITS, GenerSpeech, whisper, Conformer, ConvTasNet, TF-GridNet, Multi-decoder, NeuralWarp, DiffSinger, VISinger, LASSNet, and GeneFace.
- More supported models and tasks are expected to be added in the future.
- Detailed information can be found in the README and run.md files of the repository.