GitHub - AIGC-Audio/AudioGPT: AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

GitHub - AIGC-Audio/AudioGPT: AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

🔊🎵 Dive into the world of speech, music, and more with AudioGPT by AIGC-Audio/AudioGPT! 🤖🎶 Generate speech, music, and even talking head synthesis with this powerful AI tool. Explore Text-to-Speech, Style Transfer, Text-to-Sing, and more! 🎙️🎨 #AI #Audio #Music

  • AIGC-Audio/AudioGPT is an open-source repository for implementing and providing pretrained models for understanding and generating speech, music, sound, and talking head.
  • Capabilities of AudioGPT include speech tasks like Text-to-Speech, Style Transfer, Speech Recognition, Speech Enhancement, Speech Separation, and Speech Translation.
  • Additionally, AudioGPT supports Sing tasks such as Text-to-Sing.
  • For Audio tasks, capabilities include Text-to-Audio, Audio Inpainting, Image-to-Audio, Sound Detection, and Sound Extraction.
  • Talking Head tasks are supported for Talking Head Synthesis.
  • Acknowledgement is provided to projects like ESPNet, NATSpeech, Visual ChatGPT, Hugging Face, and LangChain for their open-source contributions.
  • The repository contains models like FastSpeech, SyntaSpeech, VITS, GenerSpeech, whisper, Conformer, ConvTasNet, TF-GridNet, Multi-decoder, NeuralWarp, DiffSinger, VISinger, LASSNet, and GeneFace.
  • More supported models and tasks are expected to be added in the future.
  • Detailed information can be found in the README and run.md files of the repository.