GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

🚀🗣️ Dive into the world of robust speech recognition with OpenAI Whisper! 🎙️🤖 This multitasking model not only recognizes speech but also translates & identifies languages. 🌐🔊 Train your own model today! 🔥 #AI #SpeechRecognition #OpenAIWhisper

  • **Repository Name**: OpenAI Whisper
  • **Description**: General-purpose speech recognition model for multilingual tasks
  • **Model Features**: Multitasking abilities include speech recognition, translation, and language identification
  • **Training Approach**: Transformer sequence-to-sequence model trained on various speech processing tasks
  • **Dependencies**: Python 3.8-3.11, PyTorch 1.10.1, and OpenAI's tiktoken for tokenizer
  • **Installation**: `pip install -U openai-whisper`
  • **Model Sizes**: Tiny, Base, Small, Medium, and Large with English-only and Multilingual variants
  • **Command-line Usage**: Transcribing audio files, specifying models and languages
  • **Python Usage**: Transcribing within Python, detecting language, and decoding audio
  • **Additional Models**: Available memory requirements and inference speeds relative to the large model
  • **Performance**: Varies based on language, with performance breakdowns provided
  • **License**: Released under the MIT License
  • **Resources**: Links to README, license, and additional model information
  • **Contributors**: 68 contributors to the project