OpenChatKit - The first open-source ChatGPT

🤖⚡️ Discover OpenChatKit, the first open-source ChatGPT! 🌐🚀 Create specialized or general-purpose chatbots with a fine-tuned 20B model. Exciting features like live-updating sources, moderation, and seamless integration await you. Join the AI conversation today! #OpenChatKit #ChatGPT #AItools

  • OpenChatKit is an open-source project providing a powerful base for creating specialized and general-purpose chatbots, utilizing a 20 billion-parameter model fine-tuned for chat tasks.
  • The project collaborated with LAION and Ontocord to create the OIG-43M dataset used in training, focusing on various natural language tasks like dialogue, question answering, classification, extraction, and summarization.
  • It features an extensible retrieval system that augments responses with information from live-updating sources to provide up-to-date context.
  • OpenChatKit supports a moderation model to filter inappropriate or out-of-domain questions and offers sample code for integrating with APIs and web search.
  • Collaborators behind OpenChatKit are Together, LAION, and Ontocord, known for open-source foundation models, data annotation services, and machine learning solutions respectively.
  • The base model is GPT-NeoX-T-Chat-Base-20B, fine-tuned with the OIG-43M dataset, focusing on tasks like multi-turn dialogue, question answering, and more.
  • OpenChatKit excels in question answering, extraction, classification tasks, and performs well in few-shot prompts but requires improvements in knowledge-based closed question answering, coding tasks, and creative writing.
  • OpenChatKit's license is Apache License 2.0, allowing free usage, modification, and distribution, with access to source code, model weights, and datasets on GitHub and Hugging Face.
  • Feedback can be given through the OpenChatKit feedback app, and collaboration is encouraged on GitHub, Discord, Twitter, and Medium for sharing ideas and suggestions.