OpenCompass

🧭 Navigate the world of large language models with OpenCompass! 🌐 This AI tool offers objective ratings, evaluation datasets, and diverse evaluation methods for over 100 language models and 50 multimodal models. Dive into the open-source community and join the model challenge today! 🚀 #AI

CompassHub is an open platform for sharing and publishing evaluation datasets and leaderboards within the community.
CompassRank provides objective scores and rankings for top-tier large language models and multimodal models.
OpenCompass performs in-depth evaluations of large language models across eight key capabilities and 29 core tasks using over 100 evaluation datasets.
The evaluation toolkit, CompassKit, offers a rich set of evaluation datasets and model templates for model evaluation.
Over 100 major language models and over 50 multimodal models have joined the evaluation on CompassRank.
Evaluation on OpenCompass includes various methods such as zero-shot, few-shot, and chain-of-thought evaluations across 40+ HuggingFace and API models.
The OpenCompass platform supports efficient distributed evaluation of models with hundreds of billions of parameters.
OpenCompass emphasizes open-source, reproducibility, rich model support, distributed evaluation, and diverse evaluation methods.
The OpenCompass Partner Program allows datasets to become part of official leaderboards or specialized leaderboards recognized in specific industries.
Individuals and organizations can contribute datasets to CompassHub and participate in the model challenge on CompassRank.