GitHub - THUDM/GLM-130B: GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

🚀 Introducing GLM-130B: A revolutionary bilingual pre-trained model with 130 billion parameters! 🤖📚 Outperforming GPT-3 175B in multiple tasks, supporting fast inference, hardware flexibility, and reproducible results. Get ready for an AI powerhouse! 🔥 #AI #GLM130B #ICLR2023

GLM-130B is a bilingual model with 130 billion parameters trained on over 400 billion text tokens in English and Chinese.
It outperforms GPT-3 175B in several tasks and supports fast inference with a single A100 server.
Reproducibility of results and cross-platform compatibility are highlighted features.
GLM-130B can be used with different hardware setups and supports INT8/INT4 quantization for reduced hardware requirements.
The model code is based on SAT and requires specific environment configurations for optimal performance.
Model weights can be downloaded and merged for usage, with recommendations for efficient storage and loading.
Task evaluation can be performed using YAML files, with a provided dataset for testing.
Multi-node evaluation and task customization are supported.
The model is optimized for up to 2.5X faster inference using FasterTransformer.
Licensing information is provided, and citation is encouraged.