πŸ¦… Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)

πŸ¦… Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)

πŸš€ Introducing Eagle 7B by RWKV: Trained on 1 Trillion Tokens in 100+ languages, this multi-lingual model outshines others in benchmarks. Available now on Huggingface for download, a game-changer for AI enthusiasts! πŸ¦…πŸŒ #AI #RWKV #Eagle7B #MultiLingual

  • Introducing RWKV's latest innovation: the Eagle 7B model, built on the RWKV-v5 architecture, trained on 1.1 Trillion Tokens across 100+ languages, and outperforming all 7B class models in multi-lingual benchmarks.
  • RWKV-v5 Eagle 7B, an "Attention-Free Transformer," requires further fine-tuning for various use cases, offering it as an Apache 2.0 licensed model under the Linux Foundation.
  • Multi-lingual performance tests show significant improvements across benchmarks covering common sense reasoning in 23 languages, displaying a marked advancement from RWKV-v4 to v5.
  • Public availability of RWKV-v5 Eagle 7B on Huggingface for download and use, along with the provision of a reference pip inference package and other community inference options.
  • English performance details exhibit substantial progress, competing with top models in certain benchmarks, aiming to bridge the performance gap towards Mistral's rumored 2~7 Trillion token training.
  • RWKV's focus on building inclusive AI for the world, supporting top languages to encompass around 4 billion people, with plans to expand coverage to 100% of the world over time.
  • Future plans include releasing updates on RWKV v5, a 1T token training session for direct comparisons, an MoE model, and the upcoming RWKV-v6 β€œFinch” 1.5B and 3B world models.
  • Acknowledgments extend to StabilityAI, EleutherAI, Linux Foundation AI & Data group, and the diverse developers contributing to RWKV-related projects.