https://github.com/deepseek-ai/DeepSeek-LLM

GitHub - deepseek-ai/DeepSeek-LLM: DeepSeek LLM: Let there be answers

Unlock the power of language with DeepSeek LLM! 🚀 Trained on a massive dataset of 2 trillion tokens, this advanced model excels in reasoning, coding, math, and Chinese comprehension. Available as open source for research. #AI #DeepLearning #GitHub

DeepSeek LLM is an advanced language model trained from scratch on a dataset of 2 trillion tokens in English and Chinese.
It is available as open source DeepSeek LLM 7B/67B Base and Chat models for research.
DeepSeek LLM 67B Base excels in reasoning, coding, math, and Chinese comprehension compared to Llama2 70B Base.
DeepSeek LLM 67B Chat demonstrates exceptional performance in coding and mathematics.
The model has been evaluated across various benchmarks and outperforms existing models in multiple areas.
The DeepSeek LLM models were pre-trained on a diverse dataset and use specific attention mechanisms for optimal performance.
Inferences can be made using Huggingface's Transformers or vLLM for text and chat completion tasks.
Limitations of DeepSeek LLM models include issues like bias, hallucination, and repetition in generated responses.
The use of DeepSeek LLM models is subject to the MIT License and commercial usage is permitted.
For more details, refer to the official DeepSeek LLM repository and contact the developers for inquiries.