
Fast inference engine | Nitro
⚡️ Introducing Nitro - The high-efficiency Large Language Model inference engine for edge computing! 🚀 Lightweight, open source, and lightning-fast inference for local AI models in apps. 🔥 #AI #EdgeComputing #OpenSource
- Nitro V0.3.14 is now live on GitHub.
- Nitro is a lightweight (3mb) inference server for local AI in apps.
- Nitro is a drop-in replacement for OpenAI's REST API.
- Nitro runs on both CPU and GPU architectures.
- Nitro integrates open source AI libraries like Llama2 and Mistral.
- Nitro allows running local AI models in apps within 10 seconds.
- Nitro is open source under the AGPLv3 license and builds upon llama.cpp and Drogon.
- Nitro supports multi-threading, model management, and LLMs like TensorRT-LLM.
- Nitro is designed for vision and speech tasks, with upcoming features.
- Nitro offers developer documentation, API reference, and community support.