GitHub - apple/ml-ferret
🚀 Dive into cutting-edge AI with Ferret on GitHub! 🐾 Built by Apple, this ML tool utilizes MLLM for fine-grained interactions. Explore its unique features, datasets, and training process for research use. Get hands-on with Ferret-Bench evaluation and a demo setup with Gradio web UI. #ML #AI #GitHub #Ferret 🌟🔍🧠
- **Project Name:** Ferret
- **Description:** End-to-end MLLM for referring and grounding fine-grained open-vocabulary interactions.
- **Key Features:** Hybrid Region Representation, Spatial-aware Visual Sampler.
- **Dataset:** GRIT Dataset (~1.1M) for instruction tuning.
- **Evaluation:** Ferret-Bench for Referring/Grounding, Semantics, Knowledge, and Reasoning.
- **License:** Intended and licensed strictly for research use.
- **Installation:** Instructions for cloning repository, creating environment, and installing required packages provided.
- **Training:** Utilizes A100 GPUs with specific hyperparameters for FERRET-7B and FERRET-13B models.
- **Prepare Model:** Requires base model Vicuna weights and LLaVA's first-stage pre-trained projector weights.
- **Checkpoints:** Method for extracting delta between pre-trained model and Vicuna, with weights download and offset application.
- **Demo:** Demo setup instructions with Gradio web UI.
- **Citation:** Reference format provided for citing Ferret.
- **Acknowledgment:** Credits to LLaVA and Vicuna codebases.
- **Resources:** Summary of repository activities and languages used.