GitHub - apple/ml-ferret

GitHub - apple/ml-ferret

🚀 Dive into cutting-edge AI with Ferret on GitHub! 🐾 Built by Apple, this ML tool utilizes MLLM for fine-grained interactions. Explore its unique features, datasets, and training process for research use. Get hands-on with Ferret-Bench evaluation and a demo setup with Gradio web UI. #ML #AI #GitHub #Ferret 🌟🔍🧠

  • **Project Name:** Ferret
  • **Description:** End-to-end MLLM for referring and grounding fine-grained open-vocabulary interactions.
  • **Key Features:** Hybrid Region Representation, Spatial-aware Visual Sampler.
  • **Dataset:** GRIT Dataset (~1.1M) for instruction tuning.
  • **Evaluation:** Ferret-Bench for Referring/Grounding, Semantics, Knowledge, and Reasoning.
  • **License:** Intended and licensed strictly for research use.
  • **Installation:** Instructions for cloning repository, creating environment, and installing required packages provided.
  • **Training:** Utilizes A100 GPUs with specific hyperparameters for FERRET-7B and FERRET-13B models.
  • **Prepare Model:** Requires base model Vicuna weights and LLaVA's first-stage pre-trained projector weights.
  • **Checkpoints:** Method for extracting delta between pre-trained model and Vicuna, with weights download and offset application.
  • **Demo:** Demo setup instructions with Gradio web UI.
  • **Citation:** Reference format provided for citing Ferret.
  • **Acknowledgment:** Credits to LLaVA and Vicuna codebases.
  • **Resources:** Summary of repository activities and languages used.