https://ai.meta.com/blog/generative-ai-text-images-cm3leon/

Introducing CM3leon, a more efficient, state-of-the-art generative model for text and images

🌟 Exciting news! Meet CM3leon 🦎, the ultimate tool for both text-to-image and image-to-text generation. 🔮✨ Training on a blend of tasks, this AI model sets a new standard for efficiency and top-notch performance in generative tasks. 🚀🖼️📝 #AI #TextToImage #ImageToText #CM3leon

CM3leon is a state-of-the-art multimodal generative model for text and images.
It excels in text-to-image and image-to-text generation tasks with high efficiency.
CM3leon follows a recipe including large-scale retrieval-augmented pre-training and multitask supervised fine-tuning.
Despite being trained with less compute power, CM3leon achieves remarkable performance in text-to-image generation.
It combines the versatility of autoregressive models with low training costs and high inference efficiency.
CM3leon is a causal masked mixed-modal model that expands on previous models by generating both text and images conditioned on various content.
The model undergoes large-scale multitask instruction tuning for improved performance in image captioning, visual question answering, text-based editing, and conditional image generation.
CM3leon achieves a new state of the art in text-to-image generation, surpassing Google's Parti model.
It excels in generating complex compositional objects and performs well in vision-language tasks.
CM3leon's training involves retrieval augmentation and instruction fine-tuning on a variety of tasks.
The model demonstrates strong capabilities in image editing, object-to-image generation, super-resolution results, and more.