Meta announces new image generation AI model 'CM3leon'

Interest and research in generative AI models has accelerated in recent months with advancements in natural language processing.

Follow : Google News Icon  
CM3Leon
Meta uses a data set of millions of images that are licensed | Image Credit: Pixabay | Image: self

Meta has announced a new generative AI model that aims at achieving a state-of-the-art performance for text-to-image generation, a company statement stated on Friday. The unveiling happens at a time when AI-powered image generators have become very popular and accessible. At present, every major company and emerging startups use it for their daily operations.

Meta

What does CM3leon do?

According to media reports, the company expects that the introduction of the new AI model could produce more coherent imagery that better analyses the input prompts given to it. 

At present, AI-based image generators like DALL-E2, Google's Imagen and Stable Diffusion rely on a process called diffusion for art creation, where the respective AI model learns by gradually subtracting noise from an image, thereby working efficiently on the prompts given.

Meta

Source: Meta

Advertisement

However, the process of diffusion is intensive, expensive and time-taking, whereas CM3leon relies on a mechanism called attention that takes more importance to the input prompt, which can be a text or an image. In comparison, CM3Leon is expected to be more efficient and requires less computation and a smaller dataset compared to other models.

For training purposes, Meta uses a data set of millions of images that are licensed. This is amid all the legal lawsuits that the company has faced for the misuse of information.

Advertisement

General image generators often struggle with complex objects and sometimes find it difficult to understand the prompts. Some of the images generated through the model also reveal that the new model is trained on working with complex designs.

Other features include the editing capability. Meta claims that text-guided image editing (e.g. “change the colour of the sky to bright blue”) is challenging because it requires the model to simultaneously understand both textual instructions and visual content, but CM3leon excels in all of the cases.

The new model could stage strong performance across a variety of tasks and provide high-fidelity image generation and understanding. It is also aimed at enhancing the creativity of the company's metaverse prospects.

Published By :
Mannu Mathew
Published On: