Updated May 24th, 2022 at 20:25 IST

Google develops AI text-to-image generator offering 'unprecedented degree of photorealism'

Google has developed this image generator named 'Imagen' which has a better language understanding, thanks to larger language models. Read further.

Reported by: Harsh Vardhan
Image: Google | Image:self
Advertisement

Tech giant Google has introduced a new Artificial Intelligence (AI) based text-to-image generator which, according to the company, produces an "unprecedented degree of photorealism”. Similar to the AI program DALL-E 2 image generator, this new tool named Google Imagen generates precise images based on the descriptions and can even produce oil paintings, photographs, draw and develop CGI renders based on the fed-in command. Notably, this is just a text-to-image diffusion model and is not available to the public for general use. 

However, the technology indeed is impressive as it generates strikingly accurate images as described in the texts. For instance, the text command of "A photo of a raccoon wearing an astronaut helmet, looking out of the window at night" would produce an image like the one shared below. See for yourself to check the accuracy.

(Image: Google Research)

Several other images were produced based solely on the text commands but the final product was an accurate image with apparently zero error. The photograph below was produced after entering the text "A cute corgi lives in a house made out of sushi", whereas the second image is generated from the text, "A dog looking curiously in the mirror, seeing a cat". It is worth noting, however, that AI images based on such tools mostly come out unfinished or blurry. Besides, the final results produced by such AI tools often are hand picked from the best results and do not give an idea of the average performance. 

(Image: Google Research)

(Image: Google Research)

How does the Google Imagen work?

Google has developed this image generator which has a better language understanding thanks to larger language models. The AI tool works on diffusion technologies which start off with a noisy image and refine it to the limit. The Imagen tool then starts generating a 64X64-pixel image and after two “super-resolution”, it scales the image to a 1024 x 1024-pixel photo. According to Google, this process is not normal upscaling as the new resolutions are created using the smaller image as a basis. 

Advertisement

Published May 24th, 2022 at 20:25 IST