Updated March 10th, 2024 at 15:04 IST

From text to images: What lies ahead for AI tools?

Google’s image generation and text prompts app was in ire for biased responses

Reported by: Business Desk
A screengrab from Sora, OpenAI's video generative model | Image:Sora
Advertisement

GenAI troubles: Search engine giant Google’s foray into generative intelligence, Gemini has been under scrutiny for its image generation tool, and a biased response for a query pertaining to Prime Minister Narendra Modi.   

Generative artificial intelligence came into the forefront after OpenAI made it mainstream in 2022, with Microsoft and Google foraying into their own Large Language Models (LLMs). 

Elon Musk’s Grok, Sarvam AI in India and Ola founder Bhavish Agarwal’s Krutim are some other players who have ventured into generative responses.

The technology has now expanded into image and video generation, with Google’s GeminiAI generating image responses for users. OpenAI also launched Sora, which generates videos from text prompts.

Advertisement

While Sora is available to creators, Gemini was rolled out for the public in February this year. The former received awe and appreciation from users, while the latter was in ire for generating historically inaccurate images.    

AI expert Jaspreet Bindra, who is also the founder of UK-based consultancy The Tech Whisperer Ltd, said the problem was likely due to highly politically-correct responses during the reinforcement learning through human feedback (RLHF) process.

“Given that Google is perhaps the leader in AI, and has been delivered in AI, and as the best AI talent in the world, it was extremely surprising as to what happened in action a generative model,” he said.

The RLHF process is a machine learning (ML) technique that uses human feedback to optimise ML models to self-learn more efficiently.

In the case of Gemini, Bindra said the people giving the feedback were “too politically correct, or too much concerned about the damage of any nature, or a controversy of any nature” which resulted in the overtaking of the model.

Advertisement

The lapse could also be because of some “basic planning issues,” he added.

Google has, as of now, paused the image generation tool.

Advertisement

On Sora, Bindra said OpenAI has come out with a different level of quality in terms of grassroots clarity and lifelike visual graphics.

“I think it definitely has the potential to shake up a few in creative industries like moviemaking studios, visual design, ad advertising, and many other video creative industries like gaming or videos on social media or YouTube,” he said.

Advertisement

It is a potential threat to employment, he added, which will make operations efficient, more productive and better for companies that embrace the technology.

But it will also lead to the folding up of other organisations which cannot compete with new technologies like these.

Advertisement

“A professional visual designer who uses these technologies can become better, and someone who doesn't, might lose the job to the person who has used them,” he said.

Amid the opportunities image and video generation presents for creativity and optimising operations in the industry, image-based AI can also pose serious challenges to society. 

Advertisement

Bindra said these issues range from deepfakes, that can potentially disrupt elections and democracy, or threaten reputations through pornographic content - providing scale for the production of deepfakes at a better, faster and cheaper rate.

“I think (BigTechs) are teaming now to figure what these problems can be, and will only release (versions) after the major elections, or fixing (issues),” he said. 

Google and Meta were among the 20 firms who have signed an accord in February for fighting misinformation during elections, with the two companies also setting out policies for flagging AI-manipulated images and videos while advertising politically on the platforms.

Advertisement

Published March 10th, 2024 at 15:04 IST