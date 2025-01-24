Not long ago OpenAI showed the world prowess of its new generative artificial intelligence-based technology that can browse the internet and perform tasks on a user’s behalf. The ChatGPT owner has now released that technology as part of its new agent called Operator. OpenAI’s Operator can go to the web to perform tasks for users with the help of its own browser, paving the way for the company’s future AI-based tools for automation.

“Operator is one of our first agents, which are AIs capable of doing work for you independently—you give it a task and it will execute it,” said OpenAI in a blog post.

Citing its efforts to “ensure a safe and iterative rollout,” OpenAI also said that the current version of Operator is available as a research preview for only Pro users in the US. The research preview, the company said, will allow it to “learn from our users and the broader ecosystem,” and apply its learnings to expand Operator to Plus, Team, and Enterprise users, and even integrate these capabilities into ChatGPT “in the future.”

Operator is powered by OpenAI’s new model called Computer-Using Agent (CUA), which combines GPT-4o model’s vision capabilities with “advanced reasoning through reinforcement learning” to interact with graphics user interfaces of a website or platform. In other words, Operator analyses the content on the screen alongside actionable elements, such as buttons, menus, and text fields, using GPT-4o’s vision prowess. As OpenAI explains, Operator can “see” through screenshots and “interact” using all the actions a mouse and keyboard allow on a browser. Based on user commands, it then takes action on the web without requiring custom API integrations.