OpenAI to Launch AI Agent for Automating Computer Tasks
OpenAI, the Microsoft-backed AI research firm, is reportedly preparing to launch a new AI agent capable of automating various tasks on a user's computer. According to a Bloomberg report, the agent, internally codenamed "Operator," is designed to handle tasks such as writing code, booking travel, and other routine activities. OpenAI plans to release a research preview of this AI agent in January 2025, which will be accessible through its application programming interface (API) for developers.
The report further suggests that OpenAI is working on several AI agent-related initiatives, with the “Operator” being the closest to release. This agent will be a general-purpose tool able to perform tasks via a web browser, streamlining interactions and eliminating the need for manual input for many activities.
OpenAI CEO Sam Altman recently discussed the evolution of AI agents during a Reddit "Ask Me Anything" session. He hinted that while the company will continue to improve its models, AI agents could represent the next major leap in AI capabilities. "We will have better and better models, but I think the thing that will feel like the next giant breakthrough will be agents," Altman said.
This development follows similar moves by other tech giants. Last month, The Information reported that Google is also working on an AI-driven tool, codenamed “Project Jarvis,” aimed at automating web-based tasks. Powered by the upcoming Google Gemini model, “Project Jarvis” is expected to integrate with Google Chrome to assist with tasks such as interpreting screenshots, clicking buttons, and entering text.
What Are AI Agents?
AI agents are advanced software tools powered by artificial intelligence that can carry out complex, multi-step tasks with minimal human intervention. Unlike traditional chatbots, which are designed to respond based on pre-existing data, AI agents can perform actions autonomously by making decisions, solving problems, and interacting with their environment. These agents have memory capabilities, allowing them to retain information from past interactions and plan future actions accordingly. They are typically driven by large language models but extend their functionality beyond simple question-and-answer tasks.
At their core, AI agents promise to free users from repetitive, time-consuming tasks by automating processes that traditionally required manual input, offering the potential for significant productivity gains across various industries.