ByteDance's Agent Outperforms OpenAI, Achieves Human-Level Gaming
ByteDance's UI-TARS-2 AI agent automates tasks like news search, coding, and lesson prep, outperforming OpenAI in GUI benchmarks and reaching 60% human-level gaming performance.
"AI Disruption" Publication 7600 Subscriptions 20% Discount Offer Link.
One-click completion of news search and webpage deployment, also helping teachers prepare lessons.
On September 4th, ByteDance Seed released the native GUI agent UI-TARS-2, which can autonomously operate computers and mobile phones to complete various tasks, including searching, creating web pages, collecting news, creating query tools, and playing mini-games. The related paper was published on the arXiv preprint platform on September 2nd.
In GUI benchmark tests, UI-TARS-2 outperformed OpenAI and Claude Agent in multiple tests, while its performance in playing 15 mini-games has reached 60% of human level.
In the demo released by ByteDance, UI-TARS-2 completed the task of searching for ByteDance Seed 1.6 news and deploying a webpage in one go. The prompt was "Search for news about ByteDance Seed 1.6 model, then write a webpage in modern style and deploy it."
UI-TARS-2 first breaks down this requirement into three tasks: searching for model-related news, writing a modern-style webpage, and deploying the webpage. First, it searches for news through LinkReader to understand the core features of the model, then creates a project directory for the webpage, selects appropriate design methods, plans the webpage structure, and, after successful creation, autonomously checks whether all functions can run properly.