Benchmark-maxxing is cool, I guess - but that doesn’t win markets. Product does.
The next lap in the AI race is Agentic Intelligence: agents that reason and act on your behalf. So you can do things like, “Brief me on upcoming client meetings based on recent news,” or “Analyze three competitors and create a slide deck.”
OpenAI just launched ChatGPT Agent, a system designed to do exactly this. But here’s the problem. The product experience lags behind the hype.
Early users report that ChatGPT Agent is sluggish, clunky, and underwhelming. Competitors like Manus or Genspark are already delivering results that are equally good, if not better. Even Perplexity’s Comet feels much smoother… partly because ChatGPT Agent runs on a virtual browser hosted on OpenAI servers, which slows everything down.
Investor Deedy Das posted a side-by-side comparison of the Slide feature, and the difference is stark.
This gap isn’t about model quality. It’s a product problem.
I was vibe coding a similar Slides product a few months ago, and OpenAI’s o3 reasoned through content beautifully, and I definitely got further with generating better slide designs than the current Agent demo, which looked sparse… and honestly, half-baked.
Earlier this week, Anthropic quietly showed what the future could look like. Their Claude for Finance launch bundled Claude model, UX, and ecosystem integrations into a cohesive stack. This platform was not just a smarter chatbot or a blank-slate API, but was designed to be more like an operating system for investors. That’s the kind of product coherence missing from OpenAI’s Agent debut.
The lack of polish is surprising, especially for a company with OpenAI’s resources. Then again, maybe it’s structural. A recent blog post from a former OpenAI employee might hold hints at the root of the problem.
“Chat runs really deep. Since ChatGPT took off, a lot of the codebase is structured around the idea of chat messages and conversations. These primitives are so baked at this point, you should probably ignore them at your own peril. We did deviate from them a bit in Codex (leaning more into learnings from the responses API), but we leveraged a lot of prior art.”
It’s a sobering thought. The very interface that made GPT go mainstream - chat - may now be its biggest Achilles heel.
I’ve said this before and I’ll say it again: every research lab should have a design team embedded inside it. R&D should stand for Research & Design. The combination of cutting-edge research and well-loved products to bring it to life. The frontier isn’t just better models, it’s bringing new capabilities to users through better products.
Let’s see who figures that out first.
- TTYL
The Download —
News that mattered this week
Mira Murati’s Thinking Machines Lab is now worth $12B in seed round: Thinking Machines Lab, the AI startup founded by OpenAI’s former chief technology officer Mira Murati, officially closed a $2 billion seed round led by Andreessen Horowitz. The deal, which includes participation from Nvidia, Accel, ServiceNow, CISCO, AMD, and Jane Street, values the startup at $12 billion.
Claude Code revenue jumps 5.5x as Anthropic launches analytics dashboard: Anthropic is rolling out a comprehensive analytics dashboard for its Claude Code AI programming assistant. The platform has seen active user base growth of 300% and run-rate revenue expansion of more than 5.5 times, according to company data.
Anthropic valuation could hit $100 billion in new investment round: According to a report by Bloomberg News, the AI startup is planning a new investment round that could bring it to a $100 billion valuation.
Sweden’s Lovable becomes $1.8B ‘vibe coding’ unicorn after record $200M Series: The raise brings Lovable’s valuation to $1.8 billion, marking one of Europe’s most significant tech fundings of 2025. Over 500,000 users, 30,000 paid subscribers, and nearly 25,000 new apps launched daily have driven annual recurring revenue from less than $20 million to a staggering $75 million in just a few months.
Jack Dorsey pumps $10 million into a nonprofit focused on open-source social media: Twitter co-founder and Block CEO Jack Dorsey has invested $10 million in an effort to fund experimental open source projects and other tools that could ultimately transform the social media landscape. These efforts are funneled through an online collective called “and Other Stuff,” formed in May,
Runway introduced Act-Two, its next-gen motion capture model. It translates single performance videos into fully animated characters with head, face, body, and hand tracking across artistic styles and outputs.
China now surpasses the west in AI research and talent. A new report from research analytics firm Digital Science shows that China now produces as much AI research as the US, UK, and EU-27 combined. In terms of research attention, China captured over 40% of global citations in 2024, four times higher than the US and EU individually, and 20 times more than the UK.
Double Click —
Links to reads we found interesting
Top AI researchers concerned they’re losing the ability to understand what they’ve created (Futurism)
AI firms ‘unprepared’ for dangers of building human-level systems, report warns (The Guardian)
Study ranks AI models by how well they summarize medical research. Google’s Gemini models performed the best overall (Tech Brew)
For the first time, astronomers have captured the birth of a new solar system (X)
A former OpenAI engineer describes what it’s really like to work there (Blog)
China is spending billions to become an AI superpower (NYTimes)