[ad_1]
Happy Friday. I’m back from vacation and still getting caught up on everything I missed. AI researchers moving jobs is getting covered like NBA trades now, apparently.
Before I get into this week’s issue, I want to make sure you check out my interview with Perplexity CEO Aravind Srinivas on Decoder this week. It’s a good deep dive on the main topic of today’s newsletter. Keep reading for a scoop on Substack and more from this week in AI news.
From chatbots to browsers
So far, when most people think of the modern AI boom, they think of a chatbot like ChatGPT. Now, it’s becoming increasingly clear that the web browser is where the next phase of AI is taking shape.
The reason is simple: the chatbots of today don’t have access to your online life like your browser does. That level of context — read and write access to your email, your bank account, etc. — is required if AI is going to become a tool that actually goes off and does things for you.
Two recent product releases point to this trend. The first is OpenAI’s ChatGPT Agent, which uses a basic browser to surf the web on your behalf. The second is Comet, a desktop browser from Perplexity that takes it a step further by allowing large language models to access logged-in sites and complete tasks on your behalf. (OpenAI is rumored to be planning its own full-fledged browser.)
Neither ChatGPT Agent nor Comet works reliably at the moment, and access to both is currently gated to expensive subscription tiers due to the higher compute costs required to run the reasoning models they necessitate. Perhaps most frustratingly, both products claim to do things they can’t, not just in marketing materials, but in the actual product experience.
ChatGPT Agent is a read-only browser experience — it can’t access a logged-in site like Comet — and that severely limits its usefulness. It’s also very slow. My colleague Hayden Field asked it to find a particular kind of lamp on Etsy, and ChatGPT Agent took 50 minutes to come back with a response. It also failed to add items to her Etsy cart, despite claiming it had done so.
While Comet is nowhere near as slow, I’ve had numerous experiences with it claiming it has completed tasks it hasn’t, or stating it can do something, only to immediately tell me it can’t after I make a request. Its sidecar interface, which places the AI assistant to the right of a webpage, is excellent for read-only tasks, such as summarizing a webpage or researching something specific I’m looking at. But as I told Perplexity CEO Aravind Srinivas on Decoder this week, the overall experience feels quite brittle.
It’s easy to be a cynic and think the current state of products like Comet is the best AI can do at completing tasks on the web. Or, you can look at the last few years of progress in the industry and make the bet that the same trend line will continue.
During our chat this week, Srinivas told me he’s “betting on progress in reasoning models to get us there.” OpenAI built a custom reasoning model specifically for ChatGPT Agent that was trained on more complex, multi-step tasks. (The model has no public name and isn’t available via an API.)
Even with the many limitations and bugs that exist today, using Comet for just a few days has convinced me that the mainstream chatbot interface will merge with the browser. It already feels like taking a step back to merely prompt a chatbot versus interacting with a ChatGPT-like experience that can see whatever website I’m looking at. Standalone chatbots certainly aren’t going away, especially on smartphones, but the browser is what will unlock AI that actually feels like an agent.
Some noteworthy career moves
If you haven’t already, don’t forget to subscribe to The Verge, which includes unlimited access to Command Line and all of our reporting.
As always, I welcome your feedback, especially if you have thoughts on this issue or a story idea to share. You can respond here or ping me securely on Signal.
[ad_2]
Source link
