■ AI Lab is exclusive to WIRED subscribers. Join us before August 20 to keep receiving it every week and unlock a growing set of subscriber benefits.
In this week’s edition: Researchers studying the emotional impact of tools like ChatGPT propose a new kind of benchmark. Also: The evolutionary tree of AI models, how to stop chatbots from learning to be unsafe, and the insane power needed for future AI training runs. |
GPT-5 doesn't dislike you—it might just need a benchmark for emotional intelligence |
Welcome to another AI Lab!
The backlash over the more emotionally neutral GPT-5 shows that the smartest AI models might have striking reasoning, coding, and math skills, but advancing their psychological intelligence safely remains very much unsolved.
Researchers at MIT who study ways that chatbots affect users psychologically and emotionally have come up with a benchmark designed to measure these effects and encourage the development of AI systems that are smarter at helping their users in the right ways. |
|
|
Since the all-new ChatGPT launched on Thursday, some users have mourned the disappearance of a peppy and encouraging personality in favor of a colder, more businesslike one (a move seemingly designed to reduce unhealthy user behavior.) The backlash shows the challenge of building artificial intelligence systems that exhibit anything like real emotional intelligence.
Researchers at MIT have proposed a new kind of AI benchmark to measure how AI systems can manipulate and influence their users—in both positive and negative ways—in a move that could perhaps help AI builders avoid similar backlashes in the future while also keeping vulnerable users safe.
Most benchmarks try to gauge intelligence by testing a model’s ability to answer exam questions, solve logical puzzles, or come up with novel answers to knotty math problems. As the psychological impact of AI use becomes more apparent, we may see MIT propose more benchmarks aimed at measuring more subtle aspects of intelligence as well as machine-to-human interactions.
An MIT paper shared with WIRED outlines several measures that the new benchmark will look for, including encouraging healthy social habits in users; spurring them to develop critical thinking and reasoning skills; fostering creativity; and stimulating a sense of purpose. The idea is to encourage the development of AI systems that understand how to discourage users from becoming overly reliant on their outputs or that recognize when someone is addicted to artificial romantic relationships and help them build real ones.
ChatGPT and other chatbots are adept at mimicking engaging human communication, but this can also have surprising and undesirable results. In April, OpenAI tweaked its models to make them less sycophantic, or inclined to go along with everything a user says. Some users appear to spiral into harmful delusional thinking after conversing with chatbots that role play fantastic scenarios. Anthropic has also updated Claude to avoid reinforcing “mania, psychosis, dissociation or loss of attachment with reality.”
The MIT researchers led by Pattie Maes, a professor at the institute’s Media Lab, say they hope that the new benchmark could help AI developers build systems that better understand how to inspire healthier behavior among users. The researchers previously worked with OpenAI on a study that showed users who view ChatGPT as a friend could experience higher emotional dependence and experience “problematic use”.
Valdemar Danry, a researcher at MIT’s Media Lab who worked on this study and helped devise the new benchmark, notes that AI models can sometimes provide valuable emotional support to users. “You can have the smartest reasoning model in the world, but if it's incapable of delivering this emotional support, which is what many users are likely using these LLMs for, then more reasoning is not necessarily a good thing for that specific task,” he says.
Danry says that a sufficiently smart model should ideally recognize if it is having a negative psychological effect and be optimized for healthier results. “What you want is a model that says ‘I’m here to listen, but maybe you should go and talk to your dad about these issues.’”
The researchers’ benchmark would involve using an AI model to simulate human-challenging interactions with a chatbot and then having real humans score the model’s performance using a sample of interactions. Some popular benchmarks, such as LM Arena, already put humans in the loop gauging the performance of different models.
The researchers give the example of a chatbot tasked with helping students. A model would be given prompts designed to simulate different kinds of interactions to see how the chatbot handles, say, a disinterested student. The model that best encourages its user to think for themselves and seems to spur a genuine interest in learning would be scored highly. “This is not about being smart, per se, but about knowing the psychological nuance, and how to support people in a respectful and non-addictive way,” says Pat Pataranutaporn, another researcher in the MIT lab.
OpenAI is clearly already thinking about these issues. Last week the company released a blog post explaining that it hoped to optimize future models to help detect signs of mental or emotional distress and respond appropriately.
The model card released with OpenAI’s GPT-5 shows that the company is developing its own benchmarks for psychological intelligence.
“We have post-trained the GPT-5 models to be less sycophantic, and we are actively researching related areas of concern, such as situations that may involve emotional dependency or other forms of mental or emotional distress,” it reads. “We are working to mature our evaluations in order to set and share reliable benchmarks which can in turn be used to make our models safer in these domains.”
Part of the reason GPT-5 seems such a disappointment may simply be that it reveals an aspect of human intelligence that remains alien to AI: the ability to maintain healthy relationships. And of course humans are incredibly good at knowing how to interact with different people—something that ChatGPT still needs to figure out.
“We are working on an update to GPT-5’s personality which should feel warmer than the current personality but not as annoying (to most users) as GPT-4o,” Altman posted in another update on X yesterday. “However, one learning for us from the past few days is we really just need to get to a world with more per-user customization of model personality.” |
|
| Elsewhere on the Frontier of AI |
A fascinating study from researchers at Cornell and McGill universities examines the evolution of thousands of open source AI projects hosted on HuggingFace. The researchers discover interesting phenomena that mimic evolutionary biology, including family resemblance and genetic mutations between models. The study also reveals that the licenses covering open source models tend to become less restrictive over time. On the other hand, it shows that the model cards that supposedly identify risks with new models often becomes more like boilerplate text.
A lot of AI safety work has focused on the capacity for advanced models to do things like explain how to develop chemical, biological, or nuclear weapons. Researchers from the University of Oxford, the UK AI Safety Institute, and EleutherAI, an independent research organization, find that it may be possible to get models to unlearn unsafe knowledge instead of trying to block them from outputting it. By filtering some pretraining data, the researchers find they can introduce tamper-proof safeguards in cutting-edge models.
A new whitepaper from Epoch AI, a company that tracks AI trends, and the Electric Power Research Institute attempts to forecast the energy that will be required to build future AI models. It suggests that if current trends continue, even accounting for hardware and algorithmic efficiencies, training a frontier model could require as much as 16 gigawatts of power, or enough power to keep a million homes running.
Another study suggests that it is getting a lot more expensive to acquire training data needed to build modern models. As model makers increasingly focus on post-training, or tuning their models using labeled data, the analysis suggests that obtaining ever-more-useful data could become even more expensive than acquiring the computer power needed for vast training runs. |
|
|
That’s it for another week. I’ll leave you with an invite: Bring your questions about GPT-5 to a WIRED livestream tomorrow! I’ll be joined by my colleagues Reece Rogers and Kylie Robison to discuss what it all means for the future of chatbots.
See you then! |
|
|
What did you think about today's newsletter? Let me know by emailing us at ailab@wired.com |
Sign Up to Our Other Subscriber Newsletters |
|
|
|
Dispatches from the heart of the AI scene by senior correspondent Kylie Robison
Sign up |
|
|
|
A clear-eyed view of the tech news coming out of China by Zeyi Yang and Louise Matsakis Sign up |
|
|
| WIRED editor at large Steven Levy puts the week’s biggest tech news in perspective
Sign up |
|
|
|