Newslurp

Do large language models dream of AI agents?

■ This is your last free edition of AI Lab. Join us now to keep receiving it after this week and unlock a growing set of subscriber benefits.

In this week’s edition: The quest to give AI agents memory. Also: Subliminal model communications, a way for AI models to think in images, and a radically transparent new AI model from Nvidia.

For AI models, knowing what to remember might be as important as knowing what to forget. Welcome to the era of “sleeptime compute.”

Hi, and welcome to another AI Lab.

This week, I look at the quest to give large language models a more advanced ability to remember (and forget)—including one startup that takes inspiration from the way the human brain strengthens memories during sleep.

Do Large Language Models Dream of AI Agents?

During sleep, the human brain sorts through different memories, consolidating important ones while discarding those that don’t matter. What if AI could do the same?

Bilt, a company that offers local shopping and restaurant deals to renters, recently deployed several million agents with the hopes of doing just that.

Bilt uses technology from a startup called Letta that allows agents to learn from previous conversations and share memories with one another. Using a process called “sleeptime compute,” the agents decide what information to store in its long-term memory vault and what might be needed for faster recall.

“We can make a single update to a [memory] block and have the behavior of hundreds of thousands of agents change,” says Andrew Fitz, an AI engineer at Bilt. “This is useful in any scenario where you want fine-grained control over agents’ context,” he adds, referring to the text prompt fed to the model at inference time.

Large language models can typically only “recall” things if information is included in the context window. If you want a chatbot to remember your most recent conversation, you need to paste it into the chat.

Most AI systems can only handle a limited amount of information in the context window before their ability to use the data falters and they hallucinate or become confused. The human brain, by contrast, is able to file away useful information and recall it later.

“Your brain is continuously improving, adding more information like a sponge,” says Charles Packer, Letta’s CEO. “With language models, it's like the exact opposite. You run these language models in a loop for long enough and the context becomes poisoned; they get derailed and you just want to reset.”

Packer and his cofounder Sarah Wooders previously developed MemGPT, an open source project that aimed to help LLMs decide what information should be stored in short-term versus long-term memory. With Letta, the duo has expanded their approach to let agents learn in the background.

Bilt’s collaboration with Letta is part of a broader push to give AI the ability to store and recall useful information, which could make chatbots smarter and agents less error-prone. Memory remains underdeveloped in modern AI, which undermines the intelligence and reliability of AI tools, according to experts I spoke to.

Harrison Chase, cofounder and CEO of LangChain, another company that has developed a method for improving memory in AI agents, says he sees memory as a vital part of context engineering—wherein a user or engineer decides what information to feed into the context window. LangChain offers companies several different kinds of memory storage for agents, from long-term facts about users to memories of recent experiences. “Memory, I would argue, is a form of context,” Chase says. “A big portion of an AI engineer’s job is basically getting the model the right context [information].”

Consumer AI tools are gradually becoming less forgetful too. This February, OpenAI announced that ChatGPT will store relevant information in order to provide a more personalized experience for users—although the company did not disclose how this works.

Letta and LangChain make the process of recall more transparent to engineers building AI systems.

“I think it’s super important not only for the models to be open but also for the memory systems to be open,” says Clem Delangue, CEO of the AI hosting platform Hugging Face and an investor in Letta.

Intriguingly, Letta’s CEO, Packer, hints that it might also be important for AI models to learn what to forget.

“If a user says, ‘That one project we were working on, wipe it out from your memory,’ then the agent should be able to go back and retroactively rewrite every single memory.”

The notion of artificial memories and dreams makes me think of Do Androids Dream of Electric Sheep?, by Philip K. Dick, a mind-bending novel that inspired the stylishly dystopian movie Blade Runner. Large language models aren’t yet as impressive as the rebellious replicants of the story, but their memories, it seems, can be just as fragile.

Elsewhere on the Frontier of AI

A few weeks back, researchers at Anthropic and other institutions announced a surprising discovery: models can transfer preferences from one another through a kind of subliminal learning. The researchers discovered that when one model is used to train another by interacting with it—a process known as distillation—the tutor model can sometimes transfer certain preferences through unrelated information. The researchers provide the example of a model that likes owls transmitting its ornithological interest—but they also worry that models could transmit unpleasant intentions, too.

Nvidia has released an efficient new open source reasoning model called Nemotron Nano 2. More importantly, the company disclosed details of almost all the data used to train it, an unprecedented level of transparency that should help researchers understand how the model’s reasoning abilities arise.

Researchers from the Chinese Academy of Sciences, Nanjing University, and Tsinghua University have developed a way for open source AI models to use images in their chain of thought, similar to OpenAI’s o3 model is able to “think with images”. The Chinese approach involves the model generating code to manipulate and make sense of visual information in different images.

The Chinese startup Z.ai has also released an impressive open source multimodal AI model called GLM-4.5V. The model achieves state of the art performance on benchmark tests covering image, video, and graphical user interface understanding, something that should be useful for developing computer-using agents.

Scientists at Shanghai Jiao Tong University have devised a new way for an AI agent to perform web research that seems to outperform other techniques. The new method involves having a pair of agents work together to plan and execute research.

So, This Happened

● Meta is apparently considering shaking up its AI teams again. (The New York Times)

● Nvidia is said to be developing a more powerful AI chip to export to China. (Reuters)

● Prediction markets are now taking bets on which AI models will beat the rest. (The Wall Street Journal)

Until Next Time

That’s it for another week. I’ll leave you with some impressive research from ShengShu, the Chinese company behind the AI video platform Vidu. Together with Tsinghua University, ShengShu developed a model called Vidar that allows a two-armed robot to learn how to perform simple tasks, like picking up bags of candy, by generating realistic AI videos that show how to perform different actions.

Robot arm picking up a purple pouch and placing it in a person's hand.

The robot was asked to place the teal gummies and orange peach gummies in the fruit basket, and give the tester his favorite grape gummies. None of the objects had appeared in the training data.

What did you think about today's newsletter? Let me know by emailing me at ailab@wired.com

In Other News

Teachers Are Trying to Make AI Work for Them Since the start of the AI boom, teachers have been tasked with figuring out if LLMs are helpful tools or a cheat code. This is how they’re bringing AI to their curricula.
What Do Kids Actually Think About AI? Parents, teachers, and experts have big opinions about the impacts of AI on young people and education. But what do the students themselves say?
Developers Say GPT-5 Is a Mixed Bag Software engineers are finding that OpenAI’s new GPT-5 model is helping them think through coding problems—but isn’t much better at actual coding.

Model Behavior

Dispatches from the heart of the AI scene by senior correspondent Kylie Robison

Sign up

Made in China

A clear-eyed view of the tech news coming out of China by Zeyi Yang and Louise Matsakis
Sign up

Backchannel

WIRED editor at large Steven Levy puts the week’s biggest tech news in perspective

Sign up

Do large language models dream of AI agents?

WIRED AI Lab <wired@newsletters.wired.com>

August 20, 6:06 pm