Cracking Open the Elusive AI Black Box with Daniel Balsam

Right now there’s no way to go in and just correct an LLM. At least not yet. They’re still black boxes. The only way to modify AI behavior is to train it with more data.

But the future might be different.

A recent paper from MIT on the Platonic Representation Hypothesis suggests that different AI learning systems converge on a similar world model given similar training data. This corroborates a growing body of similar research: It seems you can train an MLP, a transformer, or a Mamba model using different data sets and they'll end up learning basically the same thing. The authors hypothesize that this represents a convergence “driving toward a shared statistical model of reality, akin to Plato’s concept of an ideal reality.”

This sounds like a big deal. And it’s tempting to get carried away and speculate: Maybe AI models perceive the world with greater accuracy than we do! Maybe we’re closer to AGI than we know!

To get answers we checked in with Daniel Balsam, who sits on our Build Mode Editorial Board and is the co-founder of a startup working on mechanistic interpretability—ie, trying to see inside the AI black box. His take is that it’s less about discovering platonic reality and more about the nature of learning. Just like there are fundamental laws of physics, there are fundamental laws of learning, such that even if two children are taught math in radically different settings—one by running a lemonade stand, the other by betting on commodity futures—they’ll both end up discovering the underlying patterns of addition, multiplication, and compound interest.

That would suggest that knowledge is something real out there in the world, waiting to be discovered. Wait a minute, what is knowledge? What is learning? What does it all mean?!

Balsam explained it beautifully: Learning is about finding the most efficient way to express something. The smallest number of neurons. The shortest circuit. That’s the underlying pattern that different models are converging at: the easiest, simplest way to say the thing.

Scaling the model size, as well as task and data diversity, is how models reach towards this phenomenon, which is also called universality.

“This fundamental, surprising observation is why we seem likely to achieve something like AGI sooner than many previously thought,” Balsam explained.

Boom!

But what does that mean for companies?

Balsam is optimistic that soon software will be available that allows you to go in and identify where the model is thinking about, say, gender, and edit the model directly to reduce bias.

A new study from Anthropic mapped out all the different combinations of neurons in their Claude 3 model and found that by turning off certain combinations they could change how the AI behaved—not unlike Matthew McConaughey adjusting the humor setting on TARS in Interstellar.

The question then is whether the humans doing the editing are smart enough—and unbiased enough—to do it better than the AI.

Click Here to Share This Newsletter

CHART OF THE WEEK

AI spending grew 293% last year

According to Ramp, a fintech platform, B2B AI-related transaction volume has surged by 293%, dwarfing the 6% increase in overall software spending.

Ramp’s insights are especially valuable as they track spending from numerous companies. This perspective reveals the growing reliance on AI tools, with over a third of Ramp customers now paying for at least one AI tool, up from 21% a year ago. The average business spent $1.5k on AI tools in Q1, an increase of 138% year over year— evidence that companies using AI are seeing clear benefits and doubling down.

It is also fascinating to look at which industries are spending the most on AI: Tech's first-place finish is no surprise, but finance and consulting are close behind. In other words, AI is booming in knowledge work hotspots.

WATERCOOLER

Former OpenAI board member finally speaks on firing Sam Altman

On November 17th, 2023, when Sam Altman was abruptly fired by the OpenAI board. Helen Toner, a former board member, opened up about this unexpected move during an interview on The TED AI Show podcast. Toner revealed that the decision came from Altman's "outright lying," which eroded the board's trust in his leadership. She pointed out specific instances, including Altman's failure to disclose his ownership of the OpenAI Startup Fund and providing inaccurate information about the company's safety processes.

Despite the board's action, pressure from employees and major stakeholders like Microsoft led to Altman’s reinstatement. Toner attributed this to a combination of fear and Altman's persuasive track record.

To pile on top, Kevin Roose wrote about a new group of OpenAI whistleblowers concerned about “a culture of recklessness and secrecy” inside the company. The whistleblowers claim that OpenAI has used “hardball tactics to prevent workers from voicing their concerns about the technology.”

“OpenAI is really excited about building A.G.I., and they are recklessly racing to be the first there,” said Daniel Kokotajlo, a former OpenAI researcher.

The group published an open letter calling for greater transparency across the big AI companies—which sounds like a good thing to us. Yoshua Bengio and Geoffrey Hinton agree.

DISCOVERY ZONE

Rebind is a revolutionary e-reading experience that transforms your book journey with insightful commentary and expert guidance. It's like having the world's most interesting people as your reading companions, enhancing your understanding and engagement with every page.

EVENTS

Gen AI Salon: Consumer Tech in a Sci-Fi World

On June 25th, A.Team and J.P Morgan are bringing together an exclusive group of industry founders, VCs, researchers, and decision-makers in NYC for the Gen AI Salon: Consumer Tech in a Sci-FI World.

Join to hear three lightning talks from some of the world’s top AI innovators, an expert panel of consumer tech leaders to address your biggest challenges, and a generative AI hackathon competition where you’ll decide the winner. Don’t miss out — apply now to secure your spot, either in person or virtually!

Apply to Attend

MEME

‍

No items found.

Tag:

Newsletters