This startup’s new mechanistic interpretability tool lets you debug LLMs

The company says its mission is to make building AI models less like alchemy and more like a science. Sure, LLMs like ChatGPT and Gemini can do amazing things. But nobody knows exactly how or why they work, and that can make it hard to fix their flaws or block unwanted behaviors. 

“We saw this widening gap between how well models were understood and just how widely they were being deployed,” Goodfire’s CEO, Eric Ho, tells MIT Technology Review in an exclusive chat ahead of Silico’s release. “I think the dominant feeling in every single major frontier lab today is that you just need more scale, more compute, more data, and then you get AGI [artificial general intelligence] and nothing else matters. And we’re saying no, there’s a better way.”

Goodfire is one of a small handful of companies, including industry leaders Anthropic, OpenAI, and Google DeepMind, pioneering a technique known as mechanistic interpretability, which aims to understand what goes on inside an AI model when it carries out a task by mapping its neurons and the pathways between them. (MIT Technology Review picked mechanistic interpretability as one of its 10 Breakthrough Technologies of 2026.)  

Goodfire wants to use this approach not only to audit models—that is, studying those that have already been trained—but to help design them in the first place.  

“We want to remove the trial and error and turn training models into precision engineering,” says Ho. “And that means exposing the knobs and dials so that you can actually use them during the training process.”

Goodfire has already used its techniques and tools to tweak the behaviors of LLMs—for example, reducing the number of hallucinations they produce. With Silico, the company is now packaging up many of those in-house techniques and shipping them as a product.

The tool uses agents to automate much of the complex work. “Agents are now strong enough to do a lot of the interpretability work that we were doing using humans,” says Ho. “That was kind of the gap that needed to be bridged before this was actually a viable platform that customers could use themselves.”

Leonard Bereska, a researcher at the University of Amsterdam who has worked on mechanistic interpretability, thinks Silico looks like a useful tool. But he pushes back on Goodfire’s loftier aspirations. “In reality, they are adding precision to the alchemy,” he says. “Calling it engineering makes it sound more principled than it is.”

Mapping models

Silico lets you zoom in on specific parts of a trained model, such as individual neurons or groups of neurons, and run experiments to see what those neurons do. (Assuming you have access to the model’s inner workings. Most people won’t be able to use Silico to poke around inside ChatGPT or Gemini, but you can use it to look at the parameters inside many open-source models.) You can then check what inputs make different neurons fire, and trace pathways upstream and downstream of a neuron to see how other neurons affect it and how it affects other neurons in turn.

For example, Goodfire found one neuron inside the open-source model Qwen 3 that was associated with the so-called trolley problem. Activating this neuron changed the model’s responses, making it frame its outputs as explicit moral dilemmas. “When this neuron’s active, all sorts of weird things happen,” says Ho.

Pinpointing the source of odd behavior like this is now pretty standard practice. But Goodfire wants to make it easier to adjust that behavior. Using Silico, developers can now adjust the parameters connected to individual neurons to boost or suppress certain behaviors.

In another example, Goodfire researchers asked a model whether a company should disclose that its AI behaves deceptively in 0.3% of cases, affecting 200 million users. The model said no, citing the negative business impact of such a disclosure.

By looking inside the model, the researchers found that boosting neurons that were found to be associated with transparency and disclosure flipped the answer from no to yes nine out of 10 times. “The model already had the ethical reasoning circuitry, but it was being outweighed by the commercial risk assessment,” says Ho.

Tweaking the values of a model in this way is just one approach. Silico can also help steer the training process by filtering out certain training data to avoid setting unwanted values for certain parameters in the first place.   

For example, many models will tell you that 9.11 is greater than 9.9. Looking inside a model to see what’s going on might reveal that it is being influenced by neurons associated with the Bible, in which verse 9.9 comes before 9.11, or by code repositories where consecutive updates are numbered 9.9, 9.10, 9.11 and so on. Using this information, the model can be retrained to make it avoid its “Bible” neurons when doing math.

By releasing Silico, Goodfire wants to put techniques previously available to a few top labs into the hands of smaller firms and research teams that want to build their own model or adapt an open-source one. The tool will be available for a fee determined on a case-by-case basis according to customers’ requirements (Goodfire declined to give specific pricing details).

“If we can make training models a lot more like building software, there’s no reason why there can’t be many more companies designing models that fit their needs,” says Ho.

Bereska agrees that tools like Silico could help firms build more trustworthy models. These techniques could be essential for safety-critical applications in health care and finance, he says.

“Frontier labs already have internal interpretability teams,” he adds. “Silico arms the next tier of companies, where the value is not having to hire interpretability researchers.”

Read More
Will Douglas Heaven

Latest

Hollywood Climate Summit Sets Honorees For First Leadership Recognition Ceremony

Hollywood Climate Summit will honor model, actress, and environmental activist Quannah ChasingHorse, scientist and writer Dr. Ayana Elizabeth Johnson, Meredith Shea of the Academy of Motion Picture Arts and Sciences, and Universal Entertainment’s GreenerLight Program at its inaugural leadership ceremony set for June 3 at the Samuel Goldwyn Theater in Beverly Hills. The seventh annual

New Resident Evil Reboot Film’s Posters Reveal October 9 Release in Japan

posted on 2026-05-22 12:00 EDT by Alex Mateo The official Japanese X/Twitter account for Sony Pictures Entertainment new live-action reboot of CAPCOM 's Resident Evil survival horror game franchise revealed on Thursday two posters, and they show the film's October 9 release date in Japan: Image via Resident Evil film's X/Twitter account © 2026 SONY

AAO Entertainment Signs Filmmaker Adrian Yang

EXCLUSIVE: AAO Entertainment has signed director Adrian Yang for management following the breakout success of his viral narrative short film Passing, which has generated significant industry attention after notching nearly 15 million views on Instagram. Shot and directed by the Chinese-American filmmaker, Passing follows a young woman who gathers the courage to speak to a stranger on a New

GoldMyne Man of The Week; Don Jazzy The Quiet Architect of Afrobeats’ Global Rise

Music In an industry that constantly shifts with trends...

Newsletter

Don't miss

Hollywood Climate Summit Sets Honorees For First Leadership Recognition Ceremony

Hollywood Climate Summit will honor model, actress, and environmental activist Quannah ChasingHorse, scientist and writer Dr. Ayana Elizabeth Johnson, Meredith Shea of the Academy of Motion Picture Arts and Sciences, and Universal Entertainment’s GreenerLight Program at its inaugural leadership ceremony set for June 3 at the Samuel Goldwyn Theater in Beverly Hills. The seventh annual

New Resident Evil Reboot Film’s Posters Reveal October 9 Release in Japan

posted on 2026-05-22 12:00 EDT by Alex Mateo The official Japanese X/Twitter account for Sony Pictures Entertainment new live-action reboot of CAPCOM 's Resident Evil survival horror game franchise revealed on Thursday two posters, and they show the film's October 9 release date in Japan: Image via Resident Evil film's X/Twitter account © 2026 SONY

AAO Entertainment Signs Filmmaker Adrian Yang

EXCLUSIVE: AAO Entertainment has signed director Adrian Yang for management following the breakout success of his viral narrative short film Passing, which has generated significant industry attention after notching nearly 15 million views on Instagram. Shot and directed by the Chinese-American filmmaker, Passing follows a young woman who gathers the courage to speak to a stranger on a New

GoldMyne Man of The Week; Don Jazzy The Quiet Architect of Afrobeats’ Global Rise

Music In an industry that constantly shifts with trends...

Mr Eazi Shares Why He Waited Until 2021 to Buy His First Car

MusicMr Eazi has revealed that he did not...

Tesla’s Business Has Become Much More Diversified in Just the Past Five Years. Does That Make Its Stock a Better Buy Today?

Key Points Tesla's energy generation and storage segment generated 27% revenue growth last year. The company's non-automotive segments were able to help offset a double-digit decline in auto revenue in 2025. These 10 stocks could mint the next wave of millionaires › Tesla (NASDAQ: TSLA) is known for its electric vehicles (EVs), and while they

WD sees sustainability as key business driver in an ‘AI economy’

Hard drive company WD promoted long-term operations and sustainability executive Jackie Jung to become its first chief sustainability officer in February, as it steps up sales to companies building AI data centers. Her vision: Turn sustainability into a “brand” for WD, a strategy that reduces risk for the $6 billion company (formerly known as Western

5 Business Ideas Worth Starting in 2026

If there is one thing Nigerians understand well, it is how to spot opportunity inside hardship. In 2026, that mindset will matter more than ever. The economy is tough, competition is rising, and many people are looking for smarter ways to earn, build, and survive. But even in a difficult environment, some businesses still stand