Gen AI Insights View web version

SPONSORED BY

Until generative AI can give us facts and not probabilities, it’s simply not going to be good enough for a wide swath of use cases, no matter how much the next DeepSeek speeds up its calculations.
Top of Form
CONTENT FROM OUR SPONSOR

Sponsored by Hewlett Packard Enterprise: Unlock the power of your data

Unify your data for AI-intensive workloads with a flexible, open data platform. Easily integrate diverse data sources and ensure high availability, security, and streamlined access. Handle large volumes of data, both structured and unstructured, wherever you need it. Unlock the power of your data. Read more.

Fact-checking AI


In January DeepSeek seemingly changed everything in AI. Mind-blowing speed at dramatically lower costs. As Lucas Mearian writes, DeepSeek sent “shock waves” through the AI community, but its impact likely won’t last. Soon there will be something faster and cheaper. But will there be something that provides what we most need? That is, more accuracy and truth? We can’t solve that problem by making AI more open. It’s deeper than that.

“Every week there’s a better AI model that gives better answers,” Evans notes. “But a lot of questions don’t have better answers, only right answers, and these models can’t do that.” This isn’t to say performance and cost improvements aren’t needed. DeepSeek, for example, makes genAI models more affordable for enterprises that want to build them into applications. And, as investor Martin Casado and former Microsoft executive Steven Sinofsky suggest, the application layer, not infrastructure, is the most interesting and important area for genAI development. 

The problem, however, is that many applications depend on right-or-wrong answers, not “probabilistic … outputs based on patterns they have observed in the training data,” as I’ve covered before. As Evans expresses it, “There are some tasks where a better model produces better, more accurate results, but other tasks where there’s no such thing as a better result and no such thing as more accurate, only right or wrong.”

In the absence of the ability to speak truth rather than probabilities, the models may be worse than useless for many tasks. The problem is that these models can be exceptionally confident and wrong at the same time. It’s worth quoting an Evans example at length. In trying to find the number of elevator operators in the United States in 1980 (a number clearly identified in a U.S. Census report), he gets a range of answers:
 
First, I try [the question] cold, and I get an answer that’s specific, unsourced, and wrong. Then I try helping it with the primary source, and I get a different wrong answer with a list of sources, that are indeed the U.S. Census, and the first link goes to the correct PDF… but the number is still wrong. Hmm. Let’s try giving it the actual PDF? Nope. Explaining exactly where in the PDF to look? Nope. Asking it to browse the web? Nope, nope, nope…. I don’t need an answer that’s perhaps more likely to be right, especially if I can’t tell. I need an answer that is right.
 

Just wrong enough


But what about questions that don’t require a single right answer? For the particular purpose Evans was trying to use genAI, the system will always be just enough wrong to never give the right answer. Maybe, just maybe, better models will fix this over time and become consistently correct in their output. Maybe.

The more interesting question Evans poses is whether there are “places where [generative AI’s] error rate is a feature, not a bug.” It’s hard to think of how being wrong could be an asset, but as an industry (and as humans) we tend to be really bad at predicting the future. Today we’re trying to retrofit genAI’s non-deterministic approach to deterministic systems, and we’re getting hallucinating machines in response.
Top of Form

Next, read this:

About the Author: Matt Asay runs developer relations at MongoDB. Previously, Asay was a Principal at Amazon Web Services and Head of Developer Ecosystem for Adobe. Asay is an emeritus board member of the Open Source Initiative (OSI) and holds a J.D. from Stanford, where he focused on open source and other IP licensing issues.

Linkedin Facebook Twitter YouTube
Privacy Policy | Manage Your Subscriptions | Unsubscribe
Advertise with us! | More Newsletters | Our Brands
©2025 IDG Communications, Inc.
140 Kendrick Street
Building B
Needham, MA 02494