Gen AI Insights View web version

SPONSORED BY

Many vendors in the dev tools arena have argued that this can be solved by using AI apps to manage AI coding apps. Cue train wreck No. 2. Even financial giant Morgan Stanley is toying with using AI to manage AI
 
As a practical matter, the only safe and remotely viable approach is to train programming managers to understand the nature of generative AI coding errors. In fact, given that the nature of AI coding errors is so vastly different, it might be better to train new people to manage AI coding efforts — people who are not already steeped in finding human coding mistakes.

Part of the problem is human nature. People tend to magnify and misinterpret differences. If managers see an entity — be it human or AI — making mistakes those managers themselves would never do, they tend to assume the entity is inferior to the manager on coding issues. 

But consider that assumption in light of autonomous vehicles. Statistically, those vehicles are light years safer than human-operated cars. The automated systems are never tired, never drunk, never deliberately reckless. 

But automated vehicles are not perfect. And the kinds of mistakes they make — such as smashing full-speed into a truck stopped for traffic — prompt humans to argue, “I never would have done something so stupid. I don’t trust them.” (The Waymo parked car disaster is a must-see video.)

But just because automated vehicles make weird mistakes doesn’t mean they’re less safe than human drivers. But human nature can’t reconcile those differences.

It’s the same situation with managing coding. Generative AI coding models can be quite efficient, but when they go off the reservation, they go way off.
CONTENT FROM OUR SPONSOR

Sponsored by NICE: Behind the Scenes with AI: Actionable CX AI Insights

Whether you’re just starting your AI journey or taking it to the next level, this video series is your go-to resource, packed with actionable guidance and expert advice to get ahead of the CX AI curve. Unlock the full potential of AI for your customer experience (CX). Learn more.

Insane alien programmers


Dev Nag, CEO of SaaS firm QueryPal, has been working with generative AI coding efforts and feels many enterprise IT executives are not prepared for how different the new technology is.

“It made tons of weird mistakes, like an alien from another planet,” Nag said. “The code misbehaves in a way that human developers don’t do. It’s like an alien intelligence that does not think like we do, and it goes in weird directions. AI will find a pathological way to game the system.”

Just ask Tom Taulli, who’s authored multiple AI programming books, including this year’s AI-Assisted Programming: Better Planning, Coding, Testing, and Deployment.

“For example, you can ask these LLMs [large language models] to create code and they sometimes make up a framework, or an imaginary library or module, to do what you want it to do,” Taulli said. (He explained that the LLMs were not actually creating a new framework as much as pretending to do so.)

That’s not something a human programmer would even consider doing, Taulli noted, “unless (the human coder) is insane, they are not going to make up, create out of thin air, an imaginary library or module.”

When that happens, it can be easy to detect — if someone looks for it. “If I try to pip install it, you can find that there’s nothing there. If it hallucinates, the IDE and compiler give you an error,” Taulli said.

The idea of turning over full coding of an application — including creative control of the executable — to a system that periodically hallucinates seems to me a dreadful approach.

A much better way to leverage the efficiency of generative AI coding is by using it as a tool to help programmers get more done. Taking humans out of the loop, as AWS’s Garman suggested might happen, would be suicidal.

What if a generative AI coding tool lets its mind wander and creates some back doors so it can later do fixes without having to bother a human — back doors that attackers could also use? 

Enterprises tend to be quite effective at testing apps — especially homegrown apps — for functionality, to make sure the app does what it is supposed to. Where app testing tends to fall apart is when checking on whether it can do anything that it should not do. That would be a penetration testing mentality. 

But in a generative AI coding reality, that pen testing approach has to become the default. It also needs to be managed by supervisors well schooled in the wacky world of generative AI mistakes.

Enterprise IT is certainly looking at a more efficient coding future, with programmers assuming more strategic roles where they focus more on what the apps should do and why and devote less time to laboriously coding every line. 

But that efficiency and those strategic gains will come at a hefty price: paying for better and differently-trained humans to make sure AI-generated code stays on track.

Next, read this:

About the Author: Evan Schuman has covered IT issues for a lot longer than he'll ever admit. The founding editor of retail technology site StorefrontBacktalk, he's been a columnist for CBSNews.com, RetailWeek, Computerworld and eWeek and his byline has appeared in titles ranging from BusinessWeek, VentureBeat and Fortune to The New York Times, USA Today, Reuters, The Philadelphia Inquirer, The Baltimore Sun, The Detroit News and The Atlanta Journal-Constitution. 

Linkedin Facebook Twitter YouTube
Privacy Policy | Manage Your Subscriptions | Unsubscribe
Advertise with us! | More Newsletters | Our Brands
©2024 IDG Communications, Inc.
140 Kendrick Street
Building B
Needham, MA 02494