What’s going on here?
AI companies and tech titans have been banking on a new way to drive big revenue: small language models.
What does this mean?
Tech giants Apple, Microsoft, Meta, and Google have been releasing “small language models” lately, eager to woo businesses that are wary of the massive costs and power demands of beasts like OpenAI’s ChatGPT. See, GPT-4 and Google’s Gemini 1.5 Pro are both large language models, the smartest type on the market. Problem is, due to copyright and data issues, they’re expensive. The smaller versions, meanwhile, are a winning combination of cheaper, more customizable, and energy-efficient because they require less power to train and run. Not to mention, they can process tasks locally on a device – perfect for those who want to keep data in-house and not in the cloud.
Why should I care?
Zooming out: The ins and outs.
Currently, when you ask models like ChatGPT to pick an outfit or write a poem, the request zips off to OpenAI’s servers in the cloud, gets processed, and bounces back with an answer. That requires internet access and shares data with the model maker. So “edge computing”, where everything happens right on your device, is creating a buzz – and driving demand for speedy “edge” chips that can run models without spilling your data. The chips need to be small, cheap, and energy-efficient so they can fit into a device like a smartphone. Enter Apple, which is scrambling to develop its own AI chip to upgrade the iPhone.
The bigger picture: Small but mighty.
If edge chips get small and cheap enough, AI-driven smart gadgets will be everywhere – from homes to offices. Nowadays, cloud-based AI models are mostly confined to data centers, which demand a ton of power and space. So if AI could run directly from folks’ devices, those challenges may become a thing of the past.