Small language models (SLMs) are giving CIOs greater opportunities to develop specialized, business-specific AI applications that are less expensive to run than those reliant on general-purpose large language models (LLMs). By 2027, smaller, context-specific models will outpace their counterparts with usage volume at least three times more than those of LLMs, according to a recent report from Gartner, which also claims LLM response accuracy declines for tasks requiring specific business context. “The variety of tasks in business workflows and the need for greater accuracy are driving the shift towards specialized models fine-tuned on specific functions or domain data,” says Sumit Agarwal, an analyst at Gartner who helped author the report. “These smaller, task-specific models provide quicker responses and use less computational power, reducing operational and maintenance costs.” Dr. Magesh Kasthuri, a member of the technical staff at Wipro in India, says he doesn’t think LLMs are more error-prone than SLMs but agrees that LLM hallucinations can be a concern. “For domain-centric solutions such as in the banking or energy sector, SLM is the way to go for agility, cost-effective resources, rapid prototype and development, security, and privacy of organizational data,” Kasthuri says. Despite the spotlight on general-purpose LLMs that perform a broad array of functions such as OpenAI, Gemini, Claude, and Grok, a growing fleet of small, specialized models are emerging as cost-effective alternatives for task-specific applications, including Meta’s Llama 3.1, Microsoft’s Phi, and Google’s Gemma SLMs. For example, Google claims its recently introduced Gemma 3 SLM can run on just one Nvidia GPU. |