Going big was key to the beginning of the artificial intelligence arms race. Massive models trained on massive amounts of data were made to resemble the intelligence of a human.
In an effort to reduce the cost, increase speed, and improve the specialized nature of AI software, both IT giants and startups are now thinking smaller. Small or medium language models, a subset of AI software, are trained on less amount of data and are frequently created for specialized purposes.
Because of their vastness, the largest models—like OpenAI’s GPT-4—use more than one trillion parameters and cost more than $100 million to develop. Larger models can require more than $10 million to train and use over 10 billion parameters. Smaller models are often trained on smaller data sets—just on legal issues, for instance. In addition, the smaller models are less expensive to answer each query because they require less processing power.
Microsoft has made a big deal out of its Phi family of small models, which CEO Satya Nadella claims are almost as good at many jobs as the open-source ChatGPT model yet are just 1/100th the size.
According to Microsoft’s chief commercial officer Yusuf Mehdi, “I think we increasingly believe it’s going to be a world of different models.”
Mehdi stated that Microsoft, one of the first major tech firms to invest billions of dollars in generative AI, soon discovered the technology was more expensive to run than it had first projected.
The business has also introduced AI laptops that generate images and conduct searches using dozens of AI models. The models don’t require access to large cloud-based supercomputers like ChatGPT does; instead, they may be operated on a device with very little data.
This year, Google and the AI firms Mistral, Anthropic, and Cohere have also released smaller models. In order to make its AI software faster and more safe, Apple revealed its own road map in June. Part of this plan included using small models to run the program only on phones.
A version of its flagship model that is allegedly less expensive to run was just released by OpenAI, which has been leading the large-model trend. According to a spokesman, the business is considering introducing smaller models in the future.
Large models are sometimes overkill for many jobs, such as creating graphics or summarizing documents; it’s like driving a tank to get groceries.
Illia Polosukhin, who presently works on blockchain technology, stated that it shouldn’t require quadrillions of operations to compute 2 + 2. She was one of the authors of a groundbreaking 2017 Google study that set the groundwork for the current generative AI boom.
When the returns on generative AI-based technologies are still unknown, companies and consumers have also been searching for less expensive ways to operate this technology.
According to Yoav Shoham, co-founder of AI21 Labs, a Tel Aviv-based AI startup, small models can often provide answers for as little as one-sixth the price of large language models since they require less processing power. According to Shoham, using a large model is not economically feasible if hundreds of thousands or millions of answers are being processed.
The secret is in fine-tuning these smaller models to execute specific tasks, like drafting emails, by focusing them on a set of facts, like as sales figures, legal documents, or internal communications. Through this approach, small models can accomplish the same functions as huge models at a fraction of the expense.
The current frontier of artificial intelligence, according to Alex Ratner, co-founder of Snorkel AI, a startup that assists businesses in customizing AI models, is getting these smaller, more specialized models to function in these more mundane but crucial areas.
Experian, a credit-rating firm, switched from big to small models for its AI chatbots, which it uses for customer support and financial advising. According to Ali Khan, chief data officer at Experian, smaller models outperformed larger ones at a much lower cost after being trained using the company’s internal data.
According to Salesforce’s chief of AI, Clara Shih, the smaller models are also speedier. Large models result in overspending and delay problems.
The shift to smaller models coincides with a slowdown in the development of large, publicly available models. No new models that represent an analogous advancement have been revealed since OpenAI released GPT 4, a notable step forward over the previous model GPT 3.5, last year. This is explained by a number of variables, according to researchers, including a lack of fresh, high-quality training data. The smaller models have come to light as a result of this tendency.
According to Sébastien Bubeck, the Microsoft executive in charge of the Phi model project, “there is this little moment of lull where everybody is waiting.” “It makes sense that you start asking yourself, ‘Well, can you really make this stuff more efficient?'”
It’s unclear at this point if the slump is a one-time occurrence or a more widespread technical problem. However, the small-model instance illustrates how AI has progressed from science-fiction-like demonstrations to the less thrilling reality of commercializing it.
However, businesses aren’t giving up on large models. Apple declared that ChatGPT will be included into Siri to help with more complex tasks, such as writing emails. Microsoft announced that the most recent OpenAI model will be integrated into its latest version of Windows.
Nevertheless, the OpenAI connections were only a small portion of each company’s overall AI offering. Over the course of a nearly two-hour presentation, Apple only spoke about it for two minutes.