Deep learning has yielded some fantastic results for basic natural language processing (NLP) functions such as named entity recognition (NER), document classification and sentiment analysis — not to mention the abilities to generate everything from believable short stories to HTML code with minimal text inputs or prompts. In addition, deep learning can also have a dramatic impact on F1 scores, which are used as a performance measure for precision and recall, and so vendors have started to throw much of their weight and resources behind what they see as a game-changing technology.
But as the CEO of a company that’s been doing NLP for well over 15 years, I don’t believe that deep learning is always the answer — especially from an economic standpoint. I’ve watched as many new players have stepped up to the plate with NLP solutions underpinned by deep learning. But what I’m not seeing is evidence of big commercial wins, and I suspect that the cost of using deep learning-backed NLP is wiping out significant dollar gains. Deep learning tools like BERT can deliver results, but sometimes at a much greater cost than taking a traditional machine learning approach, depending on the size of your project.
Deep Learning: The Big Guns’ Big Gun
Deep learning is a powerful tool — there’s no denying it. It’s great for detecting patterns and identifying non-linear relationships. It’s the technology that underpins the tools we use every day, including Google and Apple’s voice and image recognition algorithms, Baidu’s predictive advertising platform that precisely targets and serves up ads as well as the recommendation engines that surface relevant content on Amazon, Netflix, Spotify and Google News. It’s also hard at work in Paypal’s H2O, a predictive analytics platform used to identify and prevent fraudulent purchases and payments.
Less publicly, but no less significantly, it’s also extended to areas as diverse and wide-ranging as medical imaging analysis, futures trading, autonomous vehicle development, intelligence gathering, satellite data analysis, drug discovery and actuarial analysis. If it involves predictive analytics and there’s enough data available, deep learning is a viable solution. But viable doesn’t necessarily mean “the best” or the most cost-effective — especially if you’re working on a relatively simple, small-scale project.
Deep Learning: Big On Data — But Also Big On Price
Deep learning has street cred and name recognition, but it’s enormously and increasingly computationally expensive. That’s a feature but also a bug. Estimating the costs of training different BERT models on Wikipedia and Google Book corpora, Israeli company AI21 found that an 11-billion parameter variant of one model may cost $1.3 million for a single run. That’s because companies investing in deep learning aren’t just paying for model training over a huge corpus, but also factors like GCP or AWS storage, along with hardware and personnel needs. The bigger the project, the bigger those costs.
And now that BERT, GPT-3 and other deep learning models are part of the offer of many NLP companies, those ever-rising base costs have to be covered — and passed on to the consumer.
Deep Learning-Based NLP: Viable In This Economy?
For many relatively simple NLP tasks, deep learning is neither the most efficient nor effective solution. It’s like using a tractor to mow an apartment lawn. Simple, saved searches and more basic model types like MaxEnt and CRF are better suited to the class of problem we usually see. They also offer explainability at a fraction of the cost of a deep learning solution. But now that deep learning is so heavily embedded in ML companies’ product lines, customers have little choice in the matter.
With businesses having spent the past year tightening their belts, whether the costs associated with deep learning are viable for companies looking for new NLP solutions remains to be seen. I suspect that, given the business problems NLP is typically used to solve, these prices may be too high — and vendors may need to find a way to adapt.
This article has been published from the source link without modifications to the text. Only the headline has been changed.