Audio version of the article
It’s clear that 2020 has been one of massive growth for applied NLP, but what are the practices actually driving this uptick in use and budgets? While there are many contributing factors, these are three of the key trends shaping the NLP industry and the open-source ecosystem today.
1. Models Need Better Zookeepers
The number of publicly available NLP models has exploded over the last few years – think TensorFlow, PyTorch, Hugging Face and the list goes on. While putting models at the fingertips of eager users is great, the more saturated this becomes, the harder it is to find the one you should actually use for your next project. Take Hugging Face, for example. Anyone from the community can upload models for free, and now you have more than 3,000 models to choose from but no way to tell which meet your criteria best.
At the end of the day, many users want someone to curate the most accurate model – that is actually supported – for their project. This is one of the advantages of using an open-source library such as Spark NLP, which provides both accuracy and support. Licensed users get the library, models and support, helping you find exactly what you need. That said, even TensorFlow, which allows anyone to upload models, now helps users sort handpicked models. New model hubs are adding better search, discovery and curation, which will continue to help with both adoption and ease of use.
2. Multilingual Models
According to the aforementioned NLP survey, language support was listed as one of the biggest challenges technical leaders cited when it comes to the technology. The number of languages supported varies across NLP libraries. For example, Stanford CoreNLP lists six, and Spark NLP ships with models in 46 languages. It’s recently become much easier, faster and more economic to support dozens of languages. Thanks to recent advances like language-agnostic sentence embeddings, zero-shot learning and the public availability of multilingual embeddings, open-source libraries that support dozens of languages out of the box are becoming the norm for the first time.
Historically, the highest-quality NLP software was in English or Mandarin Chinese. It’s exciting and encouraging to see companies like Google and Facebook publishing pretrained embeddings for more than 150 languages – something that was unheard of just a few years ago. Now, we can expect state-of-the-art models to be available for open source in all these languages. This is a huge step for inclusion and diversity, putting NLP in the hands of users all over the globe.
3. State-Of-The-Art Models Are One-Liners
Having formal education in the field and access to the core NLP libraries used to be necessary to use deep learning models. Take sentiment, for example: Inferring that “a beautiful day” is a positive statement was something you would need a data scientist to train. These are things of the past. Now, running many of the most accurate and complex deep learning models in history has been reduced to a single line of Python code.
This lowers the bar of entry significantly for those just getting started, and that’s exactly the point. By reducing the requirements to one line of code, people who know nothing about NLP can get started. This isn’t just helpful for NLP novices, however. Even for a data scientist who knows how to train models, this ease of use enables a level of automation that gives them time for more complex projects. It’s a win for everyone.
There have been few times since the inception of NLP that this technology has proven to be so valuable. The present is one of those times. Companies have leveraged NLP for everything: analyzing resumes, making investment decisions, providing customer service, diagnosing and triaging patients, improving sales engagement, summarizing legal documents, and developing new medications. These are all use cases presented in the recent NLP Summit, which my company participated in as a sponsor. Between the growing set of applications and the democratization of the technology, it will be exciting to see what’s ahead for NLP as it becomes more accessible. But one thing is for certain: NLP is poised for even greater growth in 2021.
This article has been published from the source link without modifications to the text. Only the headline has been changed.