Companies face issues with training data quality and labeling when launching AI and machine learning initiatives, according to a Dimensional Research report.
David Karandish, founder and CEO of Jane.ai, discusses the next step for machine learning and AI through chatbots and natural language processing.
The worldwide spending on artificial intelligence (AI) systems is predicted to hit $35.8 billion in 2019, according to IDC. This increased spending is no surprise: With digital transformation initiatives critical for business survival, companies are making large investments in advanced technologies.
However, nearly eight out of 10 organizations engaged in AI and machine learning said that projects have stalled, according to a Dimensional Research report. The majority (96%) of these organizations said they have run into problems with data quality, data labeling necessary to train AI, and building model confidence.
The report, conducted by Dimensional Research on behalf of Alegion, surveyed 227 tech professionals who were involved in active AI and machine learning projects. With processing such large amounts of data, AI and machine learning systems have a tough time keeping up, the report found.
“The single largest obstacle to implementing machine learning models into production is the volume and quality of the training data,” Nathaniel Gates, CEO and co-founder of Alegion, said in a press release. “This research reinforces our own experience, that data science teams new to building ROI-driven systems try to tackle training data preparation in house, and get overwhelmed.”
Systems can have trouble processing large amounts of data— but to get AI systems off the ground, they paradoxically need a lot of data, the report said. Data science teams are forced to walk a tightrope to deliver successful projects using large amounts of data, while making sure the systems can process the specific quantities of information.
To combat these challenges, some 76% of respondents said they sometimes try to label and annotate training data on their own. More than half (63%) said they even try building their own labeling and annotation automation technology. Ultimately, 71% of teams said they outsource training data and other machine learning project activities.