Increasing the Triumph Rate of Data Science Project

Most data science initiatives fail to deliver value due to hurdles like poor data quality, but Bryce Macher, CNN’s senior product analytics leader, has a plan for success.

The vast majority of data science projects are doomed, but they don’t have to, says Bryce Macher, CNN’s senior director of product analysis.

In 2016, Gartner estimated that 60% of big data science projects fail to go into production and generate any value. A year later, Gartner analyst Nick Heudecker said 60% was far too conservative an estimate and the real number is around 85%.

Since then, despite advances in augmented intelligence and machine learning, Gartner hasn’t changed anything. In the meantime, an IDG study from 2018 has shown that even with the application of artificial intelligence capabilities, only every third scientific data project is successful.

The myriad of obstacles preventing companies from being successful with data science projects include isolated data, a lack of data science skills within the organization, poor data quality, starting a project with no clear goal, and lack of a data culture within the organization. However, he said the obstacles can be overcome. On August 17th, the opening day of Ai4, a virtual conference on artificial intelligence, Macher presented a plan to overcome the overwhelming error rate of data science projects and give them a chance to succeed.

“The focus, whether we talk about individual tactics or overall strategies, is really on data products,” Macher said. “The data product approach is about developing in-house products that provide data to decision makers, whether they are business decision makers or customer decision makers. The best way to prevent some of the failures of Data science projects.”

He added that companies should employ four strategies to develop data products – applications and tools that drive business processes and decisions – are:

  • understanding that data science starts at data strategy;
  • building a data science infrastructure for applications and not experiments;
  • focusing on early growth applications of data science; and
  • hiring data scientists who can actually do the data science.

Data science starts at strategy

One strategic move companies can take to improve the chances of data science projects delivering value is to hire data scientists well in advance of the project. Organizations, however, often make the mistake of gathering their data, building up their data infrastructure and developing data operations strategies before bringing in data scientists, according to Macher.

“Because data science is a little more intense than analytics, for example, organizations end up with data strategies that don’t account for the needs of data science,” he said.

Data acquisition is vital for a data scientist, and having a data scientist on board when data is first collected and selected allows the data scientist to more easily work with that data when the time comes for science. “Placing a data scientist at the key moment of data strategy is incredibly crucial,” Macher said. Data quality is also critical, he continued. Including a data scientist on the data team to ensure data quality enables companies to avoid repetitive data science projects based on incorrect data, document their data and easily find the data they need for their different needs.

“Data science has to be part of an organization’s data DNA,” Macher said. “Rather than hiring it last, it should be part of the data governance process. Putting a data scientist as a key stakeholder is going to supercharge the ability to build data science applications based on good data. It’s also going to give teams a culture of thinking about data strategy.”

Building a data science infrastructure

According to Macher, when setting up a data science infrastructure, it has to be part of the company’s entire digital ecosystem and not be localized. A data science project cannot begin with models that are developed and trained on an individualized laptop: “You can almost always fail,” says Macher. When someone starts a project on their own laptop, they think of a model on a computer rather than a model that is part of a complete data infrastructure, he explained. Data science projects must therefore start in the same cloud that drives the applications.

Additionally, when starting a data science project, organizations need to think about how the resulting application will be deployed at both the micro-level with a single deployment point and the macro-level with deployment that results in applications that can be scaled across multiple ones.

“Building infrastructure for both easy, lightweight deployments and big application deployments puts deployment and productization at the core of every data science project,” Macher said.

Finally, when building a data science infrastructure, ensuring a culture of experimentation is important, he said.

While it’s important to reduce the error rate of data science projects, it is still acceptable to have ideas that don’t work; if a lot of time is spent on projects that are doomed to fail due to poor data quality, that’s one thing. But if they fail early because an idea wasn’t quite right, that will lead to a culture of experimentation.

“That feeling of fail-fast is very crucial to success,” Macher said. “In building a good data science culture that focuses on building infrastructure for applications, it is very important to make sure that your infrastructure supports the full spectrum of potential failures, be it on the model or implementation side.”

Focusing on early growth

According to Macher, growth and knowledge must have priority, and this manifests itself in two ways: First, business growth must be a priority for data science teams, projects must be aimed at reducing risks and optimizing critical opportunities. When developing a data science strategy and implementing data science projects for the first time, companies have to start small and then grow. The first projects shouldn’t be a massive, broad-scale effort. The ultimate value gained from early projects is minimal, small projects that are successful eventually lead to larger projects with the potential for bigger results.

“That’s going to set your data science team’s culture in the right direction,” Macher said. “The first project that lands will guide your data science team. Making sure there is an initial, growth-focused victory not only will steer the team in the right direction, it will also propel what we call the steering wheel of the machine. Data Scientists will create models, DevOps teams make product managers available. They will get used to driving growth, resulting in an influx of more data that feed data scientists who can then build more and better models. “And then that flywheel keeps spinning,” Macher said. Meanwhile, one mistake that companies often make occurs at the hiring level – they hire someone with a deep understanding of data science as an executive and then fill the positions below that top level.

Instead, they should stop what Makers call middle-advanced people who have room to grow. These people are high enough to have seen projects and know what infrastructure they need, but not so high that they are no longer available. “Organizations can then promote, grow and scale from there,” said Macher.

Hiring a data scientist

If companies finally want to take advantage of data science projects, they have to hire data scientists, according to the makers. as data scientists, and they’re trained to think differently. Data engineers are often experts in machine learning, but they lack the advanced math and statistical skills needed to solve business problems, Macher said. statistical skills but no technical skills. Data scientists have both. “We have to hire data scientists,” said Macher. “They are not data engineers or data analysts.” Aside from the different approaches data scientists use to approach projects, data scientists are needed to mentor other data scientists and provide the expertise that is put into action.

“Providing opportunity requires providing expertise,” Macher said. “Even if we want to hire these promising engineers or incredibly talented analysts, if we don’t have the data scientist with the necessary data science expertise on the team, then the engineers and analysts have excellent engineering practices and analytical thinking, but not many data science solutions are on the rise.

Source link