For overwhelmed IT teams, AIOps promises to automatically avoid potential business disruptions, but some IT executives are skeptical that it can actually deliver results.
Rodrigo de la Parra, AIOps Domain Leader at IBM Automation, recently expressed his skepticism at a virtual round table discussion in Canada. “It’s more than a buzzword,” said de la Parra, “AIOps is leading IT to a more agile, software-driven approach.”
AIOps is the application of artificial intelligence to improve IT operations, explained de la Parra. It detect problems by using machine learning to analyze large amounts of data generated by tools in a company’s infrastructure. Automation and natural language processing can be used for troubleshooting in real time.
“It’s not a product or a unique solution, said de la Parra. It’s a journey. In order to add value, it is important to align AIOps in such a way that the business needs are supported in order to improve efficiency and customer service.
De la Parra made a distinction between “domain-specific” and “domain-agnostics” tools. He found that domain-specific tools had great value within their specific silo. But the real value, de la Parra says, is in adding a domain-agnostic approach because you can use sources from all of the siloed tools and create a single data source. This will be the only source of truth for the analysis and to provide evidence of the causes for those involved, said de la Parra.
How to set up AIOps for success
A successful implementation begins with an operational assessment to identify current issues related to the company’s business needs; From there, Key Performance Indicators (KPIs) should be created to measure progress. Benchmarking where you are today, looking for real problems and developing measurable KPIs are at the centerpiece to find and prove the value of AIOps.
For example, de la Parra suggested that organizations could review their efficiency by tracking the volume of major incidents relative to their applications, or the average time to detect, acknowledge and resolve incidents. The value could be measured by examining how much manual labor is eliminated or the number of problems is reduced reported by users.
One participant asked how long it would take to set up the platform; This can be completed in a few weeks in many cases, according to de la Parra, and he recommended starting with a manageable pilot in order to get significant results quickly. once baseline data entered into the model, it begins to detect deviations in real time. Additionally, de la Parra noted that the IBM Watson AIOps solution comes with pre-built algorithms that build models to accelerate implementation and return on investment (ROI). “This approach eliminates the need for data scientists to normalize data, build a data lake, build models and integrate interfaces to work with the solution like ChatOps, he said.
Driving business benefits
Despite the discussion, it became clear that many participants were skeptical as to whether AIOps could achieve a measurable return on investment. In addition, questions were asked about the reliability of the data and the validity of domain-specific tools such as security monitoring, are enough.
The main advantage of domain-agnostic AIOps over domain-specific tools is that they provide complete transparency, De la Parra said. That makes it a reliable AI, he said. Decisions are based on insights gained from analyzing various data sources, grouping entities, locating problems shown in topology views to provide context, likely cause, and the next best action to resolve incidents. This is all done within the limits of the guidelines and compliance requirements.
“It’s understandable to be skeptical about the effectiveness of AIOps, given a widespread misconception about biased AI in general and efforts to implement robust AI models,” said de la Parra. “However, when we talk about AIOps at IBM, we mean a specific set of capabilities that provide concrete models to aid in log anomaly detection, blast radius, seasonal event grouping, next best action, and more.”
Another concern raised by the group relates to the problem of false positives in possible incidents, as De la Parra noted that AIOps can analyze whether a problem is affecting business systems, and if there is no impact, no alerts will be sent. “Reducing noise is important so that employees can dedicate their time to higher-value tasks,” said de la Parra. A 2021 study by Forrester looked at the overall economic impact of IBM Watson AIOps. It showed a 50 percent reduction in mean time to resolve (MTTR), 80% of the time saved by resolving false positives, resulting in savings of $ 623,000 and other benefits like proactive incident prevention.
According to de la Parra, AIOps leads to better overall management of IT services. Not only does it reduce response and downtime, but it can also be used to analyze the appropriate resource allocation for cloud workloads.
The organizations already have the data, said de la Parra. AIOps enables the IT team to be more proactive and become a trusted partner who drives the business.