For the majority of people, the idea of using artificial intelligence tools in daily life—or even just tinkering with them—has only recently entered the mainstream thanks to new releases of generative AI tools from a number of major tech corporations and startups, including OpenAI’s ChatGPT and Google’s Bard. However, the technology has been advancing covertly for years, along with concerns over the best ways to assess and secure these fresh AI systems. On Monday, Microsoft will share details about the team within the firm that has been entrusted with figuring out how to attack AI platforms in order to expose their flaws since 2018.
Microsoft’s AI red team has evolved over the five years since it was founded from what was essentially an experiment into a whole interdisciplinary team of machine learning professionals, cybersecurity researchers, and even social engineers. In order to make the ideas accessible rather than requiring specialized AI understanding that many people and organizations don’t yet have, the group tries to disseminate its findings within Microsoft and throughout the tech industry using the traditional language of digital security. But in reality, the team has come to the conclusion that traditional digital defense and AI security have significant conceptual differences that call for modifications to the way the AI red team approaches its work.
What are you essentially going to accomplish that is different was the initial question we were asked. Why do we need a AI red team? Ram Shankar claims. Microsoft’s AI red team was founded by Siva Kumar. However, that might not be enough if you simply consider traditional red teaming and the security perspective when it comes to AI red teaming. We now have to acknowledge the responsible AI part, which includes accountability for AI system failures, generating offensive content, and generating unfounded content. That represents the pinnacle of AI red teaming, not just focusing on AI errors that are at fault, but also failures in security.
It took some time, according to Shankar Siva Kumar, to draw this distinction and establish why the AI red team’s mission would actually have a dual purpose. Early efforts were largely focused on the introduction of more conventional security technologies, such as the 2020 Adversarial Machine Learning Threat Matrix, developed in partnership between Microsoft, the nonprofit R&D organization MITRE, and other researchers. The team also released open source Microsoft Counterfit automated tools for testing the security of AI systems that year. The red team also released another approach for assessing the security risks associated with AI in 2021.
But as it becomes increasingly urgent to address machine learning problems and failures, the AI red team has been able to grow over time.
The red team evaluated a Microsoft cloud deployment service with a machine learning component during one of its early operations. By taking advantage of a bug that allowed them to build malicious requests to misuse the machine learning components and purposefully create virtual machines, the mimicked computer systems used in the cloud, the team came up with a method to execute a denial of service assault against other users of the cloud service. The red team might launch “noisy neighbor” assaults against other cloud customers by strategically deploying virtual machines in important locations, so one customer’s activity adversely affects another customer’s performance.
Instead of running the risk of affecting genuine Microsoft customers, the red team ultimately developed and attacked an offline version of the system to demonstrate that the vulnerabilities existed. Shankar Siva Kumar, however, asserts that these early discoveries dispelled any uncertainty or skepticism regarding the usefulness of an AI red team. He claims that was the turning point for most people. Holy crap, if people can do this, that’s bad for business, they exclaimed.
Importantly, Microsoft observes attackers of various skill levels targeting AI platforms due to the dynamic and diverse nature of AI systems. According to Shankar Siva Kumar, some of the unique attacks they are seeing on massive language models only require a teenager or a casual user with a browser. We shouldn’t underestimate that. There are APTs, but we also acknowledge a new class of people who can overthrow LLMs and imitate them.
However, unlike other red teams, Microsoft’s AI red team does more than only study current threats that are being utilized in the wild. The company is concentrating on predicting where assault trends may move next, according to Shankar Siva Kumar. And frequently, that means placing a strong focus on the red team’s newer AI accountability component. Instead of spending the time to fully create and submit a remedy on their own, the group frequently works with other groups inside Microsoft to get software systems fixed when they discover a typical vulnerability.
According to Shankar Siva Kumar, Microsoft has additional red teams and other Windows infrastructure specialists, among other things. The realization for him is that red teaming for AI today includes not just security failures but also responsible AI failures.