Experts in artificial intelligence (AI) and machine learning are advising against the possibility of data-poisoning attacks against the massive datasets that are frequently used to train the deep-learning models in many AI applications.
Attackers who tamper with the training data needed to build deep-learning models are said to be data poisoners. This behavior indicates that it is feasible to influence the AI’s actions in a way that is difficult to trace.
Data-poisoning attacks have the potential to be very powerful since the AI will be learning from inaccurate data and may make ‘wrong’ conclusions that have substantial repercussions if the source data used to train machine-learning algorithms is secretly changed.
There is yet no proof that web-scale datasets have been contaminated in actual attacks. But, a team of AI and machine-learning researchers from Google, ETH Zurich, NVIDIA, and Robust Intelligence now claim to have shown the potential of poisoning assaults that “ensure” bad samples will surface in web-scale datasets used to train the largest machine-learning models.
While large deep learning models may withstand random noise, the researchers caution that even minute levels of adversarial noise in training sets (such as a poisoning assault) are sufficient to create specific errors in model behavior.
Researchers claimed that they could have poisoned 0.01% of well-known deep-learning datasets with little time and expense by employing the strategies they developed to abuse the way the datasets work. Although 0.01% may not seem like a significant number of datasets, experts warn that it is “sufficient to poison a model”.
Split-view poisoning is the term for this attack. If an attacker were to take over a web site that is indexed by a specific dataset, they may taint the data that is collected, rendering it erroneous, and perhaps harming the entire algorithm.
Attackers can accomplish this purpose by simply purchasing expired domain names. Regular domain expirations allow for the possibility of new ownership, which is a perfect opportunity for data poisoners.
The researchers stated that the attacker need not be aware of the precise time at which customers will download the material in the future because by controlling the domain, the attacker ensures that any downloads in the future will gather tainted data.
The researchers note that purchasing a domain and using it to propagate malware is not a novel concept; it has long been used by cybercriminals. An extensive dataset, however, could possibly be contaminated by attackers with other objectives.
Moreover, researchers have described a second kind of attack they refer to as front-running poisoning.
In this instance, the attacker lacks complete control over the targeted dataset, but they are nevertheless able to anticipate with accuracy when a web resource would be accessed in order to be included in a dataset snapshot. With this knowledge, the attacker can contaminate the dataset shortly before data collection begins.
The dataset will be inaccurate in the snapshot obtained when the malicious attack was active even if the data returns to its original, undamaged state within only a few minutes.
Wikipedia is one site that is frequently used to find training data for machine learning systems. Yet, due to the fact that anyone may edit Wikipedia, an attacker “can taint a training set sourced from Wikipedia by adding malicious edits,” per researchers.
As Wikipedia datasets don’t rely on the live page but rather on snapshots taken at a particular period, hostile attackers who time their intervention well could change the page and force the model to gather false data, which will be permanently preserved in the dataset.
“If an attacker knows when a Wikipedia page will be scraped for the next snapshot, they can perform poisoning just before scraping. The snapshot will always have the harmful material, even if the edit is soon undone on the live page “researchers noted.
Because Wikipedia produces snapshots according to a well-established process, it is feasible to forecast the snapshot times of certain articles with a high degree of accuracy. According to research, this methodology can be used to poison Wikipedia pages with a 6.5% success rate.
Although that proportion might not seem significant, the sheer volume of Wikipedia pages and the manner in which they are used as training data for machine learning make it conceivable to provide false information to the models.
Researchers point out that they did not alter any active Wikipedia pages, but rather informed Wikipedia of the attacks and any potential countermeasures as part of the responsible disclosure procedure.
The researchers also point out that the publication of the report is meant to inspire other experts in the field of security to carry out their own studies on how to protect AI and machine-learning systems from hostile attacks.
“Our study is merely a beginning point for the community to build a better understanding of the dangers associated in generating models from web-scale data,” the research stated.