Home Artificial Intelligence Artificial Intelligence News Detecting offensive content in the era of AI

Detecting offensive content in the era of AI

May 16, 2023

Big Tech has improved significantly over the past ten years in a few areas, including language, prediction, personalization, archiving, text parsing, and data processing. However, it still does a shockingly poor job of identifying, labelling, and eliminating hazardous content. To comprehend the harm this produces in the actual world, one just needs to think back to the recent rise in conspiracy theories in the US about vaccines and elections.

And there are some issues with the discrepancy. Why hasn’t content moderation at IT businesses advanced? Can they be made to do so? Will new AI developments enhance our capacity to identify false information?

Tech corporations typically claim the inherent difficulties of languages for why they’re falling short when they’ve been called before Congress to answer for spreading hate and false information. It’s challenging to understand context-dependent hate speech on a big scale and in several languages, according to executives at Meta, Twitter, and Google. Mark Zuckerberg frequently repeats the idea that tech companies shouldn’t be responsible for handling all of the world’s political issues.

At the moment, the majority of businesses combine technology with human content moderators (whose job is undervalued, as seen by their meagre pay packets).

Artificial intelligence presently detects 97% of the information that is removed from Facebook, for instance.

Renee DiResta, the research manager at the Stanford Internet Observatory, explains that since AI is not very good at interpreting nuance and context, it is not possible for it to completely replace human content moderators—who are not usually very good at interpreting those things either.

Due to the fact that the majority of automatic content moderation systems were developed with English data and perform poorly with other languages, cultural context and language might also present difficulties.

A more straightforward explanation is offered by Hany Farid, a professor at the University of California, Berkeley School of Information. Because it is not in the tech corporations’ best interests financially, content moderation has not kept up with the dangers, claims Farid. This is all driven by greed. Stop trying to make this about something other than money.

Additionally, victims of online abuse find it extremely difficult to hold platforms financially liable due to the absence of federal regulation in the US.

The battle between tech giants and bad actors over content control seems to never stop. Tech corporations enforce standards to monitor content; malicious actors figure out ways to dodge them by posting with emojis or purposefully misspelling words to escape detection. Then, while the loopholes are being exploited, the businesses strive to close them, the offenders discover other ones, and so on.

Although DiResta and Farid agree that it’s too soon to predict how things will turn out, both appear cautious. Even though many of the larger systems, such as GPT-4 and Bard, have built-in content moderation filters, it is still possible to manipulate them to produce undesired results, such as hate speech or bomb-making instructions.

Bad actors may be able to launch effective disinformation campaigns at a much faster and larger scale because of generative AI. That’s a disturbing idea, especially considering how inadequate current techniques are for classifying and recognizing content produced by artificial intelligence.

On the other hand, modern large language models do text interpretation far better than older AI systems. They may theoretically be applied to improve automatic content moderation.

However, to make that work, computer companies would have to spend money retooling substantial language models for that particular use. And although some businesses, such as Microsoft, have started researching this, there hasn’t been much noteworthy movement.

Although there have been many technological advancements that should have improved content control, he is doubtful that we will see any progress in this area, according to Farid.

Large language models still have difficulty understanding context, therefore it is likely that they will be unable to comprehend the subtleties of posts and photos as well as human moderators can. Additionally, there are issues with scalability and specialization across cultural boundaries. Do you deploy one model for any particular type of niche? Do you do it by country? Do you do it by community?… It’s not a one-size-fits-all problem, says DiResta.

New tools for new tech

If tech companies can develop reliable, widely used tools to let us know if content is AI-generated or not, that will likely determine whether generative AI ends up being more damaging or useful to the online information realm.

That presents a significant technical problem, and according to DiResta, the identification of synthetic media is probably going to be given top attention. This includes techniques like digital watermarking, which incorporates a small piece of code to act as a permanent marker to indicate that the associated piece of content was created using artificial intelligence. Unlike watermarking, automated techniques for detecting postings produced or altered by AI are appealing since they don’t require the person who created the AI-generated content to explicitly label it as such. However, the present tools that attempt to do this have not been very successful in recognizing information that has been created by a machine.

However, this would rely on voluntary disclosure strategies like watermarking. Some businesses have even offered cryptographic signatures that utilize arithmetic to securely log information like how a piece of material originated.

The most recent version of the European Union’s AI Act, which was just put up this week, mandates that businesses using generative AI notify customers when material is actually generated artificially. As the desire for openness surrounding AI-generated content rises, we’re sure to hear a lot more about these kind of cutting-edge solutions in the upcoming months.

Both predictive policing algorithms and facial recognition in public spaces may soon be prohibited by the EU. This ban, if it passes, would be a significant victory for the anti-face recognition movement, which has recently slowed down in the US.
Following a bipartisan dinner the previous evening, OpenAI CEO Sam Altman will appear to the US Congress on Tuesday as part of a hearing about AI oversight
Chinese police detained a guy over the weekend for spreading false information via ChatGPT. In February, ChatGPT was outlawed in China as part of a series of tougher regulations governing the use of generative AI. It appears that this is the first arrest as a result.

However, compared to what you might expect, the audience for misinformation appears to be much smaller. Although false material is frequently posted on Telegram, most users don’t seem to continue to spread it, according to Oxford Internet Institute researchers who looked at over 200,000 posts.

Contrary to what is commonly believed, they come to the conclusion in their study that there is a small, active community of users who are the target audience for misinformation rather than the general public. Despite the fact that Telegram is mostly unmoderated, the research raises the possibility that there may be a natural, demand-driven mechanism in place that, at least in part, controls the spread of false information.

Source link