How AI is creating a safer online world

We are excited to bring Transform 2022 back to life on 19th July and virtually 20th July – 3rd August. Join AI and data leaders for sensible conversations and exciting networking opportunities. Learn more

From social media cyberbullying to attacks on metavers, the Internet can be a dangerous place. Online content moderation is one of the most important ways companies can secure their platforms for users.

However, mediating content is not an easy task. The amount of content online is staggering. Mediators must deal with everything from hate speech and terrorist propaganda to nudity and gore. The “data overload” of the digital world is compounded only by the fact that most of the content is user-generated and can be difficult to identify and categorize.

AI to automatically detect unpleasant speech

That’s where AI comes in. Using machine learning algorithms to identify and categorize content, companies can identify unsafe content as they are created, rather than waiting hours or days for human review, reducing the number of people exposed to unsafe content.

For example, Twitter uses AI to identify and eliminate terrorist propaganda from its platform. AI flagged more than half of its tweets violating its terms of service, while CEO Parag Agarwal focused on using AI to identify hate speech and misinformation. More needs to be done, he said, as toxic effects are still prevalent on the platform.

Similarly, Facebook’s AI detects about 90% of hate speech removed by the platform, including nudity, violence, and other potentially offensive content. However, like Twitter, Facebook still has a long way to go.

Where AI goes wrong

Despite its promise, AI-based content moderation faces many challenges. One is that these systems often mistakenly flag secure content as unsafe, which can have serious consequences. For example, Facebook has marked legitimate news articles about coronavirus as spam at epidemic outlets. He mistakenly banned the Republican Party’s Facebook page for more than two months. And, he flagged the posts and comments about the public landmark Plymouth Ho in England as offensive.

However, the problem is difficult. Failure to flag content may result in more dangerous effects. The shooters in both El Paso and Gilroy shoots revealed their violent intentions on 8chan and Instagram. Robert Bowers, convicted of the Pittsburgh Synagogue massacre, was active on the Twitter-esque site Gabe, used by white supremacists. Misinformation about the war in Ukraine has received millions of views and likes across Facebook, Twitter, YouTube and TikTok.

Another issue is that many AI-based moderation systems reflect racial biases that need to be addressed to create a safe and useful environment for everyone.

Improving AI for moderation

To fix these problems, the AI ​​moderation system needs high quality training data. Today, many companies outsource data to their AI systems to train less skilled, poorly trained call centers in third world countries. These labels lack language skills and cultural context to make specific moderation decisions. For example, unless you are familiar with U.S. politics, you may not know what messages referring to “Jan 6” or “Rudy and Hunter” suggest, despite their importance to content moderation. If you are not a native English speaker, you will over-index vulgar words, whether used in a positive context, mistakenly referring to Plymouth or “She is a bad bitch” offensive.

One company that addresses this challenge is Surge AI, a data labeling platform designed to train AI in language noise. It was founded by a team of engineers and researchers who built trust and security platforms on Facebook, YouTube and Twitter.

For example, Facebook has faced many challenges in collecting high-quality data to train its moderation system in key languages. Despite the company’s size and worldwide communication platform, it had barely enough material to train and maintain a model for standard Arabic, a very few dozen dialects. The fact that the company does not have a comprehensive list of toxic slurs in the languages ​​spoken in Afghanistan means that it may miss many infringing posts. It lacked the Assamese hate speech model, though staff flagged hate speech as a major threat due to the growing violence against ethnic groups in Assam. These are the issues that Surge AI helps solve, through its focus on languages ​​as well as toxic and abusive datasets.

In short, with large, high-quality datasets, social media platforms can train more accurate content moderation algorithms to detect harmful content, helping to keep them safe and free from abuse. Just as large datasets have propelled today’s sophisticated language generation models, such as OpenAI’s GPT-3, they can also fuel better AI for moderation. With adequate data, machine learning models can learn to detect toxicity with greater accuracy and without the biases seen in low-quality datasets.

AI-assisted content moderation is not the perfect solution, but it is a valuable tool that can help companies keep their platforms safe and secure. With the increasing use of AI, we can look forward to a future where the online world is a safer place for all.

Valerias Bangert is a strategy and innovation consultant, founder and published author of three for-profit media outlets.


Welcome to the VentureBeat community!

DataDecisionMakers is a place where experts, including tech people working on data, can share data-related insights and innovations.

If you would like to read about the latest ideas and latest information, best practices and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing to your own article!

Read more from DataDecisionMakers

Similar Posts

Leave a Reply

Your email address will not be published.