ChatGPT is easily exploited for political messaging despite OpenAI’s policies

OpenAI is using GPT-4 to build an AI-powered content moderation system

The company says humans should still be involved in the moderation process.

ChatGPT is easily exploited for political messaging despite OpenAI's policies
ASSOCIATED PRESS

Content moderation has been one of the thorniest issues on the internet for decades. It’s a difficult subject matter for anyone to tackle, considering the subjectivity that goes hand-in-hand with figuring out what content should be permissible on a given platform. ChatGPT maker OpenAI thinks it can help and it has been putting GPT-4’s content moderation skills to the test. It’s using the large multimodal model “to build a content moderation system that is scalable, consistent and customizable.”

The company wrote in a blog post that GPT-4 can not only help make content moderation decisions, but aid in developing policies and swiftly iterating on policy changes, “reducing the cycle from months to hours.” It claims the model can parse the various regulations and nuances in content policies and instantly adapt to any updates. This, OpenAI claims, results in more consistent labeling of content.

“We believe this offers a more positive vision of the future of digital platforms, where AI can help moderate online traffic according to platform-specific policy and relieve the mental burden of a large number of human moderators, ” OpenAI’s Lilian Weng, Vik Goel and Andrea Vallone wrote. “Anyone with OpenAI API access can implement this approach to create their own AI-assisted moderation system.” OpenAI claims GPT-4 moderation tools can help companies carry out around six months of work in about a day.

It’s been well-documented that manually reviewing traumatic content can have a significant impact on human moderators’ mental health, particularly when it comes to graphic material. In 2020, Meta agreed to pay more than 11,000 moderators at least $1,000 each in compensation for mental health issues that may have stemmed from reviewing material that was posted on Facebook.

Using AI to lift some of the burden from human reviewers could be greatly beneficial. Meta, for one, has been employing AI to help moderators for several years. Yet OpenAI says that, until now, human moderators have received help from “smaller vertical-specific machine learning models. The process is inherently slow and can lead to mental stress on human moderators.”

AI models are far from perfect. Major companies have long been using AI in their moderation processes and, with or without the aid of the technology, still get big content decisions wrong. It remains to be seen whether OpenAI’s system can avoid many of the major moderation traps we’ve seen other companies fall into over the years.

In any case, OpenAI agrees that humans still need to be involved in the process. “We’ve continued to have human review to verify some of the model judgements,” Vallone, who works on OpenAI’s policy team, told Bloomberg.

“Judgments by language models are vulnerable to undesired biases that might have been introduced into the model during training. As with any AI application, results and output will need to be carefully monitored, validated and refined by maintaining humans in the loop,” OpenAI’s blog post reads. “By reducing human involvement in some parts of the moderation process that can be handled by language models, human resources can be more focused on addressing the complex edge cases most needed for policy refinement.”

Engadget is a web magazine with obsessive daily coverage of everything new in gadgets and consumer electronics

(9)