Newslurp

The Editors Protecting Wikipedia from AI Hoaxes

By Emanuel Maiberg • 9 Oct 2024

View in browser

Photo by Oberon Copeland @veryinformed.com / Unsplash

Hey Kaitlyn,

Today I have a kind of story I hope to write more of in the future: Humans who are working to fortify the internet against the onslaught of low quality and often misleading AI-generated content. It makes sense that Wikipedia, which has always been a collaborative, not profit-driven effort to provide people with useful information, would be a place where we'd see an effort like this. I think there's probably a lot we can learn from how and why these editors do it.

A group of Wikipedia editors have formed WikiProject AI Cleanup, “a collaboration to combat the increasing problem of unsourced, poorly-written AI-generated content on Wikipedia.”

The group’s goal is to protect one of the world’s largest repositories of information from the same kind of misleading AI-generated information that has plagued Google search results, books sold on Amazon, and academic journals.

“A few of us had noticed the prevalence of unnatural writing that showed clear signs of being AI-generated, and we managed to replicate similar ‘styles’ using ChatGPT,” Ilyas Lebleu, a founding member of WikiProject AI Cleanup, told me in an email. “Discovering some common AI catchphrases allowed us to quickly spot some of the most egregious examples of generated articles, which we quickly wanted to formalize into an organized project to compile our findings and techniques.”

This segment is a paid ad. If you’re interested in advertising, let's talk.

Your personal data can easily be found with a simple Google search—DeleteMe is here to help. Join a growing community of privacy champions, from individuals to forward-thinking businesses, seeking to stop the leak at the source. We're offering 404 Media readers 20% off all consumer plans.

In many cases, WikiProject AI Cleanup finds AI-generated content on Wikipedia with the same methods others have used to find AI-generated content in scientific journals and Google Books, namely by searching for phrases commonly used by ChatGPT. One egregious example is this Wikipedia article about the Chester Mental Health Center, which in November of 2023 included the phrase “As of my last knowledge update in January 2022,” referring to the last time the large language model was updated.

Other instances are harder to detect. Lebleu and another WikiProject AI Cleanup founding member who goes by Queen of Hearts told me that the most “impressive” examples they found of AI-generated content on Wikipedia so far is an article about the Ottoman fortress of Amberlisihar.

“Amberlihisar fortress was built in 1466 by Mehmed the Conqueror in Trabzon, Turkey. The fortress was designed by Armenian architect, Ostad Krikor Baghsarajian.^[7] Construction of the fortress was completed using a combination of stone and brick materials, with craftsmen and builders being brought in from the Rumelia region to work on the project. The timbery for the fortress was sourced from the forests along the coast of the Black Sea. The duration of construction is not specified, but it is known that the fortress was completed in 1466. It is likely that construction took several years to complete.^[7]”

The more than 2,000 word article is filled with cogent paragraphs like the ones above, divided into sections about its name, construction, various sieges it faced, and even restoration efforts after it “sustained significant damages as a result of bombardment by Russian forces” during World War I.”

“One small detail, the fortress never existed,” Lebleu said. Aside from a few tangential facts mentioned in the article, like that Mehmed the Conqueror, or Mehmed II, was a real person, everything else in the article is fake. “The entire thing was an AI-generated hoax, with well-formatted citations referencing completely nonexistent works.”

Fake citations, Lebleu said, are a more “pernicious” issue because they might stay undetected for months. Even if someone was using an LLM trained on a corpus of data relevant to the Wikipedia article, it could generate text that reads well and with correctly formatted citations of real sources, it still wouldn’t be able to correctly match a citation to a specific claim made in a specific body of work. As an example, Lebleu pointed to a Wikipedia article about an obscure species of beetle that cited a real journal article in French.

“Only thing, that article was about a completely unrelated species of crab, and made no mention of the beetle at all,” Lebleu said. “This adds a layer of complications if the sources are not in English, as it makes it harder for most readers and editors to notice the issue.”

Other examples of AI-generated content appearing on Wikipedia that WikiProject AI Cleanup has removed are more subtle, but can cause just as much confusion. For example, an article about Darul Uloom Deoband, a real Islamic seminary in India, at one point featured this image, which like many Wikipedia articles looks like period-appropriate painting related to the subject of the article.

Upon closer-examination, however, you can see the telltale signs of poorly AI-generated people with mangled hands and a foot with seven toes. According to a page where WikiProject AI Cleanup tracks the removal of AI-generated images on Wikipedia (and that was previously highlighted by The Signpost), this image was removed because “the image contributes little to the article, could be mistaken for a contemporary artwork, and is anatomically incorrect.”

The WikiProject AI Cleanup page that tracks AI-generated images on Wikipedia also clarifies that it doesn’t remove AI-generated images from Wikipedia just because they are AI-generated, and that in some cases AI-generated images are appropriate. If the article is about or mentions an AI-generated image, it makes sense to include it. For example, this article about the viral, botched Willy's Chocolate Experience which was advertised with an AI-generated image, includes that image. This article about the baseless claims promoted by Donald Trump that Haitian immigrants were eating pets in Springfield, Ohio, includes an AI-generated image, tweeted by the Republican-controlled United States House Committee on the Judiciary, of Trump holding a goose and a kitten. The pastoral science fiction article page also features an image generated with Stable Diffusion which the team is not trying to remove because “the image is a high quality rendition of the ideas in its section.”

In some ways, it seems, Wikipedia is at least for now better than other major internet services at detecting and filtering out misleading AI-generated content because the site has always relied on human volunteers to review new articles and track down any claims they make to reliable sources. This as opposed to Facebook, Google, Amazon, and other large platforms that have human moderators, but that we’ve repeatedly reported have failed to catch misleading AI-generated content, and usually only remove it in response to our reporting or complaints from users.

“Wikipedia articles have a more specific format (not only in terms of presentation, but also of content) than Google results, and a LLM that isn't familiar with it is likely to produce something that is much more easy to spot,” Lebleu said. “Things like verifying references also help: as Wikipedia aims to be a tertiary source (synthesizing other sources without adding original research itself), it should theoretically be possible to verify if the written content matches the sources.”

“While I'd like to think Wikipedians are decent at detecting and removing AI content, there is undoubtedly a lot that slips through the cracks and we're all volunteers,” Queen of Hearts, another founding member of WikiProject AI Cleanup, said. “While major companies' failure to detect and remove AI slop is concerning, I believe they could do better than us with properly allocated resources.”

Lebleu said that the editors have discussed using AI to detect AI, with tools like GPTZero, but so far found it had “varying levels of success.”

“There is ultimately no ‘oracle machine’ that could perfectly distinguish AI text from non-AI text,” Lebleu said. “These AI-detecting tools are often imprecise, and only effective on older models like GPT-2. Also, like LLMs themselves, LLMs detectors haven't been trained specifically on Wikipedia articles, a corpus that is much more homogenous than a much larger training set, and thus easier to distinguish from the outputs of models trained on this larger set. Because of this, humans familiar with both Wikipedia writing guidelines and common LLM keywords are often better at spotting AI content in this specific context.”

The Editors Protecting Wikipedia from AI Hoaxes

Emanuel from 404 Media <404-media@ghost.io>

October 9, 1:00 pm

Keep reading

The Editors Protecting Wikipedia from AI Hoaxes

Twitter Acts Fast on Nonconsensual Nudity If It Thinks It’s a Copyright Violation…

Hacked ‘AI Girlfriend’ Data Shows Prompts Describing Child Sexual Abuse…