Earlier this week, Conservative Leader Andrew Scheer tweeted a screenshot of some curious Google search results. A search for the term “Canadian soldiers” returned a photo of former Guantanamo Bay detainee Omar Khadr who was accused of killing a U.S. soldier in 2002.
Scheer asked that Google take action, and it didn’t take long for another user to suggest the whole thing was the work of a Russian troll.
Omar Khadr is a convicted terrorist who murdered a medic and blinded another. He is not a victim, nor should he be portrayed in this way alongside real Canadian heroes. <a href=”https://twitter.com/googlecanada?ref_src=twsrc%5Etfw”>@googlecanada</a>: fix this. <a href=”https://t.co/qywUGQihVb”>pic.twitter.com/qywUGQihVb</a>
The truth? It’s far more mundane.
Still, the episode is yet another reminder of how even algorithms with the best of intentions can unwittingly fuel the spread of misinformation online. And with Canada’s federal election just months away, the stakes are even higher when politics are involved.
How did Khadr get in there?
Khadr’s name appeared in what Google calls its Knowledge Graph results. These sometimes appear above or beside Google’s usual search engine results when the user asks a question, seeks out a piece of general knowledge, or searches for a well-known place or public figure.
Knowledge Graph pulls its data from a variety of sources — one of them being Wikidata, an open repository of information that’s hosted by the same organization that hosts Wikipedia. Think of Wikipedia like a finished report, and Wikidata the raw data that’s used to write it. Like Wikipedia, anyone can contribute to Wikidata, for better and for worse.
Twitter user Stephen Punwasi pointed out that the data Knowledge Graph used to put Omar Khadr among Canadian soldiers appears to have been pulled from Omar Khadr’s Wikidata page — and that a “Russian troll” was the one who did it.
This is embarrassing.<br><br>Google’s knowledge graph draws data from various source, so I checked for the change that put Omar Kadh under “Canadian soldier.”<br><br>Changes from a Russian account a month ago. 🤦♂️<br><br>TL;DR Andrew Scheer was manipulated by a Russian troll.<a href=”https://twitter.com/hashtag/cdnpoli?src=hash&ref_src=twsrc%5Etfw”>#cdnpoli</a> <a href=”https://twitter.com/Mikeggibbs?ref_src=twsrc%5Etfw”>@Mikeggibbs</a> <a href=”https://t.co/WYchQ9b1H9″>pic.twitter.com/WYchQ9b1H9</a>
So was this the work of a Russian troll?
That doesn’t appear to be the case.
The modifications to Khadr’s Wikidata page were made by a user named Ghuron. The user appears to be an active contributor of data to the site, and according to their Github account, does happen to live in St. Petersburg, Russia.
But Ghuron’s activity doesn’t appear to be targeted at data related to any particular person, ideology, country, or political topic. Rather, it resembles an automated cleanup job intended to improve the quality of Wikidata at a rate far faster than any one person could do by hand.
According to discussions between Ghuron and other Wikidata members, Ghuron runs a script which uses machine learning to automatically add and modify large volumes of Wikidata data (for example, a person’s occupation). Basically, it’s designed to put data into buckets.
His script makes sure that Street Fighter is properly classified as a video game, that the Faroe Islands get lumped in under the larger “islands” category, or that the right Renaissance artists are properly classified as painters. Just see for yourself.
From time to time, his script also appears to get things wrong — and other Wikidata users haven’t been shy letting him know.
Is that what happened to Khadr?
Yup! Using Khadr’s Wikidata page edit history as a guide, here’s a brief timeline that goes back even farther than Scheer’s tweet:
On July 26, 2018, Ghuron’s script categorizes Omar Khadr’s occupation on Wikidata as “soldier” — part of a larger, automated effort to assign occupations to everyone from the Zodiac Killer to Danish priests.
On Sept. 24, 2018, users begin posting to the discussion section of Omar Khadr’s Wikipedia page, asking why Google search results for his name describe him as a Canadian soldier. However, the phrase “Canadian soldier” has never appeared on his Wikipedia page, meaning the phrase was likely pulled from his Wikidata page instead. Google has yet to explicitly confirm this.
Later that day, the data is removed from Khadr’s Wikidata page. A user on the discussion section of Khadr’s Wikipedia page wrote: “It was a Google error, associated only with their search engine and was not fed by Wikipedia. Google rectified the error tonight and Omar Khadr is no longer shown as a Canadian soldier.”
On Sept. 30, 2018, Ghuron’s script categorizes Omar Khadr’s occupation on Wikidata as “soldier” once again. It’s not clear if Google’s Knowledge Graph ignored the change this time.
Either way, on Dec. 8, 2018, Ghuron’s script then categorizes Khadr’s military rank on Wikidata as “soldier” — data that’s just different enough that it likely found its way back into Knowledge Graph results. Before long, people noticed Khadr among the results for “Canadian Soldier” once again.
What did Google do about it?
Danny Sullivan, the closet thing Google has to a search engine ombudsman, replied to Canadaland’s Jesse Brown on Twitter.
I’m not familiar with who is claiming what. As said, we simply saw concern being raised widely. We reviewed, and because it was an issue with the Knowledge Graph, we took action there (we do not take action on search listings). And yes, on the last because it is accurate.
“We reviewed, and because it was an issue with the Knowledge Graph, we took action there,” Sullivan wrote.
It doesn’t appear that Khadr’s Wikidata page has been changed — just Knowledge Graph’s handling of the data it contains.
CBC News has reached out to Google, and will update this story if we hear more.
Is this normal?
As Sullivan also points on Twitter, Google doesn’t modify search results — at least, not unless it’s compelled to remove information from its index. Rather, Google is changing its Knowledge Graph results.
Whether the distinction is obvious to most users, given its placement — especially in situations where political tensions run high — is less clear.
Site Search 360 Trends