Wikipedia:Wikipedia Signpost/2023-02-04/Recent research
Wikipedia's "moderate yet systematic" liberal citation bias
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
English Wikipedia's news citations found to have "moderate yet systematic" liberal bias

A preprint titled "Polarization and reliability of news sources in Wikipedia" finds
The study is based on a dataset of 30 million citations extracted in 2020, which the second author and others have already examined from different angles in other research publications (cf. our previous coverage: "6.7% of Wikipedia articles cite at least one academic journal article with DOI", "How Wikipedia keeps up with COVID-19 research", "A Map of Science in Wikipedia").
As with research examining other kinds of bias (like gender, language or geography), studying political bias involves the non-trivial problem of defining a "neutral" baseline against which to compare Wikipedia's content. For example, in a series of earlier papers that (among other results) found Wikipedia to be "more slanted towards Democratic views" than Britannica, although its "bias was moving from left to right", Greenstein and Zhu used the United States Congressional Record as a kind of gold standard of unbiased language. (Of course, this opened them up to the question whether the spectrum of opinions present among US federal lawmakers is an appropriate baseline for an international encyclopedia, even if their analysis was focused on articles related to US politics.) A 2017 paper studied both political and gender bias by comparing Wikipedia's coverage of topics to that of "political periodicals geared toward either liberal or conservative ideologies" (e.g. Mother Jones vs. National Review), and women's vs. men's magazines, respectively (see our earlier coverage: "English Wikipedia biased against conservative and female topics, at least when compared to US magazines").
The present study relies on a different source that has since become available:
Matching domain names between MBM and the "Wikipedia Citations" dataset, the study finds that

Breaking down polarization ratings by ORES article topic areas, "we cannot see differences among macro topics". This "general trend" was also found for the top 10 (sub-)topic areas and the top 10 Wikiprojects, although with "minor shifts [...]. For example, the topic sports has a higher conservative-leaning fraction of citations, all the while maintaining a liberal-leaning skew. The WikiProjects Politics and India are more liberal-leaning than the average, instead. Taken together, these results confirm that the overall trend towards liberal political polarization is not specific to some areas of Wikipedia, but seems to be widespread across topics and WikiProjects."

Motivating their second research question, the authors "speculate that editors may introduce political polarization in their sources in order to prioritise reliable ones" (which might remind one of Stephen Colbert's dictum "Reality has a well-known liberal bias"). To test this hypothesis, they use the reliability ratings of Media Bias/Fact Check (but not that site's bias ratings). They note in passing that "that, while there are only 1467 citations rated as 'VERY LOW' [reliability], there remains a sizable fraction of citations to low or mixed reliability outlets" on English Wikipedia, as of 2020. (It might have been interesting to conduct the same analysis with the English Wikipedia's own reliability ratings that the community has compiled for numerous news sources at WP:RSP – where, ironically, "Media Bias/Fact Check" is itself currently rated as "generally unreliable, as it is self published", somewhat in contrast to the present paper and the peer-reviewed publication that it cites in justification of using MBFC.)
However, in a linear regression analysis (which also takes article topic and WikiProjects into account), the authors "cannot see a clear pattern emerge. While high reliability shows a liberal skew, very high reliability shows a conservative skew in turn. Mixed sources tend to be more liberal, while low and very low reliability ones tend to be more conservative." Overall, they conclude that "the case for a possible association between low reliability and conservative news outlets disappear[s]" in the end.
Briefly
- See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.
- The Wikimedia Foundation's Research team is soliciting nominations for the "Wikimedia Foundation Research Award of the Year" 2022, to be submitted until February 6.
Other recent publications
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
"Political representation bias in DBpedia and Wikidata as a challenge for downstream processing"
From the abstract:
From the "Method" section:
From the "Results and interpretation" section:
"Assuming Good Faith Online"
In this legal essay, US legal scholar Eric Goldman (whom some Wikipedians might recall for his – later retracted – 2005 prediction of Wikipedia's demise due to volunteer burnout) contrasts Wikipedia's "Assume Good Faith" principle with current attempts by Internet regulators to rein in on user-generated content websites and Section 230 (see also this issue's "In the media").
Simulation of article disputes finds that "it is more important not to have intolerant editors than to have very tolerant ones"
From the abstract:
From the "Conclusion" section:
See also our review of a related earlier paper involving one of the authors: "More newbies mean more conflict, but extreme tolerance can still achieve eternal peace".
"The Role of Local Content in Wikipedia: A Study on Reader and Editor Engagement"
From the abstract:
(cf. by some of the same authors: "The Wikipedia Diversity Observatory: A Project to Identify and Bridge Content Gaps in Wikipedia")
This paper is part of a 2021 monograph published on occasion of Wikipedia's 20th anniversary ("Wikipedia, veinte años de conocimiento libre"), which comprises various other research papers, most of which are in Spanish with an English abstract.
"Discussing the Past: The Production of Historical Knowledge on Wikipedia"
From the abstract:
This dissertation includes detailed examinations of the history of discussions at Talk:Atomic bombings of Hiroshima and Nagasaki, Talk:Vietnam War and Talk:September 11 attacks.
"Producing Historical Knowledge on Wikipedia"
This is an earlier paper by the dissertation's author. From the "Conclusion" section:
InternetArchiveBot found to be over-eager in declaring links as "permanently dead" but late in archiving them
From the abstract:
References
- Supplementary references and notes:
Discuss this story
Ech. So they compared an international encyclopedia to American publications and claim it's bad we don't perfectly align with them? Most of the Anglosphere is left of America. Adam Cuerden (talk)Has about 8.2% of all FPs. Currently celebrating his 600th FP! 16:23, 7 February 2023 (UTC)[reply]
- Most of the Anglosphere is the United States of America. Somers-all-the-time (talk) 04:36, 8 February 2023 (UTC)[reply]
- 2/3rds of the Anglosphere is a majority, but not so much of one that Wikipedia would be expected to match American biases. Frankly, the premise of the study is that American political biases are some objective standard that Wikipedia should be trying to emulate. Since that's not and has never been Wikipedia's goal, and given the weirdness of the dataset (Facebook user data and Media Bias Monitor?), it's questionable. Oddly enough, though, Media Bias Monitor itself rates Wikipedia as "least biased". Adam Cuerden (talk)Has about 8.2% of all FPs. Currently celebrating his 600th FP! 18:50, 9 February 2023 (UTC)[reply]
That InternetArchiveBot blurb is quite concerning. Any attempts to fix this issue? DFlhb (talk) 09:23, 9 February 2023 (UTC)[reply]