Wikipedia:Wikipedia Signpost/2021-06-27/Recent research
Feminist critique of Wikipedia's epistemology, Black Americans vastly underrepresented among editors, Wiki Workshop report
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
"WP:NOT, WP:NPOV, and Other Stories Wikipedia Tells Us: A Feminist Critique of Wikipedia’s Epistemology"
- Reviewed by Markworthen
This paper by Menking and Rosenberg, published in the journal Science, Technology, & Human Values, is a recondite article. Its depth is both a strength (diligent study of the article will likely enhance Wikipedians' understanding of potential problems, such as our assumptions about what constitutes a reliable source or our epistemological presumptions), and a weakness (most Wikipedians will not read it because it is so dense).
I tried (several times) but cannot improve on the authors' summary. Here, then, is an excerpt from the article abstract:
Context
Background reading that will enhance understanding of this Menking & Rosenberg (2021) article:
- Benjakob, Omer and Stephen Harrison. From Anarchy to Wikiality, Glaring Bias to Good Cop: Press Coverage of Wikipedia's First Two Decades. The Signpost 17, no. 1 (31 January 2021).
- Epistemology
- Feminist epistemology
- Feminist method
- Lorraine Code and Helen Longino – Feminist philosophers whose ideas helped guide Menking & Rosenberg's analysis.
Talk page discussions about the article
The article has generated some engaging discussions on Wikipedia talk pages, for example:
- "Helping new editors: Learn Wikipedia's unique culture" – Discussion of the article begins after I posted the citation on 28 December 2020 @ 17:45 UTC (in that thread).
- "WP:5P sidetrack (part II) on Iridescent's Talk page.
- User talk:Markworthen/sandbox/Feminist critique of Wikipedia's epistemology
African Americans are vastly underrepresented among US Wikimedians, but contribute motivated by "black altruism"
- Reviewed by Tilman Bayer
Last month, the Wikimedia Foundation published the results of its annual "Community Insights", a global survey of 2,500 Wikimedians (including active editors and program leaders) conducted in September/October 2020.
For the first time, the survey asked about race and ethnicity, confined to two countries where such categories are widely used and accepted: the US (195 responses) and the UK (67). Among US contributors, the findings shows striking gaps among Black/African American editors (0.5% compared to 13% among the general population) and American Indian/Alaska Native editors (0.1% vs 0.9%). Hispanic/Latino/a/x editors show a lesser but still large gap (5.2% vs. 18%). White/Caucasian (89% vs. 72%) and especially Asian Americans (8.8% vs. 5.7% among the general population) are over-represented among contributors in the US.

In the UK, the survey similarly found "significant underrepresentation" of Black or Black British editors (0.0% vs. 3.0% in the general population), whereas the percentage of white editors was close to the general population.
A racial or ethnic gap among Wikipedians in the US has long been anecdotally observed or conjectured (see e.g. this 2010 thread which also contained some informed speculation about possible reasons), but this marks the first time that it is backed by empirical survey data, related to the fact that the Wikimedia Foundation's annual surveys are global in nature and there are no internationally accepted definitions of race and ethnicity (or worse, survey questions of this nature would be considered offensive in many countries) [1][2].

Correspondingly, there has been few research about possible reasons for such gaps. An exception is the 2018 paper "The Pipeline of Online Participation Inequalities: The Case of Wikipedia Editing", which we previously reviewed here with a more general focus, but it also contains some insights about reasons why African Americans contribute at a lower rate.
While this study did not find a significant racial disparity among the earlier parts of the pipeline (measuring whether survey respondents had heard of Wikipedia or had visited Wikipedia), when it comes to "know[ing] that Wikipedia can be edited [...] age, gender, and several racial/ethnic identity categories (Black, Hispanic, Other) emerge as salient explanatory factors where they did not before. Income no longer explains the outcome. Education level associates strongly with knowing Wikipedia can be edited." However, racial and ethnic background factors "do not associate with who contributes content" (i.e. the last part of the pipeline). This points to raising awareness of Wikipedia's editability as a potential strategy for reducing these gaps, although this would not address "the importance of education and Internet skills" gaps for closing knowledge gaps that the authors highlight in their overall conclusions.
Conversely, a 2020 paper answered the question "What drives Black contributions to Wikipedia?" with the following conclusions, based on a survey of 318 Black Wikipedia editors in the US:
Survey respondents were recruited in 2017 via Qualtrics "based on predefined characteristics such as individuals who identified as Black/African American, resided in the United States, and had made at least one edit/contribution to Wikipedia's English edition over the last three years". Interestingly, the resulting sample of 318 Black Wikipedia contributors was much larger than that of the WMF Community Insights survey, which (barring some extreme downward adjustments during the weighting process) appears to have consisted of a single Black respondent in the sample, considering the stated percentage of 0.5% among 195 US-based respondents.
Wikiworkshop 2021
- Report by Tilman Bayer
The annual WikiWorkshop, part of The Web Conference, took place as an online event on April 14, 2021, featuring the papers listed below. The organizers reported that 78% of attendees were non-native English speakers, 66% first-time attended Wiki Workshop for the first time, 53% were academic researchers and 34% students.
"References in Wikipedia: The Editors' Perspective"
"Do I Trust this Stranger? Generalized Trust and the Governance of Online Communities"
"Negative Knowledge for Open-world Wikidata"
"A Brief Analysis of Bengali Wikipedia's Journey to 100,000 Articles"
"WikiShark: An Online Tool for Analyzing Wikipedia Traffic and Trends"
"Tracing the Factoids: the Anatomy of Information Re-organization in Wikipedia Articles"
"Wikidata Logical Rules and Where to Find Them"
"Simple Wikidata Analysis for Tracking and Improving Biographies in Catalan Wikipedia"
Related code: https://github.com/toniher/wikidata-pylisting
"Structural Analysis of Wikigraph to Investigate Quality Grades of Wikipedia Articles"
"Towards Open-domain Vision and Language Understanding with Wikimedia"
"Language-agnostic Topic Classification for Wikipedia"
See also: online demo, data dumps, model details
"Fast Linking of Mathematical Wikidata Entities in Wikipedia Articles Using Annotation Recommendation"
"ShExStatements: Simplifying Shape Expressions for Wikidata"
"Inferring Sociodemographic Attributes of Wikipedia Editors: State-of-the-art and Implications for Editor Privacy"
"The Language of Liberty: A preliminary study"
"Information flow on COVID-19 over Wikipedia: A case study of 11 languages"
"Towards Ongoing Detection of Linguistic Bias on Wikipedia"
Analysis of two million AfD (Article for Deletions) discussions
Among the findings are that "Editors who joined before 2007 tend to overwhelmingly belong to the more central parts of the network" and that "user preferences [for keep or delete] are relatively stable over time for ... more central editors. However, despite the overall stability of trajectories, we also observe a substantial narrowing of opinions in the early period of an AfD reviewer tenure. ... Strong deletionists exhibit the least amount of change, suggesting the possibility of lower susceptibility, or higher resistance, to opinion change in this group." Overall though, the authors conclude that "differences between inclusionists and deletionists are more nuanced than previously thought."
"Assessing the quality of health-related Wikipedia articles with generic and specific metrics"
"Wikipedia Editor Drop-Off: A Framework to Characterize Editors' Inactivity"

The paper is part of an ongoing research project funded by a €83,400 project grant from the Wikimedia Foundation. Some related code can be found at https://github.com/WikiCommunityHealth/ .
Wikimedia Foundation Research Award of the Year
Besides presentations about the papers listed above, the Wiki Workshop event also saw the announcement of the first "Wikimedia Foundation Research Award of the Year" ("WMF-RAY", cf. call for nominations), with the following two awardees:
"Content Growth and Attention Contagion in Information Networks: Addressing Information Poverty on Wikipedia" (also presented at last year's Wikiworkshop), a paper which according to the laudators
"Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages" and Masakhane (which describes itself as "A grassroots NLP community for Africa, by Africans"). The paper
While the research does not seem to have concerned Wikipedia directly, the laudators find it an "inspiring example of work towards Knowledge Equity, one of the two main pillars of the 2030 Wikimedia Movement Strategy" and expect the project's success
Consistent with its title, the paper features an impressive list of no less than 48 authors (with the cited eprint having been submitted to arXiv by Julia Kreutzer of Google Research).
Briefly
- See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.
References
- Supplementary references and notes:
Discuss this story
Initial discussion
These papers would be more convincing if they were written in something resembling the English language. "An alternative set of pillars developed through the lens of feminist epistemology" is about as meaningful as "Colorless green ideas sleep furiously". The Blade of the Northern Lights (話して下さい) 21:25, 27 June 2021 (UTC)[reply]
{{reply to|Chess}}
on reply) 21:37, 27 June 2021 (UTC)[reply]{{reply to|Chess}}
on reply) 22:50, 27 June 2021 (UTC)[reply]{{reply to|Chess}}
on reply) 00:59, 29 June 2021 (UTC)[reply]I'm not a huge fan of the conclusions in the paper about why black people participate in Wikipedia less. Considering that they participate at, what, 1/20th of the rate of white people, we could say that we're only engaging 1/20th of the black people who would potentially be interested in editing. Of the ones we do engage, they're interested in black altruism - but what about the ones we don't? I think that's the more important question. The fact that the ones we do engage tend to not cite "entertainment" as a reason is somewhat interesting I think - perhaps we should look at that among non-editors? Since I'd assume there is a large base of black people who would potentially find Wikipedia fun to edit, yet for some reason have avoided it. I do appreciate the effort to look more into demographics, I'm only concerned about a form of survivorship bias interfering with getting useful information. Elli (talk | contribs) 22:37, 27 June 2021 (UTC)[reply]
Yeah this reeks of bad conclusions based on poor evidence. Maybe consider systemic and socioeconomic factors before torpedoing the 5 Pillars and how wiki has functioned decently for a long time. ~Gwennie🐈⦅💬 📋⦆ 22:41, 27 June 2021 (UTC)[reply]
WhatamIdoing's discernment
What if they were right?
I'm inclined to reject the argument, but I think it is good intellectual discipline to seriously consider the possibility that they are basically right about the 5P being responsible for the diversity gap in the editor population. I think, though, even if they were right, it would be a mistake to attempt a radical reengineering of core principles: to use nautical terms, the Wikipedia community, at least for en.WP, is the analog of an oil tanker with a turning circle that takes hours to execute. I doubt the enterprise could survive such an effort and remain fruitful. If we did want to remake WP on new principles, I think it could only work in the context of a new project. — Charles Stewart (talk) 17:24, 1 July 2021 (UTC)[reply]
{{reply to|Chess}}
on reply) 01:11, 2 July 2021 (UTC)[reply]So what, if knowers are situated? Why can't knowers reflect on what they are doing and put their personal beliefs aside when writing an encyclopedia? Is it so hard to just describe the debates rather than engaging in the debates themselves? Is it really unrealistic to expect that editors can look after each others' edits on controversial articles and make sure that only description of the debates are being done? We aren't trying to write using a view from nowhere (as Thomas Nagel would put it). What we are doing is presenting from every angles proportionally to the weight they have on the composition of views held by expert researchers. Also, feminist epistemology is originally concerned with researchers, not with the summarizers (i.e. encyclopedia writers) of the findings of those researchers. So the concepts in that area aren't automatically applicable to Wikipedia. So why are the authors referring to 'the truth' when Wikipedia doesn't lead, it only follows?
VarunSoon (talk) 03:39, 2 July 2021 (UTC)[reply]
{{reply to|Chess}}
on reply) 23:37, 16 July 2021 (UTC)[reply]{{reply to|Chess}}
on reply) 00:52, 17 July 2021 (UTC)[reply]Survey percentages
I don't understand how if there were 195 respondents reporting their race/ethnicity in the US, First Nation people can make up 0.1%. Even if there was just one such person in the sample, that would be 0.5%. What am I missing? --Andreas JN466 09:00, 29 June 2021 (UTC)[reply]
Publically available version of Menking and Rosenburg paper
The article links to a paywalled version of the paper, which is available OA at https://journals.sagepub.com/doi/pdf/10.1177/0162243920924783?casa_token=EfdSjisfZf8AAAAA:EB-0LLFClccB0CVNc8io5W46u4DoBWAx9gX-bBDf3PHbsRq3xDMbs1Fh_uePmIJ4RpxXh1WGZg9j
The link should be updated. — Charles Stewart (talk) 10:22, 1 July 2021 (UTC)[reply]
What inevitably mathematically dominated the study
With our sports SNG "did it for a living for one day" criteria to bypass GNG, we have an immense amount of articles (many permastubs) in this numerically male dominated (and even more so collectively over history) field which heavily influence overall numbers in such studies. I hit "random article" a few hundred times and 43% of ALL of the articles about men were about sports figures. This mathematically dwarfs any other category, with politicians being a distant second at 11%. So sports figures would have mathematically dominated that study. North8000 (talk) 00:05, 16 July 2021 (UTC)[reply]
{{reply to|Chess}}
on reply) 01:20, 17 July 2021 (UTC)[reply]This is premature but I wanted to post something. I did a more careful sample (so far 200 articles) Of the articles about individual people (59) , I divided them into recent (active in the last 15 years) and not recent. Here was the breakdown of articles on individual people:
- Articles on individual sports people: 29% All other articles on individual people 71%
- Non-recent sports: Male 100% Female 0%
- Recent sports: Male 90% Female 10%
- Non sports, non recent: Male: 81% Female 19%
- Non sports, recent: Male 45% Female 55%
North8000 (talk) 21:12, 16 July 2021 (UTC)[reply]