Wikipedia:Wikipedia Signpost/2021-06-27/Recent research

Recent research

Feminist critique of Wikipedia's epistemology, Black Americans vastly underrepresented among editors, Wiki Workshop report

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Reviewed by Markworthen

This paper by Menking and Rosenberg, published in the journal Science, Technology, & Human Values, is a recondite article. Its depth is both a strength (diligent study of the article will likely enhance Wikipedians' understanding of potential problems, such as our assumptions about what constitutes a reliable source or our epistemological presumptions), and a weakness (most Wikipedians will not read it because it is so dense).

I tried (several times) but cannot improve on the authors' summary. Here, then, is an excerpt from the article abstract:

Context

Background reading that will enhance understanding of this Menking & Rosenberg (2021) article:

Talk page discussions about the article

The article has generated some engaging discussions on Wikipedia talk pages, for example:


African Americans are vastly underrepresented among US Wikimedians, but contribute motivated by "black altruism"

Reviewed by Tilman Bayer

Last month, the Wikimedia Foundation published the results of its annual "Community Insights", a global survey of 2,500 Wikimedians (including active editors and program leaders) conducted in September/October 2020.

For the first time, the survey asked about race and ethnicity, confined to two countries where such categories are widely used and accepted: the US (195 responses) and the UK (67). Among US contributors, the findings shows striking gaps among Black/African American editors (0.5% compared to 13% among the general population) and American Indian/Alaska Native editors (0.1% vs 0.9%). Hispanic/Latino/a/x editors show a lesser but still large gap (5.2% vs. 18%). White/Caucasian (89% vs. 72%) and especially Asian Americans (8.8% vs. 5.7% among the general population) are over-represented among contributors in the US.

Survey findings about the race of contributors in the US, compared to the overall population (categories are overlapping and thus sum to more than 100%)

In the UK, the survey similarly found "significant underrepresentation" of Black or Black British editors (0.0% vs. 3.0% in the general population), whereas the percentage of white editors was close to the general population.

A racial or ethnic gap among Wikipedians in the US has long been anecdotally observed or conjectured (see e.g. this 2010 thread which also contained some informed speculation about possible reasons), but this marks the first time that it is backed by empirical survey data, related to the fact that the Wikimedia Foundation's annual surveys are global in nature and there are no internationally accepted definitions of race and ethnicity (or worse, survey questions of this nature would be considered offensive in many countries) [1][2].

Illustration from the 2018 "Pipeline of Online Participation Inequalities" paper

Correspondingly, there has been few research about possible reasons for such gaps. An exception is the 2018 paper "The Pipeline of Online Participation Inequalities: The Case of Wikipedia Editing", which we previously reviewed here with a more general focus, but it also contains some insights about reasons why African Americans contribute at a lower rate.

While this study did not find a significant racial disparity among the earlier parts of the pipeline (measuring whether survey respondents had heard of Wikipedia or had visited Wikipedia), when it comes to "know[ing] that Wikipedia can be edited [...] age, gender, and several racial/ethnic identity categories (Black, Hispanic, Other) emerge as salient explanatory factors where they did not before. Income no longer explains the outcome. Education level associates strongly with knowing Wikipedia can be edited." However, racial and ethnic background factors "do not associate with who contributes content" (i.e. the last part of the pipeline). This points to raising awareness of Wikipedia's editability as a potential strategy for reducing these gaps, although this would not address "the importance of education and Internet skills" gaps for closing knowledge gaps that the authors highlight in their overall conclusions.

Conversely, a 2020 paper answered the question "What drives Black contributions to Wikipedia?" with the following conclusions, based on a survey of 318 Black Wikipedia editors in the US:

Survey respondents were recruited in 2017 via Qualtrics "based on predefined characteristics such as individuals who identified as Black/African American, resided in the United States, and had made at least one edit/contribution to Wikipedia's English edition over the last three years". Interestingly, the resulting sample of 318 Black Wikipedia contributors was much larger than that of the WMF Community Insights survey, which (barring some extreme downward adjustments during the weighting process) appears to have consisted of a single Black respondent in the sample, considering the stated percentage of 0.5% among 195 US-based respondents.


Wikiworkshop 2021

Report by Tilman Bayer

The annual WikiWorkshop, part of The Web Conference, took place as an online event on April 14, 2021, featuring the papers listed below. The organizers reported that 78% of attendees were non-native English speakers, 66% first-time attended Wiki Workshop for the first time, 53% were academic researchers and 34% students.

"References in Wikipedia: The Editors' Perspective"

From the abstract:

"Do I Trust this Stranger? Generalized Trust and the Governance of Online Communities"

From the abstract:

"Negative Knowledge for Open-world Wikidata"

From the abstract:

"A Brief Analysis of Bengali Wikipedia's Journey to 100,000 Articles"

From the abstract:

"WikiShark: An Online Tool for Analyzing Wikipedia Traffic and Trends"

From the abstract:

"Tracing the Factoids: the Anatomy of Information Re-organization in Wikipedia Articles"

From the abstract:

"Wikidata Logical Rules and Where to Find Them"

From the paper (an extended abstract):

"Simple Wikidata Analysis for Tracking and Improving Biographies in Catalan Wikipedia"

From the abstract:

Related code: https://github.com/toniher/wikidata-pylisting

"Structural Analysis of Wikigraph to Investigate Quality Grades of Wikipedia Articles"

From the abstract:

"Towards Open-domain Vision and Language Understanding with Wikimedia"

From the abstract:

"Language-agnostic Topic Classification for Wikipedia"

From the abstract:

See also: online demo, data dumps, model details

"Fast Linking of Mathematical Wikidata Entities in Wikipedia Articles Using Annotation Recommendation"

From the abstract:

"ShExStatements: Simplifying Shape Expressions for Wikidata"

From the abstract:

"Inferring Sociodemographic Attributes of Wikipedia Editors: State-of-the-art and Implications for Editor Privacy"

From the abstract:

"The Language of Liberty: A preliminary study"

From the paper:

"Information flow on COVID-19 over Wikipedia: A case study of 11 languages"

From the abstract:

"Towards Ongoing Detection of Linguistic Bias on Wikipedia"

From the abstract:

Analysis of two million AfD (Article for Deletions) discussions

From the abstract:

Among the findings are that "Editors who joined before 2007 tend to overwhelmingly belong to the more central parts of the network" and that "user preferences [for keep or delete] are relatively stable over time for ... more central editors. However, despite the overall stability of trajectories, we also observe a substantial narrowing of opinions in the early period of an AfD reviewer tenure. ... Strong deletionists exhibit the least amount of change, suggesting the possibility of lower susceptibility, or higher resistance, to opinion change in this group." Overall though, the authors conclude that "differences between inclusionists and deletionists are more nuanced than previously thought."

From the abstract:

"Wikipedia Editor Drop-Off: A Framework to Characterize Editors' Inactivity"

A figure from the paper, showing "The different states of drop-off related to activity and their possible transitions"
From the abstract:

The paper is part of an ongoing research project funded by a €83,400 project grant from the Wikimedia Foundation. Some related code can be found at https://github.com/WikiCommunityHealth/ .

Wikimedia Foundation Research Award of the Year

Besides presentations about the papers listed above, the Wiki Workshop event also saw the announcement of the first "Wikimedia Foundation Research Award of the Year" ("WMF-RAY", cf. call for nominations), with the following two awardees:

"Content Growth and Attention Contagion in Information Networks: Addressing Information Poverty on Wikipedia" (also presented at last year's Wikiworkshop), a paper which according to the laudators

"Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages" and Masakhane (which describes itself as "A grassroots NLP community for Africa, by Africans"). The paper

While the research does not seem to have concerned Wikipedia directly, the laudators find it an "inspiring example of work towards Knowledge Equity, one of the two main pillars of the 2030 Wikimedia Movement Strategy" and expect the project's success

Consistent with its title, the paper features an impressive list of no less than 48 authors (with the cited eprint having been submitted to arXiv by Julia Kreutzer of Google Research).

Briefly

References

Supplementary references and notes:


Uses material from the Wikipedia article Wikipedia:Wikipedia Signpost/2021-06-27/Recent research, released under the CC BY-SA 4.0 license.