Wikipedia:Wikipedia Signpost/2021-07-25/Recent research
Gender bias and statistical fallacies, disinformation and mutual intelligibility
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
New study claims to have found quantitative proof of gender bias in Wikipedia's deletion processes – but has it?
Almost half a century ago, officials at the University of California, Berkeley became concerned about apparent gender bias against women at their institution's graduate division: 44% of male applicants had been admitted for the fall 1973 term, but only 35% of female applicants – a large and statistically significant difference in success rates. The university asked statisticians to look into the matter. Their findings, published with the memorable subtitle
became famous for showing that not only did such a disparity not provide evidence for the suspected gender bias, rather, on closer examination, the data in that case even showed "small but statistically significant bias in favor of women" (to quote from the Wikipedia article about the underlying paradox). The Berkeley admissions case has since been taught to generations of students of statistics, to caution against the fallacy that it illustrates.
But not, apparently, to Francesca Tripodi, a sociology researcher at the UNC School of Information and Library Science, who received a lot of attention on social media over the past month (and was interviewed on NPR by Mary Louise Kelly) about a paper published in New Media & Society, titled "Ms. Categorized: Gender, notability, and inequality on Wikipedia". Her summary of one the two main quantitative results mirrors the same statistical fallacy that had tripped up the UC Berkeley officials back in 1973:
And while Tripodi correctly points out that this overall discrepancy between articles about male and female subjects is statistically significant (just like the one in the Berkeley case), further arguments in the paper veer towards p-hacking (a term for a kind of data misuse that consists of repeating an experiment or measurement multiple times, cherry-picking those outcomes that resulted in a significant result in the expected direction, and dismissing those that did not):
Does this mean that disparities such as the one found by Tripodi here can never be evidence of gender bias? Of course not. But (again quoting from the aforementioned Wikipedia article), it requires that "confounding variables and causal relations are appropriately addressed in the statistical modeling" (with several methods being used for this purpose in bias and discrimination research) – something that is entirely lacking from Tripodi's paper. And it is easy to think of several possible confounders that might have a large effect on her analysis.
- For example, the ratio of female article subject among biographies of living people, or those born within, say, the last 50-70 years, is much larger than the ratio among English Wikipedia's biographies as a whole (as would be expected from historical considerations) – and at the same time, it is very plausible that issues of notability are less likely to be settled for living subjects.
- Another plausible confounder is the age of the article itself: More recently created articles are presumably more likely to be scrutinized for notability (for example as part of the New pages patrol) than those that have survived for many years already. And as Tripodi points out herself, "the number of biographies about women on English-language Wikipedia rose from 16.83% to 18.25%" in the timespan analyzed (plausibly at least in part thank to "activists [who] host 'edit-a-thons' to increase the visibility of notable women" and increase this ratio, as highlighted in the paper's abstract). But this indicates that the female ratio of newly created articles was much higher during that time than in the existing article corpus that forms the reference point of Tripodi's headline comparison.
It is also noteworthy that several previous research publications who started from similar concerns as Tripodi (e.g. that the gender gap among editors – which is very well documented across many languages and Wikimedia projects, see e.g. this reviewer's overview from some years ago – would cause a gender bias in content too) but applied more diligent methods, e.g. by attempting to use external reference points as a "ground truth" against which to compare Wikipedia's coverage, ended up with unexpected results:
- Consider, for example, the paper we previously reviewed here: "Notable women "slightly overrepresented" (not underrepresented) on Wikipedia, but the Smurfette principle still holds".
- Or a paper titled "Exploring Systematic Bias through Article Deletions on Wikipedia from a Behavioral Perspective", which started out from the observation that "Malicious forms of bias towards women on Wikipedia has been well-documented in numerous accounts of online harassment" but found contrary to the authors' expectations "that content of supposed interest to men is more likely to be nominated for CSD [which] runs contrary to common ideas regarding biases in content [...] Bluntly, there does not appear to be significant qualitative differences in the rates of AfD or CSD for articles of supposed interest to women compared to articles of supposed interest to men."
To be sure, other papers found evidence for bias in expected directions, for example in the frequency of words used in articles about women. But overall, this shows that Tripodi's conclusions should be regarded with great skepticism.
Tripodi's second quantitative result, the "miscategorization" concept highlighted in the paper's title, is likewise more open to interpretation than the paper would like one to believe. The author found that once nominated for deletion, articles about women have a higher chance of surviving than articles about men. She interprets this as evidence for sexist bias against women (apparently taking the eventual AfD outcome as a baseline, i.e. postulating the English Wikipedia community as a whole as a non-sexist neutral authority against which to evaluate the individual AfD nominator's action). Other researchers have taken the exact opposite approach, where it would have counted as evidence for bias against women when pages about them would be more likely to be deleted than pages about men, e.g. Julia Adams, Hannah Brückner and Cambria Naslund in the paper reviewed here (which also, as Tripodi acknowledges, "found that women academics were not more likely to be deleted" in a sample of 6,323 AfD discussions – in contrast to Tripodi's sample, where women in general were deleted less often than men).
The quantitative results only form part of this mixed methods paper though. In its qualitative part, Tripodi draws from extensive field research, namely
Tripodi's report about the impressions and frustrations shared by these participants are well worth reading. For example:
Still, even the validity of some of the paper's qualitative observations have been questioned by Wikipedians. For example, Tripodi opens her paper with a misleading summary of the Strickland case:
However, this deletion within minutes did not at all rely on examining "evidence of Dr. Strickland’s professional endeavors" – rather, it was done based on the "Unambiguous copyright infringement" speedy deletion criterion, as can be readily inferred from the revision history that Tripodi cites here.
It is worth noting that the author of this deeply flawed paper has testified twice before U.S. Senate Judiciary Committee in the past, on different but somewhat related matters (bias in search engine results in particular).
Briefly
- See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.
- This edition of Recent research/the Wikimedia Research Newsletter marks the tenth anniversary of our inaugural issue. Thank you for reading and contributing over this decade, and consider following the WikiResearch feeds on Twitter, Facebook or Mastodon for more frequent updates – the Twitter account celebrated the milestone of 15,000 followers earlier this month.
Other recent publications
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
"Wikipedia successfully fended off disinformation" on COVID-19
From the abstract:
"The Influence of Multilingualism and Mutual Intelligibility on Wikipedia Reading Behaviour – A Research Proposal"
From the abstract:
Using Wikidata to help organize the COVID-19 research literature
From the abstract:
"Unveiling the veiled: Wikipedia collaborating with academic libraries in Africa in creating visibility for African women through Art+Feminism Wikipedia edit-a-thon"
From the abstract:
How much does Wikipedia really diverge from traditional, "authoritative" encyclopedias?
From the abstract:
The author is an experienced editor on the English Wikipedia (as User:Rhododendrites) and former longtime employee of the Wiki Education Foundation.
References
- Supplementary references and notes:
Discuss this story
I wrote to Ms Tripodi on 28 June, pointing out factual errors in her paper (different to those detailed above), regarding her analysis of the biography of Lois K. Alexander Lane, saying, in part:
At the time of writing I have not had a reply (other than an automated out-of-office acknowledgement saying she would return on 6 July). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:42, 26 July 2021 (UTC)[reply]
I did / am doing a survey of (so far) 350 articles of all types from the "random article" button. Including that was exploring the mix of male vs. female, recent (active in the last 15 years) vs non-recent, and also, because sports bios are by far the most prevalent category, sports vs. non-sports. The breakdowns are:
IMO the last split best dials out the realities of history and sports and best addresses any Wikipedia systemic bias question regarding article topics. North8000 (talk) 11:51, 27 July 2021 (UTC)[reply]
Quoting:
My own conclusions from the limited work I did are that
BTW IMHO the fact that Wikipedia is such a mean and vicious battleground environment for editors does introduce a systemic bias against female editors. But that's a different question. North8000 (talk) 14:24, 28 July 2021 (UTC)[reply]
No doubt others have commented on this elsewhere, and it is alluded to in some of the comments above, but to what extent does Wikipedia replicate systemic gender bias versus to what extent does it exacerbate that bias? I suspect for many (most?) editors the first is a sort of natural, shrug of the shoulders, that's obvious, response. However, to my mind, there are ways in which the nature of contributing to Wikipedia in a long term, consistent manner, provides far more opportunity for men, in particular older, professionally educated men, the opportunity to contribute. Our culture/principle of volunteerism (which is venerated and defended with as close to complete consensus of any principle here) per se provides more opportunity for men; every single study shows a gender inequality with regard to access to free time. Access to technology, wages, income in retirement; all these mean men are more likely to have time and means to contribute. The more one moves away from the Euro-American world, the more stark these differences become. So, I find this response somewhat missing the forest for the trees; I'm not saying there's a simple solution, but I think we should welcome attempts which try to understand how Wikipedia processes exacerbate gender inequality, rather than simply dismiss the problem as beyond our capacities to confront (or worse, deny there is a problem). Regards, --Goldsztajn (talk) 03:49, 29 July 2021 (UTC)[reply]
Update: I did / am doing a survey of (so far) 500 articles of all types from the "random article" button. Including that was exploring the mix of male vs. female, recent (active in the last 15 years) vs non-recent, and also, because sports bios are by far the most prevalent category, sports vs. non-sports. The breakdowns are:
IMO the last split best dials out the realities of history and sports and best addresses any Wikipedia systemic bias question regarding article topics. North8000 (talk) 15:10, 29 July 2021 (UTC)[reply]