Wikipedia:Wikipedia Signpost/2022-06-26/Recent research

Recent research

Wikipedia versus academia (again), tables' "immortality" probed

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"The Secret Life of Wikipedia Tables"

This paper presents an analysis of "the entire history of all 3.5 M tables on the English Wikipedia for a total of 53.8 M table versions." In an accompanying conference poster, the researchers summarize their findings as follows:

The paper itself presents various interesting results in slightly more scholarly detail.

"Number of tables and pages created per month" (from the paper)

The authors note Wikipedia contained "almost no tables" in its first three years, after which:

As an aside, there is no mention of Wikidata in the paper (a sister project of Wikipedia launched in 2012 aimed at providing structured machine-readable data), nor of the more recent efforts to store tabular data on Wikimedia Commons for use on (e.g.) Wikipedia. While there are tools to generate Wikipedia tables automatically from the structured data available on Wikidata, they are not widely used yet.


"Histogram of the maximum table count per page" (from the paper, omitting pages without any tables)

A "histogram of the maximum number of tables that ever existed simultaneously on a Wikipedia article" demonstrates that

These results appear to provide the empirical foundation for the party emoji in the conference poster (above).

The racecar emoji refers to various results on how often tables are changed. From the author's perspective of reusing information from tables outside of Wikipedia, they stress that "in a one-month-old snapshot, already 4.4% of tables are outdated."

"Table freshness over time" (violin plot from the paper)

A violin plot of table "freshness" (i.e. time since the table's last update) over table age (i.e. time since the table's creation) shows that

The authors note that the distribution of the number of updates per table has "a large skew", with one outlier being "a table on social networking websites that was updated more than 10,000 times during its lifetime. At least 1,310 tables were each updated more than 1,000 times during their lifetimes."

The paper also examines schema changes of existing tables (e.g. the addition, removal or renaming of columns). It finds e.g. that "about half of all tables never change their schema", and that schemata can evolve into various specializations, such as in this example visualizing "genes" shared by around 500 football-related tables:

"Example of schemata evolving over time" (from the paper): "This particular plot shows a cluster of schemata that all contain information about league results of football teams. There are almost 500 tables for which at least one of the snapshots had one of the Schemata 2–7."

Lastly, the conference poster's "immortality" claim is quantified as follows:

See also our earlier coverage of related research: "Neural Relation Extraction on Wikipedia Tables for Augmenting Knowledge Graphs", "TableNet: An Approach for Determining Fine-grained Relations for Wikipedia Tables", "Methods for Exploring and Mining Tables on Wikipedia"

Papers further explore dynamic between Wikipedia and academia

The June 2021 issue of "She Ji: The Journal of Design, Economics, and Innovation" featured several articles examining Wikipedia with a focus on its relation to academia, including by longtime Wikipedians Piotr Konieczny (User:Piotrus) and Dariusz Jemielniak (User:Pundit).

Konieczny's first contribution, titled "From Adversaries to Allies? The Uneasy Relationship between Experts and the Wikipedia Community", provides a historical overview and literature review, concluding that "Collaborating with Wikipedia is increasingly common in academia, though barriers remain" and that "Wikipedia’s anti-elitist culture and academia’s anti-amateur culture are still at odds." Konieczny commiserates with his "fellow experts" who try to contribute to Wikipedia, but holds up a mirror:

Furthermore, Konieczny reminds academics who complain about hostile Wikipedians about their own power structures:

In a short commentary, Jemielniak agrees with Konieczny's analysis of these two polarized stances as "the underlying cultural problem", and calls for "institutional support [for Wikipedia] from beyond the Wikimedia Foundation or Wiki Education Foundation", e.g. by "counting [Wikipedia editing] towards tenure reviews at universities."

In another response, titled "Wikipedians among Us: From Allies to Reformers", Kara Kennedy also largely agrees with Konieczny's observations, but "sheds light on some of [his] oversights, including the still-present issues of bias and gaps in content and quality due to a lack of diversity in editorship".

In a third response, the journal's editor-in-chief Ken Friedman (User:Kenfriedman0) argues that Wikipedia "suffers from the internally-focused cultural patterns among Wikipedians that prevent the improvements needed for a high quality reference work". Among other observations, he focuses on the Wikimedia Foundation's statement (in its fundraising messages) that 98% of Wikipedia readers do not donate, claiming that "This admission contains a message that the Wikimedia Foundation doesn’t seem to understand. When only 2% of the audience for a widely used not-for-profit project is willing to support the project they use, this suggests that the project might not survive as a commercial venture."

In the concluding piece, Konieczny responds to the three comments, joining Jemielniak and Kennedy in making "The Case for Institutional Support: It’s High Time for Governments and University Administration to Actively Support Wikipedia". He devotes some space to Friedman's recollections of his own negative experiences of trying to contribute to Wikipedia. Examining the on-wiki record, Konieczny notes that the only dispute appears to have been about "whether to insert several names on the list of Fluxus members—an art movement Friedman was involved in both as artist and later, scholar—or not," whereas Friedman's larger contributions all appear to have been accepted. Konieczny argues that "[t]his illustrates the classic notion of negativity bias: we are much more likely to remember the bad experiences than the good ones, even if the latter are more common".


Briefly

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

"The Wikipedia Global Consciousness Index: A Measurement of the Awareness and Meaning of the World-as-a-Whole"

From the abstract:


"Wikipedia in the anti-SOPA protests as a case study of direct, deliberative democracy in cyberspace"

From the abstract:

See also our review of an earlier paper by the same author: "Wikipedia’s SOPA Strike considered as international political movement", and his own review of a 2012 paper: SOPA blackout decision analyzed"

References


Uses material from the Wikipedia article Wikipedia:Wikipedia Signpost/2022-06-26/Recent research, released under the CC BY-SA 4.0 license.