Wikipedia:Village pump (miscellaneous)
Policy | Technical | Proposals | Idea lab | WMF | Miscellaneous |
For questions about a wiki that is not the English Wikipedia, please post at m:Wikimedia Forum instead.
Discussions are automatically archived after remaining inactive for 8 days.
Conflicting(?) dates
Hello. I am currently working on editing an article, and the sources are giving me a bit of a headache. For context, the article is Juana Belén Gutiérrez de Mendoza. At some point between 1913 and 1916, Gutiérrez was imprisoned for 10 months. Half of my sources say that such an imprisonment happened in 1913 (2-3 sources: specifically, one implies a 1913 date but does not state it explicitly). The other half (3 sources) say that such an imprisonment happened in 1916. I believe that these are referring to the same incident, since the sources that mention the 1913 date do not refer to a 1916 imprisonment and vice versa. The amount of time spent in prison is also the same between the alleged 1913 imprisonment and the 1916 imprisonment: 10 months. The difference between 1913 and 1916 is consequential, as different individuals held power during these periods. To be more specific, about half of the sources claim that it was Victoriano Huerta that imprisoned her, which is consistent with the 1913 date. The other half claim that it was Venustiano Carranza who imprisoned her, which is consistent with the 1916 date. It's also possible that I'm mistaken, and these were actually two different instances.
Right now, I have adopted the latter date, since there is technically one more source that fully supports it. Here's my current approach:
What do people think? This is driving me nuts. Spookyaki (talk) 22:16, 23 April 2025 (UTC)
- Do any of your sources cite each other or another identified source for this point of information? CMD (talk) 01:09, 24 April 2025 (UTC)
- Okay, so looked into it. Here's the rough breakdown:
- 1913
- Villaneda (1994, actually pretty clear)—Citing primary sources, excerpt included in text
- "For this reason, I had to be in Mexico City on August 25, 1913. I left for the capital, and what we had suspected was beginning to be confirmed. Mr. Palacios had learned the route, the itinerary we followed on our excursions, and when I tried to return by the same route, in Joquizingo I found out that the pass was under surveillance and that I was expected. It was almost necessary to return to camp, but I had to be in Mexico City by August 25. 'I arrived in Mexico City on August 25, at ten in the morning... Among the people helping me was Mrs. Manuela Peláez, who told me about an individual, a friend of hers, a schoolmate, who ran a newspaper called Anáhuac, and who wanted to help the Southern Revolution...' Manuela Peláez invited me to meet her at her house on September 4 at five in the afternoon to speak once more with her friend... I was punctual for the meeting; But instead of Manuela's friend, Francisco Chávez showed up with his entire entourage of reserved seats..."
- "The police carried out a new raid on agitators, obeying the instructions of the Ministry of the Interior. The head of the Security Commissions, Francisco Chávez, accompanied by several secret agents, arrested Mrs. Juana Gutiérrez de Mendoza yesterday morning. She was engaged in propaganda for the Zapatista movement. When her house was searched, several safe-conduct passes signed by Emiliano Zapata, the Zapatista anthem, and other documents were found."
- Javien (2005)—Citing a source that I don't have, published in 1983
- Rubio (2020)—Citing Javien
- Villaneda (1994, actually pretty clear)—Citing primary sources, excerpt included in text
- 1916
- Porter (2003)—Not directly cited
- Devereaux Ramírez (2015)—Weirdly citing Villaneda, which seems to contradict the date
- Valles Salas (2015)—Not directly cited
- Spookyaki (talk) 01:51, 24 April 2025 (UTC)
- If mainly reliable sources don't agree about something and can't be reconciled then we should be honest and tell the reader that sources disagree, so we don't know. Phil Bridger (talk) 07:37, 24 April 2025 (UTC)
- Spookyaki, no need to go nuts. Totally agree with Phil Bridger. It goes to our basic role as an encyclopedia, that is, we are a WP:TERTIARY source, which reflects the state of WP:SECONDARY sources. If the secondary sources do not agree, then we reflect that, and summarize the majority and minority views. See WP:DUEWEIGHT. Mathglot (talk) 09:01, 24 April 2025 (UTC)
- If mainly reliable sources don't agree about something and can't be reconciled then we should be honest and tell the reader that sources disagree, so we don't know. Phil Bridger (talk) 07:37, 24 April 2025 (UTC)
- If you want to reflect that the sources disagree but not derail the article with a discussion of sourcing, this is the perfect time to use a footnote (not a reference). See MOS:NOTES and Wikipedia:When sources are wrong#Approach_3:_Get_it_right_and_add_a_footnote for some examples. SnowFire (talk) 20:09, 25 April 2025 (UTC)
- Thanks everyone for responding! I think I have it worked out in this particular case. However, perhaps I should get a bit more specific about what is causing me problems, in case anyone has any thoughts about how I should approach instances like this in the future.
- My main issue is that I'm not sure where it would be best to place the information so that the order of events is clear—a writing issue, primarily. For example, let's say there's a paragraph that includes the following events:
- 1. Something that happened in 1911.
- 2. Something that happened in 1912.
- 3. Something that happened in 1915.
- 4. Something that happened in 1920.
- And then something that could have happened anytime between 1912 and 1930. The evidence is not stronger or weaker for any particular date, and to complicate things even further, let's say it could have been caused by event 1, 2, 3, 4, or none of them. Where should this information go? How would you approach writing a convoluted timeline like this in a way that is as clear as possible? Spookyaki (talk) 20:28, 25 April 2025 (UTC)
- FropFrop, didn't you have a similar situation at Daisy Bates (author) recently? Maybe you'd have some advice. WhatamIdoing (talk) 03:23, 1 May 2025 (UTC)
- I did indeed. Normally I'd recommend that both dates are given if the sources are of similar quality, with an explanation that different writers give different dates. If the situation is similar to the one with the Daisy Bates article, where the disagreement in dates was due to some authors following Bates's semi-autobiographical work, then I'd recommend just presenting the better researched dates.
- FropFrop (talk) 02:07, 3 May 2025 (UTC)
- FropFrop, didn't you have a similar situation at Daisy Bates (author) recently? Maybe you'd have some advice. WhatamIdoing (talk) 03:23, 1 May 2025 (UTC)
Meaningful intervals for edit size histogram
With T236087 XTools is going to get a histogram of a user's edit sizes soon. This will be a bar chart. For screen real estate reasons, it's max ~12 bars. The idea is that each bar gives the number of edits in a certain size interval. My question is: which intervals do you think we should use? The current code uses 200-width intervals (0-200, 200-400, &c), up to 1800-2000, and lumps the rest into >2000.
The issue with fixed-width intervals is they don't allow much granularity for smaller edits (e.g., separating the +1 typo fix from the +120 paragraph addition). I was thinking also of perhaps something exponential like 0-20, 20-40, 40-80, 80-160, 160-320, 320-640, 640-1280, 1280-2560, >2560. What do you think could be more meaningful to users, and why? Welcoming suggestions. Thanks, — Alien 3
3 3 16:33, 28 April 2025 (UTC)
- Just looking at my most recent mainspace contributions, the <10 typo fix or minor c/e shows up, then from 10-100 there's larger copyedits, adding categories, and formatting tweaks. The adding text+adding source seems to start from perhaps 200. I have a small number of +2000 edits which seem meaningfully distinct from say reverting page blanking vandalism, so I'd put the final bin a bit higher. CMD (talk) 02:32, 29 April 2025 (UTC)
- Thanks for the answer! When you say "higher", where would that be? 3K? 4K? 10K? Just asking for a general order of magnitude. — Alien 3
3 3 09:09, 29 April 2025 (UTC)- Probably something like 5K or 10K? Maybe someone has an existing histogram this could be based on. CMD (talk) 12:15, 29 April 2025 (UTC)
- Thanks for the answer! When you say "higher", where would that be? 3K? 4K? 10K? Just asking for a general order of magnitude. — Alien 3
- What about negatives? A few years ago I looked at my edits (in mainspace) and found that my median change was −3 bytes. —Tamfang (talk) 23:33, 29 April 2025 (UTC)
- This would be in absolute value, i.e. putting -1 with +1. Else it takes twice as much width. We could do both positive and negative, but then we'd have pretty low granularity (could only have about 6 bars on either side). — Alien 3
3 3 05:43, 30 April 2025 (UTC)- Could you split the bars in two? Top colour is positive and bottom colour is negative. 80.76.122.163 (talk) 08:45, 1 May 2025 (UTC)
- We could, I think. Question would be, what do we do with 0? is it positive or negative? — Alien 3
3 3 09:25, 1 May 2025 (UTC)- Centered/split? I agree that positive/negative above/below the horizontal axis was also where my mind went immediately. -- Avocado (talk) 22:27, 4 May 2025 (UTC)
- Yup, that's done (see discussion below). Currently the zero is put between the additions and the x-axis in the 0-10 interval, in a separate colour.
- Splitting the zero bar (as in half-above and half-below) is not doable with our library without some meh hacks I'd really like to avoid. — Alien 3
3 3 09:49, 5 May 2025 (UTC)
- Centered/split? I agree that positive/negative above/below the horizontal axis was also where my mind went immediately. -- Avocado (talk) 22:27, 4 May 2025 (UTC)
- We could, I think. Question would be, what do we do with 0? is it positive or negative? — Alien 3
- Could you split the bars in two? Top colour is positive and bottom colour is negative. 80.76.122.163 (talk) 08:45, 1 May 2025 (UTC)
- This would be in absolute value, i.e. putting -1 with +1. Else it takes twice as much width. We could do both positive and negative, but then we'd have pretty low granularity (could only have about 6 bars on either side). — Alien 3
- I like the exponential (or semi-log?) better than a straight division. Most of our edits are actually small.
- What I really wish is that we could get numbers for changes to readable prose (e.g., not fiddling with whitespace and template formatting). WhatamIdoing (talk) 03:25, 1 May 2025 (UTC)
- Sadly, that's just not doable on a statistical scale. The best possible in reasonable time would be a bit below 100 edits, which is not a lot.
- If you're ready to wait something like at least 30 seconds for it, we could make a separate tool that does this.
Update: now looks like this. Other suggestions? — Alien 3
3 3 13:29, 1 May 2025 (UTC)
- The link doesn't work.
- Instead of a separate tool (I greedily want all the tools, but would I use it often enough to justify your efforts? I'm not sure, in this case), I wonder if it would be possible to add Special:Tags to non-prose changes. Something like the "Undo" tag, which is calculated later? WhatamIdoing (talk) 03:53, 2 May 2025 (UTC)
- Well, my bad for the link. This one should work.
- Adding tags is beyond our capacity (should ask the mw people), but I get the use of it. I'm wondering, though: is a non-prose change a change that changes no prose, or that also changes something that isn't prose? — Alien 3
3 3 05:36, 2 May 2025 (UTC)- The red/green color choice in the diagram probably needs to be checked for Wikipedia:Manual of Style/Accessibility purposes. Could the red/minus items hang down below the 0 line?
- About non-prose changes: I don't want to be bothered with edits like these: [1][2][3][4][5][6]. I do want to see edits like this one: [7] WhatamIdoing (talk) 20:27, 2 May 2025 (UTC)
- Current histogram, after some color tweaking and putting the neg below the 0 line. (Actually, it was the grey that was really problematic for accessibility). — Alien 3
3 3 08:16, 3 May 2025 (UTC)- That shape is a little easier for me to understand at a glance.
- Does the new color scheme work for someone with Red–green color blindness? WhatamIdoing (talk) 22:41, 3 May 2025 (UTC)
- Yes; I checked. Still clearly distinguishable. — Alien 3
3 3 22:55, 3 May 2025 (UTC)- Thanks. WhatamIdoing (talk) 17:08, 8 May 2025 (UTC)
- Yes; I checked. Still clearly distinguishable. — Alien 3
- Current histogram, after some color tweaking and putting the neg below the 0 line. (Actually, it was the grey that was really problematic for accessibility). — Alien 3
Many thanks to everyone for all the input! Will probably go out in the next deployment or two. — Alien 3
3 3 12:29, 9 May 2025 (UTC)
File:Syrian Petroleum Company Logo.png
Hi ,how deleted this logo (File:Syrian Petroleum Company Logo.png) ,is not a official logo in this website (https://spc.sy/) the official logo is a colour blue in top? (google translator). AbchyZa22 (talk) 08:42, 30 April 2025 (UTC)
- @~Berilo Linea~ and Yedaman54, it looks like the logo at the top of Syrian Petroleum Company might be outdated (or maybe they use different colors for their website vs other places?). Could you look into it? WhatamIdoing (talk) 03:30, 1 May 2025 (UTC)
- @Freedoxm and @Abo Yemen any opinion?? AbchyZa22 (talk) 20:19, 3 May 2025 (UTC)
- Not as of right now. Freedoxm (talk · contribs) 23:00, 3 May 2025 (UTC)
- @Freedoxm and @Abo Yemen any opinion?? AbchyZa22 (talk) 20:19, 3 May 2025 (UTC)
AI tool to fact-check articles (proof of concept)
I have created a proof of concept tool for automating fact-checking of articles against sources using AI. GitHub repository. An OpenAI API key or compatible provider is required (I use BotHub). It is cost-effective; when using gpt-4.1-nano, verification of one 100-word block against a single source (approximately 12,000 characters) costs about 0.1 cent. Functionality:
- The program loads the article text from file and all available sources (text files: source1.txt, source2.txt, etc.).
- It divides the article into blocks of approximately 100 words, preserving sentences.
- For each block and each source:
- Sends a request to the OpenAI API for correspondence analysis
- Receives credibility probabilities for each word
- Combines results for all blocks and sources
- Visualizes the text with color coding based on the obtained probabilities (textmode with all sources combined or GUI allowing to select individual sources)
Installation and usage instructions, along with example screenshots, are available in the README. Bugs are certainly present (almost all code was generated using Anthropic Claude 3.7).
It is also possible to use models hosted locally by installing an OpenAI API compatible LLM server (such as LLaMA.cpp HTTP Server) and directing script to use it with --base_url and --model parameters.
Suggestions and proposals are welcome, but unless submitted as pull requests, they will be reviewed at an indeterminate time. The creation of new tools based on this idea and code is strongly encouraged. Kotik Polosatij (talk) 13:40, 5 May 2025 (UTC)
- Interesting, thanks! -- GreenC 00:56, 9 May 2025 (UTC)
Papal traffic - one of our busiest hours?
In case anyone is curious, I did a bit of digging on yesterday's traffic:
- On 8 May, the Pope Leo XIV article here was read 13.2 million times ; the Spanish, Italian, German, French and Portuguese made up another 10.9 million. This was 4.5% of all pageviews in the day for English, and as high as 12.9% for the Spanish Wikipedia. (These figures include all traffic from redirect pages)
- Absolute totals for all Wikipedias are a little trickier. The count for pageviews of the "main article title" was around 15 million on all 93 Wikipedias with articles; the six biggest ones above made up 88.5% of that. So assuming the breakdown between main articles + redirects is in proportion, maybe something like 27 million pageviews overall, including redirects.
- We went from 23 WPs having an article on him before the announcement, to 93 by midnight UTC, and 113 now. 20 Wikipedias managed to rename their article in the first three minutes (17:14 to 17:17 UTC) and two other projects had created new articles on him by that time.
- In the hour after the announcement (17:00 to 18:00 UTC), English Wikipedia had around 8.4 million hits on Pope Leo XIV and the redirect titles - around half of those were to Robert Francis Prevost - which represented one third of all pageviews during the hour.
- It probably represented over 40% of all pageviews, over 3000/second, from 17:14 to 18:00 (assuming that the other traffic was evenly distributed) and while the public data doesn't go lower than hourly, I would be happy betting money that in the first fifteen minutes, it was well over half of our traffic.
I don't know if this was our one-time traffic record, but it must certainly be well up there. Congratulations to everyone who worked on it. Andrew Gray (talk) 21:12, 9 May 2025 (UTC)
- Other contenders: Death and funeral of Pope John Paul II; Death of Michael Jackson. I think the Michael Jackson one maxed out our servers. --Redrose64 🌹 (talk) 22:26, 9 May 2025 (UTC)
- Looks like the death of Michael Jackson in 2009 and the views it generated caused wikitech:Michael Jackson effect, which was solved by our software engineers writing the software mw:PoolCounter, which is now installed on our servers to prevent it from happening again. An interesting bit of technical history. –Novem Linguae (talk) 22:40, 9 May 2025 (UTC)
- Interesting, thankyou - I had somehow forgotten the Jackson case!
- That page points to Wikipedia:Article traffic jumps which identifies a handful pushing towards 10m in a day (Kobe Bryant, Matthew Perry, Elizabeth II). Some of these do not include redirects in the count and so are ahead of Leo XIV on purely "single title" data, but I think none are likely to beat the one-day (or one-hour) figure for Leo once redirects are included (and IMO they should be).
- I'll see if I can work out what any of these were like as a percentage of traffic - in particular it seems plausible that Steve Jobs might be higher than Leo XIV, with 7.4m views in 2011. Andrew Gray (talk) 22:54, 9 May 2025 (UTC)
- Looks like the death of Michael Jackson in 2009 and the views it generated caused wikitech:Michael Jackson effect, which was solved by our software engineers writing the software mw:PoolCounter, which is now installed on our servers to prevent it from happening again. An interesting bit of technical history. –Novem Linguae (talk) 22:40, 9 May 2025 (UTC)
Looking at some recent high-traffic deaths, with a little rounding up added to the global data for redirects (which are relatively rare for stable articles like these ones):
- Matthew Perry got ~8.8m enwiki hits on 29/10/23, and ~11.8m globally, which would put him at 3.7% of enwiki traffic and 2.1% of global traffic. (Death was reported about midnight UTC)
- Kobe Bryant got ~9.5m enwiki hits on 26/01/20, and ~15.1m globally, which would put him at 3.4% of enwiki traffic and 2.6% of global traffic. (Death was reported about 1930 UTC)
- Elizabeth II got ~8.5m enwiki hits on 8/9/22, and ~20m globally, which would put her on 3.2% of enwiki traffic and 3.5% of global traffic. (Death was reported about 1730 UTC)
My rough estimate for the Pope had 4.5% of enwiki and (more tentatively) 4.4% of global traffic in the day, so I think that puts him ahead of all three. Interesting to see, though, the difference between Elizabeth/Leo and Perry/Bryant in terms of English vs global traffic. Peak hour was I think around 3.5m/21% for Bryant, 2.2m/13% for Elizabeth II, and 1.3m/11% for Perry, so again all a bit behind what we saw this week.
- For Jobs in 2011, we have the problem that a new and more reliable pagecount system came in about a month after his death. From what we do have (which may have errors/omissions), I get ~7.8m enwiki hits over the full day 6/10/11 (counting Steve Jobs & the main redirect at Steve jobs). Total hits for the day were 231.5m for enwiki, so this suggests Jobs was ~3.3% of English Wikipedia traffic that day, maybe a shade higher to account for the other redirects. Jobs's death seems to have been announced about midnight UTC so the affected period covers the full day; for the peak hour (1-2am) it was 10% of all traffic.
- For Jackson in 2009, with the same caveats, there were ~1.5m hits over the full day 25/6/09 (Michael Jackson + Michael jackson), or 0.6% of total enwiki traffic, but his death was announced only in the last couple of hours of the day so it's not a great comparison. The last two hours of the day had ~7.1% of all enwiki traffic go to the two Jackson page titles, and the last hour had ~12%.
Again, I think the data for the Pope this time around is ahead of both in terms of the share of traffic and the one-hour spike.
In terms of overall sitewide impact, 8 May was a relatively normal day for English Wikipedia in absolute traffic terms - it was busier than usual, especially for a Thursday, but only the fifth busiest this year. However, for Wikimedia as a whole, it was quite a leap, with 613m pageviews - this is the most it has been since 28/1/2024, and the sixth highest since the start of 2021. — Preceding unsigned comment added by Andrew Gray (talk • contribs)
How many left?
At this writing, there were 6,991,903 articles in the encyclopedia, and as you are reading, there are now 6,992,097. There are 7903 left to go to hit the big 7M! Who will be the lucky one to make the seven millionth edit article?? Mathglot (talk) 07:09, 10 May 2025 (UTC)

- P.S. If you are sitting here hitting reload to see the number change, you might need to listen to the calming sound of Wikipedia being edited. Mathglot (talk) 08:41, 10 May 2025 (UTC) the page instead. While you do that, you can
- Surely we've hit our 7th million edit! I have a list of notable article topics and I might get to some of them, so I'll try and chip away at a quarter of a percent. CMD (talk) 09:03, 10 May 2025 (UTC)
- Yes, we're up into the region of 1.2 thousand million edits now (specifically, 1,285,046,009). I suspect that Mathglot meant "seven millionth article" when they wrote "seven millionth edit". --Redrose64 🌹 (talk) 13:31, 10 May 2025 (UTC)
- Big 'oops!' on my part. Of course I meant article, thanks for the correction. Someone trout me! Mathglot (talk) 18:36, 10 May 2025 (UTC)
- Yes, we're up into the region of 1.2 thousand million edits now (specifically, 1,285,046,009). I suspect that Mathglot meant "seven millionth article" when they wrote "seven millionth edit". --Redrose64 🌹 (talk) 13:31, 10 May 2025 (UTC)
- I wonder what % of those articles don't meet the WP:Notability guidelines... Some1 (talk) 14:17, 10 May 2025 (UTC)
- Probably a smaller number than the number of articles that could meet the notability guidelines that don't yet exist, so it should all balance out in some way. CMD (talk) 17:35, 10 May 2025 (UTC)