Sample Page

This page is for requesting modifications to URLs, such as marking dead or changing to a new domain. Some bots are designed to fix link rot; they can be notified here. These bots include InternetArchiveBot and WaybackMedic. This page can be monitored by bot operators from other language wikis since URL changes are universally applicable.

US agencies

90+ agencies identified as having web pages deleted during the Trump admin: https://asia.nikkei.com/static/vdata/infographics/deleted-website/

ω Awaiting further developments and time to go through them — GreenC 16:46, 1 April 2025 (UTC)[reply]

  • White House
  • Department of Health and Human Services
  • Department of Agriculture
  • USAID
  • National Park Service
  • worker.gov
  • Department of Labor
  • U.S. Agency for Global Media – usagm.gov
  • Federal Mediation and Conciliation Service (United States) – fmcs.gov
  • Woodrow Wilson International Center for Scholars – wilsoncenter.org
  • Institute of Museum and Library Services – imls.gov
  • Community Development Financial Institutions Fund – cdfifund.gov
  • Minority Business Development Agency – mbda.gov
  • Department of Transportation – dot.gov – includes 11 agencies: FAA, FHWA, FMCSA, FRA, FTA, GLS, MARAD, NHTSA, OIG, OST, PHMSA
  • Environmental Protection Agency – epa.gov
  • Department of Housing and Urban Development – hud.gov
  • Centers for Disease Control and Prevention
  • Federal Emergency Management Agency
  • National Institutes of Health
  • General Services Administration
  • Department of Homeland Security
  • Department of Commerce
  • employer.gov
  • Office of the Assistant Secretary for Health
  • sftool.gov
  • Department of Energy
  • Department of the Interior
  • Department of Education
  • NOAA
  • Substance Abuse and Mental Health Services Administration
  • climate.gov
  • Department of Defense
  • Health Resources & Services Administration
  • AbilityOne Commission
  • Department of State
  • United States Patent and Trademark Office
  • BOEM
  • The Census Bureau
  • CISA
  • HUD User
  • MILLENNIUM CHALLENGE CORPORATION
  • performance.gov
  • National Archives and Records Administration
  • Bureau of Safety and Environmental Enforcement
  • Federal Aviation Administration
  • Food and Drug Administration
  • House of Representatives
  • Department of Justice
  • National Endowment for the Humanities
  • Department of the Treasury
  • youth.gov
  • American Climate Corps
  • Federal Trade Commission
  • Global Change Research Program
  • NASA
  • Administration for Community Living
  • National Endowment for the Arts
  • ATF
  • Bureau of Indian Affairs
  • Customs and Border Protection
  • Consumer Financial Protection Bureau
  • Consumer Product Safety Commission
  • Office of the Director of National Intelligence
  • Economic Development Administration
  • Equal Employment Opportunity Commission
  • Export-Import Bank of the United States
  • FBI
  • Federal Committee on Statistical Methodology
  • Federal Housing Finance Agency
  • geoplatform.gov
  • Assistant Secretary for Technology Policy
  • IRS
  • National Labor Relations Board
  • Office of Personnel Management
  • Department of Veterans Affairs
  • American Battle Monuments Commission
  • Agency for Healthcare Research and Quality
  • americorps.gov
  • Advanced Research Projects Agency for Health
  • Bonneville Power Administration
  • cms.gov
  • congress.gov
  • digital.gov
  • ENERGY STAR
  • ej.gov
  • farmers.gov
  • medicalcountermeasures.gov
  • peacecorps.gov
  • Securities and Exchange Commission
  • Social Security Administration
  • stopbullying.gov
  • Citizenship and Immigration Services
  • United States Interagency Council on Homelessness
  • workcenter.gov
  • Air Force
  • Army
  • Navy
  • Marine Corps

State government websites in the United States

Whether or not I am supposed to add this here, I don’t know, if I am not supposed to put it here, let me know. But to go along with whoever listed US federal government websites on this page. I also think there needs to be something done about sources citing the governors office websites in several states since the 2024 election. This probably needs IABOT or something to go in and archive all of them. Those include the office of governor websites in:

  • Delaware
  • Indiana
  • Missouri
  • New Hampshire
  • New Jersey (since 2025)
  • North Carolina
  • North Dakota
  • Possibly South Dakota (since Kristi Noem’s resignation to become DHS secretary)
  • Virginia (since 2025)
  • Washington
  • West Virginia

More states might need to be added. The reason why I am requesting these states (and really any state if you think about it) websites be archived by IABOT is because I’ve been noticing a couple of dead links on governor.wv.gov that were created during Jim Justice‘s tenure that were deleted whenever Patrick Morrisey took office. I tried to summon InternetArchiveBot (IABOT) but had problems with Oauth.

In addition, ANY state government website that deals with a particular administration, there needs to be an archive link put up if its used in a citation. That goes for office of governor, their legislative sites, and anything else that may be deleted when the next administration is elected. Hurricane Clyde 🌀my talk page! 02:46, 8 January 2026 (UTC)[reply]

User:Hurricane Clyde absolutely right. I started going through all 50 and completed 10: Wikipedia:Link_rot/URL_change_requests/Archives/2025/June#Five_USA and Wikipedia:Link_rot/URL_change_requests/Archives/2025/July#Five_USA_set_2. I gave up on CA it’s too complicated. I pinned this thread so it is not archived and a reminder to keep going. All these government sites are very difficult: they have been in existence since the 1990s, have gone through continues changes and updates with new administrations and technology. They are fragmented with sub-domains under different departments and issues. But every one I did required a lot of updates they all need work. — GreenC 21:43, 31 January 2026 (UTC)[reply]

ift.org.mx and cofece.mx

These Mexican government agencies will likely be dissolved this month, and I’m not sure what will happen to references and other materials used within. Sammi Brie (she/her · t · c) 19:00, 10 October 2025 (UTC)[reply]

On hold pending dissolution. — GreenC 04:21, 18 October 2025 (UTC)[reply]
It took place on October 17 for the former and presumably a similar time for the latter. Sites are still up for now (but with the replacing agency’s logo). Unclear whether they will use it or another page for their own business. I suspect the domain rpc.ift.org.mx (full of PDFs containing broadcasting technical information) will be retained intact at some other domain at some point. Sammi Brie (she/her · t · c) 06:32, 21 October 2025 (UTC)[reply]
If it’s a new agency at the same domain.. what does this mean we should do in terms of archiving URLs? Options are do nothing. Or treat all URLs as dead and add archive URLs. — GreenC 16:06, 21 October 2025 (UTC)[reply]
With the IFT and Cofece, the sites seem to be up as an archive. I am expecting the domain https://rpc.ift.org.mx/ —which contains most of our IFT citations—to move at some point, so put a pin in this thought for now. I will let you know when that happens. Sammi Brie (she/her · t · c) 18:01, 14 January 2026 (UTC)[reply]

irish-charts.com

Seems to be broken since December 2025. Not sure if it’s temporary. ~2k. Thank you! MrLinkinPark333 (talk) 04:00, 15 January 2026 (UTC)[reply]

The domain was re-registered on Dec 8 2025 to a Chinese name server. Hard to say but it looks like a usurpation of an abandoned domain that has not yet been spamified. The original was active for 20 years, and has 2,000 pages on Wiki, so it must have a high SEO rankings, a very good find for spammers. I’ll treat it as a dead site and if it ever goes usurp send over to JUDI. If the original site is ever reactivated I can “makelive” again. — GreenC 01:46, 6 February 2026 (UTC)[reply]
The website is now working again as of last month. MrLinkinPark333 (talk) 20:28, 25 April 2026 (UTC)[reply]

Enwiki

  • Checked 2,091 pages and edited 1,891 pages. Added 174 {{dead link}}. Switched 350 |url-status=live to dead. Added 1,619 archive URLs (1,604 Wayback).

IABot DB

  • Checked and updated 3,310 links

 DoneGreenC 07:10, 6 February 2026 (UTC)[reply]

Migration away from archive.today

Submitting a formal request to, where possible, replace links to archive.today with archive.org (or other suitable archive). Is this something that can be done with WP:WAYBACKMEDIC? I know that it is not always possible, but can we do what can be done automatically? I also agree with the advice to remove WP:EARLYARCHIVEs, where possible. Best, HouseBlaster (talk • he/they) 21:35, 20 February 2026 (UTC)[reply]

Just here to add a breadcrumb to Wikipedia:Bot requests/Archive 88#archive.today cleanup for anyone finding this and interested in discussing. Dreamyshade (talk) 23:39, 22 February 2026 (UTC)[reply]

 Not done yet. A handful of editors are picking away at it manually they should be given the opportunity over the next few months to see how they progress. Also, bot work is tricky it will take some time to develop soft-404 filters. — GreenC 18:48, 27 February 2026 (UTC)[reply]

uib.no

  • Old domain: https://folk.uib.no/hnohf
  • New domain: https://ardalambion.net
  • Reason: The host for “Ardalambion” (a significant resource for Tolkien’s constructed languages) has moved. The subfolder structure appears to remain identical, only the host domain has changed.
  • Number of instances: 21 according to Special:LinkSearch/https://folk.uib.no/hnohf. I’ve gone ahead and manually verified that the replacement works for all pages for necessary page.

Can a bot operator please perform a global find-and-replace for these links? Thank you! CosmicDefect (talk) 02:08, 22 February 2026 (UTC)[reply]

Will do. — GreenC 18:50, 27 February 2026 (UTC)[reply]
2,500 pages – the entire uib.no domain has dead links throughout. Will migrate folk.uib.no/hnohf -> ardalambion.net at the same time. — GreenC 21:56, 28 February 2026 (UTC)[reply]
Hello GreenC, thank you for initiating the changes, however, it appears the old domain is still used as per LinkSearch. Did the migration not complete correctly? CosmicDefect (talk) 15:33, 6 April 2026 (UTC)[reply]
I had to stop part way through due to other work intervening. Working to finish now. — GreenC 04:07, 14 April 2026 (UTC)[reply]
Also:

Enwiki

IABot DB

  • Over 11,000 links updated.

 DoneGreenC 15:27, 15 April 2026 (UTC)[reply]

I was fixing some MCU-related pages and noticed that some of these stopped working. Not sure how common this is or what the fix is other than marking as dead.

Gonnym (talk) 08:43, 4 March 2026 (UTC)[reply]

Some of these are still available under different URLs. For example, the denofgeek source was moved here, the nerdist source was moved to their archive, and the CBR article was moved here. ARandomName123 (talk)Ping me! 05:51, 14 March 2026 (UTC)[reply]

This request is kind of a can of worms because it is 5 different domains, each should have a separate request. And ARandomName123 has identified some excellent rules for moving links, there are likely more rules to be discovered. Looking at the numbers:

  • comicbookresources.com 3,451 pages some convertible to cbr.com
  • nerdist.com 2,182 pages some convertible to archive.nerdist.com
  • denofgeek.com 6,799 pages maybe some convertible
  • observationdeck.io9.com 31 pages, all dead
  • hitfix.com 2,687 pages. This site ceased to exist in 2016, content was purchased by Uproxx Media Group in 2017, in 2018 Warner Music Group acquired Uproxx, in 2024 Uproxx re-acquired it from Warner, and today it is doing business as Uproxx Studios. I don’t see the original HitFix content available at https://uproxx.com/ unless someone can find it, the domain is dead.
I feel the need to note some nerdist sources that have working links at archive.nerdist don’t have working content there. (Or more accurately, I had to switch a cite back to dead (actually… I’m going to go switch it to deviated because that’s more accurate) because it’s an audio interview whose file doesn’t exist on its live archive.nerdist page but does exist on its Wayback Machine link.) This, of course, is not the bot’s fault; I imagine it has no way to tell the difference between a working link with a working audio file on it and a working link with no audio file on it (and then some nerdist cites (like the Marvel one mentioned above) are text, so…). – Purplewowies (talk) 04:40, 17 April 2026 (UTC)[reply]
“an audio interview whose file doesn’t exist on its live archive.nerdist page but does exist on its Wayback Machine link.” That’s unusual! Usually with media files the Wayback link is broken and the live link works. Had I known this I could have scraped the page for audio content and added a rule to treat it as a dead link (add archive). If you want to show an example I can look into it. — GreenC 22:56, 17 April 2026 (UTC)[reply]
Amazing work with the bot. Thank you! Gonnym (talk) 08:56, 18 April 2026 (UTC)[reply]
@GreenC: The one the bot changed in this edit, https://archive.nerdist.com/the-mutant-season-34-dylan-sprouse/, has an audio player on it that tries to request https://traffic.libsyn.com/themutantseason/TMS34_Dylan_Sprouse.mp3 (which is a dead link–the download and the play button both try to get this file/URL). The archive.org copy, on the other hand, did successfully archive the file (despite its attempt to load the player itself failing) if one goes through the download link on the archived page (something I was also fully expecting to not work). The specific archived link’s download button was doing a 302 response at the time, but archive.org does redirect one to the archived audio file in this instance (here, for reference). I don’t know if archive.org managed to properly archive everything like that (since my experience is the same as you regarding the vast majority of media files), but since Dylan and Cole Sprouse is on my watchlist I went down that rabbithole just to make sure the archive it already had was actually useful and all… (Seconding that you do do amazing work with this bot.) – Purplewowies (talk) 04:10, 19 April 2026 (UTC)[reply]
wow surprised it worked. I’m not sure how to approach this because most editors will see the broken player and stop there, and not follow it through to the end, as you say a rabbit hole. Also the archive.org libsyn.com link didn’t play for me in Firefox, but it worked in Chrome. If we go back to https://archive.nerdist.com/the-mutant-season-34-dylan-sprouse/ and look at the Download button it points to http://traffic.libsyn.com/themutantseason/TMS34_Dylan_Sprouse.mp3 which is pretty clear it was an MP3 file. The Wayback Machine has an archive page for that, it redirects back to the libsyn.com archive page here. Notice the “&expiration=1529532070” .. that is a self-destruct command, the URL works for a short period then stops working. The WaybackMachine captured it before it expired. This is sometimes done by sites to protect from web scrapers and piracy. So they are on record they don’t want the content forever available. A messy problem on a couple levels. — GreenC 06:06, 20 April 2026 (UTC)[reply]
The bit about Firefox not playing it is interesting, as I did all that checking in Firefox. Huh! – Purplewowies (talk) 17:10, 20 April 2026 (UTC)[reply]

comicbookresources.com

Enwiki
IABot DB
  • Updated about 7,000 links
 DoneGreenC 01:15, 17 April 2026 (UTC)[reply]

nerdist.com

Enwiki
IABot DB
Updated about 1,400 links
 DoneGreenC 06:08, 17 April 2026 (UTC)[reply]

denofgeek.com

Enwiki
IABot DB
Checked about 10,000 and updated 3,200
 DoneGreenC 23:49, 17 April 2026 (UTC)[reply]

observationdeck.io9.com

Domain is dead and in 31 pages. Invoked IABot it should get most of them. — GreenC 23:13, 17 April 2026 (UTC)[reply]

hitfix.com

Enwiki
  • Checked 1,818 pages and edited 1,136 pages. Added 9 {{dead link}}. Switched 770 |url-status=live to dead. Added 865 archive URLs (865 Wayback).
IABot DB
Updated about 1,000 links
 DoneGreenC 16:45, 18 April 2026 (UTC)[reply]

networkdvd.net

(Originally posted at EL/N) This is something I’ll work on tomorrow when I’m on a computer rather than my phone, but since there are ~40 links to it that I can see with an insource: search, I thought I’d draw further attention to this news story about the hijacking of this domain. • a frantic turtle 🐢 22:09, 4 March 2026 (UTC)[reply]

(Crossposted from EL/N) I’ve been through and manually changed most of them to url-status=usurped, adding an archive link. The rest were either links to the front page or to searches on the old website, so I’ve just removed those entirely. (A couple were naked adverts, so they’re gone too now). A lot of the references were in rubbish formats that I don’t think a bot could get to grips with, which is fairly typical for our articles on TV shows, alas.
The hijacked site has taken a copy of the last working version of the real site, so people may not realise it’s actually now a scam and add new links to it in future. I don’t think there’s a way of preventing this? I’d assume that vigilance will have to suffice. • a frantic turtle 🐢 11:49, 5 March 2026 (UTC)[reply]

vesselregister.dnvgl.com

Noticed that some pages use an outdated version of the DNV vessel register (admittedly this is mostly ship launch articles).

Old/deprecated: https://vesselregister.dnvgl.com/VesselRegister/vesseldetails.html?vesselid=(VESSEL_ID_HERE)

New: https://vesselregister.dnv.com/vesselregister/details/(VESSEL_ID_HERE)

Where (VESSEL_ID_HERE) is replaced with the actual id. CastleOfLight (talk) 05:57, 5 March 2026 (UTC)[reply]

45 pages. — GreenC 04:26, 18 April 2026 (UTC)[reply]

Enwiki

  • Checked 45 pages and edited 45 pages. Moved 512 links to a new URL: 512 ruled mapped redirects, Removed 22 {{dead link}}. Switched 9 |url-status=dead to live. Added 1 archive URLs (1 Wayback).

 DoneGreenC 17:09, 18 April 2026 (UTC)[reply]

washingtonindependent.com

There are a number of legitimate citations to washingtonindependent.com, which was a news source until about 2013 (The Washington Independent). The URL has been usurped by a spam site, and I see there was a discussion a couple years ago at Wikipedia:Link rot/URL change requests/Archives/2023/November#washingtonindependent.com about marking these usurped. But I see a bunch of citations, for example on National Security Agency and Tea Party movement, marked url-status=dead instead of url-status=usurped. I’d suggest updating these to usurped. Thank you! Dreamyshade (talk) 02:31, 6 March 2026 (UTC)[reply]

ω Awaiting next WP:JUDI batch. — GreenC 04:35, 18 April 2026 (UTC)[reply]

archive.today and Idolator

After the thread Wikipedia:Link rot/URL change requests/Archives/2025/September#idolator.com was opened concerning Idolator archives, it doesn’t look like any alternatives have since come up for the instances archive.today was used for preservation. This is worrisome given how archive.today got deprecated last month per Wikipedia:Requests for comment/Archive.is RFC 5. How can we handle the remaining Idolator URLs that have expired without other backups? SNUGGUMS (talk / edits) 18:12, 7 March 2026 (UTC)[reply]

Yeah sucks. My suggestion is don’t be too reliant on public web archive services eg. https://archivebox.io/ .. in the past 10 years, 7 web archive services have been deprecated. With AI upheavals changing the nature of the web itself, that number will increase in the future. Wikipedia is unusually highly dependent on web archiving, and the WMF and community members have done nothing for 25 years, and the pain of doing nothing will eventually outweigh the pain of doing something. — GreenC 05:43, 8 March 2026 (UTC)[reply]
It baffles me how such changes haven’t already been implemented. SNUGGUMS (talk / edits) 03:37, 17 March 2026 (UTC)[reply]
$$$ — GreenC 17:34, 16 April 2026 (UTC)[reply]

geekgirlauthority.com

I noticed this link https://www.geekgirlauthority.com/the-flash-recap-s05e02-blocked/ lead to a casino website. So it seems the domain https://www.geekgirlauthority.com was usurped. Gonnym (talk) 13:42, 8 March 2026 (UTC)[reply]

ω Awaiting next WP:JUDI batch. — GreenC 04:35, 18 April 2026 (UTC)[reply]

allaboutjazz.com/php/

These links redirect to new URLs. Example: This goes to here for Pink (singer). The article titles will need to be updated as well.~2k. Thanks! MrLinkinPark333 (talk) 23:21, 9 March 2026 (UTC)[reply]

Not sure about changing the title. Like in Pink example, the title tag in the HTML is “The Imagine Project – Herbie Hancock Review”. The title displayed in the article is “Herbie Hancock: The Imagine Project”, and title in the citation is “The Imagine Project”. I think just keeping the citation title is the safest and easiest option. — GreenC 18:12, 18 April 2026 (UTC)[reply]
Found this at B. B. King. It changed from “The gentleman can play” to “the quiet giant”. Full ciation ref is needed as well. Would it be easier to change only the ones that have a lot of title changes like at B.B. King? If it’s small like the Pink example, that step could be skipped. If it’s not easy, I don’t mind if you just update the urls. MrLinkinPark333 (talk) 19:27, 18 April 2026 (UTC)[reply]
The bot just finished processing for URLs. Titles can get messy. With the URLs now locked in, someone in the future could use AWB to parse and replace titles. Or revisit during a slower period. — GreenC 20:53, 18 April 2026 (UTC)[reply]
No worries! MrLinkinPark333 (talk) 23:07, 18 April 2026 (UTC)[reply]

Enwiki

IABot

  • Skipped because 97% of the URLs are live and IABot will pick up the dead stragglers.

 DoneGreenC 20:53, 18 April 2026 (UTC)[reply]

vazcomics.org

Former website for MamEnd usurped by squatters. Does not seem to have a successor domain. Go D. Usopp (talk) 11:24, 11 March 2026 (UTC)[reply]

ω Awaiting next WP:JUDI batch. — GreenC 04:38, 18 April 2026 (UTC)[reply]

Iranian websites

Waiting for current world events to resolve. Unclear if the sites will go back online. — GreenC 04:57, 18 April 2026 (UTC)[reply]

aftabnews.ir

Dead when I visit it. Around 200 links GrapesRock (talk) 10:08, 12 March 2026 (UTC)[reply]

mashreghnews.ir

Around 500 links. GrapesRock (talk) 10:52, 12 March 2026 (UTC)[reply]

farsnews.com

Both the “www.” and the “english.” version seem broken. Around 1K links.

Edit: looks like it’s blocked outside of Iran. GrapesRock (talk) 11:07, 12 March 2026 (UTC)[reply]

rferl.org

Looking at “rferl.org/featuresarticle”: some are dead (such as this from Iraq, though removing the “2006” from the end of the url allows it to redirect) and some redirect (e.g. this to this from Belarus). There are 452 links of this form.

Also this redirects to this suggesting a mapping of rferl.org/content/article/ to rferl.org/a/. There are 438 links of this form.

More broadly, there are around 7500 links in the rferl.org domain. GrapesRock (talk) 11:16, 12 March 2026 (UTC)[reply]

Enwiki

IABot DB

  • Updated 240 links

 DoneGreenC 05:41, 19 April 2026 (UTC)[reply]

pri.org

They now redirect to theworld.org. For instance this redirects to this. I’m intrigued that the publish date changes (especially since the archive has the August 4 date).

The date change is not universal (e.g. this doesn’t change), so it might just be for that specific article.

(Also, I just want to double check that reports of redirects that are currently live are welcome? My current impression is that they are useful to stay ahead of link rot, but fixing dead links is just prioritized) GrapesRock (talk) 11:39, 12 March 2026 (UTC)[reply]

Yes updating redirects has advantages: 1. When the new URL is added to Wikipedia, it is picked up by the Wayback Machine and an archive created there, which is useful to have on record. 2. Redirects are fragile often the first thing to go link rot. Once the redirect is gone, it can sometimes be nearly impossible to reverse engineer, like the above example of the date change. With that said, there is so much of this sort of thing it’s impossible to keep up with, it is triage. — GreenC 05:51, 19 April 2026 (UTC)[reply]

Enwiki

IABot DB

  • Updated about 600 links

 DoneGreenC 18:46, 19 April 2026 (UTC)[reply]

defenselink.mil

Seems dead. There might be ghost redirects to defense.gov, though I’m not sure any of those pages are still alive. I think this because this redirects to this, but http://www.defense.gov/Speeches/Speech.aspx?SpeechID=1123 isn’t a live link.

Around 1000 articles. GrapesRock (talk) 12:20, 12 March 2026 (UTC)[reply]

The site went offline in December 2009, nearly 16.5 years ago. It will be interesting to see how much the bot repairs. — GreenC 16:40, 19 April 2026 (UTC)[reply]
Over 900 links dead. Possibly some of those had been working as redirects, prior to the recent change from defense.gov to war.gov which eliminated the old redirects .. this is typical of government sites. They are around forever, change with administration revolving doors, and usually loose redirect information with generation shifts. They are among the hardest sites to maintain. This one was pretty easy compared to some others. — GreenC 03:31, 20 April 2026 (UTC)[reply]
I’m simultaneously surprised that nearly everything was able to be salvaged while also there were no live redirects. Thanks! GrapesRock (talk) 11:49, 20 April 2026 (UTC)[reply]

Enwiki

  • Checked 1,031 pages and edited 687 pages. Resolved 1,316 soft-404s. Added 12 {{dead link}}. Switched 139 |url-status=live to dead. Added 802 archive URLs (802 Wayback).

IABot DB

  • Checked and updated 1,626 links

 DoneGreenC 02:32, 20 April 2026 (UTC)[reply]

tvrain.ru

Links now redirect to tvrain.tv (simple as changing .ru to .tv). Example. Haven’t found any that don’t work with this redirect (i.e. they’re all currently live links, just with a redirect). Around 250 articles. GrapesRock (talk) 12:49, 12 March 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked about 5,000 URLs and updated about 800

 DoneGreenC 17:14, 20 April 2026 (UTC)[reply]

tut.by

Dead. Around 450 articles.

Shut down per Tut.By. Can’t find corresponding links on its “successor” Zerkalo.io GrapesRock (talk) 13:24, 12 March 2026 (UTC)[reply]

Enwiki

  • Checked 465 pages and edited 146 pages. Added 9 {{dead link}}. Switched 148 |url-status=live to dead. Added 119 archive URLs (119 Wayback).

IABot DB

  • Checking and updating about 12,000 URLs

 DoneGreenC 19:08, 20 April 2026 (UTC)[reply]

segodnya.ua

Dead site. Website stopped updating in 2022 and it’s now down. Around 600 articles.

Example dead link from 2014 Russian annexation of Crimea GrapesRock (talk) 14:09, 12 March 2026 (UTC)[reply]

Enwiki

  • Checked 602 pages and edited 460 pages. Added 9 {{dead link}}. Switched 84 |url-status=live to dead. Added 511 archive URLs (511 Wayback).

IABot DB

  • Updated 2,649 links

 DoneGreenC 06:25, 25 April 2026 (UTC)[reply]

therafoundation.org

Usurped website that uses boiler plate privacy policy, and is selling affiliate travel services. FluffedSylas (talk) 17:16, 12 March 2026 (UTC)[reply]

ω Awaiting next WP:JUDI batch. — GreenC 00:20, 21 April 2026 (UTC)[reply]

canoe.ca

I came across this site, which has been usurpsed by a gambling website that has blocked archivals from archive.org. I see this was discussed in the archives before, and archive.today snapshots were added as replacements for them. However, archive.today is now blocked, and the usurped URL is now present again. It seems this can be solved by changing the .ca to a .com, which archive.org is still able to archive, and is still owned by the original canoe.ca owners. Thanks, ARandomName123 (talk)Ping me! 05:47, 14 March 2026 (UTC)[reply]

ARandomName123, hey that is great news! this (usurped) becomes this (dead) becomes this (archive). It’s in over 3,000 pages, many still as archive.today eg. Hulk Hogan. Not sure yet how the bots structure can unwind and rewind, should be possible. — GreenC 23:09, 17 April 2026 (UTC)[reply]
Lol, forgot that in Jan 2024 I wrote a section about this domain’s usurpation: Special:Diff/1191901478/1200150093 — GreenC 00:41, 21 April 2026 (UTC)[reply]

This needs to be done in two passes: PASS 1 will convert canoe.ca -> canoe.com, remove the archive.today link, and remove any {{dead link}} or {{webarchive}} templates. It will only target canoe.ca citations that use archive.today. PASS 2 will add an archive.org URL for the new canoe.com URL, or add a {{dead link}}. — GreenC 02:31, 21 April 2026 (UTC)[reply]

Enwiki – stage 1

Pass 1 (0001-0050). Checked 50 pages and edited 49 pages. Removed 44 archive.today
Pass 2 (0001-0050). Checked 50 pages and edited 34 pages. Resolved 60 soft-404s. Added 7 {{dead link}}. Added 38 archive URLs (38 Wayback).

Pass 1 (0051-0500). Checked 450 pages and edited 358 pages. Removed 776 archive.today
Pass 2 (0051-0500). Checked 450 pages and edited 326 pages. Resolved 834 soft-404s. Added 74 {{dead link}}. Added 725 archive URLs (725 Wayback).

Pass 1 (0501-6289). Checked 5,789 pages and edited 4,606 pages. Removed 18 {{dead link}}. Removed 9,496 archive.today
Pass 2 (0501-6289). Checked 5,789 pages and edited 4,256 pages. Moved 1 links to a new URL: 1 ruled mapped redirects, Resolved 10,063 soft-404s. Added 1,190 {{dead link}}. Switched 2 |url-status=live to dead. Added 8,575 archive URLs (8,575 Wayback).

Enwiki – stage 2 This stage resets cases where |url= was already canoe.com but |archive-url= still has an archive.today

Pass 1. edited 66 pages. Removed 105 archive.today
Pass 2. Checked 61 pages and edited 60 pages. Resolved 307 soft-404s. Added 17 {{dead link}}. Added 88 archive URLs (88 Wayback).

Enwiki – stage 3 This stage targets archive.today links in {{usurped}} templates for conversion to canoe.com and archive.org

Pass 1. Checked 2,338 pages and edited 1,738 pages. Changed 806 citation metadata. Unwound 1,422 {{usurped}} templates.
Pass 2. Checked 1,161 pages and edited 1,160 pages. Resolved 1,469 soft-404s. Added 537 {{dead link}}. Added 928 archive URLs (928 Wayback).

Enwiki – stage 4 Targets citations with a primary URL of slamwrestling.net and an archive.today URL of canoe.ca

Checked 698 pages and edited 40 pages. Made live 57 URLs. Per WP:ATODAY the archive.today URLs are deleted – because the primary URLs are live no replacement archive.org is added.

Enwiki – stage 5 Manual fix edge cases

Repaired about 10 citations

 Done – Eliminated over 10,000 archive.today about 90% converted to archive.org the rest replaced with {{dead link}}. Includes citations that are {{usurped}}, CS1|2 templates, {{webarchive}}, and bare links. This domain represented 1.5% of archive.today links on enwiki. — GreenC 16:27, 25 April 2026 (UTC)[reply]

Comments

I just noticed this via my watchlist. I’m pretty sure most, if not all these links are to Slam Wrestling. That site moved to slamwrestling.net quite some time ago. The regulars at WP:PW should have identified and/or fixed this problem long ago, since Slam is their top go-to source. Don’t know why that didn’t happen. Also don’t know what advantage there is to preserving ancient archive links when the same content is found and can be archived at the current stable address. RadioKAOS / Talk to me, Billy / Transmissions 02:35, 22 April 2026 (UTC)[reply]

Thanks for the feedback. Do you have an example of an old URL and new URL at slamwrestling.net ? I tried searching for some old article titles at slamwrestling.net without luck. Canoe had a lot of material including music, hockey etc.. and yes a lot of wrestling. — GreenC 03:11, 22 April 2026 (UTC)[reply]
Yes, my guess is that only about 5% were wrestling articles that can be found at slamwrestling.net, but it could be as high as 20%. Examples:
  1. Chris_Chetti#cite_note-23 at https://slamwrestling.net/report/overbooking-convicts-guilty-as-charged/ The original url was http://www.canoe.ca/SlamWrestlingArchive/jan10_guiltyascharged.html but GreenC bot removed it.
  2. Chris_Chetti#cite_note-31 at https://slamwrestling.net/report/confusion-reigns-at-guilty-as-charged/Jahalive (talk) 19:30, 22 April 2026 (UTC)[reply]
I don’t see a way to automate conversion, there is no information in the old URL that points to the new. — GreenC 02:23, 23 April 2026 (UTC)[reply]
The new URL’s seem to be https://slamwrestling.net/report/ followed by the title, all lowercase, with a dash in between words. That conversion could be automated.–Jahalive (talk) 20:19, 13 May 2026 (UTC)[reply]

I’m getting whiplash watching your bots change things (related to canoe.ca/canoe.com), add archives that don’t work, remove archives that do work, add dead-link and cbignore, and then remove them. It was bad enough dealing with the first wave of mistakes the other day, but this second wave is ridiculous.

These 3 just came in on my watchlist:

  • [1] removed a perfectly good archive-url to wayback.
  • [2] removed a perfectly good archive-url to wayback
  • [3] Removed {{Dead link|date=April 2026|fix-attempted=yes}}{{cbignore|bot=medic}} that some bot put there yesterday. This article had a series of edits made to it 22 April, multiple bots, most messing up, and I spent a lot of time fixing it. And now this that makes no sense.

What cbignore or bots-deny can I use that will stop your bots from continuing their mayhem?   ▶ I am Grorp ◀ 20:41, 23 April 2026 (UTC)[reply]

As explained above, it is a two step process. Step 1. Step 2. Final Diff. Your reverts of step 1 are actually creating a problem. I have manually repaired the problem your reverts created. — GreenC 21:11, 23 April 2026 (UTC)[reply]
Also, I just eliminated over 10,000 archive.today links for canoe.ca by replacing them with archive.org links to canoe.com .. this is a complex operation internally for a bot not designed for changing both the domain name and archive provider during a single edit. Thus it required making two edits. I can’t tell where you got involved in the process, but it looks you have been undoing and reverting the bot which caused unexpected results. — GreenC 21:20, 23 April 2026 (UTC)[reply]
Huh, that’s a ton of archive.today links actually. Looks like slam/jam are #6/#24 on the most archived domains at Wikipedia:Archive.today_guidance#Stats. Also thanks for writing the explanation up at Special:Diff/1191901478/1200150093, that’s actually where I got this idea from. ARandomName123 (talk)Ping me! 23:04, 23 April 2026 (UTC)[reply]
@GreenC: Your bot run was running at the same time as this bot run, and this edit messed up royally. How could I know that these two bots weren’t running in tandem and I was supposed to wait a period of time for two runs of yours? All I saw was mess up after mess up with valid archive links being removed, or replaced with ones which didn’t lead to valid source articles. I didn’t click on the edit summary link that led to this discussion until today when these 3 articles showed up again in my watchlist. From this point, I’ll stay hands-off for 48 hours for these 3 articles and then I’ll check their results.   ▶ I am Grorp ◀ 00:42, 24 April 2026 (UTC)[reply]
Ok fair enough. My edit summary could have said 2-steps. The other problem is there are so many links, the lag between step 1 and step 2 was too long and other processes got involved. Should have had smaller batches. Lesson learned next time 2-step is needed for a large domain. — GreenC 02:04, 24 April 2026 (UTC)[reply]
The shorter-batch idea sounds good. If I’d seen two back to back edits, I’d have checked them together, fixed what was broken, and that would have been the end of that. Yes, the other bot did introduce a random edit that mucked up the double-run (and made me micro-check your other bot edits).   ▶ I am Grorp ◀ 02:58, 24 April 2026 (UTC) Another suggestion would be to use two different edit summaries, where the first one mentions “1st of 2 bot runs” (or something similar).   ▶ I am Grorp ◀ 03:00, 24 April 2026 (UTC)[reply]

washingtonian.com

Links where it goes domain/year/Mo/day/article seem fine, but ones of the form domain/*/*/number.html do not.

Looks like ghost redirects exist for at least some of them. For instance,this from Joe Biden, which for some superset of time between February 12 2012 and January 2016 redirected to the new url.

Not sure how many are of the broken form, but there are around 1600 articles on the domain in total. GrapesRock (talk) 15:59, 16 March 2026 (UTC)[reply]

Many (not all) of the 1,438 are simply www.washingtonian.com redirecting to washingtonian.com .. I would have preferred to avoid those but it was too difficult to manage without missing more substantial redirects, and they are legit redirects anyway so no harm. It was able to convert 218 of the number.html to new form. — GreenC 20:10, 25 April 2026 (UTC)[reply]

Enwiki


 DoneGreenC 22:03, 25 April 2026 (UTC)[reply]

themaritimepost.com

themaritimepost.com, usurped GA-RT-22 (talk) 18:49, 16 March 2026 (UTC)[reply]

ω Awaiting next WP:JUDI batch. — GreenC 20:14, 25 April 2026 (UTC)[reply]

hotnewhiphop.com

In (as far as I can tell) 2022, HotNewHipHop changed the format of their urls from “https://www.hotnewhiphop.com/<slug>.<number>.html” to “https://www.hotnewhiphop.com/<number>-<slug>” without redirection. Unfortunately it appears they also changed their numbering system.

Some examples I found:

https://www.hotnewhiphop.com/wu-tang-clan-announce-reunion-album-title-news.5221.html/ —> https://www.hotnewhiphop.com/10992-wu-tang-clan-announce-reunion-album-title-news

https://www.hotnewhiphop.com/sean-kingston-talks-plotting-his-comeback-new-album-deliverance-and-jay-zs-advice-news.132267.html —> https://www.hotnewhiphop.com/354145-sean-kingston-talks-plotting-his-comeback-new-album-deliverance-and-jay-zs-advice-news

https://www.hotnewhiphop.com/cardi-b-launches-her-fashion-nova-line-in-los-angeles-news.64593.html —> https://www.hotnewhiphop.com/150407-cardi-b-launches-her-fashion-nova-line-in-los-angeles-news

मल्ल (talk) 00:24, 22 March 2026 (UTC)[reply]

Thank you! मल्ल (talk) 12:02, 26 April 2026 (UTC)[reply]
The conversion works pretty well: downloaded the complete CDX dataset for the entire domain, grep on the slug string, filter on a URL that has number-slug. I ran into a problem with the CDX download (IA aborted early) taking a bit longer need to rerun. — GreenC 16:57, 26 April 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked about 11,000 links and updated about 8,500

 DoneGreenC 05:04, 28 April 2026 (UTC)[reply]

LSE Research Online

The London School of Economics’s research repository has a new URL and I’ve been working on replacing the old ones (eg. http://eprints.lse.ac.uk/63814) for the new ones (https://researchonline.lse.ac.uk/id/eprint/63814/). IDs are the same, only the URL changed. I found 771 citations in Wikipedia. I managed to partially automate this on Wikidata but on Wikipedia I don’t know how. Could any bot owners help me with this? A factor that makes it more difficult is that some of the old links in Wikipedia use the direct link to the PDF (eg. http://etheses.lse.ac.uk/572/1/Balthasar_State-Making_Somalia_Somaliland_2013.pdf) but simply pointing to the entry would be better, if possible. Thanks in advance! Adam Harangozó (LSE WiR) (talk) 15:40, 23 March 2026 (UTC)[reply]

Hi Adam Harangozó (LSE WiR), just working my way to here. Looks like none left but one. I’ll close this unless there is more to be done. — GreenC 05:42, 27 April 2026 (UTC)[reply]
Thanks, I managed to fix them with AutoWikiBrowser since. Adam Harangozó (LSE WiR) (talk) 07:09, 27 April 2026 (UTC)[reply]

tamuc.edu

East Texas A&M University has just rebranded from Texas A&M University-Commerce. The domain has changed from tamuc.edu to etamu.edu. Redirects are in place, but the old domain will expire in a few months, which will result in broken citations on a number of pages. Can any bot owners help me with this? Here’s a list of Wikipedia articles with citations linking back to old domain links:

https://en.wikipedia.org/wiki/East_Texas_A%26M_University https://en.wikipedia.org/wiki/2023_NCAA_Division_I_softball_season https://en.wikipedia.org/wiki/2024_NCAA_Division_I_FCS_football_season https://en.wikipedia.org/wiki/2024%E2%80%9325_East_Texas_A%26M_Lions_men%27s_basketball_team https://en.wikipedia.org/wiki/2024%E2%80%9325_NCAA_Division_I_women%27s_basketball_season https://en.wikipedia.org/wiki/East_Texas_A%26M_Lions https://en.wikipedia.org/wiki/East_Texas_A%26M_Lions_men%27s_basketball https://en.wikipedia.org/wiki/East_Texas_A%26M_University

Thank you in advance! ~2026-18676-92 (talk) 18:48, 25 March 2026 (UTC)[reply]

73 pages. A lot of them do not redirect and have no equivalent at the new domain. I’ll do the standard bot run: if there are redirects use those, if the old domain still works keep it, or add an archive URL for dead links. — GreenC 05:52, 27 April 2026 (UTC)[reply]
Update The bulk of them successfully moved from tamuc.edu to etamu.edu including reviving 19 from dead to live. There are 31 that don’t work at either domain they are dead. I can provide a list if you want to manually track them down, they might exist at a different URL. — GreenC 15:57, 28 April 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked 243 links and updated 186.

 DoneGreenC 17:44, 28 April 2026 (UTC)[reply]

irishexaminer.com/archives/

Some of these redirect to new links, but some don’t. This redirect works for 2012 Waterford Crystal Cup. This should redirect to here for Cork Premier Senior Hurling Championship, but it’s broken. If ghost redirects can be found, that’d be great. ~600 Thank you! MrLinkinPark333 (talk) 02:43, 26 March 2026 (UTC)[reply]

Enwiki

IABot

  • Checked about 800 URLs and updated 249

 DoneGreenC 01:55, 29 April 2026 (UTC)[reply]

dawn.com

Various links redirect to new URLs. Others are broken or working.

Redirects

  • beta.dawn.com
  • archives.dawn.com

Broken

  • dawn.com/year/month/day/

Working

  • herald.dawn.com

I think it’d be easier to check the entire website. 14k Thank you! MrLinkinPark333 (talk) 23:33, 29 March 2026 (UTC)[reply]

Enwiki

Batch 1: 00001-03000: Checked 3,000 pages and edited 1,182 pages. Moved 2,049 links to a new URL: 173 normal redirects, 1,824 ruled mapped redirects, 52 ghost mapped redirects, Resolved 14 soft-404s. Added 61 {{dead link}}. Switched 78 |url-status=dead to live. Switched 19 |url-status=live to dead. Added 92 archive URLs (92 Wayback).
Batch 2: 03001-15090: Checked 12,090 pages and edited 4,820 pages. Moved 8,287 links to a new URL: 650 normal redirects, 7,442 ruled mapped redirects, 195 ghost mapped redirects, Resolved 53 soft-404s. Removed 1 {{dead link}}. Added 257 {{dead link}}. Switched 339 |url-status=dead to live. Switched 84 |url-status=live to dead. Added 499 archive URLs (499 Wayback).

 DoneGreenC 01:38, 30 April 2026 (UTC)[reply]

tvtonight.com.au

the link http://www.tvtonight.com.au/2013/11/timeshifted-wednesday-20-november-2013.html does not lead anymore to the actual rankings that https://web.archive.org/web/20141104212522/http://www.tvtonight.com.au/2013/11/timeshifted-wednesday-20-november-2013.html shows. https://tvtonight.com.au/2021/06/timeshifted-ratings-have-moved.html#search-open says that searching for “Timeshifted: Wednesday 20 November 2013” would work but it doesn’t, show I’m not sure if these are still available on the site. Gonnym (talk) 09:00, 18 April 2026 (UTC)[reply]

There are 233 pages with timeshifted URLs. Spot checks show anything with a date of November 2019 or later works. Prior dates do not work. I can add archives for the older dates. — GreenC 01:53, 30 April 2026 (UTC)[reply]

Enwiki

  • Checked 223 pages and edited 175 pages. Added 148 {{dead link}}. Switched 226 |url-status=live to dead. Added 1,516 archive URLs (1,516 Wayback).

IABot

  • Checked about 2,000 and updated about 1,300

 DoneGreenC 23:24, 30 April 2026 (UTC)[reply]

chesterchronicle.co.uk

Most of them seem to just redirect normally to cheshire-live.co.uk (e.g. this to this from Diana, Princess of Wales). Some don’t such as this from here where I found the new link by Googling the title (and there don’t seem to be any helpful ghost redirects).

Around 450 articles in the domain. GrapesRock (talk) 15:15, 20 April 2026 (UTC)[reply]

It looks like the Wayback Machine has not crawled the new site, or it’s being blocked. I could match most of those dead link and archive URLs to a new live URL via a CDX inference check, but without Wayback captures not possible. For example this is not in the Wayback Machine; if it was, I could match the slug of the new URL (“following-in-brynleys-footsteps”) with the original URL here (and in this case making adjustment for the ‘s = -s). — GreenC 19:40, 2 May 2026 (UTC)[reply]

Enwiki

IABot DB

  • Updated about 350 URLs

 DoneGreenC 02:29, 4 May 2026 (UTC)[reply]

informador.com.mx

These URLs need a slight adjustment, from informador.com.mx to informador.mx, in order to redirect to their new url. For example, changing this to that redirects to the new URL here for Follow the Leader (Wisin & Yandel song). 730 articles. Thanks! MrLinkinPark333 (talk) 01:56, 22 April 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked and updated about 2,500 URLs

 DoneGreenC 15:03, 4 May 2026 (UTC)[reply]

bleacherreport.com

This has its content here. (changing “/amp/” to “/articles/” and removing the “.amp”). Some are on “syndication.bleacherreport.com”, and for those also removing the “syndication.” fixes it such as this to this.

There’s 140 (with the /amp) (but some of them just have their archives with the /amp). Overall, there are 10.4K articles on the domain. GrapesRock (talk) 11:52, 22 April 2026 (UTC)[reply]

I’ll do the whole thing including the amp rules. — GreenC 15:24, 4 May 2026 (UTC)[reply]

Enwiki

Batch 00001-03000: Checked 3,000 pages and edited 1,131 pages. Moved 1,111 links to a new URL: 16 normal redirects, 1,070 ruled mapped redirects, 25 ghost mapped redirects, Resolved 21 soft-404s. Removed 23 {{dead link}}. Added 23 {{dead link}}. Switched 5 |url-status=dead to live. Switched 42 |url-status=live to dead. Added 249 archive URLs (249 Wayback).
Batch 03001-10413: Checked 7,413 pages and edited 2,856 pages. Moved 3,036 links to a new URL: 142 normal redirects, 2,891 ruled mapped redirects, 3 ghost mapped redirects, Resolved 41 soft-404s. Removed 42 {{dead link}}. Added 44 {{dead link}}. Switched 58 |url-status=dead to live. Switched 88 |url-status=live to dead. Added 388 archive URLs (388 Wayback).


IABot DB

  • Updated about 1,410 links

 DoneGreenC 21:30, 5 May 2026 (UTC)[reply]

indyweek.com/api

All dead, but there do exist missing redirects for seemingly all of them (as long as we have the title). I manually went through and changed all the ones that were tagged as dead to the correct links (e.g. this edit). I don’t know whether your bot can actually find the correct urls, but if it tags all the ones that don’t have archives, I’d be happy to manually redirect those.

90 or so articles. GrapesRock (talk) 12:44, 22 April 2026 (UTC)[reply]

GrapesRock: The bot tagged 40 URLs with {{dead link}}. — GreenC 01:38, 6 May 2026 (UTC)[reply]

Thanks, I’ve dealt with them all! Having the list of articles made it more convenient 🙂 GrapesRock (talk) 11:18, 6 May 2026 (UTC)[reply]
Great, and thanks also. — GreenC 23:17, 6 May 2026 (UTC)[reply]

Enwiki

  • Checked 91 pages and edited 89 pages. Moved 85 links to a new URL: 85 ghost mapped redirects, Removed 5 {{dead link}}. Added 40 {{dead link}}. Switched 9 |url-status=dead to live. Added 8 archive URLs (8 Wayback).

IABot DB

  • Updated about 190 URLs

 DoneGreenC 02:20, 6 May 2026 (UTC)[reply]

jmripl.com

Usurped. Legoktm (talk) 22:57, 24 April 2026 (UTC)[reply]

jmripl.com –> repository.law.uic.edu for the John Marshall Review of Intellectual Property Law. Added to WP:JUDI. ClumsyOwlet (talk) 14:00, 27 April 2026 (UTC)[reply]
ω Awaiting next judi batch. — GreenC 01:46, 6 May 2026 (UTC)[reply]

louisville.com

Some are dead like this. Adding “archive.” to the start of the url like this gives a live version of the link. There’s also live links on the www. sub-domain, but I haven’t been able to find any examples on Wikipedia.

Only around 100 articles, and some of them already have the “article.” at the start GrapesRock (talk) 16:11, 28 April 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked and updated 140 URLs

 DoneGreenC 23:15, 6 May 2026 (UTC)[reply]

tribuna.com

This link from Jens Lehmann has its content here when the title is “La Flop XI degli ultimi 30 anni: chi di loro non avresti mai voluto vedere al Milan?” and the date is 24 April 2020. Seems plausible that some inferred mapped redirects could be found? Thanks!

There’s 1K or so articles on the domain. GrapesRock (talk) 16:37, 5 May 2026 (UTC)[reply]

This might take a couple days to retrieve the domain’s CDX data because the API is very slow and domain is very large. If the target slug (“la-flop-xi-degli-ultimi-30..”) was positioned /slug-name-date instead of /name-date-slug it wouldn’t require a CDX download and could be done in hours instead of days. Just a limitation how the CDX API works. The inferred mapped redirect could work for some but I am seeing inconsistencies like this article dated 26 May 2019 but it has a URL date of 2020-03-06 .. the CDX method is grounded only to the slug it will be more precise. Also, this domain appears to be a mess of different types of problems, will see how it goes. — GreenC 03:42, 7 May 2026 (UTC)[reply]
The site has a CloudFlare security layer enabled which made it difficult. It’s oddly maintained with inconsistencies which made programming rules difficult. I was able to get some but not as much as expected. — GreenC 15:23, 10 May 2026 (UTC)[reply]

Enwiki

Batch 001-500: Checked 500 pages and edited 172 pages. Moved 114 links to a new URL: 98 normal redirects, 11 ruled mapped redirects, 5 ruled inferred mapped redirects, Removed 1 {{dead link}}. Added 7 {{dead link}}. Switched 4 |url-status=dead to live. Switched 4 |url-status=live to dead. Added 78 archive URLs (78 Wayback).
Batch 501-972: Checked 472 pages and edited 170 pages. Moved 125 links to a new URL: 107 normal redirects, 13 ruled mapped redirects, 5 ruled inferred mapped redirects, Removed 3 {{dead link}}. Added 8 {{dead link}}. Switched 5 |url-status=dead to live. Switched 7 |url-status=live to dead. Added 53 archive URLs (53 Wayback).

IABot DB

  • Updated about 750 links

 DoneGreenC 15:23, 10 May 2026 (UTC)[reply]

chemicalland21.com

This website which is used as a reference in ~150 articles has apparently been usurped or hijacked and now goes to msbscakery.hk. All external links to chemicalland21.com should be archived or removed, I think. Marbletan (talk) 12:35, 6 May 2026 (UTC)[reply]

ω Awaiting next WP:JUDI batch — GreenC 15:25, 10 May 2026 (UTC)[reply]

dtnext.in

Take this url from Anna University. If you lowercase-ify everything, remove the numbers and the .vpf you get this which is where the content is. You also get a working link with identical content, if you just remove the .vpf (e.g. this). The simplest rule seems to be just remove the .vpf and lowercase-ify everything (since that’s where it will 301 to).

Edit to add this paragraph: I don’t think I realized that the url in the previous paragraph already redirects to a working url (the url where you remove the numbers and the .vpf and lowercase-ify everything).

Unfortunately some urls such as this from Bruno Mars don’t have the full article title in the url. In that case, doing instead this url works. That being said, from the spattering I’ve looked, all the incomplete urls all end with -.vpf. Thank you :-)!

There are 400 or so articles with the .vpf (though this overcounts since not all results have the dtnext.in url with the .vpf) and 1170 total articles on the domain. GrapesRock (talk) 12:43, 7 May 2026 (UTC)[reply]

Will do, the whole domain. I did it before in Nov 2022, but the bot and techniques back then were more primitive. The Bruno Mars link has four paths, 1 dead and 3 live: 1, 2, 3, 4. I’ll randomly choose a working link based on what is available in the CDX records. — GreenC 16:59, 10 May 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked 590 links and updated 68

 DoneGreenC 20:54, 10 May 2026 (UTC)[reply]

americanradiohistory.com

They all seem to 301 to links on worldradiohistory.com. I thought it was a simple replace, but this redirects to this which turns turns “Archive-Billboard” into “Archive-All-Music/Billboard”. However, the smattering that I’ve checked all seem to successfully redirect.

Over 8500 articles. GrapesRock (talk) 16:45, 7 May 2026 (UTC)[reply]

3500 dead links was a lot more than I was expecting! I would’ve examined the domain closer had I known. I have now done this closer examination :-).
It seems like ones with OCR-Page near the end of the url cause problems. But some of them certainly are available. For instance this is dead, but its content seems to be here. It is also available here. Not all OCR-Page links are dead such as this one. All of these are from this edit.
This link from Betty Clooney exists here. The only thing that prevented it from redirecting was the lowercase b in billboard, with this url successfully redirecting.
This link from Bess Johnson is available here.
This link from Best I Ever Had (Grey Sky Morning) is available here (“12-02a.pdf” -> “12-02.pdf”). I edited this one directly since I feel like that a was probably accidentally inserted at some point.
This link from Charles E. Apgar exists here (this is the only wireless age one so I just changed it directly)
This url from Charles & Diana: A Royal Love Story exists here.
This exists here.
This from Denise Darcel exists here. (I can fix these manually, there’s only 18 articles)
This exists here.
This exists here.
Here are some rules I notice from above that worked on the first relevant link I found:
  • americanradiohistory.com/hd2/IDX-Business/Music/Archive-Billboard-IDX/IDX/ into www.worldradiohistory.com/Archive-All-Music/Billboard/ with the “OCR-Page-XXXX.pdf” turned into “.pdf”
  • americanradiohistory.com/Archive-BC/ into www.worldradiohistory.com/Archive-All-BC/Broadcasting-Magazine
  • americanradiohistory.com/Archive-TV-Radio-Age/Issues/ to worldradiohistory.com/Archive-TV-Radio-Age/
  • Removing “-Page-XXXX” sometimes fixes things.
  • lowercase-ifying .PDF to .pdf sometimes fixes things
  • sometimes “.pdf” -> “-N.pdf” fixes things (between July 31 1993 and December 17, 1994 every other issue of Billboard Magazine was a newspaper I think?)
  • sometimes “.o.pdf” -> “.pdf” fixes things
I don’t know how hard it would be, but it would be nice if the page information was added to the cite with page=whatever.
It seems like this domain is simply a huge mess; I think ignoring the page shenanigans, the last bit stays constant and it is just the path that changes. I don’t know how useful knowing that is though. GrapesRock (talk) 17:46, 11 May 2026 (UTC)[reply]
18531+3500 = 22031 then 3500/22031 = 16% or a 84% conversion rate (80/20 Rule) on first pass. I’ll try the new rules next. — GreenC 19:28, 11 May 2026 (UTC)[reply]

The results for Pass 2 (pages with {{dead link}}) are disappointingly low. Possibly due to my rules, and I think the site is just lots of edge cases. If you want to keep trying, below is the code I used for the rules. You can post the code to AI. Ask it to add new rules; send the new rules code block to me, and I’ll re-run it. — GreenC 05:38, 12 May 2026 (UTC)[reply]

@GreenC: My AI is telling me that the code was bugged since the nested sub is modifying the original rather than the modified copy and it had me change a line like “sub(“(?i)-?(OCR-)?Page-[0-9]{4}”, “”, d)” into “sub(“(?i)-?(OCR-)?Page-[0-9]{4}”, “”, dd)”. I’ve also added a couple new rules:

I believe the new rules that I’ve added are

  • a more general handling of hd2 (which indicates specific pages cited)
  • A new rule for Archive-BC domains
  • Try replacing BB- with Billboard-
  • Try adding -N.pdf when it is between the relevant dates for Billboard Magazine.

Also, one thing that it was unsure about is whether Nim/Awk used \1 or $1. This was only used in 2 places (and it said “Note:” before those instances).

Could you try it out when you get the chance? Thanks! GrapesRock (talk) 10:50, 14 May 2026 (UTC)[reply]

GrapesRock: This is great, thank you! I wrote the awk.nim library and sub() does not support capture groups but gsub() does. I need to fix sub(). Also this rules code is not my best day, it was a quick fix and that d vs. dd is a clear bug. Thanks for the running it through AI and the new updates. I’ll make a Pass 3. — GreenC 21:24, 14 May 2026 (UTC)[reply]
Pass 3 done. Not sure how much further you want to go. Could be many edge cases spread thinly. Or legit dead links. — GreenC 05:42, 15 May 2026 (UTC)[reply]

Enwiki

Pass 1 Checked 8,721 pages and edited 8,652 pages. Moved 18,531 links to a new URL: 11,647 normal redirects, 6,884 ruled mapped redirects, Resolved 6 soft-404s. Removed 177 {{dead link}}. Added 3,501 {{dead link}}. Switched 25 |url-status=dead to live. Switched 13 |url-status=live to dead.
I’ll make an additional pass or passes on the 3,501 {{dead link}} – some might be fixable by looking for worldradiohistory.com versions at the Wayback Machine.
Pass 2 Checked 1,718 pages and edited 93 pages. Moved 127 links to a new URL: 127 ruled mapped redirects, Resolved 4 soft-404s. Removed 85 {{dead link}}. Added 3,399 {{dead link}}.
Pass 3 Checked 1,718 pages and edited 359 pages. Moved 689 links to a new URL: 689 ruled mapped redirects, Resolved 4 soft-404s. Removed 625 {{dead link}}. Added 2,713 {{dead link}}.

IABot DB

  • Checked about 18,000 links and updated about 3,200

rappler.com

At some point they switched from Day-Month-Year to -Month-Day-Year in their urls. For instance this is now present here. The final word may need to pluralized such as this to this (this could be somebody accidentally deleted “s-” though? not sure). Seems like they sometimes change the sections too (nation -> philippines on one of them), but from what I’ve checked there’s 301’s available after switching to MDY.

Around 9.3K articles on the domain. GrapesRock (talk) 20:53, 7 May 2026 (UTC)[reply]

Enwiki

IABot DB

  • Not done, due to pct of dead links vs. number working via redirects.

 DoneGreenC 02:01, 13 May 2026 (UTC)[reply]

money.cnn.com

Fortune magazine links were converted in 2024 to money.cnn.com. However, this subdomain is now broken. I did not find any new URLs. ~7600. Thank you! MrLinkinPark333 (talk) 00:27, 10 May 2026 (UTC)[reply]

Major loss. According to AI: Natively, the content can only be found in two places: The Fortune Digital Archive: Available exclusively to paid subscribers who log into fortune.com and navigate to the “E-Magazine” viewer (which is powered by PressReader). Institutional Databases: Licensed out cover-to-cover to academic and library repositories, primarily EBSCO’s Fortune Magazine Archive. That leaves Archives as the last option. The archives might be under archive.fortune.com or money.cnn.com .. the path portion of the URL remains the same. — GreenC 20:52, 10 May 2026 (UTC)[reply]
archive.fortune.com leads to a paywall, so I don’t know if they’re accessible there. Money.cnn.com redirects to CNN Business. You might get lucky with ghost archives if they were moved to either site. MrLinkinPark333 (talk) 21:58, 10 May 2026 (UTC)[reply]
Treated as a dead site, everything archived. Did a sampling test run, no ghost redirs detected. — GreenC 19:04, 14 May 2026 (UTC)[reply]

Enwiki

  • Checked 7,656 pages and edited 7,013 pages. Added 357 {{dead link}}. Switched 2,194 |url-status=live to dead. Added 6,765 archive URLs (6,765 Wayback).

IABot DB

  • Checked 12,022 links and updated 12,016

 DoneGreenC 01:35, 15 May 2026 (UTC)[reply]

bt.com.bn

Redirects to a gambling site. Most of them already have archives, but a few like Telisai–Lumut Highway don’t. I request adding this to the JUDI list. 460. Thank you! MrLinkinPark333 (talk) 20:24, 10 May 2026 (UTC)[reply]

ω Awaiting next WP:JUDI batch. — GreenC 16:30, 15 May 2026 (UTC)[reply]

archaeology.org

Articles before December 10, 2012 are broken. Luckily, they can be changed from archaeology.org to archive.archaeology.org. Example is this to that for Archaeology of Israel. This also works for their magazine archives like this to that for Rome. I think it’d be easier to check the entire site and changing the urls to archive.archaeology.org to see if itll fix the ones that are broken. Some of these already have archived copies. ~1270. Thanks! MrLinkinPark333 (talk) 21:52, 10 May 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked about 1,000 URLs and updated about 250

 DoneGreenC 00:45, 16 May 2026 (UTC)[reply]

shtetlinks.jewishgen.org

They redirect to kehilalinks.jewishgen.org. If there is www. in the subdomain then it doesn’t automatically redirect, and you must remove the www. such as this into this which now redirects here. Thanks!

One thing I noticed about the broader jewishgen.org domain is that there was at least one cite to a mere Wikipedia mirror (namely this which was on Olivia Newton-John). I don’t think anything particularly needs to be done about that fact, it just felt remiss not to mention it.

Only around 160 on the shetlinks sub-domain. There are also around 1700 articles on the broader domain. GrapesRock (talk) 11:41, 11 May 2026 (UTC)[reply]

Enwiki

  • Checked 162 pages and edited 152 pages. Moved 178 links to a new URL: 178 ruled mapped redirects, Removed 1 {{dead link}}. Switched 13 |url-status=dead to live. Added 4 archive URLs (4 Wayback).

IABot DB

  • Checked and updated 124 URLs

 DoneGreenC 02:03, 16 May 2026 (UTC)[reply]

thisislondon.co.uk

The Standard (London newspaper) has rebranded from thisislondon.uk to standard.co.uk.

this has moved to here

this has moved to here

DiophantineEquation (talk) 16:50, 12 May 2026 (UTC)[reply]

This was fairly difficult because there are multiple layers over time where they made changes. I was able to make live a significant portion, at the Standard site. Those that remain there is no clear way to find the new link, or the content no longer exists. — GreenC 19:42, 16 May 2026 (UTC)[reply]

Enwiki

Pass 1: Checked 1,922 pages and edited 523 pages. Moved 180 links to a new URL: 136 ruled mapped redirects, 44 ruled inferred mapped redirects, Resolved 1,950 soft-404s. Removed 1 {{dead link}}. Added 60 {{dead link}}. Switched 34 |url-status=dead to live. Switched 25 |url-status=live to dead. Added 293 archive URLs (293 Wayback).
Need two passes to search Wayback CDX records for slug segments eg. tackle-the-nhs-morphine-crisis which could be available in one domain (as a 301) or the other (as a 200).
Pass 2: Checked 1,922 pages and edited 712 pages. Moved 771 links to a new URL: 1 normal redirects, 770 ruled inferred mapped redirects, Resolved 1,786 soft-404s. Removed 3 {{dead link}}. Added 3 {{dead link}}. Switched 595 |url-status=dead to live. Switched 1 |url-status=live to dead. Added 4 archive URLs (4 Wayback).
Pass 3: Checked 1,922 pages and edited 400 pages. Moved 434 links to a new URL: 434 ruled inferred mapped redirects, Resolved 1,080 soft-404s. Removed 5 {{dead link}}. Switched 309 |url-status=dead to live. Added 3 archive URLs (3 Wayback).
Pass 3 w/ additional conversion rules.

IABot DB

  • Updated about 1,400 URLs

 DoneGreenC 20:55, 16 May 2026 (UTC)[reply]

pe.com

If you change “pe.com” to “pressenterprise.com” then for some they successfully 301, some they are at the url already, and some don’t seem to exist.

  • For already being at the url: this to this is just the new url.
  • For not existing this.
  • For 301-ing: this turned into this redirects here.

Around 1250 articles. GrapesRock (talk) 11:21, 14 May 2026 (UTC)[reply]

Enwiki

  • Checked 1,157 pages and edited 853 pages. Moved 943 links to a new URL: 943 ruled mapped redirects, Resolved 3 soft-404s. Removed 18 {{dead link}}. Added 88 {{dead link}}. Switched 212 |url-status=dead to live. Switched 18 |url-status=live to dead. Added 184 archive URLs (184 Wayback).

IABot DB

  • Checked about 2,000 URLs

 DoneGreenC 00:27, 17 May 2026 (UTC)[reply]

marvel.com

I’ve come across a lot of marvel.com pages which are dead such as this. Not sure there is a new URL for these. Gonnym (talk) 11:33, 14 May 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked 12,615 and updated 6,553

 DoneGreenC 16:35, 17 May 2026 (UTC)[reply]

abc.com

I’ve come across a lot of abc.com pages which are dead such as this. Not sure there is a new URL for these. Gonnym (talk) 11:33, 14 May 2026 (UTC)[reply]

For better or worse, ABC changed it’s URL structure to 36-char UUID: this is now this. ABC is part of a large holding company (Disney) with many properties, they do this for a number of good technical reasons, but it creates problems for us at Wikipedia. The URL is no longer humanly associated with the content. They can change the content of the page without needing to change the URL itself ie. renaming the show, date of publication or where the page routes. This makes fixing content drift and link rot more difficult since the URL is not decipherable. For example, I can no longer search the WaybackMachine index for “good-morning-america” to find a new URL location after it moved. They may be doing the same thing at Disney+, Hulu, ESPN+, FX. — GreenC 17:40, 17 May 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked 741 URLs and updated 314

 DoneGreenC 21:04, 17 May 2026 (UTC)[reply]

insidetv.ew.com

Pages using insidetv.ew.com such as this are completely dead. Gonnym (talk) 11:34, 14 May 2026 (UTC)[reply]

Enwiki

  • Checked 1,909 pages and edited 1,524 pages. Added 15 {{dead link}}. Switched 534 |url-status=live to dead. Added 1,637 archive URLs (1,637 Wayback).

IABot DB

  • Checked and updated 3,335 URLs

 DoneGreenC 02:08, 18 May 2026 (UTC)[reply]

rawstory.com

If you change “rawstory.com/rs” to just “rawstory.com/” then from what I’ve checked it successfully redirects. For instance this to this (it also gets rid of the day, but I haven’t found a case where it doesn’t successfully redirect from Y/M/D to Y/M so it’s probably unnecessary to code that in?). Thanks!

Around 200 articles on the /rs path and around 700 articles on rawstory.com total. GrapesRock (talk) 11:36, 14 May 2026 (UTC)[reply]

Enwiki

IABot DB

  • Checked 1,380 URLs and updated about 800

 DoneGreenC 15:42, 18 May 2026 (UTC)[reply]

fivethirtyeight.com

  • fivethirtyeight.com1,927
  • abcnews.go.com/53882

A former senior editor at FiveThirtyEight just reported that “ABC News has now taken all FiveThirtyEight articles completely offline. They now redirect to abcnews dot com.politics”. Here’s a few examples of this redirect in Wiki sources:

Thanks! Sariel Xilo (talk) 17:18, 15 May 2026 (UTC)[reply]

Note: ABC previously did a partial takedown of articles & those links were fixed (Wikipedia:Link rot/URL change requests/Archives/2025/March#fivethirtyeight.com) Sariel Xilo (talk) 17:22, 15 May 2026 (UTC)[reply]

This is what AI had to say about the shutdown:

GreenC 22:45, 15 May 2026 (UTC)[reply]

These articles were under two domains (links to IA listings): fivethirtyeight.com from the blog’s start in 2008 until September 2023, and then abcnews.go.com/538 from then until it shut down. There are also some third-level domains like projects.fivethirtyeight.com and data.fivethirtyeight.com, and probably others. Antony–22 (talkcontribs) 04:50, 17 May 2026 (UTC)[reply]

Enwiki

  • Checked 1,928 pages and edited 1,017 pages. Added 14 {{dead link}}. Switched 341 |url-status=live to dead. Added 1,133 archive URLs (1,133 Wayback).

IABot DB

  • Updated 2,700 URLs

abcnews.go.com/538

Enwiki
  • Checked 82 pages and edited 80 pages. Added 2 {{dead link}}. Switched 23 |url-status=live to dead. Added 74 archive URLs (74 Wayback).
IABot DB
  • Updated 58 URLs

 DoneGreenC 00:40, 19 May 2026 (UTC)[reply]

londongardensonline.org.uk

This was a charity directory that changed domain. The domain is now owned by a company. URLs in the format http://www.londongardensonline.org.uk/gardens-online-record.asp?ID=THM001 should be changed to https://londongardenstrust.org/conservation/inventory/site-record/?ID=THM001 as here. MRSC (talk) 18:22, 17 May 2026 (UTC)[reply]

Enwiki

  • Checked 388 pages and edited 385 pages. Moved 443 links to a new URL: 443 ruled mapped redirects, Removed 15 {{dead link}}. Switched 127 |url-status=dead to live. Switched 1 |url-status=live to dead. Added 2 archive URLs (2 Wayback).

IABot DB

  • Checked and updated 385 URLs

 DoneGreenC 18:35, 19 May 2026 (UTC)[reply]

bbm.ca usurped

Links to pages like https://www.bbm.ca/_documents/top_30_tv_programs_english/2013-14/2013-14_09_23_TV_ME_NationalTop30.pdf should be marked as usurped as the live bbm.ca is a completely different website (see archive https://web.archive.org/web/20131005081516/https://www.bbm.ca/_documents/top_30_tv_programs_english/2013-14/2013-14_09_23_TV_ME_NationalTop30.pdf). BBM.ca changed to Numeris. Gonnym (talk) 06:26, 18 May 2026 (UTC)[reply]

ω Awaiting next WP:JUDI batch — GreenC 02:37, 19 May 2026 (UTC)[reply]

www.newsday.com

I found that http://www.newsday.com/entertainment/tv/arrow-is-off-to-a-fast-start-1.4089299 is now located at https://www.newsday.com/entertainment/tv/arrow-is-off-to-a-fast-start-d33660. Gonnym (talk) 07:07, 18 May 2026 (UTC)[reply]

5,985 pages + download Wayback CDX records (apx 12 hrs)
This will have two passes. The first will add archive URLs to dead links. The second will attempt to move dead links to live where possible. This process has elements of probabilistic (vs. deterministic). So the bot errs on the side of false negatives (ie. skips conversions) to avoid making false positives (converting to the wrong URL). The former has the backstop of an archive URL, the later has no backstop, it is a wrong URL. End result: it can’t make as many conversions as otherwise would be possible. — GreenC 03:03, 27 May 2026 (UTC)[reply]

Enwiki

Pass 1: Checked 5,998 pages and edited 4,021 pages. Moved 27 links to a new URL: 7 normal redirects, 20 ruled mapped redirects, Resolved 1,052 soft-404s. Removed 1 {{dead link}}. Added 408 {{dead link}}. Switched 2 |url-status=dead to live. Switched 926 |url-status=live to dead. Added 3,941 archive URLs (3,941 Wayback).
Pass 2: Checked 5,998 pages and edited 2,217 pages. Moved 2,878 links to a new URL: 1 normal redirects, 3 ruled mapped redirects, 2,874 ruled inferred mapped redirects, Resolved 74 soft-404s. Removed 17 {{dead link}}. Added 387 {{dead link}}. Switched 2,657 |url-status=dead to live. Switched 8 |url-status=live to dead.


IABot DB

  • Updated 8,112 URLs

 DoneGreenC 19:04, 27 May 2026 (UTC)[reply]

www.boston.com

https://www.boston.com/ae/tv/2012/10/09/arrow-falls-short-bullseye/r1BsuzTM8jDY923XdlOJFI/story.html is now at https://www.boston.com/culture/tv/2012/10/09/arrow-falls-short-of-a-bullseye/ Gonnym (talk) 07:08, 18 May 2026 (UTC)[reply]

14,929 pages + download Wayback CDX records (12+ hrs)

Enwiki

IABot DB

  • Checked 28,000 URLs and updated 20,000

 Done – this was a mountain of URLs, most needed fixing, and 2/3rds was able to move to a new URL via CDX inference mapping. — GreenC 16:02, 29 May 2026 (UTC)[reply]

Per the TFD this template will need to be “unwound” with archive URLs added. Primefac (talk) 12:14, 18 May 2026 (UTC)[reply]

  • Checked 3,310 pages and edited 3,310 pages. Converted 3,339 templates. Added 3,125 archive URLs (3,125 Wayback).

 Done GreenC 01:52, 30 May 2026 (UTC)[reply]

Awesome, thanks. Primefac (talk) 10:21, 30 May 2026 (UTC)[reply]

mahasz.hu

These Hungarian charts no longer work but most are at slagerlistak.hu. Most can be converted over to slagerlistak.hu/chart-name/year/week or slagerlistak.hu/archivum/eves-osszesitett-slagerlistak/chart-name/year with the following:

Weekly charts (year/week)

Year-end

Broken:

  • DVD charts: These two are broken. They can’t be converted as they are no longer on slagerlistak.hu.

I didn’t include ~60 of them because they need manual adjustments. If I can’t replace them, I’ll make a new request. 270 Thanks! MrLinkinPark333 (talk) 20:09, 18 May 2026 (UTC)[reply]


Enwiki

  • Checked 390 pages and edited 334 pages. Moved 380 links to a new URL: 380 ruled mapped redirects, Removed 7 {{dead link}}. Switched 60 |url-status=dead to live. Added 3 archive URLs (3 Wayback).

IABot DB

  • Checked and updated 2200 URLs

 DoneGreenC 03:49, 30 May 2026 (UTC)[reply]

Which three didn’t work? MrLinkinPark333 (talk) 03:56, 30 May 2026 (UTC)[reply]

From the logs:

This Boy's Fire----https://web.archive.org/web/20120219203200/http://www.mahasz.hu/ ---- fixbadstatus1.1 (old logbadstatus1)
No Quarter: Jimmy Page and Robert Plant Unledded----https://web.archive.org/web/20100826043127/http://www.mahasz.hu/ ---- fixbadstatus1.1 (old logbadstatus1)
Édes méreg----https://web.archive.org/web/20051018090219/http://www.mahasz.hu/m/hu/arany_keres.php?EV=2005 ---- fixbadstatus3.3 (old barelink-modify)

GreenC 02:38, 31 May 2026 (UTC)[reply]

archive.org/details

This is a bot process to remove or move links at archive.org/details/ID when the ID is no longer available.

For example this:

Seligman, M. E. P. (2011). Flourish: A Visionary New Understanding of Happiness and Well-Being. New York: Free Press. ISBN 978-1-4391-9076-0.

Becomes this:

Seligman, M. E. P. (2011). Flourish: A Visionary New Understanding of Happiness and Well-Being. New York: Free Press. ISBN 978-1-4391-9076-0.

Because this:

https://archive.org/details/flourishvisionar0000seli

No longer works.

Notes
  • Estimate about 5% of archive.org/details need removal or modification.
  • This is a 1-time batch job.
  • Books are unavailable for many reasons technical and policy.
  • IDs sometimes move – same book, new ID.
  • Google Books has the same but nobody maintains them systematically, the rate is higher than 5%, and the quantity of GB exceeds IA by at least 2x

GreenC 00:30, 19 May 2026 (UTC)[reply]

haaretz.com

Many links seem to redirect to other links and some of them may have been dead at some point. For instance this from 2011 in Israel was marked dead way back in September 2011. It now 301’s here. Thanks!

Just over 12000 articles. GrapesRock (talk) 11:18, 19 May 2026 (UTC)[reply]

Perhaps the example was marked dead because the live redirect is a paywall, while the dead has a working archive. — GreenC 00:39, 28 May 2026 (UTC)[reply]
I did a dry run of the entire set, and almost all the links are of the type this ie. ending in 1.xxxxx .. these links were at one time not behind a paywall, and so are currently available at archive.org in full .. but most do not have archive.org links on Wikipedia. So if I moved the primary link to the new redirect here, we would no longer know what the archive.org link was for the old 1.xxx – the page becomes locked behind a paywall at the new link. I will instead treat all the 1.xxx as dead, so archive URLs are added, and the content remains visible. In this case upgrading to a live link is a step backwards. — GreenC 17:45, 30 May 2026 (UTC)[reply]
A common exception: when the url contains “.premium” (example), meaning the 1.xxx url was always paywalled and the archive.org version will also be paywalled. In these cases the url will be moved to the new live URL here, and no archive added. — GreenC 18:44, 30 May 2026 (UTC)[reply]
This solution had about a 2:1 ratio: for every 2 links that have a working archive, 1 link was moved to a live (paywalled) URL. — GreenC 17:52, 31 May 2026 (UTC)[reply]

Enwiki

Batch 00001-00100: Checked 100 pages and edited 67 pages. Moved 4 links to a new URL: 4 normal redirects, Added 1 {{dead link}}. Switched 18 |url-status=live to dead. Added 110 archive URLs (110 Wayback).
Batch 00101-00200: Checked 100 pages and edited 82 pages. Moved 48 links to a new URL: 48 normal redirects, Added 1 {{dead link}}. Switched 6 |url-status=dead to live. Switched 21 |url-status=live to dead. Added 81 archive URLs (81 Wayback).
Batch 00201-01000: Checked 800 pages and edited 609 pages. Moved 355 links to a new URL: 347 normal redirects, 2 ruled mapped redirects, 6 ghost mapped redirects, Resolved 3 soft-404s. Added 22 {{dead link}}. Switched 4 |url-status=dead to live. Switched 151 |url-status=live to dead. Added 683 archive URLs (683 Wayback).
Batch 01001-11968: Checked 10,968 pages and edited 8,393 pages. Moved 5,219 links to a new URL: 5,115 normal redirects, 10 ruled mapped redirects, 94 ghost mapped redirects, Resolved 35 soft-404s. Removed 10 {{dead link}}. Added 149 {{dead link}}. Switched 123 |url-status=dead to live. Switched 1,597 |url-status=live to dead. Added 8,745 archive URLs (8,745 Wayback).

IABot DB

  • Checked 36,119 URLs and updated 23,100

 DoneGreenC 01:21, 2 June 2026 (UTC)[reply]

Dear GreenC, I am confused. I have recently encountered a few of these bot edits (citing the present talk page discussion) and in all of them, the redirect on the haaretz website is live and the correct article. (and indeed: “paywalled” now: you need to make a free account to continue reading). I think something must have gone wrong. I have reverted one or more of these automated edits. If I should not have, please inform me. Slomo666 (talk) 14:01, 4 June 2026 (UTC)[reply]
Despite the alleged exception for the premium articles, I have seen at least one .premium article that the bot did set to dead. This saga baffles me, and I just hope that articles accidentally being set to dead is the worst of it. Slomo666 (talk) 14:25, 4 June 2026 (UTC)[reply]
Right it was intentional. Discussed above. Freewall, Paywall, etc.. they all have the same effect: The Wayback Machine can’t save them. Had I switched it over to the live walled link, what happens in a few years when the link dies (it will): there will be no way to view the content again. There is no archive URL, because the Wayback Machine can’t save walled pages. The way it was done, the old links are still available at the old archive URLs, they were saved at Wayback before Haaretz put up walls. We are lucky the Wayback captures exist. When the live walled links die in a few years, there will be nothing to save them from becoming permanent 404. — GreenC 08:07, 5 June 2026 (UTC)[reply]
No, what I am saying is that your bot set a bunch of already archived citation URL’s from “live” to “dead” after the respective archive URL’s (which were not inaccessible!) were already added to the citation template.
I understand the bot cannot retroactively archive things that are no longer live, (or in this case: behind a registration-wall) but that is not what I am saying it should have done or what it did.
If you want examples, look at this: I reverted your bot’s edit, because all of those URL’s, while they redirect to the new haaretz website, actually do link to a live version of the same articles, and there was thus no need to reset the “url-status=live” to “url-status=dead”.
I think this is problematic, because we generally do not want to set url-status to dead unless it actually is. I think the bot should not be doing what I described, and instead, only set, for those refs where the archive-link is already filled, but the new url is behind the wall, the ‘url-access’ to something. (in these examples, I set it to ‘registration’, because that was what it was, but I can imagine it may be difficult for the bot to tell the difference between ‘limited’, ‘ registration’ and ‘subscription’)
Slomo666 (talk) 15:51, 5 June 2026 (UTC)[reply]
I had not really thought through all the permutations at play this was a complicated site with a lot of variation and edge cases. But I am also not too worried about it because I know from experience redirects are fragile and the first to go, those redirects to the live site will disappear sooner than later. Still, in the example you provided: would you prefer to go here or here? The first takes 99% of readers to a 0% readable page (signup required). The second takes 100% of readers to a 100% readable page (no signup required). Setting to dead will be problematic to haaretz owners, and those who have accounts on the website. It will be pragmatic for most everyone else. Had I done it the other way, it would have been problematic for the majority as they would be redirected to walled off content. You could say none of that matters only that dead is dead and live is live, but not sure I agree that’s always in the best interest, and time will resolve it anyway. — GreenC 06:57, 6 June 2026 (UTC)[reply]
Personally, I prefer to go to a source that is available, of course. However, I assumed (and maybe I am wrong about this) that marking URL’s that are not dead as dead was against one or more of our guidelines. I recall there being some resistance to expansion of the ‘url-access-level’ parameter, (I might be confusing it, and maybe it was a discussion on adding a new option to the url-status param) based on the idea that it is not up to wikipedia to censure sources for not being accessible freely. I do think we should have the ability to work into the template that a source link now redirects, but I think the current options (‘deviated’ and ‘unfit’) are really only for when it redirects to a harmful URL or a website that does not support the same content anymore. Is there a venue where I could ask about how this policy works? Teahouse maybe? Slomo666 (talk) 12:24, 8 June 2026 (UTC)[reply]

archive.*.co.uk

A bunch of British newspaper archives changed their url schemes and now none of them link directly to the source and instead redirect to a page with all the articles from the relevant day. Since they all have small numbers of pages, I’ve been manually replacing marked dead links with alternative links. However, the corresponding IABot entries should be updated to mark these domains as dead I think? Thanks!

These are the ones that I’ve found so far; there’s probably others, but these are the ones I could find with “insource:”co.uk” insource:”archive” insource:/archive\.[a-z0-9-]+\.co\.uk\/[0-9]{4}/”

GrapesRock (talk) 11:22, 20 May 2026 (UTC)[reply]

GrapesRock, unlike the WP:JUDI process, the WaybackMedic process can not do batches of domains. Each domain has to be individually configured and processed. (There is bespoke work to catch soft-404s and redirect rules by monitoring logs, it’s impossible to automate certain things safely.) I checked a couple and some redirect properly to the original article. I’m also not sure marking them dead in IABot is right. So, I’m not sure how to approach this. It’s a bit much in quantity of domains, and low link count for each domain. — GreenC 04:45, 28 May 2026 (UTC)[reply]

uk-sport-web.prod.oceanusorigin.com

Replace that unwieldy domain with skysports.com. For example this to this.

Around 70 pages. GrapesRock (talk) 17:14, 21 May 2026 (UTC)[reply]

news.yahoo.co.jp/articles

sigh… this widely used news aggregator website loves to kills its links in just few days. (1 Links Spamcheck). Basically any source older than 7 days can be dead, it will almost certainly dead by one month.

Note I proposed of a bot that automatically archive this sorts of website when added at the WP:IDEALAB. See Wikipedia:Village pump (idea lab)#Bot for automatic source rescue Warm Regards, Miminity (Talk?) (me contribs) 06:16, 23 May 2026 (UTC)[reply]

post-gazette.com

I requested an archive check on some of these links a few months ago. Recently, found a broken magazine article by them for Midnight Ramble (film). I think all articles by this newspaper should be checked now. ~8700. Thank you! MrLinkinPark333 (talk) 19:08, 23 May 2026 (UTC)[reply]

mixonline.com

This broken link is now here for Flow (Terence Blanchard album). Unfortunately, it does not redirect. I would like to request a full check. 1290. Thank you! MrLinkinPark333 (talk) 19:13, 23 May 2026 (UTC)[reply]

Links to individual Catholic News Agency articles now all redirect to the home page https://www.ewtnnews.com/ (example here), following EWTN’s decision to redirect the Catholic News Agency URLs to the new EWTN News website.

Could you please fix this? Veverve (talk) 09:33, 25 May 2026 (UTC)[reply]

justice.gov/usao-dc/

Department of Justice Press Releases deleted by the Trump Administration related to January 6 riots.[4]GreenC 20:47, 25 May 2026 (UTC)[reply]

wtvq.com

Website of WTVQ-DT, a local TV station in Kentucky. Domain is parked and the content on it is not loading, likely because the station has been sold. About 160 pages. Sammi Brie (she/her · t · c) 02:32, 26 May 2026 (UTC)[reply]

If I pick a few at random they seem to work. Perhaps it was temporary? Neils51 (talk) 21:09, 27 May 2026 (UTC)[reply]

philrobson.net

Website of Phil_Robson, a Jazz musician, loads a Korean blog. While I don’t read Korean, it seems to have nothing to do with Phil, Jazz, or even Music. He has another website instead: https://www.philrobsonmusic.com/ — Preceding unsigned comment added by Bevande (talk • contribs) 06:45, 30 May 2026 (UTC)[reply]

observationdeck.io9.com

http://observationdeck.io9.com/ is completely dead (see link). Gonnym (talk) 16:49, 1 June 2026 (UTC)[reply]

tv.com

http://www.tv.com is completely dead. (see link). Gonnym (talk) 16:59, 1 June 2026 (UTC)[reply]

The archives also aren’t loading for some reason (see link). Gonnym (talk) 17:07, 1 June 2026 (UTC)[reply]

brokenpencil.com

  • brokenpencil.com48

Broken Pencil shutdown in 2024; all of the website’s links prompt a sign in that if declined goes to a 401 Authorization Required. Haven’t done an in-depth check but ProQuest seems to have some of the magazine saved if the Wayback doesn’t (or if the Wayback has saved a subscription needed page). For example, https://brokenpencil.com/article/journeys-on-paper-jeeyon-shim-and-the-personal-resonance-of-tabletop-rpgs/ can be found at ProQuest 3108820147. Sariel Xilo (talk) 17:44, 2 June 2026 (UTC)[reply]

monumentaustralia.org.au

Has been usurped by a gaming website.

It appears that http://monumentaustralia.org still works, and the pages I tested follow the same url format, so having the bot remove the .au should fix things. In solidarity, nil nz 04:35, 4 June 2026 (UTC)[reply]

thenortheasttoday.com

Found this soft 404 to the main page at Lai Haraoba. Looking around their website, only some of the old articles are there. I think a full check is needed. ~240. Thanks! MrLinkinPark333 (talk) 23:21, 5 June 2026 (UTC)[reply]

Massachusetts Election pages

* Pppery * (alt) in solidarity 23:28, 10 June 2026 (UTC)[reply]

tvguide.com

Links lead to page not found, see this example. For that example I’ve found this:

but couldn’t find one for

Gonnym (talk) 10:47, 15 June 2026 (UTC)[reply]

tvline.com

Gonnym (talk) 10:53, 15 June 2026 (UTC)[reply]