Nearly 40% of web pages from 10 years ago are no longer accessible

Nearly 40% of web pages from 10 years ago are no longer accessible

Every young person is undoubtedly given the advice: Be careful what you put on the internet because the internet is forever.

This advice is pretty good. Posting online can still have grave consequences, from getting suspended from school to losing your job. But life online might not be quite as eternal as we think.

According to new research from the Pew Research Center, 38 percent of web pages from 2013 are no longer accessible, and a quarter of all web pages that existed from 2013 to 2023 are no longer available. This trend is undoubtedly more aggressive for older content, which, I suppose, does make sense. For instance, just eight percent of pages that existed in 2023 are no longer available.

This phenomenon is called “digital decay,” a sensation in which links to content across the internet, on government and news websites, on the “references” section of Wikipedia, and even X (then known as Twitter) no longer work. The 404 message is becoming all too common.

For instance, about a fifth of all tweets are no longer visible on the site a few months after being posted, either because the account went private, was suspended, or deleted. Tweets written in Turkish or Arabic were more likely to vanish than tweets written in other languages.

As the Columbia Journalism Review wrote, “The fragility of the Web poses an issue for any area of work or interest that relies on written records. Loss of reference material, negative SEO impacts, and malicious hijacking of valuable outlinks are among the adverse effects of a broken URL. More fundamentally, it leaves articles from decades past as shells of their former selves, cut off from their original sourcing and context. And the problem goes beyond journalism. In a 2014 study, for example, researchers (including some on this team) found that nearly half of all hyperlinks in Supreme Court opinions led to content that had either changed since its original publication or disappeared from the internet.”

Link rot and digital decay can make some parts of the internet virtually unusable. Have you ever clicked on a news story and found that most of the tweets embedded in the post are blank, and the hyperlinks are no longer active? It’s frustrating — and can hurt our ability to understand subjects and issues with context.