Internet

What Types of Deep Web Data are There?

By Sydney Butler / February 1, 2019

These days more than half of the world uses the internet. However, most people don’t really think about all the invisible stuff going on just behind the pretty facade of a website or app. It’s easy to think of the internet as only the stuff that you can find on Google, but the truth is actually the opposite of that. Most of the content that’s part of the global internet will never show up in Google. You literally have to dig deeper in one way or another before getting to it. In essence, that's the Deep Web.

The idea of the Deep Web is one that fascinates people whenever they first hear of it. Humans are fascinated with the unknown after all. It's been an especially hot topic ever since the media began reporting on the so-called "Dark Web", with its illegal drug trade and other tawdry stories. However, we're not talking about the Dark Web here. Although that anonymized network of sites is certainly a part of the Deep Web as a whole, the overwhelming majority of the Deep Web is something else entirely. Here are the delicious(ish) details.

Scholarly Databases

Graduates

If you aren't familiar with academia then you might be surprised to know that there are vast databases of academic articles stored in the Deep Web.

The vast majority of scientific and scholarly articles are not going to show up in a Google offering their full texts. Most of the time you'll get a title and a paywall. There's a lot of shenanigans when it comes to public access to scientific and scholarly writing. You'll often have to cough up more than $30 for just one article. An article the original author would have most likely sent you for free. The same author who won't see any of the money you pay to read their article!

The good news is that students and independent researchers are catching on to methods that will grab full scholarly texts from the deep web. If you're interested in finding the article behind the artificial barriers, a good place to start is the Unpaywall web extension. Open-access journals are also on the rise, which is a very good thing.

Website Back-ends

Thousands of Servers Vulnerable to Hijacking Due to Libssh Vulnerability

When you visit a website you are really accessing a custom interface for the content on a server. While the website and some of the information it openly links to will get indexed by search engine web crawlers, much of it will not. That includes the software that the server is running and any information not directly linked to the website. The bottom line is that unless you have the admin passwords you can't get access to the back-end of the site, which is a good thing because a stranger in your sites innards can wreak havoc.

Temporary Web Pages

There are basically two kinds of web pages. Static web pages don't change. A common example is a blog article like this one. The article stays the same every time you reload it.

Dynamic web pages, on the other hand, change in response to something. Often thanks to input from you! Web forms that are dynamically generated are one example.

As you might imagine, these dynamic pages that pop into existence and then go away after serving their purpose don't get indexed by search engines. Therefore they make up part of the Deep Web.

Anything Behind a Password

Privilege Escalation Bug

The web may make us feel like the internet is this big, public space. However, much of the content that is physically connected to the internet is artificially fenced off. You have to pay for a subscription, work for the company or otherwise be the right sort of user to see what lies behind that password screen.

We all do this every day. As soon as you've logged in to your email account or Netflix subscription, you've wandered into the Deep Web. Yes, that's all it takes to wander into the unknowable web. In a way, hackers exist because not all information on the net is openly available. The lure of cracking that password is largely thanks to the craving for mystery and discovery.

Legal, Medical, Military, and other Confidential Information

Files

Following on from the last point, there's plenty of confidential data on the Deep Web. Understandably institution all over the world is going digital. Old paper records are being captured and new records are digital from the start or very quickly converted.

The various world governments are generating billions of records. Military, medical, legal and more. These are all kept behind some pretty impressive security technologies. That makes for a pretty juicy target. Not just for small hacker groups, but for governments themselves. In some ways, the push for cyberwarfare is really a battle for the Deep Web.

No Real Mystery

It's almost a shame, but the truth of the Dark Web is actually a bit of a letdown. Unlike it's a much smaller cousin, the Dark Web, the Deep Web is just the unseen web. Pulling back the curtain just reveals the mundane details of how the magic is made. Still, it's better to know the truth than make up your own facts.

Have you ever searched the Deep Web? Are there other types of content we didn't list? If you liked this article please share it with your friends. Don't forget that you can follow us on Facebook and Twitter. Thanks for reading!



For a better user experience we recommend using a more modern browser. We support the latest version of the following browsers: For a better user experience we recommend using the latest version of the following browsers: