Most people think of the World Wide Web as being synonymous with the internet. That’s understandable and to a lot of people, the WWW is the only internet-related place they ever see. The truth is, the Web is only one relatively small part of the internet as a whole.
The internet is, in fact, the total physical network infrastructure on which the Web exists. In other words, the Web is just one application that the internet can be used for. There are plenty of other network systems that are a part of the internet, but not the Web.
In this article, I’m going to talk about two of these non-Web internet systems. One is often referred to as the dark web and the other as the deep web. Unfortunately, people tend to use these two terms interchangeably, but they actually refer to two distinct things.
So let’s look at what both the deep and dark web is, what they are used for and how people access them.
What is the Deep Web?
When you put a search term into Google, they don’t actually search the whole internet to find what you are looking for. Rather they keep and maintain an index of web pages. That’s actually why Google turned into the top dog. They figured out an innovative way to build, search and rank their indexes. This improved their search results and in turn, it meant more people preferred their engine. The rest, as they say, is history.
That doesn’t mean every single resource on the internet is indexed by a search engine. There are many servers or sites on servers that do not get included in search engine indexes. So regardless of what search terms you use those sites will never come up on Google or other search engines. The sum total of those non-indexed web resources is what’s referred to as the deep web.
What is the Deep Web Used For?
Sites and resources that are on the dark web are not necessarily weird or nefarious. A lot of it is just technical housekeeping stuff that no one would want to visit anyway. Duplicate content, pages that are only meant to be accessed directly, database files and so on.
Another common type of content that’s on the deep web is stuff behind a paywall. Obviously, if someone is selling content, they don’t want a search engine just making it available to the public. The software that search engines use to index the web can’t index content they get blocked from viewing.
More Familiar Than You Know
Believe it or not, you use deep web content just about every day of your life. After all, your cloud emails and DropBox files don’t show up on Google when you search for them, At least I hope not!
Similarly, if you work at a company that has an intranet (a sort of internal internet with web pages) you also won’t find any of that content by using Google.
Academic journals that aren’t part of open publishing agreements are also included in the deep web. University libraries or other content hosts that have portals may also only allow users who have the right credentials to search their databases.
While no one knows exactly how much information is in the deep web (for obvious reasons) most of the web is deep. It’s almost a weird sort of analog to dark matter. That hypothetical material that props up the universe. Alternatively, you can think of the deep web as the part of the 90% of an iceberg you can’t see.
How Big is the Deep Web?
Despite the fact that we have no real way of knowing the true extent of the deep web, some people have still made very educated guesses.
These estimates are astounding. Much more dramatic than a mere 10/90 split as in the iceberg analogy. Which, by the way, should be credited to one Denis Shestakov.
Educated guesses at the size of the deep web put it somewhere between 400 and 550 times the size of the surface web we all know. The web is always growing and the rate of expansion is mind-boggling. For instance in 2011 Facebook itself had as many users as the entire internet in 2004 did.
How is Search Engine Indexing Blocked?
It’s easy enough to understand that the deep web consists of web content and resources that have not been indexed by the big search engine, but how exactly do you stop someone like Google from noticing you?
The deep web might actually shrink a little in future thanks to the fact that the world’s best search engines are constantly improving. This means more and more content will become indexable, until those who don’t want indexing devise new ways to prevent it. Sometimes content is left out of search engine indexes simply because of how they work. At other times they are deliberately used. Here are some detailed types of indexing methods that make it difficult for mainstream search engines to deal with them.
The Contextual Web
Contextual web searches are a modern approach to internet searching that doesn’t just look blindly at keywords, but also at context. When it indexes pages it will note things like the purpose and intention of the site. What sort of content it is and so on.
If the site doesn’t have the sort of contextual cues this indexing technology needs to classify it, then it won’t be indexed.
There are web pages out there that don’t exist in a permanent form. In other words, they are created and deleted as needed. A common example of this is a web form that generates a custom result when you submit it.
When you set up a website, you also have to set up a file known as robots.txt. This is a file that tells any machine visitors (such as a search engine crawler) how you would like it to conduct itself.
Robots.txt can contain a command that tells engines like Google that you don’t want them to index that page or domain. Of course, the crawler doesn’t have to respect those wishes, but it will keep guys like Google out. There’s often good reason to mark some pages for skipping and mostly it has to do with keeping a site’s ranking high by excluding things like pages that are still under construction.
Obviously, the technology designed to keep bots of all stripes out will also defeat search engine bots. Think of things like CAPTCHA, which requires that you perform a humanity test before letting you through.
Unsupported File Formats
If the engine can’t read a text, then it can’t index it. So if you put text on a JPEG file on your site it probably won’t get indexed. That is until they can justify the computational costs of optical character recognition at scale. The same goes for video content or audio files. Unless they are given transcripts they won’t get indexed. That’s after all that media metadata is for.
The Private Web – Members Only
Just like CAPTCHA anti-bot measures will stop a typical search engine cold, sites that need a username and password to get to after registering for an account can also be part of the deep web.
Software and Scripts
Since software packages stored in servers are usually zipped binaries or other formats not readily interpreted, they don’t get indexed. The pages that link to them would, of course, be indexed as normal, but the actual software is ignored.
The crawlers used by search engines are usually very systematic in their exploration of the Web. When they discover a page, the links on that page leads them to more pages, and so on.
If a page exists that has no other pages linking to it, then you can see it would be difficult for a search engine to notice its existence.
The web is always in flux and websites are constantly changing or disappearing. Luckily several organizations have realized that web content has historical value and so we have internet archives that let you look up old pages as if they are still up.
However, it would be pretty confusing if search engines indexed these pages as if they were still current. So usually search engines will ignore archival material, other than the actual homepage or portal.
How is the Deep Web Accessed?
I’ve already alluded to some of the ways that deep web content is accessed. Your Gmail account or logging into your intranet are two examples. Sometimes you need special custom software because just because something is on the internet does not mean it’s stored in common web formats.
The deep web plays host to vast networks of custom technology. Some of it is governmental in nature and some of it is part of criminal or underground networks. That’s where the dark web comes into play.
What is the Dark Web?
All of the dark webs is part of the deep web, but not all of the deep web is the dark web. Confused? Well, as we just discussed plenty of perfectly innocent or entirely legal stuff is not indexed by search engines. These resources are generally not hidden for suspicious reasons, but for practical purposes that make sense.
The dark web, on the other hand, is a tiny subsection of the internet that doesn’t want to be found. These are networks that exist only for the people who helped create and sustain them.They usually require special software for access, While the technology that underpins the dark web can certainly be used for innocent purposes, no one who talks about the dark web thinks of it that way. They think specifically about the ultra-secretive and seedy underbelly of the internet. The dark web feels more like something from the x-files than a real place, but rest assured it’s really there.
What is the Dark Web Used For?
There is a long list of truly nasty purposes that the dark web is put to. This is the where hackers come to hang out and confer. Many cyber warfare attacks are built from bases situated in the dark web. We literally do not know all the different purposes for which the dark web is put to use, but some of the more prominent aspects of it have bubbled to the surface from time to time.
If you thought street markets in some parts of the world were chaotic and filled with dodgy goods, wait till you see illegal Darknet markets.
In these place, you can buy just about anything. Firearms, drugs, custom computer viruses. Just about anything the average anarchist needs to raise a little hell.
The most famous example that the general public is aware of is likely the Silk Road, a massive anonymous market that relied on sophisticated protective measures and cryptocurrency to operate. It’s probably the biggest reason the public associate’s Bitcoin with illegal trade, even though dark web trade of Bitcoin is a tiny fraction of all Bitcoin trade. Speaking of which…
All Sorts of Bitcoin-Related Stuff
Although the dark web is very small in the greater scheme of things, they sure do love all things Bitcoin and crypto. Bitcoin is a revolutionary digital currency that doesn’t require a central bank or centralized ledgers. This means that Bitcoin can provide almost completely anonymous trade without leaving a trace through the mainstream financial system.
For one thing, you’ll find cryptocurrency mixers on the dark web. These services help obscure the trail that certain Bitcoin has taken to make transactions harder to trace. No prizes for guessing why that sort of thing is popular on the dark web.
Hackers generally come in three main flavors: white hats, gray hats, and black hats.
White hat hackers are the good guys. They make a job of working with organizations to find security holes and plug them.
Gray hats also tend to do this, but might not tell someone their system has a weakness until after hacking it. Which is technically breaking the law?
Black hats are the type of hacker the media refer to when they just say “hacker”. Criminal computer wizards who break in, steal and destroy.
If you want to hire a hacker for any reason, the dark web is where you go to find them. Hacker groups call the dark web home and will trade services and information there.
Illegal Adult Content
One of the more sickening aspects of the dark web, illegal videos and images are kept here for distribution through criminal rings. I won’t go into detail, but content showing death, torture and sexual abuse often come from the dark web. Including content that disgracefully includes children.
Ironically, many of the hacker groups on the dark web actively hunt down the perpetrators behind such content.
Scams, Hoaxes, and Fraud
In an ironic twist, there are plenty of scams, hoaxes and fraudulent sites aimed at dark web users on this corner of the net.
For example, “legitimate” darknet markets might be spoofed or cloned, with the cloners making off with Bitcoin without ever providing anything. There isn’t much anyone can do about this. It’s not like you can go to the police and complain that you were cheated by an online drug dealer.
There are always persistent rumors that you can hire hitmen on the dark web, but there is no evidence that this is true.
If you want to buy stolen credit card information the dark web is the place to go, but just as often those sites are scams selling fake numbers. It really is like the Wild West.
Home of the Botnets
A botnet is a mass of computers that have been infected with malware. The owner of the malware can summon these zombie bot computers to do all sorts of things. Often they are used to orchestrate large DDoS or Distributed Denial of Service attacks, where sites are flooded with access requests. This blocks real users from getting to the service.
A botnet needs a central point of control, however. The commands have to come from somewhere. The dark web is the perfect place to hide that control center, where it’s unlikely to be found.
One group of people who really hate being found are terrorists. Yet they need to communicate with each other. So the dark web ends up being perfect for them. Security researchers say that terrorists use the surface web, such as social media, to recruit people. Their actual serious planning communication happens out of sight on the dark web.
Believe it or not, there are social media like Facebook on the dark web. In fact, it’s not just dark web clones of the service, there’s an official Tor Facebook page.
Although Facebook knows who is logging in, their argument is that it allows users to circumvent censorship.
How is the Dark Web Accessed?
You can’t just open up Chrome or Firefox and hop straight into a dark web server. These sites don’t use the same user-friendly web technology we’re all familiar with. You usually need to make use of special software and access methods. Some of these you’re probably familiar with if you’re a regular reader, others are a bit more niche than usual.
I’ve written about Tor quite a bit on TechNadu, but usually in the context of accessing normal web pages while maintaining your anonymity through a vast network of volunteer peers. In a nutshell, Tor is a special browser along with a global volunteer network of peer computers. When you access web content through Tor the packets are randomly routed through that network, making it impossible to trace.
Tor is also one of the main ways to access the dark web. Apart from normal web pages that end in “.html”, you can also access special Tor sites that end in “.onion”. Why? Because Tor is short for the onion router, which refers to the complex layering method it uses to obscure identity on the web. It’s also the reason that the Tor network is called onionland in deep web parlance.
VPNs or virtual private networks are an absolute requirement for anyone who wants to access the dark web. A VPN creates a network “tunnel” that hides what you are doing from everyone except the VPN provider. If you’re using Tor to access the dark web then obviously no one can trace that activity back to you from that side of the transaction. However, your internet service provider can still see that you are using Tor.
Although the rule is generally “innocent until proven guilty” just the act of using Tor might put you on a watch list. That’s not the sort of attention anyone would want, so dark web users will hide their Tor use within a VPN tunnel.
The Freenet is a peer-to-peer communication network built to resist attempts at censorship. It forms its own little web with sites, forums and more. In order to use Freenet, you need to use their custom software. The whole deal with the Freenet is that the information is spread across many computers. There’s no feasible way to shut down the content here.
You can also use third-party applications to do things like chat with other people over Freenet anonymously.
I2P – The Invisible Internet Project
I2P is another way content can be hosted on the dark web without getting removed, deleted or shut down. It’s a rather complex technology that uses garlic routing, a cousin to the onion routing that Tor uses.
It’s what’s known as an overlay network, which is one way of saying that it’s a network which exists within another network.
The I2P network plays host to blogs, websites, email servers, chat systems and file sharing services. It is also infamously the technology used to host the Silk Road.
Protecting Yourself in the Depths
While much of what happens on the dark web is intentionally hidden from the authorities, that does not mean the denizens are themselves your friends. Visiting the dark web comes with real risks. Your identity and information can be collected as a matter of fact by your ISP, big companies, and the government. On the surface web, we know about this as a fact. That’s one of the main reasons we recommend VPNs for regular web users.
When it comes to these open attempts to collect data, however, at least they are governed by laws. If a company like Facebook collects your information in contravention of the law then at least we can hold them accountable.
The dark web is different. No one knows anyone else. There is no accountability. No rules govern who may take what information. That means your attitude to online safety has to be quite a bit different than usual.
Trust No One
Perhaps it doesn’t need to be said, but don’t trust anything or anyone on the dark web. Since this is a place of almost total anonymity and there are plenty of people with questionable motivations there, you should be very skeptical.
It can be an interesting and eye-opening experience to explore the depths of the hidden web. At the same time, you need to develop the right level of suspicion and cynicism.
In particular, be very careful about giving your personal information. That’s especially important because some details may seem trivial or unimportant. However, small comments such as the name or type of your pet could help others on the deep web build a profile of you. There are after all other ways of tracing someone than using their IP.
There are plenty of urban legends where people who have visited the dark web only to receive mysterious phone calls or weird letters in the mail. Chances are, if true, that these people simply let slip to much on a forum or other site and another user wanted to give them a wake-up call. Let that not be you.
Disconnect Cameras and Microphones
With applications like Skype now widely in use, the chances are that your computer has a microphone and webcam connected to it. Nothing weird about that, right? Except, there’s a long tradition of webcam hacking by internet criminals.
By using malware or exploits to take over your webcam and mic, the hacker can record you as you work on your computer. While good anti-malware software and a working firewall should prevent this from happening, there’s always a chance that it won’t be detected.
If you can’t unplug your camera and mic because they are built into your device, then you can follow the advice of Zuckerberg and Snowden Tape up the camera and temporarily block the mic-holes. Better safe than sorry!
Absolutely use Tor, VPNs, and VMs
Maintaining your anonymity on the dark web is more important than at any other time. While having good habits when it comes to keeping your personal details to yourself is a start, the right privacy tools are a must.
Tor is sort of a given, but combining it with a VPN and even virtual machine technology can insulate you from many attempts to uncover who you really are. Virtual machines are entire computers simulated in software. It doesn’t matter if someone hacks your VM since it’s sealed off from the rest of your system.
I’ve already done an article that will teach you how these three technologies work together. Let’s call it the Tor, VPN and VM combo.
How to Find Stuff on the Deep Web
Getting Tor and setting up a VPN is both pretty easy things to do. So if you’ve done them, congratulations! Technically you are now at the front door of the deep web. Except, how do you actually find anything here? Let’s look at some ways.
Deep Web Search Engines
What? Didn’t we just say that the deep web is not searchable by definition? Well, that’s not entirely true. It’s just that the methods of discovery for hidden sites is a little different.
You can find the actual onion links for these search engines by Googling for them the normal way. From there they can help you find other sites. Some well-regarded deep web search engines include:
- Onion URL Repository
Reddit is King
The number one go-to for people looking for stuff on the dark web is certainly Reddit. This massive message board plays host to just about any topic and that includes deep web information.
Reddit also has the benefit of showing you what other people have said about the information. Letting other people follow instructions to deep web locations first is a good idea. Let them report what they find. It’s always better to be the second one through the door.
The Hidden Wiki
The last place I want to talk about is called the Hidden Wiki which I guess is like the deep web version of the Necronomicon. A tome of power if used correctly, but one that can bite you if you aren’t careful.
If you find the true Hidden Wiki on the deep web itself you need to know that anything goes as to the sort of content it can point you to. Be careful not to access illegal content that you didn’t mean to.
The Good Side of the Dark Web
Despite its reputation, the dark web represents the same mixed bag as humanity in general. Some of it is pretty neutral, some of it is very bad and some of it is quite good. It’s in the nature of the media to focus on the negative. After all, that’s what gets the ratings up.
The dark web is a fantastic place for people who are fighting against real-world evils to get the news out. If you lived under a despotic government, you can’t say anything about their wrongdoings out loud. Get access to Tor for five minutes and you can leak everything. No one can ever prove it was you.
Even in democratic countries, it can be a way to uncover corruption and level the playing field when confronting those with power.
The dark web is also a safe haven for information and content that might otherwise be destroyed by censorship. Who knows how many books we’ve lost to the “greater good” over the centuries. The size and nature of the dark web mean that you are unlikely to ever eradicate something from it completely.
Not So Scary After All
We fear what we don’t know. When it comes to the deep and dark web most of us know very little. It’s easy for our imaginations to come up with vast conspiracy theories and sinister plots. Unfolding on the invisible parts of the web.
The truth is that most of both these phenomena come across as a little mundane. It’s a place for freaks and geeks. For anarchists and people who live on the wrong side of the law. As Obi-Wan says, it’s a “hive of scum and villainy”. At least some of the time.
No one can deny that the hidden parts of the internet aren’t fascinating. There’s always something romantic or morbidly curious about anything shrouded in such mystery. I hope that I haven’t spoiled too much of the magic, urban legends or hax0r coolness. I think they are still plenty enigmatic. In a world filled with security and privacy threats, it’s better to be aware than unaware.