The Deep, Dark Web

a look at the whole thing

Mar 10, 2021

Anybody can set up a computer server on the internet. All you need is two willing partners, i.e. two gateway servers that are already on the internet. Each of these computers will add yours to one of its networks. Once your server is connected and a little bookkeeping is done, it is ready to serve all the other computers on the internet.

My first experience with the internet was when it was still known as the Arpanet. Some colleagues of mine in an Alberta computer science department were playing a game that was running on a server in California, UC Berkeley in 1979 I think it was. The game was text only, played through some kind of chat. The game involved the server telling us something like “You are in a room with two doors. One of them is ajar and sounds of music are coming from it. What do you do? Open closed door (A), go through open door (B), or sit down and enjoy the music (C)” We would answer with “A”, “B”, or “C”. The server’s response would depend on our choice and the game would go on.

Don’t laugh but that game was pretty exciting back then. Fun and games has always been a part of the internet from text-only games of the past to almost realistic video games of today. But Arpanet was not created by the U.S. military for games. The idea was to have a bomb proof network. Things change and today the army uses online video games to help recruit future bomb launchers.

The Arpanet is a between-network construct that has a clever way of sending messages from computer A in Alberta to computer B in California. The communication path is resistent to bombing attacks because no central control computer is required and any individual computer is expendable. This resistence is accomplished because of properties like this:

The connections between gateway computers are arranged so that there is always at least two ways to send a message from one computer to another.
Each gateway computer has a routing table that provides different ways of forwarding messages; routing information is tentative, incomplete, and updated when messages are successfully routed.
There are at least two computers that know how to find any one computer, namely the two which are directly connected to it.
If a message cannot get through using one route another is tried.

It might take a while to find a route between computer A and computer B but once it is found, messages can flow back and forth with ease.

A message could be a message or a sound track or anything that convertable to 0s and 1s. The Arpanet was restricted to the military, a few tech corporations or government agencies, and a few computer science departments. But the system clearly had a lot of more potential than that. A version of the Arpanet was made public and called the internet. Research papers started being passed around.

But there was no Google to help find anything. Reading a research paper and looking up its references involved writing or emailing the authors. From such an annoyance the world-wide web was born. Its inventor Tim Berners Lee, now Sir Tim, created a piece of software called a browser that could display a research paper much like it would look in print (as opposed to how it would look written on a typewriter). In order to allow authors to specifiy the desired look Sir Tim created a markup language. As an example, in this language you would type <i>emphasis here!</i> to get the effect of emphasis here! the <i> and </i> are marks telling the browser to use italics.

In the past editors would mark up papers with a red pen to tell typesetters what to do to get the desired effects. Sir Tim’s markup language, known as HTML, was a cool replacement for that process. It ran on a computer that Steve Job’s designed during his absence from Apple. Another cool feature was ability to link one paper to another with HTML’s <a> notation. For example this markup

<a src="https://webfoundation.org/about/vision/history-of-the-web/">web history</a>

produces this hyperlink web history. You can click on it to learn more about the history of the world-wide web than explained here.

Sir Tim’s world-wide web had a lot of potential. From it grew the likes of Google, Facebook, Amazon, and Substack. Today’s browsers are much more sophisticated than Sir Tim’s. For one thing they can accept and run computer programs. What I am writing now is being simultaneously translated into HTML by such a program. What I see is what I want to see. In the background my browser is running a program that creates an HTML file. This file is saved at cogitamus.substack.com. Later it was sent to your browser so you can see what I see. Neither one of us needs to know anything about HTML.

One other invention was necessary to make the internet really useful, the search engine. Google was the first company to combine a powerful search engine with a way of making money. Instead of showing a list of search results in some arbitrary order they let the owners of web sites bid for a position at the top of the list.

Once Google became so popular that the word became a verb, the internet was divided into three parts:

The surface or open web which consists of those web servers that are indexed by Google.
The rest of the web servers. Estimates are that 90% of the web servers fit in this category which is called the deep web.
Computers like our personal computers which have no web server. If these have a browser, as almost all do, we think of them as being connected to the web. They are not part of the deep web.
Although the web and the internet seem like the same thing today they are not. It is possible to connect a computer yhat knows nothing about HTML to the internet. Such a computer is of no use to the world-wide web.

My web site is located on a server whose services I rent. My site happens to exist on the surface web. (You could find it by googling “J Adrian Zimmer”.) However not all the pages I put on my web site are on the surface web. Sometimes I post a page that I intend only a few people to see. Instead of linking to it from a page that Google can find, I will merely send out some emails with the necessary link. People who visit that page from my emails are visiting a page on the deep web.

Articles on Substack such as this one can be found on the surface web but the comments made to those articles cannot. They are on the deep web.

The deepest part of the deep web is called the dark web. Dark web servers are not only missing from Google searches but their computers will only respond to browsers that use the right encryption. The best known of these browsers is called Tor. Tor was created to support a spy network but now is available to all of us. The Tor browser sends your request through three servers before forwarding it to some computer on the deep or surface internet. Your browser’s request is encrypted and the encryption holds until the third computer forwards the request.

Suppose you send a request for a news article from the New York Times through a Tor browser. That news article will have an address that begins with “https://” which tells a browser that the request is to be encrypted in a certain way. The Tor browser will ignore that request and use its own encryption. The request then goes through three Tor intermediaries. The third intermediary decrypts the request and then re-encrypts it with the encryption that the New York Times expects and sends the request in a normal way to the New York Times. The response from the New York Times follows a reverse path.

The result of this is that the New York Times server thinks it is getting the request from that third Tor server. It doesn’t know where you are or anything about you—unless you log in.

Currently my web site does not require encryption. Addresses to my web pages begin with “http://”. Browsers will send unencrypted requests for my web pages. So of course that is what that third Tor server will do as well. However the Tor’s encryption from its browser to its third server will still be in full use.

This may seem like overkill but the designers of the Tor system were asking themselves “what if one of our servers gets corrupted?” If the first server is corrupted, then all it can learn about your web traffic is which computer it is coming from, it will not be able to decrypt your outgoing or incoming messages. If the third server is corrupted it will be able to see the content of your messages but it will not know what computer is sending and receiving them. The second Tor server makes it difficult to connect the dots between the first and third servers.

So far I’ve been talking about using Tor to visit the surface and deep webs . Communication between that third Tor server and the web sites you are visiting is normal. You are not visiting the dark web this way. Dark web servers have a special relationship with Tor, communication between the Tor browser and them is not decrypted and then re-encrypted by the third Tor server. It remains encrypted until it reaches the dark web servers. Only you and that dark web server can read the communications between you and the server.

To accomplish this the Tor browser and the dark web server have to have some kind of working relationship. Only servers in the .onion domain have this relationship with the Tor browser. (You have surely seen a lot of addresses that end with .com, .org, and .edu. The .onion ending is the same kind of thing.) When you link to something in the .onion domain your information, sending or receiving, is encrypted all the way. Nobody but your computer and the computer you are talking to knows what is going on.

Some of the .onion servers have chat rooms. You can find a list of them here. If you download a Tor browser and go to one of those chat rooms remember they are full of people who are doing their best to keep anyone from knowing who they really are. This encourages all kinds of chicanery. You and I are not smart enough to anticipate all the kinds of fraud that may be out there to draw us in. So never, never reveal anything about yourself. Lie if you must. It’s expected and it’s the only safe way. Unless you yourself are a criminal or a law enforcement officer, you are not there to do business.

You might be there to exchange information with people you know. For example, someone inside China might be interested in a private conversation with someone in another city or even outside the country. There are ways to communicate that do not involve public chat rooms. There are ways to avoid the Great Firewall of China when connecting to a Tor server. The Tor browser explains them. Of course these workarounds must change fairly often because holes in the Great Firewall of China are frequently closed.

One final note about the dark web. Tor is not all there is. There are other systems. Some of them may not even be on the world-wide web because browsers and web servers are not the only way to communicate. Remember that anyone can connect to the internet so long as they find two cooperative gateway computers. When you connect you can run any software you like on your computer. You and a friend could put your computers on the internet and create specialized software for letting them talk to each other.

Or you could put your computer on a likely route for communication between two people you want to spy on. Depending on traffic loads, some of the communication would go through your computer and you could read anything that isn’t encrypted.

Be safe out there.

Use this button to share this post with a friend

https://cogitamus.substack.com/p/the-deep-dark-web

Cogitamus Home Page

Cogitamus

Cogitamus

The Deep, Dark Web

a look at the whole thing

More

Use this button to share this post with a friend

Or copy the link to this post

Cogitamus Home Page

Other Posts (With Comments)