Ask Professor Puzzler
Do you have a question you would like to ask Professor Puzzler? Click here to ask your question!
I heard someone talking about the "Diameter of the Internet" recently. I understand finding the diameter of a circle; it's just the distance across its widest point. But what is the diameter of the internet, and how do you measure it?
Note: this item was originally posted on the "Ask Doug" blog* back in 2005. We've updated it and republished it with additional commentary at the bottom.
Have you ever played the game "Six Degrees Of Separation"? You start out with a person (often a movie star), and then see if you can relate them to another movie star in six or less connections. For example, how would you connect Julia Roberts to Sean Connery? Well, Julia Roberts starred with Richard Gere in "Pretty Woman", and Richard Gere starred with Sean Connery in "First Knight". That's two degrees right there.
The internet diameter is similar. How many degrees of separation are there between any two randomly selected web pages? How many clicks does it take to get from one web page to another? Researchers have done studies and calculations to try to determine the answer to this question, and the average "distance" in clicks between any two web pages is known as the diameter of the web. As of the turn of the millenium, the diameter of the internet was approximately 19 clicks.
And here we are in 2016, and you'd think we'd have updated statistics for you.
We have some wrong statistics, though. Because, like everything else on the internet, people read what they want to read, and interpret things how they want to interpret them, with no regard for reality.
For example, in 2013, Smithsonian Magazine's website (yes, you read that right - I'm accusing Smithsonian of publishing bogus statistics - although, to be fair, they did partially retract their article) stated this information as though it was new, and reworded it to say that the maximum distance in clicks was 19. (Go back and read the original post; 19 was the average distance, not the maximum distance).
Others have made even more remarkable claims, such as "The diameter of the internet will never exceed 19."
But the reality is, nobody seems interested in continuing this line of study, and there doesn't appear to be any current statistical analysis on the diameter of the internet. So I guess we might as well keep saying the diameter is 19, even though that statistic is now about 16 years old!
There are some questions that would be fascinating to explore (if only I had more time!). Questions like:
- How do you deal with a website that is completely isolated (no inbound or outbound links)? Surely their distance to other websites isn't defined to be infinite? Because that would be problematic for averaging purposes! I suspect sites like that are dismissed from the equation.
- Did you realize that "distance" is not a "commutative" property? In the real world, if we say the distance from A to B is five miles, that means the distance from B to A is also five miles. But in the web, between any two points there are two distances - the number of clicks by outbound links from A to B, and the number of clicks by outbound links from B to A. These two numbers can be significantly different. To see why this is so, consider that there are thousands of websites around the world that link to The Problem Site. This gives a distance of one click. But The Problem Site doesn't link to most of those sites, which means the distance has to be more than one click. In fact, we should really think of it like distances between two cities when all the roads are one-way streets, so you have to take a different route from A to B than you would from B to A. So how do you calculate the distance? Do you take the smaller number? The larger one? Or the average?
- How are search engines included in the mix? Technically, Google links to virtually every website. But the pages that link to various sites are not static, because they are dependent on user input to display the links. On the other hand, if I search for "The Problem Site" on Google, and post the link here, I've now provided an indexable link for people to use when calculating the internet diameter. So have I changed my click distance to google by doing that?
Those are the questions that keep me up at night when I'm pondering 16-year-old statistics that no one else seems to care about any more!
* The "Ask Doug" blog was the precursor to the Ask Professor Puzzler blog. We are gradually moving the content from that blog to this location. Most items will be kept to their original publication dates, but occasionally, when we find an item of special interest, we'll re-post it with the current date, so our visitors will be more likely to find it!