Majority of The Internet Is Squatted And What You Can Do About It
The Great Internet Illusion
After learning that domains are expensive mainly due to squatting, we decided to investigate how many domains were in fact squatted.
We analized 12 million domains and found that about half of all .COM domains are unused, around 72-96% of alternative TLDs are basically in lounge mode too, and domain squatters are spending 2 Billion dollars a year on hoarding, and that is just for .COMs.
To get to this conclusion, we picked .COM and a small sample of alternative domain extensions, and we looked at each one of them. You can see our results in the graph below, and a bit more on the process to get there.
We found a few peculiar things, and there is lots more to be learned. If you have any questions or suggestions about this, please email me.
A cumulative graph of content availability
|TLD||DNS||HTTP||HTTP content||Overall||Estimated Domains Currently In Use|
The outcome you see is cumulative. We first got the DNS status. For all domains that passed DNS, we got their HTTP status. Then, only domains that got ✅ http status were counted as "used" against the total number of domains in sample.
✅ - http status returns something that isn't one of the things below
🚙 - obviously parked as indicated by page content
☠️ - expired
💲 - listed for sale
🛠️ - under construction
Trolling Through The Belly of The Beast
First, a quick explanation for anyone not in the domain space. There are a handful of companies out there which own the right to resell extensions. For example, Verisign owns .COM, Radix owns .TECH , Matt from WordPress owns .blog, and so forth. Every time you buy a domain, those guys get paid. Every time you renew a domain, those guys get paid again. It's a great business.
Every now and again these companies compete in the size measuring contest, claiming to have millions of those domains under management. We wanted to know if any of the numbers reported for various TLDs were real, or if these companies padded their numbers to look good. It is probably both, but you can decide.
We got the data from icann.org where anyone really could request a bulk file for every internet extensions, containing every domain currently in registration. It's called "Centralized Zone Data Service." The files we got are dated December 2022.
Because each file is massive and contains ALL of the domains registered on extension as of December 2022, we picked a random subset of those domains. We then got the DNS status for each domain, and for all domains that passed DNS, we got their HTTP status. We wrote a classifier to look for domains that were parked, expired, under construction, or for sale. It was not very sophisticated, but good enough to get started.
Note: picking more data or writing a more sophisticated classifier would simply improve the accuracy of results in the direction they are going. It is unlikely that more narrow tools would show results that are widely different front the ones we got. Think of it as zooming in. The more you zoom in, the more you see closer what you were already seeing from afar.
Only domains that were not obviously for sale, or parked, or expired or under constructions, were counted as "used" against the total number of domains in sample. Everything else, and that includes domains with no DNS, were counted as "other". That's how the table above was built.
To calculate how much it costs to hoard domains, and the $2 Billion that it costs for dot-coms, we took the total domains in circulation by TLD, substracted domains which actually hosted something on them, and mulitplied by the cost to renew a domain in 2023. At $15.99 per domain on dot-com, times (354 million domains - ~165mil used domains), that's is roughly 2 billion dollars.
Interesting note here, Verisign reported in June that they had 354 million dot-coms in circulation, but ICANN file that contains dot-coms had 415 million unique rows. We don't know who is right here, but the final numbers are close enough for practical purposes.
Since so much of the dot-com namespace is already taken, you have mainly three choices if you want to get a domain for your business or project.
Choice 1: Buy squatted domains from a domain reseller. This will be a lengthy and expensive process, but possibly worth it as you'd have decent chances at finding a short name you desire.
Choice 2: Explore alternative namespace. With 1500 domain extensions that are not dot-com, you have a much better chance of finding a domain you want, which has not yet been purchased. How come? Even though more alternative space is squatted in proportion to registered names, the absolute numbers are much much less. Dot-com has 400 million domains registered, and 50% of it squatted. Alternatives have 70%+ of registered namespace squatted, but one of the biggest ALTs (XYZ), only has 3.6 million domains registered. In comparison to dot-com, you have another 200+ million options to go for.
Choice 3: Try our sophisticated business domain name search tool to find a domain that will work great for your company. Instead of just looking for domains that are out there, we generate domain names based on your individual needs, which creates by far better and more available results.
I would want to figure out the quality of domains in both used and unused spaces. A cursory inspection of domain files shows a lot of computer-generated domain names that follow a pattern of numbers, letters, and words. There is no way those domains were purchased by average internet users by hand, and were most likely generated and purchased in bulk.
The most interesting follow up question in our mind would be - who generated those domains?
In cases where DNS information is provided, we could do an agregate count to see if there's one particular company, or a set of companies, that is holding on to the majority of auto-generated domains.
There is probably a better way to spend time than to analyze the internet, but sometimes when you want to know, you just got to know!