Wednesday, 15 September 2010

What is the most commonly Mispelt word on the internet?

I've wondered about this before, and finally got around to doing some research last night. For a measure of how often a word is mispelled on the internet, let's use this: the ratio of the number of times the correct spelling appears to the number of times the misspelling appears. I'm not quite sure what to do if there is more than one common way in which a given word is misspelt - probably take the ratio of the correct spelling to the sum of the incorrect spellings.

Anyway, here's my first few attempts:
definitely: definately - 96,900,000: 13,500,000 = 7.18:1
separate: seperate - 172,000,000: 12,200,000 = 14.10:1
accommodation: accomodation 98,400,000: 12,600,000: 7.81:1
So, after my first few tries, the old perennial "definately" seemed to be doing well.

The two people I've discussed this with both suggested having a look at the number of hits for "teh" vs "the", which I did the ratios are:
the: teh - 11,910,000,000: 23,000,000 = 517.8:1
The most interesting thing about this fact is that "the" appears in more than 12 *billion" webpages. That means that there is now almost certainly more than two webpages in the google index for every person on Earth. The fact that "the" is so common makes it nearly impossible for the typos to overwhelm the people getting it right.


Anyway, the best I've been able to do so far is the following:
Gauge: guage - 33,100,000: 11,200,000 = 2.96:1
I got pretty excited by the number of hits for the word "miniscule", but it appears that the consensus is that that's an acceptable spelling these days, so you don't get points for that (I'm not quite sure what rules I'm using on what makes a spelling count as "acceptable", but I'm sure there are some). 

Anyone beat 2.96? 

EDIT (16/9)
As Adrianna points out, I had managed to fail to misspell two of the words in my original post - now corrected. She also suggests "occurring", as a candidate, which is a new clear winner:
Occurring: occuring/ocurring = 49,900,000:22,300,000+22,100,000 = 1.12:1
Notice that either of these individual misspellings would already have been winning on its own. Note also that for some reason this word is much more commonly misspelled than either "occurred" or "occurrence".
So - new challenge - can anyone get a ratio below 1?

3 comments:

Adrianna said...

Have you spelt definitely and accommodation identically just to mess with us or have I missed something?

Also how about occurring/ocurring/occuring

Andy said...

A good (apart from the annoying animation) website to use to test words is http://www.googlefight.com

What has surprised me most when looking for a good misspelling is just how unusual most of my common misspellings are!

John Faben said...

Andy - Googlefight gets completely different numbers to those I actually get when I search Google - their number is somewhere between 1/2 and 1/10 of the number of pages Google tells me it's found.

Also, I've just tried repeating the searches in the office and got completely different results than I did when i was at home - I imagine this means we have to change the definition to include the words "when John does the search from his home computer, whilst signed into his Google account", or come up with a better definition.