What is CAPTCHA?

Have you heard of the “CAPTCHA” tool? Probably not, but I’m sure you’ve seen it and even used it. It’s used by secure websites to prevent automated registrations. It can verify that you are a human who is submitting information to their website and not some sort of “bot.”  I know you’ve seen it: the box that you have to retype the distorted words in to prove you are human. Like this:

Image

CAPTCHA stands for “Completely Automated Public Turing Test to Tell Computers and Humans Apart.” It works because humans can read distorted text and current computers can’t. It was developed by four men at Carnegie Mellon University in 2000 for Yahoo. In fact, there’s a fantastic article available online for free that was written by three of the four creators called, “Telling Humans and Computers Apart: How lazy cryptographers do AI.” It is available here for free: http://www.cs.cmu.edu/~biglou/captcha_cacm.pdf. The authors have a sense of humor too, which I loved.  In their article they said—while explaining that it is a computer that is used to determine if the registrant is human or another computer, “Notice the paradox: a CAPTCHA is a program that can generate and grade tests that it itself cannot pass (much like some professors)” (Ahn, Blum & Langford).

There are several practical uses for the tool including preventing comment spam in blogs; verifying online poll respondents; preventing dictionary attacks; and thwarting spam and worms by ensuring that the person sending you an email is a real person.

If your website needs protection, you too can get the Captcha tool on your website for free from the reCAPTCHA project here: http://www.google.com/recaptcha.

There’s also a little known real-world application from the reCAPTCHA project: to help digitize text. According to reCAPTCHA, the tool is used to “Stop spam and help digitize books at the same time! The words shown come directly from old books that are being digitized.” This is done through a “sophisticated combination of multiple OCR programs.” It has allowed programmers to “achieve 99.5% transcription accuracy” from the millions of answers people have put in the challenges. At the link I just provided, you can see a comparison of how the two different texts are translated (OCR vs. reCAPTCHA). It’s pretty incredible. I’ve run across digitized text when I’ve been working on genealogy and can tell you that there is a lot to be desired regarding the translation.

There have been historical books translated online via a PDF and you can readily see the problems with the text. Some of it comes out as characters and/or symbols instead of words making reading somewhat difficult.

Who knew that by using a useful tool like CAPTCHA, we would be helping to digitize old documents.

Works Cited

“CAPTCHA: Telling Humans and Computers Apart Automatically.” CAPTCHA.net. 2012. Web. 01 Nov. 2012.

“reCAPTCHA: Digitizing Books One Word at a Time.” Google.com/recaptcha. 2012. Web. 01 Nov. 2012.

Von Ahn, Luis, Manuel Blum and John Langford. “Telling Humans and Computers Apart.” Communications of the ACH. February 2004: Vol. 47, No. 3. Web. 01 Nov. 2012.

12 thoughts on “What is CAPTCHA?

  1. Heidi Parton says:

    Great post! A Tumblr page you might like: http://www.captchart.com/ It’s “art” inspired by reCAPTCHA word combinations like “pyromaniac firefighter” and “cream atrocity.” It’s pretty funny.

  2. Great information! I think CAPTCHAs are interesting. I see them in use everyday and for those who don’t understand their purpose they seem annoying. I guess I can understand that but I also know the amount of SPAM posts and comments that would occur otherwise.

    The company I work for has started looking into photo captchas and other ways to make the process more appealing. “Finish the smiley face” or “Connect the Dots” are much more interactive and still require that a read human complete the task. I wonder what effect these applications would have on the translation help..I mean, eventually captchas would run out of words to translate.

    • Evy says:

      Good ideas on the other ways to make Captchas less intrusive, Hannah. I believe that after they run out of words to translate on reCAPTCHA, then there won’t be a need for that any longer because all the documents will have been translated. 😉

  3. christhew1 says:

    I saw a hilarious episode of a TV show where one of the characters was trying to buy concert tickets online. The offer was only good for a certain amount of time and as many times as she tried she could not decipher the CAPTCHA code to complete the transaction. It was an experience I could certainly relate to!

  4. Evy says:

    I know what you mean, Chris. I have had to go through several CAPTCHAs to get the correct answer. It is frustrating, but necessary, I think. I’m glad they often have the option to listen to the words for those that we can’t decipher. 😉

  5. vickielajoie says:

    I had no idea how this worked or why it worked…thanks for clarifying! I wonder…do the bots “give it their best shot” to decipher the code and sometimes get through? 🙂 I’ll tell you, I’m just plain lucky when I do! 🙂

  6. Evy says:

    Hey again, Vickie. I think some of the CAPTCHAs pretty much let anything get through because I’ve used them on some sites where I was pretty sure I didn’t decipher it correctly but still got through. Yet on other sites, I’ve had to put it in several times because I got it wrong. Go figure.

  7. lahobson says:

    I know these are great assets for a company to shift through spam and fake accounts, but I despise these CAPTCHAS with every ounce of my being. There are so many where I feel like I am looking at ancient Egyptian hieroglyphics to try and decipher what I am trying to enter in. I understand the importance of using a tool like this, but there has to be an easier way to try and distinguish between a person and a computer.

  8. KristieO says:

    Evy,
    Great topic and for me, a mystery solved. I’ve used them, understood their rationale and deciphered them, but wasn’t sure of their name and construction. Now I do! Thanks. ko

Leave a reply to Evy Cancel reply