FB Doug Meet

Search This Blog

June 5, 2011

reCAPTCHAs make sense

reCAPTCHAs make sense of books

reCAPTCHA

If you enrolled in Blurb’s Set Your Price program to sell your book, you had to enter a CAPTCHA. We had to make sure you’re human. It’s nothing personal, just something websites do to prevent fake accounts from being created. CAPTCHA stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart, a name-checking device that pays tribute to the great British cryptographer Alan Turing.

Blurb actually uses a reCAPTCHA, a freely available verification tool that helps make books from yesteryear searchable and readable. Long story short: When books are digitized they’re read by an OCR (Optical Character Recognition) program. Sometimes, the OCR can’t make heads or tales of the text, particularly if a word is smudged or incomplete which can be the case with old type.

ReCaptcha OCR demonstration

That’s where reCAPTCHA comes in. It helps solve this by providing you with two words, one that’s known by the computer (that’s the one used to prove you’re human) and one that isn’t. Without knowing it, you tell the reCAPTCHA what the unknown word is and that helps OCR “read’ the word correctly. If you’re a real book nerd you’ll start trying to guess which word is which.

Some estimate that about 200 million CAPTCHAs are solved by humans every day. And at about 10 seconds each, that comes to about 150,000 work hours a day. So, look at this way, when you’re approved for our Set Your Price program, an angel gets its wings — or at the very least a grad student can search an archaic text halfway across the world.

reCAPTCHAs make sense of books If you enrolled in Blurb’s Set Your Price program to sell your book, you had to enter a CAPTCHA. We had to make sure you’re human. It’s nothing personal, just something websites do to prevent fake accounts from being created. CAPTCHA stands for Completely Automated Pub ...»See Ya