Machines are getting increasingly more intelligent, and all comes down to simple answers to complex questions in software development from the past. As there is a constant need for enterprise solutions that require interaction with humans in a digital world, technology that tells people apart from machines is getting more complex. And it is the story of the birth of CAPTCHA and how it, almost accidentally, fueled machine learning.

The million-dollar question

Here is a million-dollar question: How to create a test that can distinguish humans from robots? Furthermore, every human should pass it, and even though no robot should pass it, it must be able to grade it.

Well, this was one of the great unsolved questions in the 2000s when one of the major tech innovators was Yahoo and spammers would endanger millions of accounts.

So, what tells bots apart from humans? Especially in the 2000s. The answer was humans’ capacity for optical character recognition. Briefly, identifying letters even when these are distorted.

CAPTCHA – what does it stand for?

Even though there is an ongoing debate about the inventors of this technology, the term CAPTCHA first appeared in publications in 2003 by Luis von Ahn and his team.

So, what does CAPTCHA stand for? It is an acronym meaning Completely Automated Public Turing test to tell Computers and Humans Apart. Basically, it is a reverse Turing test.

Why do we feel like we are failing this basic test too many times?

If the test was created for any human to pass it, regardless of language, age, or education, why do people feel like it takes too many tries to get it right? Well, it is because these tests are getting harder. To understand why these are getting harder we need to understand how it works.

How does CAPTCHA work?

It started with simple technology. CAPTCHA developers would give the computer a code of letters, then they distorted it so that the computer would connect the right answer to the image, but another computer could not. This is how the computer would not understand it, but it would be still able to grade it. In a nutshell, it would be able to detect the correct answer.

However, as more and more people started using this technology it fed computers data, eventually making computers more intelligent. The process resulted in software developers needing to make slight changes to be able to differentiate humans from bots.

Why there was a need for ReCAPTCHA?

Hence, they released ReCAPTCHA, with the same mechanism, but with two words. One was generated for the computer to be able to identify the right answer, the other was from old articles and books. Since the computer could not understand the second word it distributed it to other humans. If the answer was the same as what other people typed in the computer would rate it as correct. And this way computers could further learn what that image meant. Furthermore, the byproduct of this was that an increasing amount of articles and books got digitized.

In 2009 the byproduct became the main product, as Google purchased ReCAPTCHA. It not only made computers better at reading but also digitized massive amounts of physically written content. Consequently, they used this technology to teach computers not just how to read, but how to recognize images.

Why was this a million-dollar question?

The amount of money Google purchased CAPTCHA for is not public. However, it was between 10 and 100 million dollars. Hence, the question at the beginning of the article was actually a lot more than a million dollars.

Current tests to tell humans apart from computers

Now, as a considerable part of our lives moved online, there is so much data about us, that is constantly being tracked, that computers can differentiate between other computers and humans just by behavior. Hence, our limits prove our humanity. For example, the fact that people can’t type as fast as robots, tells us apart. But just as in previous versions of CAPTCHA, it is only a matter of time until computers can mimic human behavior as well.

As apparently simple puzzles turn out to program intelligence into machines, software development blurs the line between what humans and what machines are able to do.

Image: Freepik

