CAPTCHAs are Completely Automated Public Turing tests to tell Computers and Humans Apart. We all encounter these visual puzzles used as website defenses against robo intrusion. Our solving the puzzle correctly confirms to the website that we are indeed real people. Alan Turing and CAPTCHAs made their initial appearance here at SimanaitisSays more than three years ago. Today, I describe how researchers in artificial intelligence have conquered CAPTCHAs with enhanced AI.

Treatment of a variety of CAPTCHAs by enhanced AI. Image from Science, December 8, 2017

“Brain-Inspired Vision Model Cracks Website Protection System, With Much Less Training Compared To Deep Learning,” by Dileep George et al, is online at Scienmag as well as described in the December 8, 2017, issue of Science, the weekly magazine of the American Association for the Advancement of Science. The full paper is available to AAAS members.

Researchers achieved this CAPTCHA deciphering by incorporating other aspects of human intelligence into their AI. They note that recognizing composition, generalizing from this, and learning from a few examples are three hallmarks of human intelligence. In fact, humans are very good at it.

The letter “A.” Image by Vicarious AI on Scienmag online.

Originally, AI computers responded to their human programmers’ instructions. Then machine learning took this a step further, with AI improving upon its programmed algorithms. Last, with deep learning, the AI learns by evaluating scads of examples, rather than simply reacting to pre-programmed direction.

Dileep George and his colleagues observe that “A recent deep-learning approach for parsing one specific CAPTCHA style required millions of labeled examples, whereas humans solve new styles without explicit training.”

To perform AI more efficiently, the researchers looked to neuroscience and devised something they call a recursive cortical network. An RCN is a “probabilistic generative model for vision in which message-passing-based inference handles recognition, segmentation, and reasoning in a unified manner.”

A four-level RCN representation of the letter “A.” Compare this to the illustration above. This and the following images from “A Generative Vision Model That Trains with High Data Efficiency and Breaks Text-based CAPTCHAs,” D. George et al, Science, October 26, 2017.

The researchers report that “RCN was effective in breaking a wide variety of CAPTCHAs with very little training data and without using CAPTCHA-specific heuristics. By comparison, a convolutional neural network [a form of deep learning] required a 50,000-fold larger training set and was less robust to perturbations to the input.”

Variations of an “A,” with different deformability settings, each determined by pooling and lateral perturbation factors.

The RCN approach handles clutter more efficiently than conventional deep learning. It also generates realistic responses when confronted with one-shot training of handwritten characters. Researchers note, “RCN outperformed state-of-the-art deep-learning methods while requiring 300-fold less training data.”

The researchers also see other applications of RCN parsing. For example, the approach permits recognition of multiple real-world objects in cluttered scenes with random background.

These detections and segmentations show RCN extended beyond text parsing of CAPTCHAs.

“In addition,” the researchers conclude, “our model’s effectiveness in breaking test-based CAPTCHAs with very little training data suggests that websites should seek more robust mechanisms for detecting automated interactions.”

Before long, we may have to do more than simply “Identify the boxes containing store fronts.” ds

