deCAPTCHA

Defeating reCAPTCHA with Clarifai

Thomas Clarke, Lily Elshaktori, Christopher Green, Sam Thomas
University of Birmingham

Overview

  • reCAPTCHA is really irritating!
  • Can we break reCAPTCHA image recognition based challenges with the aid of Clarifai’s object recognition API?

Solution

  1. Extract reCAPTCHA iframe from target website;
  2. Parse webpage:
    1. Extract search tag;
    2. Extract image grid;
  3. Split image grid into component images;
  4. Query and obtain tags from component images using Clarifai API;
  5. Correlate tags:
    1. Use porter stemming algorithm to remove morphological and inflectional suffixes;
    2. Augment existing tag sets with additional tags due to porter stemmer and synonyms of tags;
    3. Find matching sets by comparing cardinalities of set intersections against base tag (exact matches);
    4. Find similar matching sets by comparing set intersections against tags of exact matching images with a threshold (partial matches);
  6. Generate list of matched images.

Demonstration

Conclusion

  • Further development to construct a browser plug-in capable of automatically intercepting reCAPTCHA iframes and solving challenges.