Shortcuts
What is CAPTCHA?
It is a technique used to prevent bots (web robots) from abusing web resources. The core of this technique is a computer program running at the server side which tries to distinguish human users from bots. CAPTCHAs have been widely used on user registration page and sometimes on login and message posting web pages. If you had never heard about this term, the examples below should remind your memory.Examples
The Name of the Game
- CAPTCHA = Completely Automatic Public Turing tests to tell Computers and Humans Apart
-
CAPTCHA = HIP (Human Interactive Proof)?
- Historically, when Manuel Blum et al. coined the term HIP, then used it to cover a much more broader security systems involving humans. While one of the major type of HIPs they proposed is CAPTCHA, they also cover some other systems such as HumanOID, user authentication system against observation attacks without the use of auxiliary device (so called "naked human in a glass house").
- Unfortunately, the term HIP has been used by some people as a synonym of CAPTCHA which I don't agree.
- So my answer is: "No, CAPTCHA ⊂ HIP".
-
CAPTCHA ⊂ ATT (Automated Turing Test)?
- "The Turing Test is a test of a machine's ability to exhibit intelligent behaviour." The concept was coined by Alan Turing, the Father of Computer Science in his 1950 paper "Computing Machinery and Intelligence".
- The "standard" interpretation of a Turing Test is about a human interrogator who is challenged to judge if one of two invisible entities in front of him/her is a machine.
- According to the full name of CAPTCHA, it is considered as an automated Turing Test, in the sense that the human interrogator is automated by a machine.
-
CAPTCHA ⊂ RTT (Reverse Turing Test)?
- Reverse Turing test does not have a clear definition, "but has been used to describe various situations based on the Turing test in which the objective and/or one or more of the roles have been reversed between computers and humans."
- Since in CAPTCHA, the role of the human interrogator is replaced by a machine, it is considered a reverse Turing test by some people.
The History before CAPTCHA
Here I list previous work done by others before the term "CAPTCHA" appeared in late 2000 and early 2001. Some papers were published after 2000 but they were not influenced by the work on CAPTCHA done by CMU researchers.-
Moni Naor, "Verification of a human in the loop or identification via the Turing test," 1996
- Moni Naor is a former PhD student of Manuel Blum. It is obvious that Manuel Bluem et al. were inspired by Nanor's early work.
-
AltaVista's "Add-URL" web page, http://altavista.com/sites/addurl/newurl, protected by a scheme later known as CAPTCHA, 1997
- This work was backed by a US patent filed in 1998: Mark D. Lillibridge, Martin Abadi, Krishna Bharat and Andrei Z. Broder, "Method for selectively restricting access to computer systems," US Patent 6195698, Original Assignee: Compaq Computer Corporation, filed on 13 April, 1998, issued on 27 February, 2001
-
Jun Xu, Richard Lipton and Irfan Essa, "Hello, Are You Human?," Georgia Institute of Technology College of Computing Technical Report, GIT-CC-00-28, 13 November 2000
- This work was later published at the ICCCN 2003 conference: Jun Xu, Richard Lipton, Irfan Essa, Minho Sung and Yong Zhu, "Mandatory Human Participation: A New Authentication Scheme for Building Secure Systems," in Proceedings of the 12th International Conference on Computer Communications and Networks (ICCCN 2003), pp. 547-552, IEEE, 2003
A Brief History of CAPTCHA
- September 2000: Udi Manber, the then chief scientist of Yahoo! described the so-called "chat room problem" to Manuel Blum at the UC Berkeley. The problem is about webbots posting spam in Yahoo!'s chat rooms.
- 2000: Manuel Blum's group launched a web site www.captcha.net which made the term CAPTCHA public with several practical CAPTCHA designs.
- 2001: Manuel Blum moved to the Carnegie Mellon University (CMU) and continued his research there.
- January 2002: The First Workshop on HIPs was held at Palo Alto, CA, USA. This workshop has a main focus on CAPTCHA although it did also touch other HIPs.
-
2002: The first report on breaking CAPTCHAs was released by Greg Mori and Jitendra Malik at http://www.cs.berkeley.edu/~mori/gimpy (now moved to http://www.cs.sfu.ca/~mori/research/gimpy).
- The paper was not published until the next year at CVPR 2003 conference: Greg Mori and Jitendra Malik, "Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA," in Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), vol. 1, pp. 134-141, IEEE Computer Society, 2003
- 2002: HP researchers suggested using CAPTCHAs together with passwords to make automated online dictionary attack harder: Benny Pinkas and Tomas Sander, "Securing Passwords Against Dictionary Attacks," in Proceedings of the 9th ACM Conference on Computer and Communications Security (ACM CCS 2002), pp. 161-170, ACM, 2002
- 2003: Manuel Blum and his collaborators published the first paper on CAPTCHA at EUROCRYPT 2003: Luis von Ahn, Manuel Blum, Nicholas J. Hopper and John Langford, "CAPTCHA: Using Hard AI Problems for Security," in Advances in Cryptology — EUROCRYPT 2003: International Conference on the Theory and Applications of Cryptographic Techniques, Warsaw, Poland, May 4–8, 2003 Proceedings, Lecture Notes in Computer Science, vol. 2656, pp. 294-311, Springer, 2003
- 2004: Manuel Blum and his collaborators published an article on CAPTCHA targeting the broader readership in computer science: Luis von Ahn, Manuel Blum and and John Langford, "Telling Humans and Computers Apart Automatically," Communications of the ACM, vol. 47, no. 2, pp. 56-60, ACM, 2004
- May 2005: The Second Workshop on HIPs (HIP 2005) was held at the Lehigh University, Bethlehem, PA, USA. Some researchers started confusing HIP with CAPTCHA since this event.
- October 2007: The first malware exploiting humans to solve hard CAPTCHAs was spotted: Mike Barwise, "Trojan tricks users into reading captchas," The H Security, 29 October 2007
- 2007: CMU researchers Luis von Ahn et al. designed the reCAPTCHA to exploit human efforts in solving CAPTCHAs for texts that cannot be recognized by OCR tools of Google Books.
- 2008 (?): The first CAPTCHA advertising service provider(s) started.
- 2009: Google acquired reCAPTCHA from the CMU.
- 2010-: ...
Surveys
- Meriem Guerar, Luca Verderame, Mauro Migliardi, Francesco Palmieri and Alessio Merlo, "Gotta CAPTCHA ’Em All: A Survey of 20 Years of the Human-or-computer Dilemma," ACM Computing Surveys, vol. 54, no. 9, Article No. 192, ACM, 2021
- Antreas Dionysiou and Elias Athanasopoulos, "SoK: Machine vs. machine – A systematic classification of automated machine learning-based CAPTCHA solvers," Computers & Security, vol. 97, Article No. 101947, Elsevier, 2020
- José María Gómez Hidalgo and Gonzalo Alvarez, "CAPTCHAs: An Artificial Intelligence Application to Web Security," Advances in Computers, vol. 83, Chapter 3, pp. 109-181, Elsevier, 2011
- Marti Motoyama, Kirill Levchenko, Chris Kanich, Damon McCoy, Geoffrey M. Voelker and Stefan Savage, "Re: CAPTCHAs – Understanding CAPTCHA-Solving Services in an Economic Context," in Proceedings of 19th USENIX Security Symposium, USENIX, 2010
- Elie Bursztein, Steven Bethard, Celine Fabry, John C. Mitchell, Dan Jurafsky, "How Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation," in Proceedings of the 2010 IEEE Symposium on Security and Privacy, pp. 399-413, IEEE Computer Society, 2010
- 李秋洁, 茅耀斌, 王执铨, “开放式人机区分图灵测试”技术研究综述 (A Survey of CAPTCHA Technology), 《计算机研究与发展》 (Journal of Computer Research and Development), vol. 49, no. 3, pp. 469-480, 《计算机研究与发展》编辑部, 2012
Shujun's Work
Oracle-based Attacks
In 2020, Shujun and his collaborators published a paper showing a new CAPTCHA protection mechanism against learning-based oracle attacks cannot actually prevent such attacks, due to some information about true labels from some statistical differences of images belonging to different classes. This work also led to a new general principle about checking any statistical imbalance in future CAPTCHA designs.- Carlos Javier Hernández-Castro, Shujun Li and María D. R-Moreno, "All About Uncertainties and Traps: Statistical Oracle-based Attacks on a New CAPTCHA Protection Against Oracle Attacks," Computers & Security, vol. 92, Article Number 101758, 2020 © Elsevier Science B. V.
Breaking e-Banking CAPTCHAs
In 2010, Shujun and his collaborators analyzed a large number of CAPTCHA schemes deployed by many financial institutions all over the world and found out that none of them is secure. Three CAPTCHA schemes are used by the affected financial institutions for securing online banking transactions against automated man-in-the-middle attacks.- Shujun Li, Syed Amier Haider Shah, Muhammad Asad Usman Khan, Syed Ali Khayam, Ahmad-Reza Sadeghi and Roland Schmitz, "Breaking e-Banking CAPTCHAs," in Proceedings of 26th Annual Computer Security Applications Conference (ACSAC 2010, Austin, TX, USA, December 6-10, 2010), pp. 171-180, 2010, DOI: 10.1145/1920261.1920288 (Acceptance rate: 39/227=17.2%) Companion Web Page (Media Coverage: Infosec Island, LLC) © ACM
Captchæcker
Since 2011, Shujun and his collaborators have been developing the idea of automating the security and usability evaluation of CAPTCHAs. Some preliminary work has been published (see below), but a complete system is still to be developed.- Yousra Javed, Maliha Nazir, Muhammad Murtaza Khan, Syed Ali Khayam and Shujun Li, "Captchæcker: Reconfigurable CAPTCHAs Based on Automated Security and Usability Analysis," in Proceedings of 2011 4th Symposium on Configuration Analytics and Automation (SafeConfig 2011, October 31 - November 1, 2011, Arlington, VA, USA), 2011 (Acceptance rate: 10/29=34.5%) © IEEE
- Maliha Nazir, Yousra Javed, Muhammad Murtaza Khan, Syed Ali Khayam and Shujun Li, "Poster: Captchæcker – Automating Usability-Security Evaluation of Textual CAPTCHAs," in Proceedings of 7th Symposium On Usable Privacy and Security (SOUPS 2011, Carnegie Mellon University in Pittsburgh, PA, USA, July 20-22, 2011), ACM, 2011 © Authors
Pass-CAPTCHA
Since 2009, Shujun has been thinking about how to combine passwords and CAPTCHAs to improve the usability of both systems when they have to appear on the same page. He calls such a combined system "Pass-CAPTCHA". Some ideas have been proposed and one prototype system has been tested in 2011-2012. More prototype systems are to be developed and tested. This is still an ongoing line of research, so no any result has been published so far.Note that combining passwords and CAPTCHAs itself is not a new idea. Some human user authentication schemes have been designed to incorporate CAPTCHA to reduce the risk of automated attacks. One of such systems called PAS was cryptanalyzed by us in the following paper:
- Shujun Li, Hassan Jameel Asghar, Josef Pieprzyk, Ahmad-Reza Sadeghi, Roland Schmitz and Huaxiong Wang, "On the Security of PAS (Predicate-based Authentication Service)," in: Proceedings of 25th Annual Computer Security Applications Conference (ACSAC 2009, December 7-11, 2009, Honolulu, Hawaii, USA), pp. 209-218, 2009, DOI: 10.1109/ACSAC.2009.27 (Acceptance rate: 44/224=19.6%) © IEEE
Audio CAPTCHAs
Shujun also did some work on audio CAPTCHAs. The main focus is how to improve usability and accessibility to the disabled.- Matthew Davidson, Karen Renaud and Shujun Li, "jCAPTCHA: Accessible Human Validation," in Computers Helping People with Special Needs: 14th International Conference, ICCHP 2014, Paris, France, July 9-11, 2014, Proceedings, Part I, Lecture Notes in Computer Science, vol. 8547, pp. 129-136, 2014 © Springer
Web Resources
General: T. Pavlidis's Tutorial on CAPTCHA W3C - Inaccessibility of CAPTCHA An ASP.NET Framework for Human Interactive Proofs Top 10 Worst Captchas (Paper)CAPTCHA Designs
Text CAPTCHAs: Egglue Semantic CAPTCHA textCAPTCHA Accessible Captcha for ExpressionEngine 2.x SI CAPTCHA Anti-Spam for WordPress SMARTCHA (SeMi Automated Reverse Turing test to tell Computer and Human Apart)Recognition Based CAPTCHAs: HELLOCAPTCHA () HKCaptcha CAPTCHA Service @ ProtectWebForm.com Horst Nogajski's PHP Class hn_captcha bot-check 1.2: WordPress anti-spam comment plugin CAPTCHA @ WebSpamProtect.com CAPTCHA Image @ codeproject.com
Image Understanding Based CAPTCHAs: Uncertainty-based CAPTCHA FaceDCAPTCHA Picatcha GigoIts HumanAuth (Implementation @ Uni-Regensburg) IMAGINATION: Image-based Authentication MosaHIP
Interactive CAPTCHAs: Stickman CAPTCHA + CAPTCHA ROCK Sliceya CAPTCHA
2-D+ CAPTCHAs: Moving-Object CAPTCHAs (including Emerging Images based CAPTCHA) (demo) Juraj Rolko's 3D CAPTCHA CAPTCHAs based on depth perception: AniCAP + STE3D-CAP + STE3D-CAP-e Michael G. Kaplan's 3-D CAPTCHA Vappic 4D CAPTCHA
Other CAPTCHAs: SenCAPTCHA GEETEST极验 CAPTCHA @ Arkose Labs Heyes Captcha (demo) Codetcha (demo) Sliding CAPTCHA @ TheyMakeApps.com GeoLang's Second Generation CAPTCHA System Project
CAPTCHAs for Advertising: Confident AdCAPTCHA™ KoolCaptcha Cubecaptcha