20 May 2007
The brief overview of the most known implementations
Carnegie Mellon's PIX CAPTCHA - so called "naming images CAPTCHA" - user sees a few pictures and has to select a word that is appropriated to the all shown pictures. The main problem of this type of CAPTCHAs is misspelling while writing the answer and synonyms for the answer-word (for example: dog, hound, pooch). In the described case this solved by means of transferring all variants of the answer to the client side.
Oli Warner's KittenAuth - in order to prove his humanity visitor has to select all animals of specified species among the proposed pictures. But the limited number of the pictures allow to recreate the picture base manually.
Microsoft's Asirra - in outline it is similar to KittenAuth - user has to distinguish cats from dogs - but it works with extremely large array of pictures (photos of homeless animals from the specialized site) and reconstruction of the picture base is impossible.
IMAGINATION - CAPTCHA that requires two steps to be passed. At first step visitor clicks elsewhere on the picture that composed of a few images and selects in this way a single image. At second step the selected image is loaded. It is enlarged but very distorted. Also variants of the answer are loaded on the client side. The visitor should select a correct answer from the set of the proposed words.
Why not are image-based CAPTCHAs so widespread as text-based ones?
I do not touch on contrastive analysis of the possibility to crack them, you can find some thoughts/calculations here and here. I want to express my point of view as a web developer. So why?
- They are too large. CAPTCHA should not take the dominant position on web page. It is only ancillary element that serves to weed out bots under forms filling, getting some information, etc.
- Traffic. A few pictures, each has size about 5-10 kB, together they weigh a lot as for single page, in my opinion. Visitors using a low bandwidth network will be unpleasantly impressed to say nothing of visitors that are using dial-up connection.
- Inconsistency to a general conception of a web site. CAPTCHA with cats(or dolphins) will be appropriate on leisure sites but it will be irrelevant - for example - on a site of a medical institution. In this case it is possible to gather a number of images of medical subjects, but - on the analogy of Asirra - it is problematically to find a site with large amount of the photos of homeless doctors :)
- Laborious process of the picture base creation.
I have to notice that it is not a criticism in any way - I only want to find an answer for the given above question.
Let's sum up the aforesaid. Image-based CAPTCHA might be a good alternative to the text-based one if it would be a single small light-weight image based on a limited set of the pictures.
Idea
Look at the next two pictures.
 |
 |
| The original image |
The changed image |
It is easy to notice that the right image is slightly distorted and it is not hard to outline a rough region where the distortion take place. In order to notice it the original image actually is not required . A human easy copes with this task even he sees an image at first time and does not know what the image depicts - the aforesaid does not apply to expressionist's pictures :).
To make a long story short, ability to test a working example worths many words.
Now about bots. I have never worked with image recognition systems and my knowledge in this area is rather poor. Perhaps the proposed variant is intricate to parse by special programs, perhaps not - it will be interesting to hear an expert's opinion.
The code overview
Note: I left the demo project for a demonstration how to implement this kind of CAPTCHAs. It is functional but not too optimal. Do not judge this code harshly :), it was written offhand specifically to illustrate this article that was published May 20, 2007. The control was greatly changed since then. If you want to include such CAPTCHA in your project then better to use ready
Image-Based Bot Detector control that is
available for download since Jun 24, 2007. It is free, with advanced facilities, tested in most of modern browsers.
The proposed image-based CAPTCHA control works in such a way: visitor sees a picture with a distorted part and he has to click elsewhere in the anomalous region boundaries. The point he clicked on will be marked with a red spot.
The control fulfils a double functionality, it renders its HTML content and forms the picture itself. It has two child controls: an image and a hidden field that serves to store coordinates of the visitor's chosen point. The image url forms by adding the special parameter to the current request url. When the request to this new url comes the control interrupts the usual process of page loading and writes the image as the byte array in the response.
protected override void OnInit(EventArgs e)
{
if (HttpContext.Current.Request[queryKey] != null)
DrawImage();
.....
}
private void DrawImage()
{
Bitmap bitmap;
....
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ContentType = "image/jpg";
bitmap.Save(HttpContext.Current.Response.OutputStream, ImageFormat.Jpeg);
HttpContext.Current.Response.End();
}
This approach allows to compound all the functionality in a single control. In heavy loaded environment it is better to take out the image creation elsewhere to another place - for example, to the HttpHandler - to avoid the excessive creation of the page where the control is placed on.
Coordinates of the point where visitor clicks is calculated and visualized by means of JavaScript (tested on IE 6.0, Firefox 1.0+, Opera 9.1). They are stored in the hidden field in order to be accessible on the server side.
function captchaClicked(hidID,e)
{
var sender = e.target || e.srcElement;
var offsetX, offsetY;
if (e.offsetX) //IE
{
offsetX = e.offsetX;
offsetY = e.offsetY;
}
else if (e.pageX)
{
var left = sender.offsetLeft;
var top = sender.offsetTop;
var parentNode = sender.offsetParent;
while(parentNode != null && parentNode.offsetLeft != null && parentNode.offsetTop != null){
left += parentNode.offsetLeft;
top += parentNode.offsetTop;
parentNode = parentNode.offsetParent;
}
offsetX = e.pageX - left;
offsetY = e.pageY - top;
}
document.getElementById(hidID).value = offsetX+","+offsetY;
var spot = document.getElementById("spotOnCaptha");
if (!spot)
{
spot = document.createElement("span");
spot.id = "spotOnCaptha";
spot.style.height = "3px";
spot.style.width = "3px";
spot.style.fontSize="1px";
spot.style.backgroundColor="red";
spot.style.position="absolute";
spot.style.zIndex = 100;
sender.parentNode.appendChild(spot);
}
spot.style.left = e.pageX || (event.clientX + document.body.scrollLeft);
spot.style.top = e.pageY || (event.clientY + document.body.scrollTop);
}
Now about the CAPTCHA image creation. In the loaded template picture a rectangle with random defined coordinates is selected. Then the coordinates are saved in the Session. It is possible to store them in the ViewState but in this case they have to be encoded because the ViewState is accessible on the client side. Then image in the rectangle boundaries is stretched (any other distortion may be used, the only condition - it has to be notable for the site visitors).
Also there is another problem. It is possible to compare the template (original) image and the final distorted image pixel by pixel in order to find distorted area. To avoid this possibility the template image is also changed in a random way. In the below example it is stretched or compressed on random scale.
using (System.Drawing.Image img = System.Drawing.Image.FromFile(this.Page.MapPath(TemplateImageUrl)))
{
int captchaWidth = (int)(img.Width * 0.9);
int captchaHeight = (int)(img.Height * 0.9);
bitmap = new Bitmap(captchaWidth, captchaHeight);
Graphics g = Graphics.FromImage(bitmap);
int rectWidth = 20;
int rectHeight = 20;
Random r = new Random();
float scaleX = 1 + r.Next(-100, 100) / 1000f;
float scaleY = 1 + r.Next(-100, 100) / 1000f;
g.ScaleTransform(scaleX, scaleY);
g.DrawImage(img, 0, 0, img.Width, img.Height);
g.ResetTransform();
int x = r.Next(0, captchaWidth - rectWidth);
int y = r.Next(0, captchaHeight - rectHeight);
Rectangle distortedRect = new Rectangle(x, y, rectWidth, rectHeight);
HttpContext.Current.Session["ImgCAPTCHA.coords"] = distortedRect;
rectWidth = rectWidth + 10;
rectHeight = rectHeight + 10;
if (x + rectWidth > captchaWidth)
x = captchaWidth - rectWidth;
if (y + rectHeight > captchaHeight)
y = captchaHeight - rectHeight;
g.DrawImage(img, distortedRect, new Rectangle(x, y, rectWidth, rectHeight), GraphicsUnit.Pixel);
g.DrawRectangle(Pens.Black, 0, 0, captchaWidth-1, captchaHeight-1);
}
Maybe the proposed image creation algorithm looks imperfect. Indeed, it is. I want to say that this article is rather the idea presentation then the control's description (it is just an example).
How to use
In order to use the control it has to assign the path of the template image - TemplateImageUrl property. The result of the CAPTCHA control action is a value of IsValid property. Besides here is the ability to lock user after the fixed number of failed attempts per session (FailedAttemptsBeforeLocking, IsLocked).You can see how it works on the test page. The full code is accessible in the demo project.