This is a repost of something I wrote relating to a site I run - hope it will be useful to others here...

A web-cralwer ("spider") such as Google will try to look at all pages by following all links. The *spider* is not a logged-in user, so all pages will show the *register* link, which the spider will then follow and end up at the /user/register page where a *captcha* image will be displayed. The CPU and memory required to generate this image is much higher than for pages without the *captcha*. A single access certainly wouldn't bring the server to it's knees, but repeated access might.

Following on from the above, and probably more significant: a *badly-behaved* spider or deliberate DoS attack could cause high CPU loads and a DDoS attack targetting a captcha-based page could be a very significant risk. Even if intended as DoS, a script-based password-cracking or spam attempt might have the same effect.

### Benchmark of access to a standard page (site front page)

ab -n 100 -c 10 http://chat.ksfiomdepositors.org/

Requests per second:    849.65 [#/sec] (mean)

No noticeable affect on server load.

Benchmark of access to a page with *captcha* image - /user/register

ab -n 100 -c 10 http://chat.ksfiomdepositors.org/user/register

Requests per second:    5.15 [#/sec] (mean)

Server load increased from around 0.5 to around 2.5 while the test was running.

Simulated DoS attach - /user/register

ab -n 1000 -c 100 http://chat.ksfiomdepositors.org/user/register
Requests per second:    12.37 [#/sec] (mean)

Server load increased to around 15 while the test was running.

Conclusion

* This indicates that generating the *captcha* image requires around 100 times or more the processing power of a standard page.

* Drupal's image-based captcha module is a potential target for DoS attack. Unless the site is protected in some other way, consider replacing image captcha with an alternative. The current module should, IMO be updated to provide some internal caching of generated images.

Tests were performed with Drupal's caching disabled but MySQL query cache enabled, captcha-6.x-1.0-rc2, captcha_pack-6.x-1.0-beta2, Drupal 6.8

Ok, so, I'm going to replace the image captcha with a text-based captcha, at least temporarily to see if the "overload" issue we have been experiencing goes away.

-------
# Addition to standard Drupal-installed robots.txt
# Block ALL paths from /user, esp. important for captcha graphic, and probably good for user privacy.
Disallow: /user/*
Disallow: /?q=user/*

Comments

andy inman’s picture

Should have mentioned, ab was run locally on the server itself, but given the high cpu required (and net bandwidth relatively unimportant in this case) I imagine very similar results would be obtained by accessing the captcha-protected page from a remote location.

dddave’s picture

Shouldn't security issues be reported directly to the security team?

andy inman’s picture

Well, it's arguable whether this is a security issue - a DoS attack is possible in all kinds of ways, but good point, I'll do that.

stoptime’s picture

On my 6.9 installation, the robots.txt file already handles all possible pages that a Captcha challenge may be issued on (where a site requires a user to have an account to comment) - check it out:

# Paths (clean URLs)
Disallow: /admin/
Disallow: /comment/reply/
Disallow: /contact/
Disallow: /logout/
Disallow: /node/add/
Disallow: /search/
Disallow: /user/register/
Disallow: /user/password/
Disallow: /user/login/
# Paths (no clean URLs)
Disallow: /?q=admin/
Disallow: /?q=comment/reply/
Disallow: /?q=contact/
Disallow: /?q=logout/
Disallow: /?q=node/add/
Disallow: /?q=search/
Disallow: /?q=user/password/
Disallow: /?q=user/register/
Disallow: /?q=user/login/

Of course, only allowing registered users to comment helps tremendously.

For an very simple, alternate approach to blocking spam-bots, check out this page on the Chicago Reader:
http://tinyurl.com/ck7no6

Once you click the 'comment button' you are presented with a form that requires you so select "human" from a select/option list. Very ingenious!

dddave’s picture

of the Chicago Reader. Really easy and not so intrusive a lot of captchas are.

gerhard killesreiter’s picture

To make any sense you should also have tested the registration page with captcha disabled.

Looking at the results make me think that you should have examined them more closely before publishing them. I don't think that the registration page should require that many more resources as your test makes us believe it does.

There are also no error estimates at all.

andy inman’s picture

Yes, good point. I was testing on a live site (since the whole reason for starting the process was to try to find the cause of occasional over-loading on that particular site) -- so disabling captcha completely would not be an option - I would have to repeat the tests on a development site. As I said, the posting was in the hope that it might be useful, I did not mean to suggest that I had completely researched the topic.

wundo’s picture

We are discussing it in the CAPTCHA Group thread http://groups.drupal.org/node/19483