This is a repost of something I wrote relating to a site I run - hope it will be useful to others here...
A web-cralwer ("spider") such as Google will try to look at all pages by following all links. The *spider* is not a logged-in user, so all pages will show the *register* link, which the spider will then follow and end up at the /user/register page where a *captcha* image will be displayed. The CPU and memory required to generate this image is much higher than for pages without the *captcha*. A single access certainly wouldn't bring the server to it's knees, but repeated access might.
Following on from the above, and probably more significant: a *badly-behaved* spider or deliberate DoS attack could cause high CPU loads and a DDoS attack targetting a captcha-based page could be a very significant risk. Even if intended as DoS, a script-based password-cracking or spam attempt might have the same effect.
### Benchmark of access to a standard page (site front page)
ab -n 100 -c 10 http://chat.ksfiomdepositors.org/
Requests per second: 849.65 [#/sec] (mean)
No noticeable affect on server load.
Benchmark of access to a page with *captcha* image - /user/register
ab -n 100 -c 10 http://chat.ksfiomdepositors.org/user/register
Requests per second: 5.15 [#/sec] (mean)
Server load increased from around 0.5 to around 2.5 while the test was running.
Simulated DoS attach - /user/register
ab -n 1000 -c 100 http://chat.ksfiomdepositors.org/user/register
Requests per second: 12.37 [#/sec] (mean)
Server load increased to around 15 while the test was running.
Conclusion
* This indicates that generating the *captcha* image requires around 100 times or more the processing power of a standard page.
* Drupal's image-based captcha module is a potential target for DoS attack. Unless the site is protected in some other way, consider replacing image captcha with an alternative. The current module should, IMO be updated to provide some internal caching of generated images.
Tests were performed with Drupal's caching disabled but MySQL query cache enabled, captcha-6.x-1.0-rc2, captcha_pack-6.x-1.0-beta2, Drupal 6.8
Ok, so, I'm going to replace the image captcha with a text-based captcha, at least temporarily to see if the "overload" issue we have been experiencing goes away.
-------
# Addition to standard Drupal-installed robots.txt
# Block ALL paths from /user, esp. important for captcha graphic, and probably good for user privacy.
Disallow: /user/*
Disallow: /?q=user/*
Comments
Should have mentioned, ab was
Should have mentioned, ab was run locally on the server itself, but given the high cpu required (and net bandwidth relatively unimportant in this case) I imagine very similar results would be obtained by accessing the captcha-protected page from a remote location.
Shouldn't security issues be
Shouldn't security issues be reported directly to the security team?
Well, it's arguable whether
Well, it's arguable whether this is a security issue - a DoS attack is possible in all kinds of ways, but good point, I'll do that.
Drupal's Default robots.txt
On my 6.9 installation, the robots.txt file already handles all possible pages that a Captcha challenge may be issued on (where a site requires a user to have an account to comment) - check it out:
# Paths (clean URLs)
Disallow: /admin/
Disallow: /comment/reply/
Disallow: /contact/
Disallow: /logout/
Disallow: /node/add/
Disallow: /search/
Disallow: /user/register/
Disallow: /user/password/
Disallow: /user/login/
# Paths (no clean URLs)
Disallow: /?q=admin/
Disallow: /?q=comment/reply/
Disallow: /?q=contact/
Disallow: /?q=logout/
Disallow: /?q=node/add/
Disallow: /?q=search/
Disallow: /?q=user/password/
Disallow: /?q=user/register/
Disallow: /?q=user/login/
Of course, only allowing registered users to comment helps tremendously.
For an very simple, alternate approach to blocking spam-bots, check out this page on the Chicago Reader:
http://tinyurl.com/ck7no6
Once you click the 'comment button' you are presented with a form that requires you so select "human" from a select/option list. Very ingenious!
I like the solution
of the Chicago Reader. Really easy and not so intrusive a lot of captchas are.
To make any sense you should
To make any sense you should also have tested the registration page with captcha disabled.
Looking at the results make me think that you should have examined them more closely before publishing them. I don't think that the registration page should require that many more resources as your test makes us believe it does.
There are also no error estimates at all.
Yes, good point. I was
Yes, good point. I was testing on a live site (since the whole reason for starting the process was to try to find the cause of occasional over-loading on that particular site) -- so disabling captcha completely would not be an option - I would have to repeat the tests on a development site. As I said, the posting was in the hope that it might be useful, I did not mean to suggest that I had completely researched the topic.
Check out g.d.o/captcha
We are discussing it in the CAPTCHA Group thread http://groups.drupal.org/node/19483