Static cache captcha_point form_id query

alienbrain - March 29, 2009 - 22:46
Project:CAPTCHA
Version:6.x-2.x-dev
Component:Code
Category:feature request
Priority:normal
Assigned:alienbrain
Status:needs review
Issue tags:Novice
Description

Hello,

It appears that a performance gain could be achieved by statically caching the captcha_point query in hook_form_alter.

#1

soxofaan - March 29, 2009 - 23:03

I question if that would offer a significant performance gain (that would be worth the additional code complexity/maintenance).
The performance gain would only be for pages with more than one CAPTCHA (which is not a very common use case, as far as I "guess"), and it would be a performance loss (because of useless overhead) for pages with only one CAPTCHA (which is the majority of use cases, as far as I "guess").

Is your suggestion based on measurements/statistics of real live sites?

#2

alienbrain - March 29, 2009 - 23:30

Oh that was quick. Gotta love that :)

To answer your question right away: No, I actually just used it for the first time less than 1 hour ago! I was checking the code for doing some lighty form_alter and this query came to my attention.

However, I'd like to correct something you mentioned which is that the performance gain would be for pages which have more than 1 form. A lot of busy sites have more than one form in their pages. If you have the search block on (like drupal.org) that's 1 form per page, at minimum. Add the login form, and they are 2, and those are just the site-wide ones.

Also, I believe the overhead for static caching is very little that I doubt it's noticeable at all by any measure. So if there is only one form, static caching would never hurt. On the good side though, I'd imagine that this static caching would be sealed behind a nice function (dare I say API?) e.g. captcha_point($form_id) which would do the work of static caching and returning the result when asked to. That's better than caching it in hook_form_alter() because modules can use that (which is actually what I was trying to do, I wanted to add something to the form, only if captcha would be present, and an isset($form['captcha']) wasn't working, apparently my module is executed before captcha.module is).

Anyway, I'm far from qualified to discuss potential API benefits for captcha, I'd be attaching a patch now otherwise :P

Cheers

#3

soxofaan - April 3, 2009 - 16:48
Priority:normal» minor

Ok, I am probably wrong about the performance loss.
It would make sense to introduce static caching for the captcha_point data.
However, there are more important issues to be solved for the moment (and I don't have much time to even solve these important ones), so I'm setting it to minor priority for now (and flagging this low-hanging-fruit as "novice")

#4

alienbrain - April 3, 2009 - 17:24
Assigned to:Anonymous» alienbrain

I've to say I'm slightly disappointed that you regard such a performance improvement (saving 1+ queries per request) as a minor :)

That said, I can help! I will dedicate some time to this in the next fews days and post a patch.

#5

soxofaan - April 3, 2009 - 17:53
Priority:minor» normal

Sorry, didn't mean to hurt your feelings :)
but in the light of lacking enough time to make 6.x-2.x at least work, I have to set some priorities right
and thinking about caching is typically something you do after having something that works (so you know what are the interesting parts to cache).
Anyway, thanks for jumping in!

For what it's worth: currently I'm working on a rather heavy refactoring of the module, but it's not yet in Drupal's CVS,
If you're familiar with git, you can find the current progress at http://github.com/soxofaan/drupal-captcha-dev/tree/hook_elements
So if you want to create a patch, it's probably better to use that version.

#6

alienbrain - April 3, 2009 - 18:33

Brilliant, I'm familiar with git. Actually, I will do just as you say and make a patch against the git tree.

#7

soxofaan - April 7, 2009 - 09:15

I committed the work from the git branch hook_elements to CVS
and I am now working on a new branch captcha_admin_mode, so you should switch to that one.

You'll find (a.o) in captcha.module a function captcha_get_form_id_setting(). That's where the static caching should go in.

#8

soxofaan - April 15, 2009 - 21:19

ok, I'm in doubt again.

In #2 you argue that there can be several CAPTCHA protected forms on the same page, which is certainly true.
But those forms are most likely different forms (different form_id),
so (page level) static caching wouldn't deliver much gain because there would be mainly cache misses.

#9

alienbrain - April 16, 2009 - 00:54

.. there can be several CAPTCHA protected forms on the same page, which is certainly true.

There is nothing I'm disagreeing with here, but I want to say that I'm looking at all the forms, not only the captcha protected ones. Because as I understand, for every form in the page (not only those with Captchas) a query to captcha_points would be executed to figure out if a captcha is needed for the given form (_captcha_form_alter_untrusted_user).

But those forms are most likely different forms (different form_id) so (page level) static caching wouldn't deliver much gain because there would be mainly cache misses.

But we are not going to cache per form, we are going to cache the whole stash of captcha points in one go. Here is an example of what I have in mind (totally untested + written right now):

<?php
function captcha_get_point($form_id) {
  static
$captcha_points = FALSE;

 
// The first form will cause the cache to be populated.
 
if ($captcha_points === FALSE) {
   
$captcha_points = array();
   
// We fire one query, and get all the available points in one go.
   
$result = db_query('SELECT form_id, module, type FROM {captcha_points}');
     while (
$row = db_fetch_object($result)) {
      
$captcha_points[$row->form_id] = $row;
     }
   
// By now, our cache is built and this block won't be executed again.
 
}

 
// Serve from cache
 
return isset($captcha_points[$form_id]) ? $captcha_points[$form_id] : FALSE;
}
?>

And then this function would be used in different places instead of SELECTing from the captcha_points table.

#10

alienbrain - April 15, 2009 - 23:50

I also apologize for not getting earlier at this as promised, got really swamped up with work! :)

#11

alienbrain - April 16, 2009 - 00:56

I obviously was too much OAuth-ing these days; I was using the term 'captcha endpoints' instead of captcha points, I edited #8 with an s/endpoint/point/, sorry if you read #8 already, if you hadn't then just ignore this :)

#12

alienbrain - April 28, 2009 - 21:26
Status:active» needs review

Patch attached! :)

AttachmentSize
captcha-417706-01.patch 1.32 KB
 
 

Drupal is a registered trademark of Dries Buytaert.