I was doing some profiling on a page containing a single webform node, and noticed that there was an API call to Mollom — even though the webform was not one of the Mollom-enabled forms on the site. The API call originated from _mollom_status().

I think we could safely move _mollom_status() to a later point in mollom_form_alter() to avoid running into whatever edge case might trigger an API call on pages where there's no need to validate the keys.

I'm attaching a patch against 7.x-2.x that moves the call to _mollom_status() to just after verifying that the form is Mollom-enabled, before attaching the Mollom widget to the form.

Files: 
CommentFileSizeAuthor
#7 0001-1974148-by-benclark-sun-Fixed-API-keys-are-repetitiv.patch2.13 KBsun
PASSED: [[SimpleTest]]: [MySQL] 5,742 pass(es).
[ View ]
#6 mollom.form-alter.6.patch1.59 KBsun
PASSED: [[SimpleTest]]: [MySQL] 6,826 pass(es).
[ View ]
#4 mollom.status-lock.4.patch4.69 KBsun
PASSED: [[SimpleTest]]: [MySQL] 6,800 pass(es).
[ View ]
#3 mollom.status-lock.3.patch3.35 KBsun
PASSED: [[SimpleTest]]: [MySQL] 6,801 pass(es).
[ View ]
mollom-moving-_mollom_status.patch824 bytesbenclark
PASSED: [[SimpleTest]]: [MySQL] 6,710 pass(es).
[ View ]

Comments

Thanks for the patch! :)

Technically, I don't see an issue with this patch.

However: Calling _mollom_status() without arguments means that the latest known status is retrieved from a variable (if a latest known status exists). Thus, in the vast majority of cases, this means a simple and fast call to variable_get() only.

Or did you encounter performance issues with a primed status, too?

I totally agree, that's definitely how the function is supposed to go. But I think there might be a bug (or intended feature?) that resets the mollom_status variable too frequently?

I was running a load test (~ hundreds of hits per minute) on this webform node (where there are no sidebars or anything where Mollom was needed) and I saw a spike in API calls to Mollom. Using xhprof, I traced it back to the call in form_alter().

Truthfully, I'm not sure how frequently this edge case comes up — or if it's just a bug in my own site build — but after looking at the code I figured it wouldn't hurt to move the status check anyway, just so it's only checked when needed.

StatusFileSize
new3.35 KB
PASSED: [[SimpleTest]]: [MySQL] 6,801 pass(es).
[ View ]

Hm. An aggressive load test could potentially run into an entirely different issue: Race-conditions.

In other words, the first request is still checking + updating the status, while subsequent ones are hammering in. As a result, the subsequent requests do not see the updated status yet. Consequently, each of the subsequent triggers a new status update.

Can you test again with attached patch? :)

StatusFileSize
new4.69 KB
PASSED: [[SimpleTest]]: [MySQL] 6,800 pass(es).
[ View ]

Sorry, I forgot to move the variable_set() into the lock code path — which is essential, since updating the variable also takes time.

OK, I dug a bit deeper into this issue, and I discovered the cause is that "testing mode" was enabled.

MollomDrupalTest::__construct/mollom() is what kicks off the API request, and it does this every single time that _mollom_status() is called when testing mode is on, regardless of the value of the mollom_status variable. I'm not sure, but it looks like that might be intentional.

So, it's not a race condition — although I like your patch to lock the API call and variable_set(), in general.

Although I can fix it by turning off testing mode, I think this brings me back to my original patch. When test mode is enabled, we still don't need to validate the keys on page requests where there are no Mollom-enabled forms. It'll improve performance for sites still in development or sites testing Mollom, and it'll take load off of dev.mollom.com.

Thoughts? I'm not entirely clear on the mechanics behind testing mode, so if I've misunderstood, let me know. :)

StatusFileSize
new1.59 KB
PASSED: [[SimpleTest]]: [MySQL] 6,826 pass(es).
[ View ]

I'm terribly sorry for the late reply!

Over in #2058645: Client-side networking failure is never reported as error, the _mollom_status() implementation has been rewritten to be based on a cache item, instead of a variable. (The variable misbehaved on sites that somehow locked their entire variable values.)

Regardless of that, I think I see what you're saying now. In case the testing mode is enabled and in case the Testing API is down, then the call into _mollom_status() in mollom_form_alter() did trigger an API request for every single form on every page. Each time with a timeout of 3 seconds + retry (= 6 seconds in total), but if you have multiple forms on all pages (user login, search, etc.pp.), then those can easily add up.

#2058645 improved the situation already by caching the (negative) result of the API key verification for some minutes.

Your original patch additionally improves the situation for that (edge-)case in that _mollom_status() is only called when (1) the form is actually protected and (2) the current user is not privileged to bypass the Mollom protection.

That makes sense.

Because it's an edge-case, we should make sure to document the reasoning for posterity (which I've done in attached patch).

What do you think?

Title:Move _mollom_status() later in mollom_form_alter() to avoid API call on pages with no Mollom-enabled formsAPI keys are repetitively checked for all forms on all pages in case they cannot be verified
Version:7.x-2.x-dev» 6.x-2.x-dev
StatusFileSize
new2.13 KB
PASSED: [[SimpleTest]]: [MySQL] 5,742 pass(es).
[ View ]

Backported to D6.

Version:6.x-2.x-dev» 7.x-2.x-dev
Status:Needs review» Fixed

Thanks for reporting, reviewing, and testing! Committed to all branches.

A new development snapshot will be available within the next 12 hours. This improvement will be available in the next official release.

Sorry to follow-up on this thread so late, but I wanted to say thanks for including my patch! I agree with your assessment and adding the inline comments to document why it was moved was a good call, too. Looking forward to seeing this in the next release.

Status:Fixed» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.