While profiling a Drupal 7 site, I noticed that htmlpurifier has a setting for double caching of the filter, and that this is set to FALSE by default. However the check_markup() cache caches the output of the entire text format, which may have any number of filters in use - and which may benefit from caching.

Additionally, text module now caches check_markup() output in the field cache as part of text_field_load() - this means there is no additional cache entry or cache_get()/set() when using text fields (which there used to be with check_markup() in D6) - the only double caching in this case would be a slightly larger field cache entry.

Another side effect of this, is that disabling the filter cache means that hook_filter_info() is called on every request, and htmlpurifier_filter_info() is currently causing a variable_set() on every request, will open an issue for that as soon as I've finished typing up this one.

In light of this, I'd suggest removing the feature altogether, attached a patch for that.

Also, while there is a setting for this, core currently doesn't update filter cache status when hook_filter_info() changes, see bug report at #993230: Text format cacheability is not updated. I implemented a workaround for this at #836000: Media filter should not prevent the core text format cache from being used but not sure what to do about the update numbering in htmlpurifer.install so left that for now.


Status:Needs review» Reviewed & tested by the community

Tested the patch, found no issues. The patch looks good.

Status:Reviewed & tested by the community» Needs work

I've applied the patch, but I'm leaving this bug open pending the other changes necessary that catch mentioned in his bug report. I must admit, I'm a little confused by the new caching structure of Drupal 7; are there any docs about it somewhere?

Version:7.x-1.x-dev» 7.x-2.x-dev

Thanks for the commit! No docs, but the original patch where the text module optimization went in is here - #369011: Performance: do check_markup() in text_field_load(). What it does is cache the results of check_markup() in the field cache, and in doing so doesn't bother using the check_markup() cache at all.

Ok. As for update numbering, the convention is ostensibly XYZZ where X.Y is the Drupal version and ZZ is just a bunch of increasing numbers. So you could probably place it in htmlpurifier_update_7000 and that'd be fine. It'd be lovely if you could port the workaround.

That's right, but I'm pretty sure you'll need the actual Drupal 6-7 upgrade code for htmlpurifier to run first #804240: 7.x Port.

Status:Needs work» Needs review
new1021 bytes

Thought about this some more, here's the update.

This will work for Drupal 7 sites doing head to head updates.

When the Drupal 6-7 update is ready, it should probably be renumbered and put at the end.

re-rolling with updated update number (7002).


Status:Needs review» Fixed

Status:Fixed» Closed (fixed)
Issue tags:-D7 stable release blocker

Automatically closed -- issue fixed for 2 weeks with no activity.