I have a process that is selecting a bunch of nodes and then apply HTML purifier on each text content of each node.
Everything works fine for small amount of nodes. As soon as I increase the amount of nodes to be managed during the batch, purifier crash the PHP memory with the following message :
Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 74 bytes) in /var/www/mp/sites/all/modules/contrib/htmlpurifier/HTMLPurifier_DefinitionCache_Drupal.php on line 86
I suspect something in the purifier code not freeing memory used for all information that are yet treated and therefore no more relevant to be kept in memory.
Comments
Comment #1
ezyang commentedAre you using multiple configurations? HTML Purifier is generally good about freeing up its hefty memory consumption when its done, but it will specifically cache certain datastructures in memory to speed things up, and configs are one of them.
Comment #2
pierat commentedYes, we are using two configuration quite the same.
Here is the one that is used when I have my memory problem :
function htmlpurifier_config_8($config) {
$config->set('HTML.SafeEmbed', TRUE);
$config->set('HTML.SafeObject', TRUE);
$config->set('HTML.TidyLevel', 'heavy');
$config->set('Core.ConvertDocumentToFragment',true);
$config->set('HTML.Doctype', 'XHTML 1.0 Transitional');
$config->set('URI.DisableExternalResources', FALSE);
$config->set('CSS.Proprietary',TRUE);
}
Comment #3
pierat commentedComment #4
ezyang commentedWhat happens if you comment out the following line:
$config->set('Cache.DefinitionImpl', 'Drupal');like
// $config->set('Cache.DefinitionImpl', 'Drupal');as you might see in htmlpurifier.module. Also, how many nodes are you trying to process at a time?
Comment #5
Yoran commentedSame issue hiere and disabling the cache doesn't change anything unfortunately.
Our HTMLPurifier is processing 200 nodes (apache solr cron) each time.
It seems that the problem only appears on PHP 5.2.X, and not with 5.3.
Comment #6
ezyang commentedClosing on inactivity. Please reopen if you get more information.
Comment #7
filiptc commentedExact same thing here with latest (stable) version. Nothing out of the ordinary. I did a cache cleaning of the html purifier and kept doing maintenance tasks. 20 minutes later, out of the blue, a similar error as #1 appeared on a WSoD:
Besides, in dblog I see these two errors repeating a dozen times (might be unrelated):
Cheers!
Comment #8
filiptc commentedOk, this keeps happening every once in a while. The dblog entries I mentioned were indeed unrelated. Reopening.
EDIT: php version was 5.2.9. Upgraded to 5.3.4, will report back with results.
Comment #9
filiptc commentedStill happening on 5.3.4. Happens frequently when opening a chat node from the chatroom module.
Comment #10
heddnSeeing as this issue hasn't gotten much attention since last summer, going ahead and marking completed for now. It can always be reopened if additional reports arise.