I have a process that is selecting a bunch of nodes and then apply HTML purifier on each text content of each node.
Everything works fine for small amount of nodes. As soon as I increase the amount of nodes to be managed during the batch, purifier crash the PHP memory with the following message :

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 74 bytes) in /var/www/mp/sites/all/modules/contrib/htmlpurifier/HTMLPurifier_DefinitionCache_Drupal.php on line 86

I suspect something in the purifier code not freeing memory used for all information that are yet treated and therefore no more relevant to be kept in memory.

Comments

ezyang’s picture

Status: Active » Postponed (maintainer needs more info)

Are you using multiple configurations? HTML Purifier is generally good about freeing up its hefty memory consumption when its done, but it will specifically cache certain datastructures in memory to speed things up, and configs are one of them.

pierat’s picture

Yes, we are using two configuration quite the same.

Here is the one that is used when I have my memory problem :

function htmlpurifier_config_8($config) {
$config->set('HTML.SafeEmbed', TRUE);
$config->set('HTML.SafeObject', TRUE);
$config->set('HTML.TidyLevel', 'heavy');
$config->set('Core.ConvertDocumentToFragment',true);
$config->set('HTML.Doctype', 'XHTML 1.0 Transitional');
$config->set('URI.DisableExternalResources', FALSE);
$config->set('CSS.Proprietary',TRUE);

}

pierat’s picture

Status: Postponed (maintainer needs more info) » Active
ezyang’s picture

Status: Active » Postponed (maintainer needs more info)

What happens if you comment out the following line:

$config->set('Cache.DefinitionImpl', 'Drupal');

like

// $config->set('Cache.DefinitionImpl', 'Drupal');

as you might see in htmlpurifier.module. Also, how many nodes are you trying to process at a time?

Yoran’s picture

Same issue hiere and disabling the cache doesn't change anything unfortunately.

Our HTMLPurifier is processing 200 nodes (apache solr cron) each time.

It seems that the problem only appears on PHP 5.2.X, and not with 5.3.

ezyang’s picture

Status: Postponed (maintainer needs more info) » Closed (cannot reproduce)

Closing on inactivity. Please reopen if you get more information.

filiptc’s picture

Version: 6.x-2.0 » 6.x-2.4

Exact same thing here with latest (stable) version. Nothing out of the ordinary. I did a cache cleaning of the html purifier and kept doing maintenance tasks. 20 minutes later, out of the blue, a similar error as #1 appeared on a WSoD:

  • Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 82 bytes) in /modules/htmlpurifier/HTMLPurifier_DefinitionCache_Drupal.php on line 86

Besides, in dblog I see these two errors repeating a dozen times (might be unrelated):

  • Cannot enable Linkify injector because a.href is not allowed in /modules/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 114.
  • Cannot enable AutoParagraph injector because p is not allowed in /modules/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 114.

Cheers!

filiptc’s picture

Status: Closed (cannot reproduce) » Active

Ok, this keeps happening every once in a while. The dblog entries I mentioned were indeed unrelated. Reopening.

EDIT: php version was 5.2.9. Upgraded to 5.3.4, will report back with results.

filiptc’s picture

Still happening on 5.3.4. Happens frequently when opening a chat node from the chatroom module.

heddn’s picture

Status: Active » Closed (cannot reproduce)

Seeing as this issue hasn't gotten much attention since last summer, going ahead and marking completed for now. It can always be reopened if additional reports arise.