I believe that there is a memory leak due to a circular reference to $node in content.module: v 1.301.2.112 2009/08/03 20:40:26.

First a bit of background. I discovered the leak with a long running bootstrapped script to populate Solr. Ideally this script would be able to run for months and load/process many, many nodes with no problems. However, the CLI was running out of memory even though it had 1G allocated to it. My very rough calculations showed that each processed node was adding about 100k to RAM. The Apache Solr Integration code is careful to make sure that loaded nodes are not cached so that couldn't be the problem. After a bit of tracking down I believe that I have isolated the problem to the content_field() function.

The problematic $op is for 'view'. The calls went along the lines of:
- node_build_content()
- node_invoke_nodeapi()
- content_nodeapi()
- conent_view()
- _content_field_invoke_default()
- content_field()

The problem occurs when the #node key is set in $elements in content_field():

$element = array(
'#type' => 'content_field',
'#title' => check_plain(t($field['widget']['label'])),
'#field_name' => $field['field_name'],
'#access' => $formatter_name != 'hidden' && content_access('view', $field, NULL, $node),
'#label_display' => $label_display,
'#node' => $node, // this causes a memory leak!
'#teaser' => $teaser,
'#page' => $page,
'#context' => $context,
'#single' => $single,
'items' => array(),
);

$node is added to $element which is then added to $wrapper which is added to $addition which is passed up to _content_field_invoke_default() where it is merged into $return and passed up to content_view() where it is merged and then added to $node->content.

The outcome is that $node now holds a reference to itself. I believe that this is the cause of the leaking problem.
http://derickrethans.nl/circular_references.php

If I comment out the "'#node' => $node," line it works OK with no leaks.

This problem is currently stopping me from running my population script. The workaround is to stop the script every 1000 or so nodes. This is sub optimal because there is substantial CPU overhead on the DB to get the script started along with a lot of time wastage.

cheers

Murray

Comments

murrayw’s picture

One workaround is to unset($node->content) when you are finished with the node in question. This works for me - memory is stable after thousands of nodes.

Scott Reynolds’s picture