On a site with c.100 custom fields, memory usage for _field_info_collate_fields() is around 3mb. It also causes a memory leak when doing hundreds of node inserts.

Attached patch does two things:

1. Caches the $info array, rather than the $definitions array.
2. Trims the size of the $info array by having $info['fields'] return an array of ids, which can then be used to look up the field data from the $info['fields_ids'] array.

However this doesn't reduce the straight memory usage from calling the function, wondering if it'll help with the memory leak at all though.

Attached patch reduces memory usage by nearly half, but making the $info['fields'] an array of ids rather than duplicating the entire field arrays in both $info['fields'] and $info['field_ids']

Comments

catch’s picture

Status: Active » Needs review

Pretty sure field tests will fail, if not lots of other things, but CNR for the bot anyway.

yched’s picture

I approve this. I think I considered something like that at some point, but did not really pursue for lack of evidence of an actual memory issue.

Note that the patch sneaks in a difference in what gets cached :
with patch, it becomes the *prepared* field and instance definitions, post _field_info_prepare_[field|instance|widget|display]().
in HEAD, it's only the collection of raw field and instance definitions; the 'prepare' functions run once per request on the raw definitions fetched from cache.

Might have nasty side effects; if possible, I'd rather stick to the current approach of only caching raw definitions.

Status: Needs review » Needs work

The last submitted patch, field_info.patch, failed testing.

catch’s picture

It doesn't really sneak it in, that was point #1 ;) Although unless there's a hook running on those definitions, I don't see any reason to do all that processing and looping on potentially hundreds of fields unless we absolutely have to. The two fails suggest it might well be fine to cache the prepared array.

However (and I though I hadn't even posted this) the patch doesn't actually reduce memory usage by half, it looks about the same in my testing. I want to check with devel generate and memory profiling before/after though to see if it affects that situation, which is where the bug was found - will do that a bit later today. Although even if it doesn't actually reduce memory, I don't see any reason to duplicate all that data given it's easy enough to retrieve it in the consumer functions.

catch’s picture

StatusFileSize
new1.15 MB
new1.55 MB

edit: wait a minute, better data coming soon.

catch’s picture

Title: _field_info_collate_fields() memory usage » _field_info_collate_fields() resource usage
Status: Needs work » Needs review
Issue tags: +Performance
StatusFileSize
new96.95 KB
new91.29 KB
new3.87 KB

When you have a lot of fields and instances, _field_info_collate_fields() really starts to add up, I have it at anywhere between 15-30% of page execution time on node/n depending on the page.

Original patch didn't apply so I've attached just one with the caching for now, and before/after webgrind screenshots - this isn't on a clean install, but it's a handy one which has a lot of fields and instances.

CNR just for the bot.

yched’s picture

odd, can't seem to apply the patch. Will come back at this later today.

catch’s picture

StatusFileSize
new4.04 KB

Patch was made with bzr, here's one from cvs in case that helps.

bjaspan’s picture

Anyone remember why the current implementation caches the definitions but not the full structure?

re #7, the first screenshot shows 250 calls to collate_fields, the second screenshot shows an obscured but apparently 2-digit number of calls. So I'm sure this patch will help, but I don't think your screenshots prove it. :)

yched’s picture

Anyone remember why the current implementation caches the definitions but not the full structure?

The goal of _field_info_prepare_*() funcs is to prepare the fields, instances, widgets, formatters, for the very current execution context - available widgets and formatter types, default settings (possibly altered by the various hook_field_*_info_alter())...
So the definitions are persistently cached, but the 'prepared' definitions are only statically cached to avoid tricky-to-predict stale cache conditions.

catch’s picture

@barry, yes I noticed that but it's completely consistent - 250 calls with the patch applied, 16 without, haven't figured out why yet though. Also 200+ calls which just return from static cache frankly isn't a lot.

bjaspan’s picture

re #11: Okay, so isn't this patch is changing the code to cache all the prepared, tricky-to-predict stale cache conditions, and isn't that a problem? Or are we confident that we clear the cache in all the right places now?

yched’s picture

No, right now I cannot be confident that it's not an issue. That's why I asked in #2 if the changes in the OP (memory resources / CPU resources) could be decorrelated.

It'd be interesting to look closer into what takes so long in field_info_prepare_*().
Having eventually set a linux-based dev env, I should now be able to dig into profiling data a little more easily.

catch’s picture

StatusFileSize
new92.23 KB

I can't do anything with "potential nasty side effects" that are "tricky to predict" - that basically means issues that we don't know exist, but might possibly crop up at some point in the future. If you have a more concrete example of what those are, perhaps something that appeared before, then great, if not I'm prepared to take that risk for 20% (or even 5%) of page execution time. It's a small patch, easy to roll back if such a race condition gets found without an easy fix otherwise.

Here's some better webgrind data (if you can call posting a screenshot either 'better' or 'data', yay for yched's new dev environment). _field_info_prepare_instance() is the worst offender.

Now, you will not see this on a clean HEAD install with an empty db, because that has about 5 or less field instances, not 295. If you think 295 field instances is crazy, you may well be right, but if 295 leads to 20% of page execution time, then 100 (10 bundles with 10 instances each, quite reasonable, and only going to go up as more things become both fieldable and fields), may well be 6-7%, 6-7% on just about every page request is still a lot.

We could try to get better data - i.e. have a script generate 25, 50, 100, 200, 500 instances, then profile and benchmark before/after with each. If it is indeed a linear slowdown (and that seems likely), then I'd be tempted to mark this critical.

catch’s picture

And the reason I'm focusing on this rather than the initial memory problem is that I'm not convinced my original patch fixed the memory issue, and because I care about CPU more.

catch’s picture

Priority: Normal » Critical
StatusFileSize
new57.29 KB
new79.35 KB
new80.95 KB
new81.43 KB
new96.94 KB

OK now profiling on a clean HEAD install with xhprof.

I used the following to create fields and instances on the node/article bundle, then profiled user/1 at 50, 100, 200 and 400 fields + instances. There is overhead from displaying a node with lots of fields attached, even if they're not rendered, so profiling user/1 allows us to better isolate the cost of _field_info_collate_fields() by itself against a normal request (although I think there's still some other overhead, but the profiling results are sane enough to not worry too much). Numbers are a couple off because I didn't bother to delete core fields/instances before running this, but we're talking at most 2%.

Screenshots of the _field_info_collate_fields() detail for 50/100/200/400 fields+instances, then 400 with the patch applied all attached.

Once you get to 50 fields + instances (which is very easy to do), it's already taking up 20% of both page execution time and memory, which is enough for me to bump this to critical - Drupal.org has 40 'fields' just on user profiles.

At 400 fields+instances, which I hope never to see in the wild but wouldn't be impossible on some kind of ecommerce or library database, we're looking at 47% inclusive CPU and memory usage, moving the caching reduces the CPU to 10% and memory to 20% (probably more of a gain with something like memcache which doesn't need to unserialize the array).

// $Id$
/**
 * Creates dummy fields and instances.
 */

/**
 * Bulk generate fields and instances for a bundle.

 * @param $quantity
 *   Integer, defaults to 50.
 * @param $entity_type
 *   The entity type, defaults to node.
 * @param $bundle
 *   The bundle, defaults to article.
 */
function fields_generate($quantity = 50, $entity_type = 'node', $bundle = 'article') {
  for ($i = 1; $i <= $quantity; $i++) {
    $field = array(
      'field_name' => 'a' . strtolower(_fields_random_name()),
      'type' => 'text',
    );
    field_create_field($field);
    $instance = array(
      'field_name' => $field['field_name'],
      'label' => $field['field_name'],
      'entity_type' => $entity_type,
      'bundle' => $bundle,
    );
    field_create_instance($instance);
  }
}


/**
 * SimpleTest ripoff.
 */
function _fields_random_name($length = 8) {
  $values = array_merge(range(65, 90), range(97, 122), range(48, 57));
  $max = count($values) - 1;
  $str = chr(mt_rand(97, 122));
  for ($i = 1; $i < $length; $i++) {
    $str .= chr($values[mt_rand(0, $max)]);
  }
  return $str;
}
yched’s picture

Took a closer look at what is done in field_info_prepare_*(), and the potential result of caching the results :
- fill in default field settings
- fill in default field storage settings
- populate $field['storage']['details'] (runs a alter hook)
- populate $field['bundles'] (list of bundles having an instance of the field)
- fill in default instance settings
- fallback to default widget if widget type is unavailable
- fill in default widget settings
+ for each view mode :
- fallback to default formatter if formatter type is unavailable
- fill in default formatter settings
- explicitly set to hidden if nothing specified for the view mode in the $instance definition

Things that can cause the outcome of those steps to change :
- content of hook_field_info / hook_field_widget_info / hook_field_formatter_info / hook_field_storage_info ... (and corresponding alter hooks) changes.
In most cases this happens on modules are enabled / disabled --> OK, we already clear the cached data in hook_module_enable|disable()
However : edge case where the behavior of hook_field_*_info() / hook_field_*_info_alter() is dynamic (depends on some UI setting...). The module must then take care of calling field_info_cache_clear() when a change is triggered.

- a new view mode is added
In most cases, this happens on module enable --> OK
However, case of a (much needed IMO) module providing a UI for user-defined view modes. In the current state of the 'field display' workflow, this will produce warnings when displaying a field in a view mode its $instance doesn't know about, until cache is cleared.
This could be fixed by adding some checks down the line in the display pipeline, but the whole point of the 'prepare' steps was precisely to avoid those 'just in time' checks here and there. Or we could keep just that part in the non-cached, done-on-each-request part of _field_info_collate_fields().

So all in all, I think the patch should be OK, once we decide how we handle this case of new view modes.
I don't really understand how the patch also reduces RAM consumption though ?

catch’s picture

That overview's great yched.

I think new view modes should prompt a field info cache clear, that seems like a very rare thing to add, presumably it'd already need to alter the entity info cache too?

RAM consumption - I don't really know, it's possible that some variables are built in the un-cached run which don't make their way into the returned data, and that PHP garbage collection isn't dealing with them properly. If someone's able to reproduce the test on PHP 5.3 (which I don't have locally) that'd be an interesting comparison since garbage handling is very different afaik.

It's also possible that xhprof measurement either skews the memory usage just by virtue of measuring it, or is otherwise imperfect.

If I get some time tomorrow, I'll try to re-roll the original patch with the attempt at memory savings, and see if it makes any difference in either case.

catch’s picture

StatusFileSize
new60.62 KB
new104.2 KB
new56.95 KB
new79.88 KB
new4.69 KB

Updated patch which tries to make the array smaller as with #1. Combined with the caching, it looks like a further improvement although the numbers are a bit odd. Everything seems a bit more expensive this round of profiling, not really sure why that is, but attached xhprof screenshots of the various combinations including previous patch to compare like with like.

Also confirmed a lot of the memory and cpu on cache hits is from unseriaiizing the cache entry, so sites using pluggable cache.inc should skip a lot of that.

yched’s picture

Yay - the profile data looks very promising.
The 'late cache only' approach results in a much bigger serialized / unserialized array than in current HEAD (because of the duplication of $field definitions in 'fields' and 'field_ids' entries).
I'm not sure the additional 'small array' change results in an actual overall memory win (gain in _field_info_collate_fields() == loss in the additional static in field_info_fields()), but it's in fact needed to keep the size of the serialized data to it's current HEAD level.

So, +1 on #20.

The only thing that worries me is why the profile data show 'small_array_no_cache' as much slower than HEAD.

+++ modules/field/field.info.inc	23 Jun 2010 17:30:39 -0000
@@ -524,8 +524,14 @@ function field_info_bundles($entity_type
 function field_info_fields() {
+  $fields = &drupal_static(__FUNCTION__);
   $info = _field_info_collate_fields();
-  return $info['fields'];
+  foreach ($info['field_ids'] as $key => $field) {
+    if (!$field['deleted']) {
+      $fields[$field['field_name']] = $field;
+    }
+  }
+  return $fields;
 }

Probably an overlook - shouldn't we loop only if the static is NULL ?
That possibly explains the 'small_array_no_cache' CPU loss compared to HEAD, but it would be worth double checking.

+++ modules/field/field.info.inc	23 Jun 2010 17:30:39 -0000
@@ -186,47 +186,47 @@ function _field_info_collate_fields($res
+      // Populate 'fields' only with non-deleted fields.

Comment is not accurate anymore - $info['fields'] holds ids only

Actually, given how the patch uses the 'fields' entry, we probably need to switch names between the 'fields' and 'field_ids' entries.
Current patch :
'field_ids' = list of $field arrays, keyed by id
'fields' = list of field ids, keyed by field name (mapping between field names and ids for non-deleted fields)
Makes more sense the other way around :-).

+++ modules/field/field.info.inc	23 Jun 2010 17:30:39 -0000
@@ -186,47 +186,47 @@ function _field_info_collate_fields($res
-      $field = $info['fields'][$instance['field_name']];
(...)
+        $id = $info['fields'][$instance['field_name']];
+        $field = $info['field_ids'][$id];

simpler :
$field = $info['field_ids'][$instance['field_id']];

Powered by Dreditor.

catch’s picture

Status: Needs review » Needs work

field_info_fields() is only called by file_get_references() - http://api.drupal.org/api/function/field_info_fields/7 - and file_get_references() is a bad example for many different things with at least two critical bugs against it, so I'm happy for a bit of memory bloat in that function to avoid it in the critical path, or thinking about it, we could just not static cache it and do the foreach each time, the foreach just by itself on a few requests doesn't need caching.

All the other comments fully agreed with, I'll re-roll today, and do one more set of profiling - xdebug on the real site (with 295 instances and memcache) and xhprof with a more reasonable 100 fields on HEAD.

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new4.95 KB

Updated patch per #21, more (hopefully last round) profiling soon.

Status: Needs review » Needs work

The last submitted patch, field_info.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new184.45 KB
new106 KB

OK so with ~295 instances and memcache, here's the before and after - xdebug says 20% of page request down to 1% - the reason this is so much more dramatic is 'cos we save the unserialize() when using memcache. No memory output from xdebug/webgrind and I don't have xhprof on there.

Devel output though:

With patch: age execution time was 515.65 ms. Memory used at: devel_boot()=5.17 MB, devel_shutdown()=45.28 MB, PHP peak=46.5 MB.

Without patch: Page execution time was 601.76 ms. Memory used at: devel_boot()=5.17 MB, devel_shutdown()=46.45 MB, PHP peak=47.5 MB.

So at most 1mb saving (not sure how much that's out of though). Still plenty from one function.

catch’s picture

#23: field_info.patch queued for re-testing.

yched’s picture

Status: Needs review » Reviewed & tested by the community

Sorry for dropping. Gave it another round, and #23 is good to go.

dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. :)

Status: Fixed » Closed (fixed)
Issue tags: -Performance

Automatically closed -- issue fixed for 2 weeks with no activity.