Problem/Motivation
_field_info_collate_fields() is the central point of access for field and instance definitions, used by the various field_info_*() API funcs :
field_info_fields(), field_info_field_by_ids(), field_info_field(), field_info_field_by_id(), field_info_instances(), field_info_instance().
The function holds an array of all fields and instances across all entity types and bundles, as a monolithic static cache + persistent cache entry. The list is rebuilt or fetched from persistent cache on any request involving a fieldable entity (that is every request, since users are fieldable).
With free-for-all fieldable entity types, that array can rapidly grow out of hand (measurements are from @catch in the OP):
- 140k on a fresh install
- 1M on installing media + media_gallery contrib modules
- 3M on non unrealistic sites
- 10M+ on some commerce sites and other sites with many bundles/instances
Proposed resolution
Split the static and persistent caches per [entity type, bundle]. On each request, only the data about the fields and instances present in the bundles involved in the request are loaded from cache and kept in memory.
That approach stemmed from @catch's work on DrupalCacheArray, which aimed at solving similar issues with the cached theme and db schema registries. However, the special needs here (notably the correlation between fields and instances) make the DrupalCacheArray class itself unapplicable, and the code has evolved into its own thing for the needs at hand.
The FieldInfo class is the central provider of run time data about fields and instances. It holds the static and peristent caching logic.
As discussed in #16/#18 below, caching strategy is optimized for the following use case (most frequent across core and a couple 'big' contrib suspects - Views, Ctools, OG, Field permissions, Field collection, OG, Date, References, Relation) :
foreach ($instances in a given bundle) {get the $field; do something with $field and $instance;}
.
"Give me all fields/instances" functions (field_info_fields(), field_info_instances()...) are as costly as before, and should be avoided when possible. The patch deals with the existing cases in core.
A spreadsheet is being compiled with a list of Contrib modules which use these costly functions, and which will need to be updated to take advantage of this patch: https://docs.google.com/spreadsheet/ccc?key=0AjNi8SoLpdaQdEFjaV9jY3ZwYkl...
Remaining tasks
Some things to consider in follow-ups:
- API refactoring for fields to remove functions like
field_info_field_by_ids()
. - Reading the FieldInfo object from the DIC (currently blocked by #1708692: Bundle services aren't available in the request that enables the module).
- Removing a call to
field_info_instances()
fromfield_ui_existing_field_options()
.
User interface changes
None.
API changes
No API change strictly speaking.
- The existing field_info_*() functions still take the same arguments and return the same values.
Their relative cost, however, has changed: they all used to load all fields / instances in memory; some functions are now less costly than they were, which changes the recommended use patterns.- field_info_fields(), field_info_field_by_ids(), field_info_instances() are still persistently and statically cached. Their use is discouraged, since they have a important memory & CPU cost.
- API ADDITION: field_info_field_map() returns a lightweight map of the use of fields across entity types and bundles. It should be able to replace most use cases for field_info_fields() / field_info_instances().
- field_info_instances($entity_type, $bundle) reads the field+instance data from the persistent cache entry for the bundle, and is statically cached.
- field_info_field($field_name), field_info_field_by_id($field_id), field_info_instance($entity_type, $field_name, $bundle) are static cache hits (no CPU/memory cost) if done on a field/instance previously loaded in cache by field_info_instances($entity_type, $bundle).The following code was the usual way so far, and is now not recommended (although it will not perform worse after the patch) :
// Collect fields in advance to save function calls to field_info_field(). $fields = field_info_fields(); foreach (field_info_instances($entity_type, $bundle) as $field_name => $instance) { $field = $fields[$field_name]; // Do something with the $field and $instance }
Recommended way after the patch :
// Let field_info_instances($entity_type, $bundle) foreach (field_info_instances($entity_type, $bundle) as $field_name => $instance) { $field = field_info_field($field_name); // Do something with the $field and $instance }
- The previous _field_info_prepare_*() helper functions, that prepare a raw field / instance definition for the current runtime context (fill in default settings values, replace unavailable widgets and formatters with defaults) have been internalized as methods in the class. Their signature needed to change anyway (they received a full $field array, which is too costly to get, while they actually only need a field type). There might be valid edge cases for calling them from 3rd party, so the methods have been kept public, and the D7 patch can provide BC versions.
- API addition : field_read_fields($conditions) which reads fields definitions from the {field_config} table, has been modified to accept $conditions on {field_config_instance} ('entity_type' and 'bundle' : "give me the fields that have an instance in this bundle"). That was needed so that field_info_instances($entity_type, $bundle) can pre-cache the fields involved in the bundle.
While the code changes are fairly important, the patch prioritized 1) D7 backportability, 2) overall soundness and consistency of the new shape of the code, in that order. No changes were made just for the sake of it if they weren't needed to solve the D7 performance issue, but then where changed were needed I tried to push the code logic and shape it in a way that makes sense for a "clean-slate" D8.
Future possible D8 evolutions ($field and instances as classed objects / WSCCI plugins, definitions read from CMI) might shuffle things around, but the FieldInfo class introduced here might well stay in one form or another (and I'd advocate getting rid of the field_info_*() wrappers, but that's for another patch).
Original report by catch
The _field_info_collate_fields() cache grows rapidly as soon as you start installing contrib modules.
With vanilla core, _field_info_collate_fields() uses around 140k of memory.
Do this:
drush en -y media media_gallery
The following extensions will be enabled: styles, file_styles, multiform, media, media_gallery
Now, that cache is over 1mb - just due to a few default fields, instances and/or formatters. I've already seen a Drupal 7 site where it was over 3mb.
This bit of code isn't all that different from the equivalent Drupal 6 CCK code, I found it using over 3.5mb of memory on a client site, see #1009596: _content_type_info() memory usage improvements for the CCK issue (and a patch).
#729186: _field_info_collate_fields() resource usage helped with the CPU issues in D7 but not really the memory, or if it did slightly it was only a side effect.
Now the CCK patch was easy, but there are sufficient changes in the D7 logic that I'm not sure the simple approach from there (basically split/duplicate the cache into the parent functions on the assumption they won't all be called on most requests) will cut it.
Attached a screenshot to show the usage anyway.
Comment | File | Size | Author |
---|---|---|---|
#246 | ab-scenario-1-without-patch.txt | 1.26 KB | David_Rothstein |
#246 | ab-scenario-1-with-patch.txt | 1.26 KB | David_Rothstein |
#246 | ab-scenario-2-without-patch.txt | 1.26 KB | David_Rothstein |
#246 | ab-scenario-2-with-patch.txt | 1.26 KB | David_Rothstein |
#246 | ab-scenario-3-without-patch.txt | 1.26 KB | David_Rothstein |
Comments
Comment #1
yched CreditAttribution: yched commentedOne approach could be splitting static caches about instances info by entity type - possibly bundles as well, but less sure about that.
I don't see how we can minimize the 'fields' info size, though. Fields are potentially used in any entity type.
Comment #2
catchWith fields I'm thinking cache it by bundle too - based on the instances. This would be more cache_get() on pages that show entities with different bundles, but each one would be loading much less information. Also there will not be many pages that load every single bundle on the site - but if they did then memory usage could be even worse.
Whatever we do there's going to be some kind of trade-off.
Comment #3
BerdirSubscribing.
One of the reasons for creating drupal.org/project/properties is that the entity/field configuration is global and can become huge if you are trying to create large amount of bundles and fields.
Comment #4
yched CreditAttribution: yched commentedMarked #1165826: field_info_fields cache workaround fail as duplicate.
Comment #5
andypostComing from #1165826 which still mot marked as duplicate, I think both should be fixed
field_info_fields() also CPU-eater because iterates all fields mostly without reason, suppose we need to store deleted items in separate cache
file_get_file_references() the only usage of field_info_fields() should be rewritten with more lightweight implementation of this
Comment #6
podarok+1 to andypost #5 #1165826: field_info_fields cache workaround fail
Comment #7
podarokhere are xhprof on real project with
all the 15968 created by commerce BTW
this run was recorded in xhprof without field_info_field cache bin
Comment #8
podarokhere are output with cached field_info_fields
the same database and codebase
Comment #9
catchI don't have any code here yet, but thinking about this more:
- if we want to split the cache up, we have a choice between per-entity type and per-bundle. Per bundle is going to be more efficient if you have rarely used bundles (like a 'legacy' content type) that have their own set of fields, however there's also more risk of duplication.
- field_info_field() could ideally share the same cache as the per-bundle/type caches for fields and bundles to start with, then fall through to the full list if it's not found there.
- this might be another place where using ArrayAccess would allow us to do fairly complex caching without having to spill it out all over field module, berdir has a git sandbox for the schema issue, I just posted a new patch to #1011614: Theme registry can grow too large for MySQL max_allowed_packet and memcache default slab size doing something similar.
Comment #10
yched CreditAttribution: yched commentedWow. For now, I'll just say that the current code is indeed in no way designed for 16.000 different instances. I don't think Drupal Commerce handles hundreds of different entity types, so I'd more likely suspect they are creating bundles like mad ?
If that # of instances is "legitimate", then caching per-entity-type will hardly cut it.
Pondering on the items 2 and 3 in @catch's #9 (I don't think I get item 2)
Comment #11
podarok2yched
this is possibly offtopic, but in that project there are 800+ types with 20-40 fields in it
duplication of fields was minimized via using same fields(bundles) for different types(instances)
with those huge fields database (50GB +) I`ve got a 10 seconds trouble - any page that used fields displayed in 10 seconds 8(
If You are interesting in it here are few issues that become real when there are more than 10K bundles(fields)
#1173108: Permissions page performance + #1165826: field_info_fields cache workaround fail + #435694: >1M data writes incompatible with memcached (memcached -I 32M -m32) + #1131980: commerce_product_ui_menu performance fail
2catch
We have a thin place in fieldapi when getting info about all the fields for getting fields for a single entity or a bundle...
We need be more specific in that place - without caching all the data but caching a data at different levels
Storing field_info_fields in database is like a bomb - serialized 10M blob(~500 types) unserialized 100M
FE
xxx_get_all_fields - returns an array or an object with only field names at all (1000 fields will be few KB in a single cache blob)
xxx_get_field_info($field_id) - returns particulary the same as _field_info_collate_fields() but only for a single field (a small cache blob too)
Comment #12
Damien Tournoud CreditAttribution: Damien Tournoud commented@podarok: you are trying to bend Drupal 7 outside of its design specifications. Drupal 7 assumes that *entity types, bundles and fields are configuration*, which means that they can be (and are) loaded into memory at every request. A system that scales to a large number of fields would be designed completely differently, and would likely have worse performance in the simple cases.
Comment #13
podarok2Damien Tournoud
nope, I`m using really standard core D7 config.
I just use a simple online store configuration with a 800 product types, created in commerce with a ~30 fields for each
WIth xhprof I`ll found a problem with memory in _field_info_collate_fields() and here I am 8)
As described at #1 - memory trouble is even at small D7 installations and I thought my configuration will be helpfull for this issue
FieldAPI really good for storing and displaing such huge number of fields with a memory trouble for _field_info_collate_fields() workaround
If we fix it - D7 would be a good solution for enterprise systems like electronic, car parts catalogs (10K+ types with 50+ fields each) or any other
Comment #14
Berdir@podarok
You misunderstood Damien. He meant that you are using Drupal 7 in a way for which it was not designed to be used.
Have you looked the Properties project that I linked above? ( http://drupal.org/project/properties ). One of the main reasons for creating that was that you practically *can not* create hundreds of field instances. It works around that by being a single field that has an more or less unlimited amount of key/value pairs, grouped in categories. It is limited to strings as values currently, so it doesn't work for anything but it allows to create any number of pre-defined templates, which are *not* read into memory.
The fields system in D7 simply has not been designed for what you are doing (creating 800 product types). I don't think that can simply "be fixed", certainly not in D7, as that requires architectural changes to really solve this properly.
Comment #15
yched CreditAttribution: yched commented@podarok : everyone agrees there's a data size (memory + CPU) issue with _field_info_collate_fields(), and the discussion in this thread will hopefully come to enhance that (with the particular requirement that a solution for D7 needs to avoid breaking the field_info_*() API).
However, as for any performance and optimization issue, there needs to be a target as to what range of data size we should support with a "reasonable" performance, so that we can reason on the necessary tradeoffs. "800 bundles & 16.000 field instances" seems out of reach (and scope), and definitely not the main case we should try to optimize for.
Comment #16
catch@podarok, if you are using exactly the same set of fields for two different bundles of the same entity type, then that sounds wrong overall. It's likely the changes here would help your use-case too, but they might not make it workable with a small memory limit still.
I have the CacheArrayObject class at #1011614: Theme registry can grow too large for MySQL max_allowed_packet and memcache default slab size working well now, it would be possible to start work on a patch here based on that API (although it may well change before that patch gets anywhere near commit).
Some thoughts, I need to review the order in which the various field_info_*() functions get called on most requests, but iirc the instance functions tend to get called, then the field functions are called usually for fields that are already somewhat known about from instances. If this is correct:
_field_info_collate_fields() should switch to returning a CacheArrayObject. It's a private function so that's fine.
Try to allow field_info_instances() and field_info_instance() to then trigger loading instances (+ possbily the field definitions themselves for those instances) cached per bundle. The instances and fields would be stored for later retrieval during the same request in the CacheArrayObject.
If we stored both instances and fields per-bundle, then field_info_instances($entity_type, $bundle); _field_info_collate_fields() will load all the instance and field definitions for that bundle into memory. If I later run field_info_field(), it can re-use that same data. This means one cache request per bundle per page. We might load duplicate data, but the field info array inside the CacheArrayObject can be unique, so it won't be duplicated in the static at least.
field_info_field() could then consult the CacheArrayObject() to see if the field definition was already loaded, if not fall through to either the full field list, or an individual cache entry. This assumes that most of the time it's called when there's already an instance known about, which may not be the case.
functions that need to return a huge list of fields or bundles or instances across all entity types to the caller we can't do much about, but this won't happen on most requests.
Comment #17
podarok2catch
nope
Every content type (bundle) has its own set of fields, but some fields are common.
but anyway... it is possibly offtopic...
Comment #18
yched CreditAttribution: yched commentedre @catch #16 :
I'd say the assumptions and the approach look correct.
Below is a a small audit of the use of the field_info_*() API functions (at least those that fetch from _field_info_collate_fields()), with core + a couple 'big' contrib suspects (Views, Ctools, OG, Field permissions, Field collection, OG, Date, References, Relation)
Disclaimer, though : the Entity API module does advanced (and convoluted) stuff with fields (entity metadata et al), I'm not familiar enough with it to evaluate the performance impact over there. We wight want to bring @fago in here.
I also didn't dare looking through Drupal Commerce :-/
field_info_field($field_name) / field_info_field_by_id($field_id)
Those are generally called after knowing an instance (sometimes collected higher in the callstack, no problem). So having instances pre-fetch their fields sounds like a good pattern.
For other cases (code inquiring about a specific field, e.g "field settings page for field_foo", tests, or views field handlers, that generally deal with a field across bundles), that will mean a possible extra db hit, should be OK.
field_info_fields() / field_info_field_by_ids()
Then we have the "return all fields" functions : .
We'll need to load every field definition, but we should probably make sure those do not remain in the persistent "on-demand" cache (and possibly not in the static cache as well ?)
field_info_field_by_ids() is mainly used internally, and each time the code could use field_info_field_by_id($id) instead. We can keep the function around, but our current core use cases should be moved to field_info_field_by_id().
field_info_instance($entity_type, $field_name, $bundle_name)
Perfect match for filling the static and persistent "on-demand" caches, and pre-fetching the corresponding $fields. We can probably load all fields and instances for the whole bundle while we're at it.
field_info_instances($entity_type = NULL, $bundle_name = NULL)
That one is a bit special, because of the two optional params.
- field_info_instances(NULL, NULL), like field_info_fields(), needs to load all definitions, and should probably not keep anything in the persistent cache (nor in the static ? in which case there's no need to pre-fetch the corresponding $fields)
- Same for field_info_instances($entity_type, NULL) (very rare).
- field_info_instances($entity_type, $bundle_name), like field_info_instance(), is a perfect match.
Comment #19
yched CreditAttribution: yched commentedAlso : I'm leaving tomorrow for a whole month away from my coding env. Not completely AFK, but I won't be able to help much code-wide :-/.
Comment #20
omercioglu CreditAttribution: omercioglu commentedComment #21
xjmCrossposting #1220168: Standardize information returned by _field_info_collate_fields() storage bins for consideration in whatever fix is implemented.
Comment #22
catchThis is waiting for #402896: Introduce DrupalCacheArray and use it for drupal_get_schema(). I already have 3-4 patches in the queue which are duplicating code from that issue and don't want to add any more, so I'm tagging this as blocked.
Comment #23
catchAnd now it's not!
Comment #24
yched CreditAttribution: yched commentedStarted to look into this now that DrupalCacheArray is in.
I'm actually not sure how we should best leverage the tool here :
- One single DCA object for both fields and instances across all entity types and bundles (filled on demand for each separate entity_type+bundle) ? - means one single db cache entry, which can still theoretically grow huge and hit MySQL max_allowed_packet limit ?
- Two separate DCAs for fields and instances ?
- Different DCAs for each entity_type+bundle ?
Comment #25
catchI think we want a cache entry per bundle including both the instances and the field info triggered by field_info_instance()
I don't know if we want to make those individual entries DrupalCacheArray or not - depends if there is field and instance information that might not get requested I guess.
We then need to figure out how that interacts with requests for field info - it may be the fields are a DCA and this checks for field information that was pulled out for bundles, then falls back to its own cache (or the db) if not. That bit is going to be tricky I think.
There's also situations like when all instances get returned or all fields. We might want to not cache those calls at all (pull from the db and call alter hooks) since that should be rare. Eventually we could look at ArrayAccess + Iterator for things like that - started work on this at #853864: views_get_default_view() - race conditions and memory usage.
It may be we just end up using ArrayAccess for part of this since it's not going fit neatly in to the pattern used for schema and theme registry. Will try to have a more in-depth look at this myself in the next couple of days.
Comment #26
yched CreditAttribution: yched commentedI started to work on some code myself, not nearly finished, but I will try to post it later today.
Comment #27
yched CreditAttribution: yched commentedPosting what I have. This is still largely rough code, unpolished and unfinished - notably, not tied with the actual field_info_*() functions yet.
The FieldInfoFieldCache class extends DrupalCacheArray for now, but uses a different logic around the persistent and static caches (one DCA but several cache entries, static data stored not only in $this->storage, and massaged before returning the accessed values), so that in the end it has little in common. At some point it might be clearer to drop the reference to DCA completely, not sure yet.
Guiding principles :
- We want separate cache entries for each bundle, to keep them to reasonable sizes. Only load / unserialize / keep in memory the data for the bundles used in the page. That's one cache_get() per bundle used on the page, I think this is a fair trade_off.
- For each bundle, the persistent cache stores the list of instances, and the list of corresponding fields. Bundles are populated 'as a whole' : all instances and fields for the bundle; I don't think it makes sense to have on-demand loading of specific fields or instances within a given bundle.
- The static per-bundle entries ($this->storage["$entity_type:$bundle"]) do not hold field definitions, but only a list of field ids, to avoid in-memory duplication of fields shared in several bundles. When the persistent cache entry for the bundle is loaded in memory (in offsetGet()), the field definitions are taken out and merged in a global list of fields ($this->fields_by_id)
- When per-bundle entries are accessed (offsetGet()) or persisted (__destruct()), the relevant field definitions are merged back into the returned value.
Comment #28
bibo CreditAttribution: bibo commentedSubscribing.
Comment #29
yched CreditAttribution: yched commentedFor the record, I'm still working on this, but I've had limited core time recently. I'm currently working in #1305362: Helper issue for #1040790 to have a testbot playground for the work-in-progress patch without cluttering this issue.
Still a fair amount of code polish ahead when the tests over there come back green, though.
Comment #30
yched CreditAttribution: yched commentedStill some polish needed, and a couple @todos, but we should be getting close.
I'll try to post a human-reviewable patch with a patch summary shortly.
Comment #32
yched CreditAttribution: yched commentedHere's a green and almost final patch (still has a couple docs @todos).
I'll try to wrap a summary asap.
Comment #33
yched CreditAttribution: yched commentedComment #35
BerdirInteresting.
Is this still applicable for a D7 backport? The functions which are moved around are prefixed with an _, sure, but this is quite a huge change...
Comment #36
yched CreditAttribution: yched commentedSo much for "here's a green patch". For some reason patch #32 included a revert of my #1391384: Wrong static clear on taxonomy_vocabulary_save() fix and thus failed the tests I added over there. That's called self-biting I guess...
Updated patch fixes that and clears a couple doc @todos.
@Berdir: yes, this is definitely done with a D7 backport as a target.
The changes on existing functions only involve a couple _internal_funcs(). For some of them (field_info_prepare_*()) I suspect there might be legitimate cases for calling them from contrib, but the D7 version of the patch can preserve BC implementations.
Comment #37
xjmWow, big patch! It will be great to see this get in. Somewhere in the middle I switched from reading the code to just doing a docs review, which follows.
This should be one line, 80 chars or less, and start with a third-person verb. Also, "information" is not pluralized. How about simply:
Trailing whitespace at the end of these lines. Also, because of the above change, I'd start the function either with "Holds..." or "This class holds..."
Unneeded space before the parenthesis here. Also, I think we can fit more words on this line?
Typo here (ends with a comma instead of a period).
We should try to word this such that it fits in one line.
I think there is a typo here (a period instead of a space).
Trailing whitespace after this method.
These should read:
"does not read from nor populate..."
"ID" should be uppercase in both these lines.
"Non-deleted" should be hyphenated. Edit: Also, there's a typo ("Ratrieves").
We can remove the word "return" here.
Do we really need to call the query "ad-hoc"? :P
@todo
.Trailing whitespace here.
Trailing whitespace just above the return value.
REALLY nitpicky point, but "Fall back" should be two words when it's a verb.
Phew! :)
Adding a novice tag in case someone wants to clean up the points listed above and write one-line summaries where there are currently
@todo
before others get back to it.Comment #38
xjmSo I didn't mean to mark this NW actually. Conversation in IRC about how dreditor sets it to NW reminded me. :) Sorry.
Comment #39
Anonymous (not verified) CreditAttribution: Anonymous commentedwill run some benchmarks and a do review of this tomorrow - nice work!
Comment #40
yched CreditAttribution: yched commentedFixes @xjm's remarks in #37.
Comment #41
yched CreditAttribution: yched commentedSome more refactoring.
Moving the static and persistent cache gets and sets to separate helper methods really helped make the code flow legible in the rest of the class, but the method names themselves sucked big time (staticCacheGetFieldbyId(), persistentCacheSetBundleInfo()...).
Factorized those into cacheGetX() / cacheSetY() methods, with CACHE_STATIC and CACHE_PERSISTENT class constants as params.
Comment #42
Anonymous (not verified) CreditAttribution: Anonymous commentedthis fatal errors for me on install. but the bot green-lights it. ruh roh.
Fatal error: Undefined class constant 'CACHE_PERSISTENT' in /var/www/drupal8/webroot/core/modules/field/field.info.inc on line 258
will see if i can fix that.
Comment #43
xjmUntagging.
Comment #44
Anonymous (not verified) CreditAttribution: Anonymous commentedattached patch fixes the PERISTENT issue - was just a typo.
only change vs 41 is s/PERISTENT/PERSISTENT/.
Comment #45
Anonymous (not verified) CreditAttribution: Anonymous commentedComment #46
Anonymous (not verified) CreditAttribution: Anonymous commentedok, another fatal, sacheSetFieldMap(), fixed with attached patch.
Comment #47
yched CreditAttribution: yched commentedOops, #41 came back green with 0 passes, which gets easily overlooked. Sorry about the typos.
Comment #48
Anonymous (not verified) CreditAttribution: Anonymous commented#47 no worries, this is awesome work.
taking me a while to work through this patch, because i still need to get up to speed on the way this data changes and is queried.
Comment #48.0
yched CreditAttribution: yched commentedPatch summary.
Comment #48.1
yched CreditAttribution: yched commentedadded API addition on field_read_fields()
Comment #49
yched CreditAttribution: yched commentedAdded an issue summary.
Comment #49.0
yched CreditAttribution: yched commentedformatting
Comment #50
yched CreditAttribution: yched commentedbump - anyone up for review ?
Comment #51
drzraf CreditAttribution: drzraf commentedI would be happy to test this if there were a "light" way to consistently measure the memory consumption.
Something like
drush ev '[RAM usage]; _field_info_collate_fields(); [RAM usage]'
or something usingmysqldump db {cache_field}|wc -c
, so we can do several tests easily before and after the patch.Comment #52
podarok#46
clean drupal 8.x today git clone
with manually created module for filling one entity_type 'node' , bundle 'test_type', with 1000 fields(instances)
all tested on freebsd 9.0-RELEASE-amd64bit Core2Quad Q9300 CPU + 6GB RAM +
apache 2.2.21
php 5.3.9 (all dynamic extensions)
mysql 5.5.20
all 64bit builds
no opcode cachers
without patch #46
truncated cache tables run
xhprof data
cached run
with patch #46
truncated cache tables run
cached run
for the bottom results i`d use most repeatable result from ~5-10 page runs
apachebench for a custom page
without patch
Server Software: Apache/2.2.21
Server Hostname: 127.0.0.1
Server Port: 80
Document Path: /drupal/?q=pfield
Document Length: 1648597 bytes
Concurrency Level: 5
Time taken for tests: 171.750 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 1649080000 bytes
HTML transferred: 1648597000 bytes
Requests per second: 5.82 [#/sec] (mean)
Time per request: 858.750 [ms] (mean)
Time per request: 171.750 [ms] (mean, across all concurrent requests)
Transfer rate: 9376.59 [Kbytes/sec] received
with patch
Server Software: Apache/2.2.21
Server Hostname: 127.0.0.1
Server Port: 80
Document Path: /drupal/?q=pfield
Document Length: 1648597 bytes
Concurrency Level: 5
Time taken for tests: 168.670 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 1649080000 bytes
HTML transferred: 1648597000 bytes
Requests per second: 5.93 [#/sec] (mean) (in other run got 5.7#/sec :( )
Time per request: 843.349 [ms] (mean)
Time per request: 168.670 [ms] (mean, across all concurrent requests)
Transfer rate: 9547.83 [Kbytes/sec] received
~ 1.8% speedup at all
I do not see any speed/memory optimization here
Or possibly show me right way to test it
Comment #53
catchYou won't see an improvement if you add 1,000 instances to a single bundle. A better data set for testing this (with the same number of overall bundles) would be:
- ten bundles
- 100 instances per bundle
- load instance info for three of these bundles during the request.
Comment #54
podarokOk @catch
I`ll try to test such config
Comment #55
podarokok...
another test
14 bundles of node entity
10 of em with 100 fields
without patch
xhprof
apachebench
with patch
xhprof output
apachebench
looks like this patch is a good memory optimization fix
possibly we can also do some work for multiple function call optimization and get a more nice code execution
but anyway 50% memory optimization is good enouph for making it RTBC
Comment #56
catchOK this looks great overall. I didn't do an in depth review yet so I may have missed finer points of the logic, but generally caching by bundle, and not caching when functions are trying to get everything all at once looks great.
A few things that came up reviewing:
Could we use the class name rather than 'static cache' here? This also looks more like it's commenting on the lines changed in the patch than recommending what to do, since it's hook documentation, could we put the performance advice in the phpdoc instead?
Hmm I'm not sure about this, does it really need to be consistent? If the problem is having $cache as the first parameter to everything else, what about swapping the parameter order?
This returns if it's from the persistent cache too?
Again this is either persistent or static so the docs should cover both.
Looking at this, I'm wondering if it wouldn't be easier to have separate methods for persistent and static cahing. So $this->staticCacheSetField() and $this->persistentCacheSetField() or similar.
Typo for 'corresponding'.
Is this only for performance, or is there an API change?
For Drupal 7 we might possibly want to keep the old function and wrap the new one?
Comment #57
yched CreditAttribution: yched commentedMany thanks for the profiling data, @podarok.
Attached patch fixes @catch's remarks, except those on cacheGetX() / cacheSetX() methods.
As I wrote earlier, I wasn't fully satisfied with the code legibility until I moved the caching logic in dedicated methods, but @podarok's numbers in #55 make me reconsider, since this adds function calls in the critical path.
With a primed static cache, field_info_instance($entity_type, $field_name, $bundle) makes :
- 1 nested function call in current D7/D8 : _field_info_collate_fields()
- 3 nested function calls with the patch : _field_info_field_cache()->getBundleInfo(), which calls cacheGetBundleInfo().
Moving from a giant statically cached array to a class with methods means we get two func calls (get the object, call a method) instead of one (get the array, read the part we need), but we should probably stop there and favor performance over legibility.
Will try to inline the cache methods back and see how this translates in XHProf.
- re: field_ui_inactive_instances() : yes, that's performance only, no API change. The current code calls the expensive field_info_instances($entity_type) in all cases, even when a $bundle_name is provided. My take is we could drop support for $bundle_name = NULL, but I don't want to do this in this patch. New patch makes the code in there more explicit and adds comments.
- re: _field_info_collate_fields_reset() : sure, this could be kept as a BC function in D7. I keep it removed in the current D8 patch.
Comment #58
podarokComment #59
podaroklooks like testing bot happy, so will try xhprof results in 24 hours against #55
Comment #60
yched CreditAttribution: yched commented@podarok : #57 has no real changes that would deserve another round of profiling.
I want to explore a bit the impact of cacheGetX()/cacheSetX() methods, I'll do that hopefully tonight or tomorrow.
Comment #61
podarokok, following up
Comment #62
yched CreditAttribution: yched commentedSo after spending some time with XHProf and ab, turns out the cacheGetX()/cacheSetY() methods add like 3% CPU overhead compared to when all caching logic is inlined. So I removed them in favor of the inlined version. A bit less friendly to read (see the getBundleInfo() method), but 3% is worthy.
New patch attached.
My setup (test code attached as well) :
20 node types
On each node type, 20 text fields :
- 10 fields shared across all node types
- 10 fields specific to the node type
That's 210 fields, 400 instances across 20 bundles. A "medium+" sized site.
+ devel_generate to create 100 nodes with dummy field data.
Memory measured with XHProf, CPU measured with "ab -n 300 -c 1"
1st test : Display all the fields on one entity (aka node/%nid)
- 8.x:
Requests per second: 13.80/s
Time per request: 72.441 ms (s/d 5.0)
Total Incl. PeakMemUse 15,662,856 bytes
- patch:
Requests per second: 14.51/s
Time per request: 68.941 ms (s/d 4.5) --> CPU -4,83%
Total Incl. PeakMemUse 8,384,776 bytes --> Memory -46,5%
2nd test: Display 10 entities (all fields) across 10 different bundles (aka "worst-case listing page")
- 8.x:
Requests per second: 6.61/s
Time per request: 151.173 ms (s/d 5.6)
Total Incl. PeakMemUse 17,905,632 bytes
- patch:
Requests per second: 6.64/s
Time per request: 150.651 ms (s/d 4.5) --> CPU -0,35%
Total Incl. PeakMemUse 13,871,712 bytes --> Memory -22,5%
Comment #63
catchThis looks good to me, I don't think losing the methods is too bad for readability.
Minor issues:
- the 'static cache' is just a class property since the class itself is held in a static, we should call it something else.
- There's some very minor whitespace things like extra blank lines.
Wondering if beejeebus got any further with his review?
Comment #64
yched CreditAttribution: yched commentedDang. You puzzle me, actually. I'll try to see if I can make things clearer in the class docblock, but I'd really prefer to keep the expression "static cache" in the code comments within the methods.
Even though not directly based on PHP static vars, conceptually those properties really act as a static, 'request only' cache as opposed to data read from the persistent cache bin. The various methods are just implementations of the common pattern present all over core :
For efficiency I removed the explicit getCache()/setCache() methods which made the pattern very explicit in code, but I tried to keep it clear in code comments. If we can't use the words 'static cache' there while it's the expression used throughout all of core for this pattern, then all I'm left with is some fairly cryptic code :-(.
[dishonest mode] The rest of core says "static cache" but doesn't directly use PHP static vars either, they use a userland construct based on a function that "happens" to have the word 'static' in its name :-p. [/dishonest mode]
Comment #65
yched CreditAttribution: yched commentedWhat about this ?
- tried to make the phpdoc block for the class clearer
- added "quotes" around the term 'static' in code comments.
+ additionally fixes extraneaous arg passed to hook_field_storage_details_alter()
Interdiff attached.
Comment #66
podarok#65
what about testing ab -c5 ?
today we mostly have a multicore environments and c5 is more realistic in results
Comment #67
catchThanks yched, those comments look better to me.
I don't have any more complaints here and the performance numbers look great. Given this has been profiled already I don't see a need to do benchmarks - trust xhprof over ab any day.
Comment #68
yched CreditAttribution: yched commentedactually, in #62, CPU measurements come from ab, memory measurements come from xhprof. I couldn't get xhprof to produce somewhat consistent and reproducible cpu ratios between runs, whereas the ab processing times over 300 runs were fairly stable.
Comment #69
yched CreditAttribution: yched commented@catch : what is missing before we can get this in ?
Comment #70
catchI was hoping someone else would rtbc it, but let's try that and see if any additional reviewers show up, if not I'll commit soon.
Comment #71
sunI didn't test this patch and I'm not able to comment on the functional code changes at this point.
However, I'd like to see the phpDoc warnings/notes to be shortened and clarified, as it seems that it's important that developers understand the implications of using whichever functions quickly and crystal-clearly.
To achieve that, put the most important info upfront. Skip the preceding "Warning:" or "Note:" hint. Since all of the functions seem to be cached, but only some retrieve a particular key/item of the cache, whereas the others can lead to memory issues, it's a bit hard to distill the important bits, but the docs should really try hard to communicate that clearly and quickly.
I'm also missing concrete pointers or examples for when it is appropriate to use the memory-hog-functions. It sounds like that's discouraged and almost not supported anymore (which would make sense to me), but the functions are still around.
E.g.:
and
Comment #72
yched CreditAttribution: yched commentedI'll try to address @sun's remarks tonight.
Comment #73
catchTagging this as novice for the comment changes only, see #71 for what needs to change relative to the current patch.
Comment #74
BerdirRemoving the Novice tag, I don't see this as a Novice tag just because it's only about comments :) #71 merely contains an example and basically says "Improve the documentation" and for that you need to understand the code, and that isn't that easy.
Comment #75
geerlingguy CreditAttribution: geerlingguy commentedSubscribing, and adding that I have two sites now where this issue has reared its ugly head; one community-based site lets people set up 'networks' with profile2 profiles (for each network), and people can put any number of fields into their profiles... this means that we already have a few thousand field instances, and that site's field_info_field blob is 6.1MB.
I've checked through the patch, and code-wise/functionality wise it looks like it's good. Haven't tested it, but will try to review docs and submit a new patch if I can get a few spare moments. (Right now, this issue is pushing cache_clear's on one of the two sites over our memory limit, and causing some other minor performance issues, so it's in our best interest to get this backported too :).
Comment #76
BerdirAttaching a re-roll of the patch from #65, no actual changes yet.
Comment #78
pounardWhy is there an
init()
method on theFieldInfo
class? You could just writeprotected $fieldMap = array();
(default property values) in class properties declarations, and get rid of both the constructor and the init method, this would be more CPU wise and code would be shorter.Instead of
isset()
on the static cache arrays you could usearray_key_exists()
, this ensures you would not retry a second cache load in some cases, where the key would be here but could be NULL due to a previous failure; Another solution is to set an explicit FALSE on every miss (non existing field id or name or whatever) in order to ensure that isset() would return TRUE for non existing identifiers (if you don't likearray_key_exists()
, but that's a matter of taste).EDIT: The previous paragraph is important, because I'm afraid that unstored invalid identifier could cause the underlaying cache get attempts and framework to do unwanted extra SQL or cache backend queries.
Comment #79
BerdirThis should pass the tests, not sure if these two assertions are new or why they fail now. It is a change because previously, we returned "empty" bundles (= no fields) and now we don't do that anymore. Not sure if we that's ok.
I also moved the FieldInfo class to PSR-0 and tried to address the problem with trying to access missing fields. Not sure if that could result in problems because it worked differently previously, for example code that creates a field if it doesn't exist yet and runs multiple times, e.g. for multiple bundles. No idea if the caches are correctly cleared when the field is actually created. Let's see if there any test fails because of this
I also have no clue how this overlaps with the new Field API, any chance @yched or others working on this can check that?
Attaching an interdiff for the changes suggested in #78, there are a few other changes and the move that's not in there, because that makes the interdiff useless (moving and changing code at the same time).
Comment #80
yched CreditAttribution: yched commented@Berdir : thanks for grabbing the task.
I'll try to follow up with a review later today - meanwhile, if that's ok with you, I'll upload the patch in a branch in the new "D8 Field API" sandbox we opened for the we're currently running sprint.
Comment #81
Berdir@yched: Sure, that's ok.
Comment #82
yched CreditAttribution: yched commentedPushed.
D8 field API sandbox : http://drupal.org/node/1736366
field-info-cache-1040790-berdir branch.
Comment #83
yched CreditAttribution: yched commentedReviewed Berdir's changes in #79, adjusted some of them.
See attached interdiff.
I don't think the array_key_exists call is needed here, because we never store NULL in fieldIdsByName, we store MISSING_FIELD_ID.
OK, this allows us to do
return $this->fieldsById[$this->fieldIdsByName[$field_name]]
when we return directly from "static" cache above. Seems a bit hackish though.I'd prefer handling the case where fieldIdsByName[$field_name] == MISSING_FIELD_ID explicitly when we return from static cache.
See attached interdiff.
Those changes don't seem needed since the code below always puts a non NULL / non-empty value in
$this->bundleInfo[$entity_type][$bundle]
We currently don't check whether the [$entity_type, $bundle] pair is a valid, existing bundle.
I started doing so at some point, but it introduced some nasty race conditions & dependency on entity_get_info() caches, and I reverted that.
So I'd I'd go for not checking the validity of bundle. If some code asks for info about a non-existing node type, we cache (both statically and persistently) the fact that we don't have any fields on that node type. Broken code repeatedly asking for non existing bundles will therefore add more "empty" cache entries, but that's not a big deal, those cache entries won't bloat the cache entries that get loaded in memory for valid bundles.
Comment #84
yched CreditAttribution: yched commentedDunno why this switched to "need work".
I still need to get back at @sun's remarks in #71, but let's have the bot run meanwhile.
Comment #85
yched CreditAttribution: yched commented(side note : I forgot to push the changes corresponding to patch #83 in the field-info-cache-1040790-berdir sandbox. Did that just now.)
Comment #86
geerlingguy CreditAttribution: geerlingguy commentedI've tested the patch on D8, and before, there's one row with all the field info (
field_info_fields
) that's over 100KB. After patch, there is a row per entity type (field_info:bundle:node:article
and the like), and each cached entry per-bundle is only about 8KB.Tests passed on the previous patch (which applied cleanly on my local computer), I think this patch looks great (and WOW there is a lot less data being passed around on one page request in my particular instance, instead of having an 8MB field_info_fields row... now it's about 10-50KB per entity type), and docs are reasonable.
Memory savings ftw!
Comment #87
geerlingguy CreditAttribution: geerlingguy commentedAh, still haven't incorporated docs suggestions from sun in #71, so back to needs work, but imo, only the docs need some work. Taking a stab at this now, but only have a little time, so might not be able to get too far.
Comment #88
geerlingguy CreditAttribution: geerlingguy commentedDidn't get time to work on this, and probably won't by next week :(
Comment #89
geerlingguy CreditAttribution: geerlingguy commentedBack to CNR: I've made a few spelling corrections and moved around/edit a couple bits of text for clarity. Everything seems clear to me. For D8, maybe it makes sense to remove some of the now 'wrapper-ish' functions like
field_info_field_by_ids()
, but that would have further implications beyond what this patch is trying to achieve.The warnings about memory use are fairly clear to me (especially after rearranging a couple docblocks), but please highlight what specific bits might be vague or too verbose still.
Comment #91
yched CreditAttribution: yched commented#89: field-info-cache-1040790-89.patch queued for re-testing.
Comment #92
yched CreditAttribution: yched commented#89 is green, the fail is #1779638: Unexplained test failure in LocaleFileImportStatus->testBulkImportUpdateExisting().
Thanks @geerrlinguy !
The comment changes are fine by me.
Fully agreed, there might totally be some API refactors worth considering around here, but let's do that in followups. The API around getting field and instance definitions will probably be impacted by the "convert Field API to CMI" patch.
Also, the FieldInfo object will definitely be read from the DIC at some point, but we'll see that later. (#1708692: Bundle services aren't available in the request that enables the module is still a blocker for this anyway)
The goal here is to keep changes minimal for an easy D7 backport.
I've been willing to see whether a fix for #1400256: Tests for: $info['fields'], not always set? could be baked into this patch, but that can also happen in a followup and shouldn't prevent anyone from RTBCing this :-)
Comment #92.0
geerlingguy CreditAttribution: geerlingguy commentedUpdated issue summary.
Comment #93
geerlingguy CreditAttribution: geerlingguy commentedSince I just changed a few comments, didn't touch code, and since the testbot and my manual testing confirmed no issues with the patch in #83 (now #89), I'm marking RTBC.
I've also updated the Issue Summary with more details, and some things to consider as follow-ups.
Comment #94
yched CreditAttribution: yched commentedThe current patch should fix #1400256: Tests for: $info['fields'], not always set?. The steps-to-reproduce for this bug mostly exist on the D7 side, though, so I'm working on the D7 backport patch (being tested right now at #1305362-38: Helper issue for #1040790).
Comment #95
yched CreditAttribution: yched commentedcrosspost :-)
Comment #95.0
yched CreditAttribution: yched commentedBetter formatting, added example, clarified RTBC status.
Comment #96
yched CreditAttribution: yched commentedAttached is a D7 backport. Passes tests : #1305362-47: Helper issue for #1040790
Comment #97
catchSorry folks I went to commit this but it no longer applies - hunk in field.info.inc, quick re-roll?
Comment #98
swentel CreditAttribution: swentel commentedRe-roll, hope it's ok, did it really quick, so wait for the bot :)
Comment #100
swentel CreditAttribution: swentel commentedMeh forgot a file.
Comment #101
yched CreditAttribution: yched commented#100 should be correct, thks @swentel !
Comment #102
yched CreditAttribution: yched commentedAh wait, the patch was broken by #1757504: Regression: language field is not visible on manage display, made changes in _field_info_prepare_extra_fields(). We need to report the changes to the new location of the function. Working on it.
Comment #103
yched CreditAttribution: yched commentedOK, here it is.
Comment #104
yched CreditAttribution: yched commentedComment #105
bojanz CreditAttribution: bojanz commentedTried to test the D7 backport on a test instance of Commerce Kickstart, got:
.
(both on frontpage and on update.php). So, contrib will need some updates I guess?
Comment #107
yched CreditAttribution: yched commented@bojanz : aw, right, some contrib modules call _field_info_collate_fields() directly :-/. I guess we'll need to bake some more BC in the D7 version. Will try to look into that later today.
Comment #108
yched CreditAttribution: yched commented#103's fail is #1779638: Unexplained test failure in LocaleFileImportStatus->testBulkImportUpdateExisting() again, patch is green.
Back to RTBC.
Comment #109
catch#103: field-info-cache-1040790-103.patch queued for re-testing.
Comment #111
yched CreditAttribution: yched commented#103: field-info-cache-1040790-103.patch queued for re-testing.
@catch : it's the same unrelated fail - it's all over the queues these days :-)
Let's try to be lucky, but #103 is green, really.
Comment #112
yched CreditAttribution: yched commentedGreen :-)
Comment #113
yched CreditAttribution: yched commented@bojanz: can you try with the attached D7 patch ?
(interdiff with #96 attached)
Comment #114
bojanz CreditAttribution: bojanz commentedThis needs more testing.
Applying the patch to a Kickstart v1 (D7 + Entity API + Rules + Commerce) install worked for me, but applying it to Kickstart v2 (Kickstart v1 + many contribs such as message, title, fences, crumbs, etc) didn't (breached 128M memory limit).
Preliminary poking showed that the problem is with calling field_info_instances() from hook_field_extra_fields(), because that causes it to loop back to FieldInfo::getBundleInfo(), over and over. This is something that Title does, as well as Message, as well as Commerce (and yet I didn't trigger from Kickstart v1, but maybe that's because that site uses less memory, so the 128M limit is further away).
After commenting out the call to hook_field_extra_fields() from inside FieldInfo::getBundleInfo(), I got the site to run.
Initial Kickstart v2 install without the patch: 1.5M.
With the patch: 0.25M.
(I used memory_get_usage(TRUE), which might be a bit too crude)
Comment #115
yched CreditAttribution: yched commented@bojanz: what about this one ? (also includes @David Rothsein's test from #1400256: Tests for: $info['fields'], not always set?)
(interdiff with #113 attached)
Comment #116
yched CreditAttribution: yched commentedSorry, rather this one.
(interdiff with #113)
Comment #117
bojanz CreditAttribution: bojanz commented(Still testing Kickstart v2, as a heavy enough environment to represent a typical site)
This one works. However...
"drush cc all" now takes significantly longer (about a minute and a half I'd say).
The first page load after the "drush cc all" exceeds the 30s load time and crashes.
(This happens for every "drush cc all", not just the initial one after applying the patch)
After that it works, and I see a consistent decrease in memory usage (from 0.85M to 1.25M per page as reported by devel).
I'm also seeing a consistent increase in page load times, between 200ms and 500ms per page.
So it seems there's still something fishy going on.
I can xhprof this on sunday if nobody beats me to it.
Comment #118
BerdirTested this on 7.x as well, seeing similar results. looking at with xhprof shows 18 calls to FieldInfo::getFields() through field_info_fields(), which is called by commerce_info_fields() (commerce_product_reference_entity_view()) and _nodeconnect_get_ref_fields() in mycase.
I'm quite sure that commerce can be optimized to get rid of these calls, especially because we're not actually using commerce_product_reference on this site at all. Don't have time to look into that right now, @bojanz, can you have a look at that? Another performance issue related to that function: #1567090: Unconditional call to entity_metadata_wrapper() in commerce_product_reference_entity_view().
Comment #119
swentel CreditAttribution: swentel commentedDid a test with the D7 patch as well on a current site:
- 138 fields
- 305 instances
yes it's a complex site :)
Before patch:
Page execution time was 112.85 ms. Memory used at: devel_boot()=1.82 MB, devel_shutdown()=16.98 MB, PHP peak=17.25 MB.
After patch:
Page execution time was 84.28 ms. Memory used at: devel_boot()=1.82 MB, devel_shutdown()=11.82 MB, PHP peak=12.25 MB.
Also as bojanz reported, drush cc all takes a long time, first call to the page didn't go white though.
Memory drop is fantastic, and didn't see an increase on page loads so far.
I'm going to run this one in pre-production for a few days because I only briefly browsed around on the site so far. Client is giving in a lot of content and browsing a lot, so I'll hear reports from them soon enough once something might go wrong.
Comment #120
geerlingguy CreditAttribution: geerlingguy commented(Firstly, sorry for the brain-dump of a comment to follow… I’ve been debugging for an hour or so, and can’t piece together everything as succinctly as I’d like.)
After applying the patch to a local D7 site, I kept getting out of memory errors (with 256MB RAM allocated in
memory_limit
), in different field files (field.crud.inc
,field.info.inc
,field_sql_storage.module
, etc.).I was using the APC module and caching cache_field and some other caches in APC, but this was causing out of memory errors even when I had PHP’s memory limit set to 1, 2, 3 or 4 GB (yikes!). So, I turned off APC caching, and instead used database caching (which helped the httpd process stay below 200MB for the most part). But now the load switched from memory to CPU, and the max time limit kept getting hit (I tried letting it run for 15 minutes at one point).
When I could get some debug output, after 15 minutes of a process running, I found that there were tons and tons of calls to
field_read_instances()
, each with slow-running queries to{field_config_instance}
and{field_config}
that took about 43ms (looks like these calls were coming fromgetFieldMap()
). One such query is:The below query was also called over 450 times, and I don't think it was cached:
The entire time, CPU usage spiked at 100% and stayed there.
So… for a site with 550 fields, 5,049 field instances, 12 node types, and 479 profile types (so 491 entity bundles), it looks like this patch isn’t very helpful when trying to build caches…
Comment #121
swentel CreditAttribution: swentel commentedYes, I was to hopefull but had to revert as well. Clearing cache, but also simply adding a new field, was a real pain. In our case, cache was handled by memcache which sometimes got completely crazy. I'm going to dig in this a bit during the weekend.
Comment #122
geerlingguy CreditAttribution: geerlingguy commentedIf I statically cache the results of the second query in #120, I still run out of execution time after 470 calls to the first query and 800 seconds of time.
Comment #123
yched CreditAttribution: yched commentedOk, real-life use cases are on D7 right now, so I think we'd better switch this to D7.
When we get a correct D7 patch, we can come back to the usual "fix in D8 then backport" way.
Comment #124
yched CreditAttribution: yched commentedRegarding the slowness reported in #120:
I'd be surprised if getFieldMap() was a problem. It's statically and persistently cached, and even in contexts where cache gets cleared 10 times a request, I don't see how the request could get painfuly slow, even with 5000 records in field_config_instance
What I think is the issue, though, is that the patch changes field_info_fields() and field_info_instances() (the ones that return *all* fields and *all* instances) to not be statically nor persistently cached. As per the early posts in this discussion (around #20 / #30), it was decided to stop caching those.
While I think that's valid for D8, it's not a behavior we can change for D7. While working on this patch earlier this spring, I grepped a couple "big" contrib modules, and there's an *awful* amount of code out there that calls field_info_instances(), even in places where it could do differently - just because they can : field_info_[fields|instances()] currently cost nothing, everything is in memory anyway.
I filed a couple patches against a couple of those modules back then, but it's a lost battle.
- In D8, there will be smarter ways to navigate through the list of fields and instances to find a specific one (the fieldMap is one).
- In D7, we need to keep field_info_fields() and field_info_instances() fully cached - just not loaded in memory by default on every page, only when explicitly requested.
Then, a call to one of those will have a cost (load or rebuild a huge array, have it sit in memory), but a series of calls - which is what I think is happening on the cache clears mentioned in the comments above) will mean no extra cost.
Contrib modules calling those in regular page views will need to be fixed individually.
I see no other way around for D7. We can fix field_info_instances($entity_type, $bundle), we cannot fix calls to the "gimme all" versions, only try to progressively move away from them.
Current D7 patch, being a straight backport of the D8 patch, still has field_info_fields() and field_info_instances() uncached, which is why IMO we have those issues on cache clears.
Comment #125
geerlingguy CreditAttribution: geerlingguy commentedThat sounds workable. The main offender, in my case, is Profile2, and since it's hopefully a candidate for core inclusion, it's already being cleaned up quite a bit... And this kind of conversion would probably be considered as part of the inclusion cleanup. (See #1668292: Move simplified Profile module into core).
Comment #126
yched CreditAttribution: yched commentedAttached patch adds back static caching for field_info_fields() and field_info_instances().
Tested on Commerce Kickstart 2, full setup with demo content :
- without patch, 'drush cc all' takes 30 seconds
- with patch, 'drush cc all' takes 32 seconds
Also :
I actually installed kickstart 2.0-beta2 a couple days ago.
Back then, the patch #116 would die on an infinite recursive loop, because commerce_product_reference_field_extra_fields() was calling field_info_instance().
I needed to split collecting extra fields into a separate function and cache entry to be able to get a cache clear to run.
Since bojanz was able to test the patch in #117, I assume the code has changed in kickstart 2.0-rc1.
It seems wise, however, to keep this separation. I don't know what else we have in contrib. That means an extra cache hit per bundle rendered in the page.
So, attached patch:
- adds back static caching on getInstances() and getFields()
The static lists are shared with getField($field_name) and getBundleInstances($entity_type, $bundle), so that we don't bloat memory with twice the same fields and instances.
Still @todo : add persistent caching ? Not sure, given the numbers above
- moves getBundleExtraFields() to a separate method
Comment #127
yched CreditAttribution: yched commentedOops, missed a rename.
Attached interdiff with #116.
Comment #129
yched CreditAttribution: yched commentedSharing the static lists raises an issue because they keep track of non-existent fields and bundles :
field_info_instances('non_existsing_entity_type') -> array()
field_info_instances('non_existsing_entity_type', 'non_existing_bundle') -> array()
field_info_instances('non_existsing_entity_type') -> array('non_existing_bundle' => array())
field_info_field_by_ids() -> array(1 => $field_1, 2 => $field_2)
field_info_field_by_id(99) -> NULL
field_info_field_by_ids() -> array(1 => $field_1, 2 => $field_2, 99 => NULL)
Hence the test fails above.
Not sure about the right fix.
Comment #130
pounardI see two different fixes (probably more are possible):
1) Keep two statics, one with real data, and the other one which only says 'exists' or 'does not exist' (TRUE or FALSE for example), when requesting a single field, attempt lookup in the second, and if exists then pick the value from the first. When requesting the complete list just return the first.
2) Proceed to array_filter() calls when returning it complete.
EDIT: While 2) is simpler, it's also less performance wise (it creates a whole lot of new arrays with array_filter()), so I think 1) is a good path even if a bit more complex.
Comment #131
yched CreditAttribution: yched commented@pounard :
Right, I considered array_filter() too, but we don't want to create copies of the massive arrays.
Attached patch goes for something like your approach 1).
It passes the 'field info' tests, let's see about the rest.
Patch also updates phpdocs, since field_info_fields() and field_info_instances() do statically cache their stuff now.
Comment #132
yched CreditAttribution: yched commentedComment #133
pounardIt looks fine, but I don't know the whole patch so don't consider my judgement is reliable. It needs to be manually tested and measured I guess.
Comment #134
geerlingguy CreditAttribution: geerlingguy commentedAwesome results! This patch halved the page load times for most pages on my test site. The site has 541 fields, 4,629 field instances, 12 node types, and 490 profile types. I've attached a file with results from three page loads, with a cache clear and a normal page load (cached), on two different pages on the test site. One page was complex, with a couple views, a few entities loaded, etc. Another page was a simple node page.
Here's a quick summary of the results (before and after the patch):
Before Patch
After Patch
Differences
After the caches are built, page load times and memory usage more than halve. This is better than I had expected! For smaller sites, this will only mean a couple MB savings at most... and cache clears won't take nearly as long as my tests (I don't know of any other site that has as many entities, but I imagine Commerce and Profile2-heavy sites may come close).
Anyways, I'd gladly take an extra bit of time during a cache clear (which only happens every few days on this particular site) for 2x faster page loads and half the memory requirements!
[Edit: One more note: if you look at the detailed results file in the next comment, you'll notice I have the sizes of the cache_field table before and after the patch listed as well. Before the patch, it's about 6 MB. After, it's over 100 MB. I wouldn't be able to cache table in APC after the patch (limited RAM), but that's fine with me since the entire 6 MB field doesn't need to be loaded on every request. Just something to note.]
Comment #135
geerlingguy CreditAttribution: geerlingguy commentedWhoops! I forgot to attach the detailed results file...
Comment #136
webchickCool! Let's get some results from a few more people. Also moving to 8.x and to be ported.
Comment #137
Lars Toomre CreditAttribution: Lars Toomre commentedFYI... When next re-rolled, please check for white spaces at the end of lines. There were two instances in the patch in #131.
Comment #138
geerlingguy CreditAttribution: geerlingguy commentedYeah, noticed that, but forgot to mention.
Comment #139
geerlingguy CreditAttribution: geerlingguy commentedAttached patch for D7 fixes whitespace errors.
Comment #140
yched CreditAttribution: yched commented@geerlingguy : thanks for the tests (and reroll) ! It would be good to hear from Commerce Guys guys (sorry, lame, couldn't resist :-p), but your numbers are very encouraging.
Regarding the size of the cached entries : what the patch does is store fields + instances involved in a given bundle in a dedicated cache entry for the bundle. Thus, $field definitions get duplicated in the cache entries for the bundles it appears in.
Going from 6Mb to 100Mb seems a lot, though. Beware that this cache table also contains the field values cache.
Would be interesting to check how many "field_info:bundle:[entity_type]:[bundle]" entries you have, and how many fields are stored in each of them, see if that matches your sites field map.
Also - we still need to add back persistent caching of "all fields" and "all instances" IMO. I'm hoping to be able to roll a patch later tonight
Comment #141
webchickOops. Guess this is still NW then. :(
Comment #142
bojanz CreditAttribution: bojanz commentedSorry, traveling at the moment, will try to test the latest patch as soon as I stop (in about 2 days).
Great work so far!
Comment #143
yched CreditAttribution: yched commentedAttached patch:
- adds back persistent caching for field_info_fields() and field_info_instances()
- exposes the "lightweight" field map (now includes field type) through a new field_info_field_map() function, which should cover most current use cases for field_info_fields() / field_info_instances(), so that contrib can start getting rid of those expensive calls.
- updates phpdocs for the new caching behavior (we do cache full lists)
Comment #144
yched CreditAttribution: yched commentedFixes phpdoc for Fieldinfo::getExtraFields(), was blindly copy pasted from getBundleInstances().
Comment #145
yched CreditAttribution: yched commentedAnd the corresponding D8 patch.
I'm going afk til the end of October. I'm leaving it to you folks to drive this home :-)
Comment #146
yched CreditAttribution: yched commentedEr, D8 patch attached.
Comment #146.0
yched CreditAttribution: yched commentedAdjusted follow-up task description.
Comment #147
yched CreditAttribution: yched commentedUpdated the "API changes" section in the issue summary, to match the current patch.
Comment #147.0
yched CreditAttribution: yched commentedUpdated for latest changes
Comment #149
yched CreditAttribution: yched commented#146: field-info-cache-1040790-145-D8.patch queued for re-testing.
Comment #151
catch#146: field-info-cache-1040790-145-D8.patch queued for re-testing.
Comment #152
pounardI applied the latest D7 patch version and I get exceptions thrown by search api, when rebuilding views data, when flushing the caches. I'm not giving any details right now but I'm not sure of the source, but this wasn't happening before applying the patch.Ignore that this doesn't seem related, just a weird coincidence.Comment #153
catch#134 is very, very encouraging. I think it's fine to trade the extra execution time on a cache miss for better runtime performance, especially considering peak memory usage is a lot lower as well.
Comment #154
joachim CreditAttribution: joachim commentedI've applied this patch to my dev site that has commerce and a ton of fields on lots of bundles, and it certainly seems to make page loads quicker. No error messages on cache clear.
This function has changed its signature.
Does this patch needs a change request, or is it ok because it's a private function?
Comment #155
geerlingguy CreditAttribution: geerlingguy commentedUsing D7 patch in #144, clearing caches takes about 5 seconds longer, but normal page load times and memory usage were the same. It's crazy how much faster the site feels with this patch. Most pages load in half the time they used to!
Hoping to hear more of the same from some commerce user... If all that's needed is a quick test of before/after with a default commerce install, I'd be glad to do that, but it seems like someone who has an actual site with real data to test would probably be preferable.
Comment #156
pounardWhen I tested it over a complex commerce site with 100+ fields here, I didn't see any difference in page generation time at all (tested on pages displaying 30 products and 30 associated nodes). Important note thought: the windows box I'm working on spends half the time on drupal_load() and PDO::execute() alone. Aside of that, drupal_alter() take more time than every other field API operation (3000+ drupal_alter() calls dues to the field API alone). I didn't notice any difference when clearing the caches. After all that I got unrelated troubles with Search API.
This is an objective feedback, aside of that, all other feedbacks are really encouraging.
Comment #157
swentel CreditAttribution: swentel commentedDid two testing scenarios (with the D7 patch), created a setup with a drush script (works for D7 and D8) to quickly create content types and generate a number of fields (currently only textfields though) - see http://drupal.org/sandbox/swentel/1781486 - just run 'drush fact anynumber' (defaults to 10)
Scenario 1
7 content types, 286 fields and instances, 100 nodes - 1 node type has 236 fields, others have between 10 and 26.
Request on frontpage - nodes from various types generated by devel generate (100)
Before: Page execution time 130.94 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=11.14 MB, PHP peak=11.5 MB.
After: Page execution time 134.38 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=10.38 MB, PHP peak=10.75 MB.
Request on node with 236 fields
Before: Page execution time 131.76 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=11.27 MB, PHP peak=12.75 MB.
After: Page execution time 127.95 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=10.28 MB, PHP peak=11.75 MB.
Request on node with 16 fields
Before: Page execution time 62.61 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=10.06 MB, PHP peak=10.25 MB.
After: Page execution time 51.03 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=4.98 MB, PHP peak=5.25 MB.
Request on node with 4 fields (standard core article content type)
Before: Page execution time 57.77 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=10.02 MB, PHP peak=10.25 MB.
After: Page execution time 48.71 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=4.74 MB, PHP peak=5 MB.
Scenario 2
10 content types, 50 fields each (so yeah total of 500), 1000 nodes
Request on frontpage - nodes from various types generated by devel generate
Before: Page execution time 113.48 ms. Memory used at: devel_boot()=0.98 MB, devel_shutdown()=14.95 MB, PHP peak=15.25 MB.
After: Page execution time 107.48 ms. Memory used at: devel_boot()=0.98 MB, devel_shutdown()=11.12 MB, PHP peak=11.5 MB.
On the frontpage, only 7 different node types where rendered.
Request on node with 50 fields
Before: Page execution time 79.62 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=14.07 MB, PHP peak=14.75 MB.
After: Page execution time 62.09 ms. Memory used at: devel_boot()=0.99 MB, devel_shutdown()=5.62 MB, PHP peak=6.25 MB.
Comment #158
geerlingguy CreditAttribution: geerlingguy commentedWhoops—we need to get an answer for #154:
Other than that, it seems like this can go back to RTBC...
Comment #159
catch#154 just needs review for me, and doesn't matter for 8.x commit.
Comment #160
geerlingguy CreditAttribution: geerlingguy commented#146 seems fine to me; works for D8 in manual testing (creating fields, deleting fields, viewing nodes, etc.). I didn't do any performance benchmarks on it, though, since I only had a couple bundles and a few fields. Anyone else able to test and confirm RTBC for D8?
Then we could set this back to CNW to resolve the question mentioned in #154 for D7, after a D8 commit.
Comment #161
catch#146: field-info-cache-1040790-145-D8.patch queued for re-testing.
Comment #163
swentel CreditAttribution: swentel commentedReroll of D8 patch
Comment #165
swentel CreditAttribution: swentel commentedOk, now that's a nice surprise, I'll recheck this evening.
Comment #166
swentel CreditAttribution: swentel commentedOk, needed some re-rolling after #1326634: Clean up API docs for field module and #1785256: Widgets as Plugins got in - c'mon bot!
- edit - also removed the t() calls in fieldInfoTest.php, but mistakenly thought that was a new file, will need a new re-roll tomorrow. see #1797106: Remove t() from asserts messages in tests for the Field module for more background.
Comment #168
swentel CreditAttribution: swentel commentedOk, let's see what this does.
Comment #170
xjmComment #171
xjmSpoke to @swentel and re-un-tagging based on the current status here since there are functional parts of the
_ficf()
patch that are broken after widgets-as-plugins. Sorry for the noise!Comment #172
swentel CreditAttribution: swentel commentedAlright here we go - thanks to Stalski as well for the debugging.
Comment #173
swentel CreditAttribution: swentel commented#172: 1040790-170.patch queued for re-testing.
Comment #174
geerlingguy CreditAttribution: geerlingguy commentedLooks good to me. Here are some quick performance comparisons, from an upgraded site with about 25 fields shared across 8 node types (bundles) - 3 runs each for comparison:
Before Patch
After Patch
The page load time for the complex page is a little odd, don't know why it took longer after patch, but RAM savings are apparent, even though slight on a smaller site. I think swentel said he was going to do a little testing on a larger site instance too.
Comment #175
swentel CreditAttribution: swentel commentedDid a test like the last one in #157 for the D8 patch as well:
10 content types, 50 fields each (so yeah total of 500), 1000 nodes (devel generate)
Request on frontpage - nodes from 7 different node types
Before: Page execution time was 184.68 ms. Memory used at: devel_boot()=1.18 MB, devel_shutdown()=15.47 MB, PHP peak=15.75 MB.
After: Page execution time was 178.27 ms. Memory used at: devel_boot()=1.19 MB, devel_shutdown()=12.5 MB, PHP peak=12.75 MB.
Request on a single node page with 50 fields
Before: Page execution time was 95.89 ms. Memory used at: devel_boot()=1.18 MB, devel_shutdown()=14.41 MB, PHP peak=14.75 MB.
After: Page execution time was 72.91 ms. Memory used at: devel_boot()=1.2 MB, devel_shutdown()=5.72 MB, PHP peak=6 MB.
Comment #176
swentel CreditAttribution: swentel commented#1797170: Remove t() from assertion messages in tests for field_ui.module and #1797106: Remove t() from asserts messages in tests for the Field module got in - last re-roll of the day. Tagging with 'avoid' as well.
Comment #177
BerdirStill looks good. Let's get this in, it's about time :)
Comment #178
catchReally good to see this RTBC again. Committed/pushed to 8.x!
Comment #179
geerlingguy CreditAttribution: geerlingguy commented#144: field-info-cache-1040790-144-D7.patch queued for re-testing.
I think that the D7 patch should be RTBC if tests pass, but there was a lingering question in #154: "This function [
_field_info_collate_fields()
] has changed its signature. Does this patch needs a change request, or is it ok because it's a private function?"Comment #180
pounardI'm thinking that a lot of modules may use this function. Even if the underscore denotes a private function, I guess that a lot of modules will be tempted to use it.
Comment #181
tim.plunkett#180, #1288364: Clean up reset patterns in field.info.inc already changed the signature, and there was no change notice. It's a private function, if they used it that's their problem.
I grep'd through major contrib modules, and all I could see was one call in entity.module that would need to be changed to
_field_info_field_cache()->flush();
Comment #182
geerlingguy CreditAttribution: geerlingguy commentedI agree wholeheartedly. Now that I think about it... everywhere private functions are mentioned, there are heavy warnings to NOT use them in external code, and to do so at your own risk (see two examples).
Therefore, with a passing test in #144, I'm RTBCing.
Comment #183
yched CreditAttribution: yched commentedI *really* can't see any use case for 3rd party code to use _field_info_collate_fields() (no param) directly, we have wrapper API funcs for that. Any module doing so could be rightfully considered broken, but I seriously doubt there is any.
There are direct calls to _field_info_collate_fields(TRUE) (reset the cache), though. Entity API is one example (see #105/#107 above).
Entity API is not *supposed* to do that, but I guess it has kind of valid reasons - possibly because the official field_info_cache_clear() API func also clears the "list of available field types, widgets, formatters" cache, and Entity API, doing funky cross dependent stuff with various info caches, needed a more granular clear). So it seems a valid expectation to keep those calls working, and we can do so with minimal efforts.
IMO, this doesn't deserve a mention in the release notes, but I guess that's up to @webchick and @David_Rothstein to decide.
PS : yay for D8 commit, thanks @swentel & @Stalski for the rerolls !
Comment #184
webchickThis one has serious potential to break things on 7.x, so assigning to David to eyeball first.
Comment #185
alippai CreditAttribution: alippai commentedRelated issue: #1802074: Undefined index: fields field.info.inc:596
Comment #186
guy_schneerson CreditAttribution: guy_schneerson commentedRelated issue #1810178 _field_info_collate_fields() is not localized
This requires language specific versions of the cache variables. It should not affect memory usage as long as global $language; dose not change through the lifetime of a single request.
Happy to close the related issue if this patch can address the issue or look at it again after this is committed.
Comment #187
David_Rothstein CreditAttribution: David_Rothstein commentedThanks for all the work on this patch, as well as the extensive Drupal 8 reviews and Drupal 7 testing!
I'm a bit concerned that despite the fact that this has been filed against Drupal 7 for the last month there haven't really been any in-depth reviews of the Drupal 7 code itself. (Especially because from looking at the first 10 lines of the patch it's pretty clear that there is Drupal 8-specific stuff that carried over into it.)
Anyway, I spent a fair amount of time giving this a review myself. Overall it looks like a great improvement - while it's pretty clear this will hurt performance a bit in some cases, it seems like the huge memory gains (and performance improvements in other cases) do make it worth getting this into Drupal 7.
I separated my feedback below into a few categories based on severity. (I also tried to note places where my comments applied to the Drupal 8 code as well.) And I'd really feel more comfortable if other people gave this a mega-review as well. This is one of the bigger and more far-reaching patches we're ever going to commit to Drupal 7...
Commit-blockers and commit-blocking questions
__DIR__ does not exist in PHP 5.2, so we can't use that. Also, huh? I'm pretty sure that whole include_once line can just be removed :)
This throws up some red flags for me, since field_info_instances() is a public API function... Why is the expected output of field_info_instances() changing?
I looked through the codebase of a Drupal 7 site I had and found one right away:
http://drupalcode.org/project/field_delimiter.git/blob/abca25a20b5e05b06...
(Though granted it's an alpha release and the code is actually already gone in the latest dev version.)
The Internet found me one more:
http://drupalcontrib.org/api/drupal/contributions!cck!cck.module/function/cck_debug_field_info/7
(Although yes, that's dev code also.)
Anyway, this is not a public API function, but it's clear people are using it on occasion and especially for something as fundamental as Field API we try not change it in a stable release unless we have to (#181 points out that this function was changed in #1288364: Clean up reset patterns in field.info.inc but that looks like a Drupal 8 issue only).
So my question is simply this: Would it really be hard to keep around? Wouldn't it mostly just be a wrapper around class methods that the patch is already adding?
Currently the D7 patch removes them outright, though. So, same question as above - is this necessary? I have no idea if people are using them or not, but again, it seems like it would be very very simple to just leave these around as simple wrappers on the new class methods.
Would be really, really good to resolve before commit
(and tons of other places). These code comments are referring to the Drupal 8 class name, not the Drupal 7 one.
I'm not sure if we have an official standard for this, but I really haven't seen Drupal 7 (as opposed to Drupal 8) use this kind of file naming convention before. Shouldn't it be something like field.info.class.inc instead?
This will cause fatal "ambiguous column name" errors in some cases, since a lot of columns have duplicate names between the two tables but the regular keys aren't being prefixed with "fc". (For example, try something like this:
field_read_fields(array('entity_type' => 'node', 'field_name' => 'field_image'))
.)The only reason this isn't a strict commit blocker is that I suppose it can only occur if you're using the new functionality, but it would still be a really good idea to fix.
Affects Drupal 8 also.
Other issues
This looks like the tests from #1400256: Tests for: $info['fields'], not always set? but I don't see it in the Drupal 8 patch. Shouldn't it be forward-ported there (or is that what the other issue is being left open for?)
Typo ("retriev" => "retrieve").
Affects Drupal 8 also.
This function doesn't exist.
Affects Drupal 8 also.
That code comment isn't true, is it?
Affects Drupal 8 also.
Typo ("exists" => "exist").
Affects Drupal 8 also.
I'd probably move this to an inline comment (rather than part of the PHPDoc), no? It doesn't make much sense out of context.
Affects Drupal 8 also.
Missing parentheses after "provided".
Affects Drupal 8 also.
Comment #188
jhodgdonThis patch has also had an "avoid commit conflicts" tag for months. Can that be removed?
Comment #189
geerlingguy CreditAttribution: geerlingguy commentedYeah, I'm pretty sure that was really just for D8 HEAD tracking. We're past that point, and we just have some follow-up things to address.
Comment #190
jhodgdonTHANK YOU. I was getting tired of looking for the patch in #144 every time I needed to make a commit. :) [unsubscribing]
Comment #191
yched CreditAttribution: yched commented@David_Rothstein:
Thanks for the thorough review. I'll try to update the D7 patch asap.
Meanwhile, I posted patches for the items you mention that apply to D8:
#1837018: Possible "ambiguous column name" error in field_read_fields - nice catch !
#1837022: Field API comment fixes
Comment #192
mgiffordAny update? Is the patch in #176 safe to apply to a production environment yet?
We're running into some nasty looping that we're hoping to resolve.
Comment #193
mgifford#176: 1040790-176.patch queued for re-testing.
Comment #194
geerlingguy CreditAttribution: geerlingguy commented#144 is the lastest D7 version of the patch, and it's working very well on one production site, keeping page loads down to 10-20 MB (were 160 MB or more). Cache clears cause a long rebuild, though.
Comment #195
yched CreditAttribution: yched commentedSo, yeah, sorry for not reporting earlier.
David told me he is not planning a new D7 "regular" (non-security) release before January, so I'm focusing on D8 work until feature freeze, and will come back at this after Dec 1st.
Comment #196
yched CreditAttribution: yched commentedGetting back at this.
Attached patch fixes most of @David_Rothstein's remarks in #187.
I'll talk about the remaining points in a followup comment (probably tomorrow, though).
- Removed the implicit include before using the FieldInfo class in _field_info_field_cache()
Not having it caused fatal errors in tearDown() on all upgrade tests.
This gets fixed in UpgradePathTestCase : its setUp() unregisters the code registry, so its tearDown() needs to re-register it before running regular cleanup code.
Also, added a dummy update function to force a registry rebuild.
- Added back BC functions for the _field_info_prepare_*() helper functions.
- Moved the class file from modules/field/FieldInfo.php to modules/field/field.info.class.inc.
Got rid of the other D8-isms.
- Fixed the possible "ambiguous column name" errors in field_read_fields(), and added tests (D8 fix got in meanwhile : #1837018: Possible "ambiguous column name" error in field_read_fields)
- Fixed the doc remarks (D8 fix got in meanwhile : #1837022: Field API comment fixes)
Comment #197
yched CreditAttribution: yched commentedSpotted after I uploaded the patch :
- one extra newline before FieldCrudTestCase::testReadFields()
- assert messages in newly added tests were not wrapped in t()
Fixed locally, will be in the next reroll.
Comment #199
yched CreditAttribution: yched commentedSo, turns out that the fixes in UpgradePathTestCase() in #196 still leave *some* upgrade tests failing with a fatal if we solely rely on the class registry to do the autoloading. I tried to investigate a bit, but I can't make sense of this for now.
Thus, added back an explicit include for the class file (with a @todo) before instanciating the FieldInfo object - and reverted the UpgradePathTestCase changes.
Comment #200
yched CreditAttribution: yched commentedSo, the remaining points in @David_Rothstein's #187 were:
- field_info_instances() output slightly changing :
Previously, _field_info_collate_types() initialized the 'instances' array (the one returned by field_info_instances()) to an empty array for all existing entity types and bundles. So you could do something like:
with no PHP notice, without checking that the entries existed.
We stopped doing that in the D8 version, because this means calling entity_get_info() when rebuilding field_info and causes nasty cache rebuild dependencies.
Also, the above code pattern is actually not the typical use of field_info_instances() - if you want the instances of a specific entity_type and bundle, you'll use field_info_instances($entity_type, $bundle), not field_info_instances(give me all) - when using the latter, you typically iterate on the entries in the resulting array.
But, well, sure, I guess we cannot really change that in D7.
Attached patch reintroduces the the previous behavior (based on entity_get_info()) in FieldInfo::getInstances() - the regular getBundleInstances($entity_type, $bundle) does not rely on entity_get_info().
Patch also reverts the corresponding test changes.
- Should we keep an emulated _field_info_collate_fields() around ?
Well, I'm still convinced that any code calling this function directly is buggy and at fault.
If we can't remove that one, then it means our 'no API breaks" policy extends to all existing functions, and that there is no such thing as an "underscore prefix" ?
But well, I'm not in the shoes of a major version maintainer, and I fully respect your concerns :-).
Attached patch emulates the previous behavior of _field_info_collate_fields() - it's going to be a performance drag though...
Additionally:
- fixed a minor glitch in FieldInfo::getField() (D8 got fixed in #1831202: wrong param for field_read_field(), forgot to report it in the backport)
- found & fixed a minor bug in FieldInfo::getBundleInstances(), incorrectly caches bundles with no instances, the info is never statically cached, gets re-read from persistent cache each time. Since this deals with internal caching in protected members, this is not really testable. Will need to get fixed in D8 as well, I'll open a separate issue.
And well, this, my friends, is comment #200 :-/
Comment #201
yched CreditAttribution: yched commentedFrom #200 above:
Posted D8 patch at #1864720: Minor hysteresis in field_info_instances(). While doing so, I found out that the bug is actually observable and testable.
Attached patch adds the same test than in the D8 patch (also slightly adjusts the comment around the bugfix).
Comment #202
valthebaldWorks for D8. Testbot likes it. Checked locally. May I mark this RTBC?
Comment #203
YesCT CreditAttribution: YesCT commented@valthebald did you mean D7?
Comment #204
valthebald@YesCT: sure. I applied patch from #201 to the latest 7.x sandbox
Comment #205
obrienmd CreditAttribution: obrienmd commentedWorks for me for D7 install w/ lots of custom modules interacting w/ field API and entity API.
I see a 2-3% performance increase on front page w/ 10 nodes and 3-5% on node/%nid.
Comment #206
klausiThis is not a dynamic query, so it should use db_query() instead of db_select(). Minor performance improvement (avoids a lot of expensive PHP function calls), but it should not matter here since this is not called very often anyway.
We really need to update the documentation page on dynamic queries, since in 90% of the cases db_select() should not be used. http://drupal.org/node/310075
Comment #207
geerlingguy CreditAttribution: geerlingguy commentedSome test results (All tests were averaged over three runs on a site with 10 entity types and 101 field instances.)
Before patch:
cache_field table size: 272KB
Node listing page (with about 50 nodes): 509ms / 15.75MB RAM
Node listing page cache clear: 5676ms / 37.75MB RAM
Single node page: 342ms / 15.25MB RAM
After patch:
cache_field table size: 1.5MB
Node listing page (with about 50 nodes): 543ms / 14.5MB RAM
Node listing page cache clear: 7369ms / 39MB RAM
Single node page: 341ms / 13MB RAM
No problems with the patch (this is run on a dev instance of a live site), and the memory savings are apparent. The timings fluctuated a bit, so the accuracy isn't that great. Definitely no regression, except during a cache clear (actually, the rebuild is what takes most of the time).
Leaving at RTBC, and will do a quick re-roll with the query quoted above reworked into db_query() (I agree that there's no reason for db_select() on that one).
Comment #208
geerlingguy CreditAttribution: geerlingguy commentedAttached patch converts the query from field.info.class.inc to db_query(). I was going to format it like below, but it seems the core convention is to spell out the entire query on one super long line, so that's how I made the patch. Is there any guidance on this?
[Edit: According to berdir in IRC, there's no specific guideline, but single-line queries are normal. Just goes against my OCD tendencies :)]
Comment #209
bojanz CreditAttribution: bojanz commentedThe latest patch is missing some files.
Tested #201 on a fresh Kickstart 2.0 install.
Before:
All products view: Page execution time was 1548.19 ms. Memory used at: devel_boot()=0.81 MB, devel_shutdown()=18.2 MB, PHP peak=18.75 MB.
Frontpage: Page execution time was 676.57 ms. Memory used at: devel_boot()=0.81 MB, devel_shutdown()=14.7 MB, PHP peak=15 MB.
admin/content view: Page execution time was 551.83 ms. Memory used at: devel_boot()=0.81 MB, devel_shutdown()=13.86 MB, PHP peak=14.25 MB.
After:
All products: Page execution time was 1805.92 ms. Memory used at: devel_boot()=0.81 MB, devel_shutdown()=17.81 MB, PHP peak=18.25 MB.
Frontpage: Page execution time was 649.89 ms. Memory used at: devel_boot()=0.81 MB, devel_shutdown()=13.69 MB, PHP peak=14 MB.
admin/content: Page execution time was 580.78 ms. Memory used at: devel_boot()=0.81 MB, devel_shutdown()=12.81 MB, PHP peak=13 MB.
Execution times vary wildly and shouldn't be trusted. Memory savings were consistent.
I should also note that Commerce based sites will see a bigger improvement once we commit #1809916: Decrease memory usage by loading fields more selectively to Commerce
I've also tested installing the distro with the patch applied. Found no problems.
Comment #210
geerlingguy CreditAttribution: geerlingguy commentedYikes! Thanks for catching that, bojanz, forgot to add files to staging area before creating patch. Attached patch should be complete.
Comment #211
yannisc CreditAttribution: yannisc commentedApplied patch #210 cleanly. I get no errors or warnings. For a view with a lot of block views, I got 1.58 reqs/sec with the patch, instead of 1.61 reqs/sec without the patch. No cache, no APC enabled.
Comment #212
catchA view's not a particularly good test case for this. Just a node/n page generally works best to see the difference.
Comment #213
alexrayu CreditAttribution: alexrayu commentedTested #210 in WAMP. Site with lots of views, and field collections having fields in each. The patch yielded about 2% memory usage improvement, not distinct database usage improvement, but very high CPU usage resulting in about 1.5 longer page load time. Maybe it's WAMP-specific though.
As for errors and API, all went smooth, the patch brought no errors or any problems in actual functionality or API compatibility.
Comment #214
yannisc CreditAttribution: yannisc commentedIn a node page with 5 fields, I got from 4.01 reqs/sec to 4.09 reqs/sec, logged in as admin.
Comment #215
slashrsm CreditAttribution: slashrsm commentedTested #210 at fairly big media site (195 field instances, 70k articles). Mem usage on frontpage (a LOT of content) went down for ~2,5% and on node page for ~5,5%. CPU usage is slightly lower (but not significant).
Applied cleanly, no errors appeared.
Site uses quite a lot contrib/custom modules. Another level of improvement could be achieved by updating their code too. Did not test that though.
Comment #216
heddnTested #210 on front page of a news media site. Memory usage on front page went down ~12%. Page execution time (single user testing) were roughly 25-100ms faster on page refreshes. There's a lot of external js, etc that gets loaded so page execution time varied widely but was consistently faster.
old-side - page refresh
Memory used at: devel_boot()=5.69 MB, devel_shutdown()=41.37 MB, PHP peak=42 MB.
old-side - cache clear
Memory used at: devel_boot()=6.01 MB, devel_shutdown()=69.7 MB, PHP peak=91.5 MB.
new-side - page refresh
Memory used at: devel_boot()=5.69 MB, devel_shutdown()=36.58 MB, PHP peak=37.25 MB.
new-side - cache clear
Memory used at: devel_boot()=6.01 MB, devel_shutdown()=69.11 MB, PHP peak=90.75 MB.
Marking as needs work though because I tried testing several nodes. The body content of a node is universally empty. Not good. By the time we get to hook_node_view(), it doesn't exist. Not sure what in this patch caused the behavior but removing the patch causes the site to return to normal.
Comment #217
heddnBad me, we also apply #955658-42: Undefined index: module in FieldInfo::prepareInstanceDisplay() and I'd gone ahead and I applied it after applying this patch. I didn't apply it correctly. Moving back to needs review.
Comment #218
becw CreditAttribution: becw commented#210 works for me but provides limited reductions in overall memory usage (.75 to 1 MB) on a site of medium complexity. For example:
I tested the patch on a node page of a site with 70 fields and 136 field instances. I ran each page several times, since there were minor variations in memory usage and I wanted to watch for changes in the execution times. It seems like this patch will have a limited performance impact on sites of this size.
Thinking this was due to the small size, I ran the my test script on a node page of a larger site (which admittedly has some other performance issues). This site has 125 fields and 252 field instances; applying the patch increased memory usage by a small amount:
Is the intent of this patch to reduce the memory usage of
_field_info_collate_fields()
specifically, or of Drupal overall? Because while it may be succeeding at the first, it does not have a significant positive impact on the overall memory usage in the tests I have run.Comment #219
mariusdiacu CreditAttribution: mariusdiacu commentedI tried the patch with a content type with about 60 fields and 12 field collection with 4-5 fields in or even field collection in field collection. Cache is on memcached. For me the improvements was really small.
Comment #220
BerdirFor those not seeing improvements despite having quite a few fields, you might have contrib/custom modules which are calling field_info_instances() without type and bundle or field_info_fields(). (Like commerce does currently, as pointed out be @bojanz)
Right now, that doesn't matter, everything is loaded anyway, so that call is almost free. However, with the patch, that changes and that information is no longer kept in a global cache and needs to be rebuilt every time it is requested. I think that is what even slows down some sites.
Comment #221
yched CreditAttribution: yched commentedThanks to the many people that provided feedback here !
I was just about to post exactly what @Berdir #220 said. The memory gains will be mitigated/nullified by "give me all" field_info_[instances|fields]() calls in contrib module:
- field_info_instances() with $bundle param == NULL
- field_info_fields()
The whole point of the patch is to only load stuff that is needed - i.e the list of fields and instances on a specific bundle.
However, calls to one of the above loads everything in memory, and statically persists it, as in current D7, so as to not degrade performance of existing code. The only difference is that the "whole" info is not automatically loaded in memory on each page request anymore, but read from cache and statically persisted on first call.
The real use cases for field_info_instances("all") are not that frequent, but since the info was available at no cost so far, contrib modules liberally used it instead of more targeted calls.
There's not much that can be done about that, aside from slowly fixing contrib modules to get rid of those calls.
Most uses of field_info_instances("all") can generally be avoided, either:
- by using a slightly different logic (restricting to a series of specific bundles),
- or by using the newly added field_info_field_map(), which provides a lightweight map of fields across entity bundles. Contrib modules would then probably need to use an
if (function_exists('field_info_field_map')) { optimized version } else { old, non optimized version }
construct to preserve compatibility with both pre-patch & post-patch core codebases.Comment #222
tsvenson CreditAttribution: tsvenson commented@yched: (thinking out loud) Would it be possible to task the coder module to help identifying and maybe even assist on rewriting contribs that uses the wrong functions?
Comment #223
DamienMcKenna@tsvenson: The issue summary already includes sample code to that should be used to update existing modules, we just need to identify modules that use the old code pattern and submit patches to improve them.
Comment #224
DamienMcKennaI just looked through some modules I had checked out locally and found the following:
In almost all cases the usage was quite simple and it wouldn't take much effort to update them.
Comment #225
brianV CreditAttribution: brianV commentedTo add to the list in 224 from looking at one of our larger sites:
Invoke field_info_fields():
Apachesolr, Addressfield, ctools, custom_formatter, date, diff, entityreference, field_collection, file_entity, media, geofield, media, microdata, nodequeue, token, views.
Invoke field_info_instances() w/ no arguments or null $bundle:
Acquia Connector, Apachesolr, calendar, ctools, date, devel, entityreference, microdata, style_usage, token, views
Maybe we can create a canonical list, and use it to create a meta-issue to track module-specific issues that address this?
Comment #226
brianV CreditAttribution: brianV commentedI've begun a Google doc at https://docs.google.com/spreadsheet/ccc?key=0AjNi8SoLpdaQdEFjaV9jY3ZwYkl... which tracks the modules reported which need to be adjusted to take advantage of this patch. So far, includes entries from #224 and #225.
Please update it with any modules you are aware of which will need to be updated to take advantage of this patch.
Comment #227
becw CreditAttribution: becw commentedI don't see broad usage of
field_info_fields()
andfield_info_instances()
without arguments in the contrib modules that are installed on the sites I tested against. On the first site, the following modules call one of those functions without arguments outside of the install hooks:These modules have one or more calls to the field_info_(fields|instances) functions without arguments, but none of the calls are in the execution path on my node pages.
Comment #227.0
becw CreditAttribution: becw commentedformatting
Comment #228
yched CreditAttribution: yched commented@becw: field_info_instances($entity_type) (one single argument) is also impacted.
Just as a quick test, I'd suggest adding this at the beginning of field_info_instances() :
and re-test the "patched, cached load" part - some stuff might stop working as they should of course, but just to get an idea.
Thanks for the extensive test, BTW !
@DamienMcKenna, @BrianV:
Sounds like an excellent idea !
Bottomline is: there should be very few real cases for iterating on the list of *all* fields and instances in a "regular" page flow (maybe less true while rebuilding some cached data or participating in an info hook).
Cheap:
- field_info_field_map()
- field_info_instances($entity_type, $bundle)
- field_info_field($field_name) & field_info_field_by_id($field_id), if the field appears in a bundle on which field_info_instances($entity_type, $bundle) has already been called - otherwise it's O(n) on the number of different fields loaded.
(i.e things were optimized for the most common pattern of :
foreach (field_info_instances($type, $bundle) as $instance) { $field = field_info_field($instance['field_name']); }
)- field_info_instance($entity_type, $field_name, $bundle) - same as above, cheap if comes after field_info_instances($entity_type, $bundle), O(n) if not.
Costly:
- field_info_fields(), field_info_field_by_ids()
- field_info_instances(), field_info_instances($entity_type) - extra costly, also loads field data.
We might end up completely removing/renaming the "costly" functions in D8 to make that clearer, but of course that's not an option for D7 :-(
Comment #229
tim.plunkettI used the following to find
field_info_instances()
andfield_info_instances($entity_type)
:grep -nr "field_info_instances([^),]*)" *
Comment #230
DamienMcKenna@tim: Nice regex :)
I identified a few more modules with that: Display Suite, References Manager (that's mine), UserReference URL.
Comment #231
DamienMcKennaSo it seems like we honestly need to first patch a few modules and then compare the numbers.
Comment #232
bojanz CreditAttribution: bojanz commentedLet's start by getting the patch committed.
We've all confirmed that it works correctly and provides memory savings even without contrib patching.
Comment #233
yched CreditAttribution: yched commentedI don't think so. We're not trying to benchmark the patch and see if the changes make sense. This has been established already, and I don't think we can squeeze more perf gains there or take a different approach.
The situation is :
- the patch (strongly) optimizes the good (and most frequent) cases.
- there are some real use cases that cannot be optimized (those that need to iterate the full list of fields or instances), but those should not be the majority, and the patch shouldn't make them perform worse.
- there is a (possibly non-minor) amount of contrib code that currently rely on that non-optimizable logic, and some / most of it could be refactored to use more performant logic.
Comment #234
yched CreditAttribution: yched commentedcrosspost :-)
Comment #235
David_Rothstein CreditAttribution: David_Rothstein commentedI looked at the interdiff between #144 and #210 and it looks great. Just a couple comments; none are commit blockers except (possibly) the first one:
Why did this switch from an inner join to a left join? (In practice it might not matter much, but inner join looked correct here.)
The new patch seemed to introduce lots of these (both in new code and code that already existed in the earlier patch)... we should clean that up in a followup.
Everywhere the patch says that, we should probably switch to refer to the change notification (once one exists).
***
Meanwhile, as for all the testing people have been doing, that looks really great. I get the overall impression that the patch is behaving as expected, but there are a few outliers in some of the reports above...
It actually seems like it's been a long while since anyone did any "official" benchmarking (like with ApacheBench, to get statistically significant results) in this issue; not sure if anyone has done any on a recent patch. I don't think it's a requirement for commit necessarily, but I'd be very curious (and perhaps curious enough to do it myself) what the results are of a few "theoretical" extremes. For example:
My understanding is we expect the patch to provide a big memory improvement for #1 (and maybe even an improvement in page execution time too) and not much of an effect either positive or negative for #2 and #3...
Comment #236
David_Rothstein CreditAttribution: David_Rothstein commentedWhile looking this over I also tried to track things that aren't in D8 yet, since we should make sure there are followups for all of them.
Here's what I found:
Doesn't seem to be in D8... is there a followup issue yet?
None of this seems to be in D8... is there a followup issue yet?
By the way... is "which is uncached" actually correct here?
As mentioned in an earlier comment, doesn't seem to be in D8 yet. I'm just going to repurpose #1400256: Tests for: $info['fields'], not always set? for that after the patch here goes in.
Comment #237
hass CreditAttribution: hass commentedLinkchecker calls field_info_fields(), too. Fixed in #1880652: Refactor _linkchecker_parse_fields() and _linkchecker_replace_fields() to be less expensive.
Comment #238
geerlingguy CreditAttribution: geerlingguy commentedAs far as the D8 issues go, I've opened #1880682: Convert dynamic query in FieldInfo::getFieldMap() to static query for the query, but maybe we could wrap the other changes in another issue?
Comment #239
Jorrit CreditAttribution: Jorrit commentedandypost is right in #5:
file_get_file_references()
needs to be rewritten. What it basically needs is afield_info_field_by_type()
kind of function to retrieve all fields of a certain type. I also need this for the Video module, where I do something similar asfile_file_download()
invideo_file_download()
.Comment #240
yched CreditAttribution: yched commented- Thanks @geerlinguy for the fixes / rerolls !
re @David_Rothstein #236:
- Correct, the "(which is uncached)" comments are not relevant anymore.
Fixed in attached patch.
- Will create a D8 issue to forward-port the other bits.
re @Jorrit #239:
- file_get_file_references() : Right. Created #1880884: Optimize field / instance loading in file_get_file_references() .
- field_info_field_by_type() : I'm not sure we'll provide that, but the new field_info_field_map() would get you 90% of the way there.
Comment #241
yched CreditAttribution: yched commentedCrap, forgot the patch.
Interdiff with #238.
Comment #242
Jorrit CreditAttribution: Jorrit commented#240: field_info_field_map() is sufficient, you're right about that. The changes are small.
Comment #243
marcingy CreditAttribution: marcingy commentedHave tested this on a large drupal site with many nodes and with 295 fields and 657 field instances and am seeing a 5%-10% reduction in memory usage. No other issues with the site were uncovered so this certainly seems to do what it says on the tin.
Comment #244
yched CreditAttribution: yched commentedback to RTBC ?
Comment #245
mgifford#241: field-info-cache-1040790-240-D7.patch queued for re-testing.
Comment #246
David_Rothstein CreditAttribution: David_Rothstein commentedOK, this looks good - thanks everyone! Committed to 7.x: http://drupalcode.org/project/drupal.git/commit/bd1dc1f
This needs the placeholder change notification at http://drupal.org/node/1880666 filled in.
***
To be extra certain this was OK, I decided to run benchmarks for the scenarios I listed in #235 before committing this, specifically:
All the scripts I used and the ApacheBench results are attached to this comment. I was getting some large standard deviations but overall the patch seems to behave as desired. Scenario 2 did have a statistically significant increase in page execution time of a few percent, while the others either stayed the same or got better. Memory results, meanwhile, were consistent with large improvements that others have reported for scenario 1 (PHP peak memory dropped from 33.25 MB to 24 MB with the patch) and no real change for either scenario 2 or scenario 3.
So, I felt this was committable because the kinds of sites that are likely to care about a few percentage points page execution time increase (as seen in scenario 2) are also least likely to find themselves in that scenario in the first place (they are more likely to be large and complex and therefore benefit the most from the other improvements in this patch). Overall, therefore, it seemed like a big win for Drupal 7.
We still have over a week to go until the next release window (February 6) in case anyone finds anything wrong with this in the meantime; I'm also toying with just waiting until March (for other reasons besides this issue, but for this issue it would give some time for people to write patches for contrib modules to optimize their behavior for this change and get more sites over to something resembling scenario #1).
Thanks again!
Comment #247
David_Rothstein CreditAttribution: David_Rothstein commentedAs for things that stil need to be forward-ported to Drupal 8, we have issues for two of them:
#1400256: Tests for: $info['fields'], not always set?
#1880682: Convert dynamic query in FieldInfo::getFieldMap() to static query
I think we still need an issue for porting #236.2 to Drupal 8 though (with the "which is uncached" part removed).
Comment #248
yched CreditAttribution: yched commentedSorry for the delay, been stuck out of the issue queue lately, slowly getting back.
Yay for the commit !
Posted a change notification in http://drupal.org/node/1915646.
I tried to be as clear and precise as I could, but the result is probably too verbose - could use a proof-reading and possibly some edits.
[edit: deleted previous record (swentel)]
Comment #249
podarokYay!
At last!!!
Thanks!!!!!
Comment #250
yched CreditAttribution: yched commentedUpdating tags.
Comment #251
yched CreditAttribution: yched commentedOuch - the placeholder change notification node was linked in several place in the code.
But I superbly ignored it, created a new change notification, and we deleted the old one.
To avoid adding confusion here, patch posted in #1920340: Update URL of change notification for "_field_info_collate_fields() memory usage".
Comment #252
David_Rothstein CreditAttribution: David_Rothstein commentedThanks, I'll get that one committed soon!
The change notification looks pretty good to me. I went through it a little earlier and changed all references from Drupal 7.20 to Drupal 7.21 (since Drupal 7.20 was a security release only).
Making the same change here too...
Comment #253
David_Rothstein CreditAttribution: David_Rothstein commentedHm, thought I changed the tag... trying again.
Comment #254
yched CreditAttribution: yched commented@David_Rothstein :
Yup, posted a D8 patch in #1922892: Last missing bit from D7 "_field_info_collate_fields()" patch
Comment #256
David_Rothstein CreditAttribution: David_Rothstein commentedFixing tags since Drupal 7.21 only contained a fix to deal with fallout from the Drupal 7.20 security release. Hopefully this is really the last time I do this and this will be released in Drupal 7.22 for real :)
Comment #257
David_Rothstein CreditAttribution: David_Rothstein commented.
Comment #258
pounardSomething bothers me a bit with this patch: in certain edge cases (complex sites with *a lot* of fields) it makes *a lot* of cache backend queries. Don't bother trying to fix this, in this case I guess memory is more important than the cache backend. I just wanted to point that this optimisation does have a downside. I just wanted to say it.
Comment #259
yched CreditAttribution: yched commented@pounard: yes, it's been clear from the start that there would be tradeoffs - there has to be.
Not sure what qualifies as "*a lot* of cache backend queries", but the code that got committed results in one cache query per entity bundle involved in the page request (instead of one single cache query previously). That's not strictly related to the number of overall fields & instances on the site, but rather to the "complexity" of the page (as in # of different "kinds of stuff" loaded on for the page).
For sites with many fields and instances the memory gain should definitely outweight the cache reads. The bad case would be pages involving lots of different entity types & bundles, on sites with a moderate # of fields, but this was deemed a reasonable tradeoff - i.e the perf loss in that case would be limited, and more likely happen on sites where performance is probably not too much of a critical issue, while the perf gains are huge for sites with lots of fields.
Comment #260
pounardHere is a simple scenario: using profile2 to store complex user profiles, with many profiles carrying many fields that ends being loaded because you have an authenticated user, viewing a node with some node references or entity references (or simply a product page using commerce) with a cart block on the page and some other nice blocks showing entities: you can easily end up loading half of the fields on the same page. This is actually not rare, but a pretty common Drupal scenario.
But yes, I understand the tradeoff, and that's still a good one.
Comment #261
plachI think the following snippet from the change notice
should become
Please, confirm :)
Comment #262
yched CreditAttribution: yched commentedIndeed. Fixed. Thanks !
Comment #263
hass CreditAttribution: hass commentedCan someone help with #1953804: field_tags is missing in $entity if field is empty on "node/add", please?
Comment #263.0
hass CreditAttribution: hass commentedAdd link to Google spreadsheet which tracks modules to be updated to take advantage of this patch.