Download & Extend

_content_type_info() memory usage improvements

Project:Content Construction Kit (CCK)
Version:6.x-2.x-dev
Component:content.module
Category:bug report
Priority:major
Assigned:catch
Status:reviewed & tested by the community
Issue tags:memory, Performance

Issue Summary

On a D6 site I'm profiling, the _content_type_info() cache takes between 3.5-4mb of memory (as measured by xhprof). This is because it loads all the information for all content types, fields and field types even on requests when only a fraction of that information might be needed by the request.

The following patch isn't done yet, but starts to allow the various functions like content_types() to maintain their own caching - this means (at the moment) a cache entry for widgets, field types, and for each content type individually.

On a Drupal 6 site that I'm profiling, all that was needed on a node/n page was two calls to content_types() for two node types, and one call to _content_field_types().

Before - _content_type_info() - 3,636,504 bytes (peak memory usage is actually 4mb)
After: _content_field_types() 476,696 + content_types() 60,000 - so about 1/7th the memory usage.

On pages with a lot of node types or similar this will mean additional calls to cache_get() where there was previously one, but that ought to be offset by having a lot less data to transfer, and not needing to unserialize() such a large array.

@todos still here are adding $reset params to these functions, and also doing similar for some of the other functions that call _content_type_info() frequently.

AttachmentSize
cck_memory.patch2.74 KB

Comments

#1

Status:needs work» needs review

Updated patch adds _content_widget_types() and $reset arguments for all the functions now with their own caching, also a couple of bugfixes from testing the cck UI.

I was going to look at removing and/or splitting the caching in _content_field_info() itself, but it looks like it's called directly by a number of contrib modules too - see http://drupalcontrib.org/api/drupal/contributions--cck--content.module/f... so for now just trying to deal with the memory issue relatively unobtrusively rather than change any extra.

AttachmentSize
cck_memory_1009596.patch 5.06 KB

#2

Glad to see others are experiencing this problem also.
We're seeing a much larger memory usage than 4mb - almost 160mb!

Will give the patch a test.

#3

Fixes one logical bug in this. I've been running this on a large production Drupal 6 site for a while now with no side-effects (apart from the bug this patch fixes which broke static caching for one function).

AttachmentSize
cck_memory_1009596.patch 5.06 KB

#4

I've tried the patch from #3 - and attached the callgraphs before and after

Before the patch

_content_type_info 184,419,888     94.6%

After the patch

_content_type_info 184,471,168      72.3%
content_fields           61,814,776          24.2%

So for me the content_fields is reducing memory usage by about a third but _content_type_info remains at 184M

AttachmentSize
before-patch.png 170.09 KB
after-patch.png 149.49 KB

#5

This patch doesn't actually reduce the memory usage from _content_type_info(), it just avoids it being called by adding caching to the functions that call it. Can you post a backtrace of where _content_type_info() gets called from?

Also I'm a bit shocked that _content_type_info() is taking 184M, could you do file_put_contents('myfile.txt', print_r(_content_type_info(), 1)); and post the contents here?

#6

Unfortunately we can't post 40mb txt files to d.o so I've uploaded it to my dropbox http://dl.dropbox.com/u/1286874/debug.txt

Scanning through it I found the following array:

[field_literature] => Array
                (
                    [field_name] => field_literature
                    [type_name] => page
                    [display_settings] => Array
                        (
                            [weight] => 12
                            [parent] =>
                            [label] => Array
                                (
                                    [format] => hidden
                                )

                            [teaser] => Array
                                (
                                    [format] => default
                                    [exclude] => 0
                                )

                            [full] => Array
                                (
                                    [format] => default
                                    [exclude] => 0
                                )

                            [4] => Array
                                (
                                    [format] => default
                                    [exclude] => 1
                                )

                        )

                    [widget_active] => 1
                    [type] => filefield
                    [required] => 0
                    [multiple] => 1
                    [db_storage] => 0
                    [module] => filefield
                    [active] => 1
                    [locked] => 0
                    [columns] => Array
                        (
                            [fid] => Array
                                (
                                    [type] => int
                                    [not null] =>
                                    [views] => 1
                                )

                            [list] => Array
                                (
                                    [type] => int
                                    [size] => tiny
                                    [not null] =>
                                    [views] => 1
                                )

                            [data] => Array
                                (
                                    [type] => text
                                    [serialize] => 1
                                    [views] => 1
                                )

                        )

                    [list_field] => 0
                    [list_default] => 1
                    [description_field] => 1
                    [widget] => Array
                        (
                            [0] =>
                            [1] =>
                            [2] =>
                            [3] =>
                            [4] =>
                            [5] =>
                            [6] =>
                            [7] =>
                            [8] =>
                            [9] =>
                            [10] =>
                            [11] =>
                            [12] =>
                            [13] =>
                            [14] =>
                            [15] =>
                            [file_extensions] => pdf
                            [file_path] =>
                            [progress_indicator] => bar
                            [max_filesize_per_file] =>
                            [max_filesize_per_node] =>
                            [16] =>
                            ...
                            [262143] =>
                            [label] => Literature
                            [weight] => 0
                            [description] => Provide a PDF document which contains further printed information about the event.
                            [type] => filefield_widget
                            [module] => filefield

Where ... is there is an extra row for each number. This happens 3 times in the array stack. Hopefully this should help.

#7

I see it in the dump, that's definitely the source of your issue, no idea how it could get like that though.

#8

Status:needs review» reviewed & tested by the community

#3 looks good to me. I was seeing _content_type_info get() called 4,356 times on a particular view, adding 3M of memory and 341ms exclusive wall time.

Patch removes this overhead by avoiding unnecessary calls to _content_type_info which seems very sensible to me.

#9

Hello, is ready this patch to be used with cck 6.3.x?

It's ready for in a production site or it's only a work around?

Thanks.

Oskar

#10

+1

#11

Hm, better late than never I guess :-/. Sorry for the delay, CCK is pretty much abandoned right now. Looking back at the code in CCK is somewhat embarrassing, BTW...

So, the two layers of cache added in the patch (content_types(), content_fields() & friends each have their own static cache + persistent cache entry, and defer to _content_type_info() on cache miss, which has its own static + persistent cache) are far from ideal, but as @catch points out, a deeper refactoring getting rid of _content_type_info() would potentially break too many things.

Attached patch adds a missing call to _content_field_types(TRUE) within content_clear_type_cache().

Other than that, I'm probably OK committing this, but I'd welcome some fresh confirmation that it's been working without a hitch on production sites. I'm a bit nervous rolling a release with such central changes on a module that is basically without active maintainers.

#12

Looks like your attachment didn't arrive?

I've been running this on production on at least one large Drupal 6 site for the past year with no side effects.

#13

Sorry, here's the patch + interdiff.

AttachmentSize
cck_memory-1009596-11.patch 5.31 KB
interdiff.txt 965 bytes

#14

I'm slightly worried by the possible explosion of the execution time for content_clear_type_cache().
Each one of the new partial caches gets re-serialized and re-stored on cache clear (including the "list of all fields", which, as we know, is quite large).
It would probably be best to move the cache logic for those to "just clear cache entries without repopulating on the fly".

No time for a patch tonight, though.

#15

Just a note I tested this on a sandbox site with a D6 client today, hitting the front page:

Before:

_content_type_info(): 11.2mb

After:

content_fields() 3.5mb
_content_field_types() 900k
content_types() 1mb

So that's around 5mb saved on that site, or just under 50% of the memory use from that function.