Problem

text.module runs check_markup() in text_field_load() and caches the 'safe' string result in the field on entity load. The problem here is that with Media enabled, this means that if you are switching between HTTP and HTTPS, editing files, etc, that the rendered file entity is now cached. It seems to me that it causes more problems than it the performance it saves by enabling caching.

Proposoal

Disable the caching for the media filter since it depends on data and other entities that cannot be cached.

Related issues:

#944296: After a cache clear, same image embedded and attached views in the same size for one page load
#1352182: Media assets should have relative paths for wysiwyg editors
#1120188: Default Theme intermittently replaces Sub Theme
#1922308: Paths not correct in multi site enviroment
#1266064: The media filter allows infinite recursion via media_token_to_markup()

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Dave Reid’s picture

Issue summary: View changes

Adding 1922308

aaron’s picture

Status: Active » Needs review
FileSize
418 bytes

Here is the patch with the cache disabled.

thamba’s picture

Patch at #1 by aarom seems to work.
Here is the same patch for the 7.x-2.0-unstable version.

Status: Needs review » Needs work

The last submitted patch, 1984424-disable-filter-caching-2.patch, failed testing.

thamba’s picture

This patch should go into its own issue against the 2.0-unstable7 version. Here is the issue for the correct patch: https://drupal.org/node/2017761

aaron’s picture

Status: Needs work » Needs review
aaron’s picture

Can I take that as RTBC?

aaron’s picture

Status: Needs review » Reviewed & tested by the community
thamba’s picture

A quick note to anyone trying this patch out:

You need to turn off the "Convert media tags to markup" filter in each of your textformats under admin/config/content/formats. Then turn it on again and save.

I did not get the patch working for a long time with or without cache enabled until I did the following above. This could save time for anyone out there trying to solve this issue.

aaron’s picture

Status: Reviewed & tested by the community » Needs work

That sounds like we need to do that in an update, then.

kaidjohnson’s picture

Status: Needs work » Active

I strongly disagree with turning off caching of the media filter. I have run into several scenarios in the past 6 months where a 'cache' => FALSE setting in hook_filter_info() has brought a production site to its knees, causing all sorts of unexpected slowness and out of memory errors. To quote the hook_filter_info documentation:

cache (default TRUE): Specifies whether the filtered text can be cached. Note that setting this to FALSE makes the entire text format not cacheable, which may have an impact on the site's overall performance.

Please see the latest patch here: #1266064: The media filter allows infinite recursion via media_token_to_markup()

aaron’s picture

Status: Active » Needs review
FileSize
1.25 KB

Here we go after some hair-pulling.

slashrsm’s picture

Patch looks good and it does what we wanted it to do.

I can, however, also agree with @kaidjohnson. Filtering text on every page request just because of this does seem a bit too much. Would it be possible to catch events on files (update, ...) and clear appropriate item from cache_field when needed? We should be able to do that since we have usage information for files that are embedded into WYSIWYG.

Thoughts?

aaron’s picture

That would be nice, except that we do not have any context information from the WYSIWYG. For example, if we have embedded a file into the body of node 21, and then later make a change to that file, we have no way currently to clear the cache for that body's text field. Although I suppose that we could clear all the caches every time we made a change to a file, but that seems a bit of an overkill to me. I guess it's six of one, half a dozen of the other.

aaron’s picture

slashrsm’s picture

Cache ID looks like this:

$cache_id = $format->format . ':' . $langcode . ':' . hash('sha256', $text);

Could we do this?
- hook on file_update
- check file usage for this file to get list of entities that use this file
- fetch list of textarea fields on those entites
- use field's content to calculate cache ID
- clear it

It is a bit funky, but probably still less overhead than making entire text format uncachable. We could, alternatively, flush all items that belong to formats that use media filter. Still less overhead than running filters over and over again, I assume.

aaron’s picture

That sounds reasonable.

aaron’s picture

Status: Needs review » Needs work
catch’s picture

File field does it's own caching of check_markup(), so the markup is cached in cache_field instead. That would cut out some of the work in #15.

catch’s picture

Not file field, text module.

Dave Reid’s picture

Pushing to beta blocker for now.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
805 bytes

This patch purges the field cache for entities with references in a file's usage list when the file is updated. It takes the advice of catch and implements the idea in #15.

There should be a very high threshold for making a filter uncacheable because of the knock-on effect it has for an entire format. To support contextual links on embedded file entities, I use hook_filter_info_alter to replace media_filter's process callback to a function that rewrites the media token into something that Inline API picks up when the entity is viewed. This allows the format to remain cacheable but doesn't embed rendered entities in cached data that doesn't vary by permission. (See https://github.com/bangpound/bangpoundmedia)

Devin Carlson’s picture

An updated version of #21 to use file_entity_invalidate_field_caches() and to remove the existing calls to media_filter_invalidate_caches() which essentially does the same thing as we're trying to accomplish but with the {media_filter_usage} table.

The patch also includes an update function to remove the table.

Devin Carlson’s picture

An updated version of #22 that removes the cache clearing from Media since it now exists in File entity.

Devin Carlson’s picture

Forgot to merge in the latest changes.

Devin Carlson’s picture

Status: Needs review » Fixed

Committed #24 to Media 7.x-2.x.

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Added 1266064 to related