Download & Extend

Caching does not properly respect protocol (a problem when dealing with https)

Project:Drupal core
Version:8.x-dev
Component:cache system
Category:bug report
Priority:normal
Assigned:Unassigned
Status:active

Issue Summary

Caching, such as the block cache, filter cache, field cache, etc doesn't respect protocol (http vs https) when caching. So, you can run into a couple problems.

  1. If you don't have a valid ssl cert, the cached item was generated under ssl by someone, and a visitor hits a non-ssl version of the page the item included will not be displayed because the invalid cert has not been accepted.
  2. If a cached item was generated on a non-ssl page and is then displayed in an ssl page a ssl error will be displayed for including non-ssl assets in the page.

Code to come soon.

Comments

#1

Priority:normal» major

#2

Priority:major» normal

To clarify the problem here's an example - the filter cache may return content a rendered IMG tag (rendered, for example, by media module) that references a non-https URL if it was cached while begin viewed over http. This would lead to mixed content warnings for the page.

#3

After working on an alternative database caching backend as an option in the interim I think I've come to the conclusion that core shouldn't attempt to automate checking if a cache should have a marker to signify https. It can work as a way around but isn't clean.

Instead, when something caches html that has absolute links in content or has the possibility of absolute links it should properly handle the cache id in a manner to reflect https if present. We should fix core to work this way, document this in the handbook/api docs, and file bugs against any modules we know violating this.

If that works as an idea I can start working on the patch.

#4

For the media example, I think a good solution is to avoid the problem and use a schemeless URI. I haven't seen a lot of info about them, I encountered them at #1180646: Linking to google_service.js with https protocol when SSL enabled. Seems like they are good to go, except for referencing style sheets, http://stackoverflow.com/questions/2181207/is-it-safe-to-use-schemeless-....

#5

@drumm Drupal 7 core uses an absolute URL for image style urls which is where media module gets the absolute url on images. This is a core behavior. Going schemaless would be nice and maybe something we should try getting into Drupal 8 and Drupal 7 contrib.

#6

To add a little more detail and context, file_create_url() takes internal Drupal paths (e.g., admin/modules) - that is those that don't start with a full domain or a / - and prepends them with the $base_url. This is how CSS and JavaScript end up with a full URL when rendered.

#7

@drumm I wrote up a description of how to implement protocol relative URLs in D7. That advice was very useful. http://engineeredweb.com/blog/11/12/protocol-relative-urls-drupal-7

#8

HTTPS isn't the only case where this can rear its head. If for example you want to route authenticated users through a separate domain for added security (via HTTP authentication, VPN, or something else), then you are also likely to see cached data containing the private domain. This can lead to random HTTP authentication popups, or just plain missing content.

#9

Confirming #8 Images uploaded via SSL are NOT visible on the Non-SSL site, link to the images is https://site/image.png while the request is done over http://site/image.png. (and we run on a self-signed certificate... boem... error) So, let's do some heavy url rewriting for now...

nobody click here