Add version number to cache files to enable forced expiration of client's browser cache
| Project: | Javascript Aggregator |
| Version: | 5.x-1.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs review |
Jump to:
The filename of aggregated javascript (and css) files is a hash of all files that are aggregated. If one of those files has changed, but all the filenames stay the same, the hash will also be the same. This is a problem if users have an older version of the aggregate file in their cache. The common practice for many websites is to add a version number to their .js and .css files that gets incremented whenever those files change.
Looking at the code for the aggregation module, I realized that this version number could be used to create a different hash without actually changing any of the source files by simply changing this line:
<?php
$filename = md5(serialize($scripts_js_files)) .'.js';
?>into this
<?php
$filename = md5(serialize($scripts_js_files) . variable_get('javascript_aggregator_version_number', 0)) .'.js';
?>The same approach can be used to force a name change of the aggregated css files in includes\common.inc:
<?php
$filename = md5(serialize($types)) .'.css';
?>gets changed into
<?php
$filename = md5(serialize($types) . variable_get('javascript_aggregator_version_number', 0)) .'.css';
?>I guess this means that this feature should really be moved into core, but for production sites running D5, this is a quick solution that seems to work. I invite comments from others about this.
Attached is a patch for javascript_aggregator.module that patches the filename generation with the version number, and adds a version number entry to admin/settings/performance.
| Attachment | Size |
|---|---|
| javascript_aggregator.module.patch | 1.37 KB |

#1
This patch works fine for me.
But in the commons.inc I would propose to use
if (module_exists(javascript_aggregator)) {$filename = md5(serialize($types) . variable_get('javascript_aggregator_version_number', 0)) .'.css';
}
else {
$filename = md5(serialize($types)) .'.css';
}
instead of just
$filename = md5(serialize($types) . variable_get('javascript_aggregator_version_number', 0)) .'.css';So you won't get in trouble if javascript_aggregator is removed.
Would it be possible to get this version thing für the cssimages too?
#2
Hi smitty,
I don't know what the policy on core depending on other modules would be. In any case, I don't think that new features for core will be accepted for D5 or D6, so this change of common.inc is purely for private use.
Note that we are not calling a function on the module javascript_aggregator, but just using the variable_get function with a default value. So if this variable doesn't exist or gets deleted, variable_get will just return the default value.
Therefore, checking if the module exists might make the code more readable, but it is not necessary for correct functionioning.
I there a module that aggregates css images? I wasn't aware that this is possible. The best thing you could do is to preload the images (google it).
#3
Another alternative would be to include current date/time (at the time of cache creation) while creating the hash of the aggregated files. This means the filename will be different everytime the cache is cleaned (even though the js files themselves haven't changed). But that should be fine as cache is cleaned on deman by site administrator so the visitors are getting a new file (only when site admin wants)
Wouldn't that be a simple solution?
#4
@Bodo Maass: Yes, you are right with your statement about the 'module_exists'. I wasn't aware of that. Thank you for teaching me!
Well what I meant is not the aggregation of css images but the adding of a version number. As far as I have seen, this is done by the Support File Cache Module (http://drupal.org/project/sf_cache) - but don't ask my how it is done there! Unfortunately this module does not work for me because of http://drupal.org/node/374030.
#5
Hi ajayg,
That does sound like a good solution and would remove the need to enter a manual version number.
#6
Hi smitty,
The only way I can think of to add versioning to images is to dynamically rename them to include the version number in the file. This would require to change all references to that file in the css and in the javascript, which could break pages that reference some images directly.
I haven't looked at the sf_cache module yet, and I don't think I need it for now. Javascript and css aggregation speeds up page loading significantly by reducing the number of requests per page, and that's my main incentive for using it. I'm not sure what the extra benefit of sf_cache is.
#7
@ajayg: If the files were named based on when they were created, the aggregator would need to know how what date was used. It would just be a matter of storing it somewhere, but it does complicate things a bit. Also, if something will be stored, maybe it should use the files' modification times.
@Bodo Maass: I just applied this patch, and it's a great help, thanks! Our users were sometimes seeing problems with old JS/CSS files being loaded, especially since we set our expiration times for cached files far in the future.
I made one small change, though: I stuck the version number on the string after the MD5sum, like this:
$filename = md5(serialize($scripts_js_files)) . '_' . variable_get('javascript_aggregator_version_number', 0) .'.js';That makes it easier to tell that everything's working, and also eliminates the extremely small chance of an MD5 collision between different versions.
#8
@scottgifford
you would get the current system date/time (perhaps seconds since epoc which php easily provides) and use it to generate the filename. Why would it need to be stored or complicated? It is only one time when admin makes a configuration change and not every invocation. So I am not clear what would be more complicated?
#9
@ajayg: But the next time it ran, with the file already generated, how would it know what the generated filename was? In other words, how would it know what time the file was generated at, so it could send back the name of the already-generated file without creating it again?
#10
Here what I am suggesting.
Instead of
$filename = md5(serialize($scripts_js_files) . variable_get('javascript_aggregator_version_number', 0)) .'.js';Use Time function
$filename = md5(serialize($scripts_js_files) . time()) .'.js';Time function returns number of seconds since epoch. So unless Administrator makes two changes to same file in less than a second we are safe (meaning almost always). So we have unique file name everytime we save the cache without worrying about manually putting some version number.
#11
@ajayg - I understand what you're suggesting, the problem is that the filename won't stay the same between different page views.
If you look at the line of code you're talking about, in javascript_aggregator_cache, it runs every time a page with Javascript is generated. It figures out the filename, checks if it already exists, and if not caches it.
So let's say that we visit a page with Javascript around now, t=1237069068. It uses that time to generate the filename, sees that it doesn't exist, then creates it and returns a link to it. Now 1 second later, at t=1237069069, I visit the page again. It uses that time to generate another filename, which will be different from the one it generated 1 second before, sees that this new file doesn't exist, then creates it and returns a link to it. It would not cache anything for longer than 1 second, because the time used to generate the filename would change every second.
To make the timestamp idea work, you'd have to save the timestamp used to generate the file the first time so it can use the same file the next time.
#12
Sounds like the timestamp should be stored in the version number variable instead of a user supplied version number. So the timestamp is the time of clearing the cache, and not the time when the aggregator runs.
#13
Sure, that would work. It does change the effect of cleaning the cache quite a bit, since changing the filename will cause all browsers to reload the file even if it hasn't changed, but when a site admin presses the "clear cache" button to push a fix out to their users, I think that's what they are likely to want.
The other use of clearing the cache is to clean up old cruft that's built up, in which case you wouldn't want a filename change. It would be useful to have a way to clean old files up without changing the filename. I haven't really looked at what this module does to handle that now, maybe it would just take care of itself.