Race condition in cache file creation
| Project: | Boost |
| Version: | 5.x-1.x-dev |
| Component: | Caching logic |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs review |
There is a race condition in creating the cache file. My read of the code shows it is simply opening the file to write, then writing the contents. There doesn't appear to be any locking to prevent multiple writers, or any attempt to prevent other processes from reading the file while one is writing to the file at the same time.
As it stands, two (or more) processes could be writing to the same file at the same time, resulting in a jumbled mess, or another process could try to serve up the cached file before it is fully written. The simple test of testing if the file exists before creating it is insufficient because the two steps are not atomic.
The simplest thing to do would be to write the file with the PID appended to the name, then rename it to the actual file name after successfully closing it for writing. This way, even if multiple processes try to create the same cache file, it will always be correct (never will two writers try to write the same physical file), and readers will never see a partially written file as it is atomically put in place by the rename.

#1
I went ahead and made a patch to make it atomic as described above.
Index: boost.api.inc
===================================================================
--- boost.api.inc (revision 180)
+++ boost.api.inc (working copy)
@@ -134,8 +134,14 @@
if (($filename = boost_file_path($path))) {
_boost_mkdir_p(dirname($filename));
if (!file_exists($filename) || boost_file_is_expired($filename)) {
- if (file_put_contents($filename, $data) === FALSE) {
- watchdog('boost', t('Unable to write file: %file', array('%file' => $filename)), WATCHDOG_WARNING);
+ $tempfile = $filename . getmypid();
+ if (file_put_contents($tempfile, $data) === FALSE) {
+ watchdog('boost', t('Unable to write file: %file', array('%file' => $tempfile)), WATCHDOG_WARNING);
+ } else {
+ // put the temp file in its final location
+ if (rename($tempfile,$filename) === FALSE) {
+ watchdog('boost', t('Unable to rename file: %file', array('%file' => $tempfile)), WATCHDOG_WARNING);
+ }
}
}
I don't know how to make an entry into the translation table for this new error message.
#2
Thanks for contributing the patch! Applied and works.
+1
#3
This patch is working for me. Marking as reviewed and tested.
#4
+1 works for me too and is a critical patch in my opinion.
#5
Looking into this.
#6
Considering using the
LOCK_EXargument to file_put_contents() (but see PHP bug #43182) or just basing the locking on flock(). Thoughts?#7
@arto - that bug has long since been fixed according to that page. I'd say that file_put_contents() with LOCK_EX is a reasonable solution. You *might* want to declare php 5.1 as a requirement in the module since thats when LOCK_EX became available.
#8
@Moshe, well, that PHP bug was apparently fixed. There is no resolution on file other than the automatic "fixed in CVS" at the end (so who knows which option was chosen or what was fixed, exactly), and the version number on the ticket states PHP 5.2.4 in any case, which throws some doubt about relying on consistent
LOCK_EXbehavior prior to that.Also, while I've declared PHP 5.2+ a dependency on all my other modules, I'm somewhat more loath to do so with Boost as the module is most generally useful exactly to people who are running their sites on crappy web hosts. In other words, I need to think about this some more... maybe we should go with the original poster's patch for the time being.
#9
The original patch is fine with me as well.
#10
OK, will test it and commit...
#11
Rerolled for 5.x-1.x-dev.
#12
Updated for 6.x as it is still an issue. Improved watchdog reporting so admin has a better chance of fixing the error if it's a reoccurring issue. Split the file writing off in its own function.
#13
added more reporting
#14
committed with a slight fix to the watchdog call in the above patch
http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/boost/boost...
#15
I have a question about this topic as we have noticed the same conditions on both our boost D5 and D6 installs.
From what I understand that the LOCK_EX argument applies an exclusive lock on the file, closing it to reading and writing. My question is what happens when a script tries to read or write to that file while locked?
For instance, if a page is requested and the cache is being written at the exact time that a second page request is made, would Apache try to serve up a locked file, thus sending an error to the end user?
Also what if 2 concurrent page requests are made and both try to write at close to the same time. Does the process that took slightly longer error out when it tries to write?
It seems like ideally, any process that comes across a locked file waits for the file to be unlocked, but I have a feeling that is wishful thinking. Has anyone noticed either of the scenarios, or have a clear understanding of the inner workings of the LOCK_EX argument?
#16
I just looked through the patch, it looks like everyone has decided to go with the create a file then rename it method. Works for me.
#17
I made some very slight changes to the patch, that is now in
http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/boost/boost...
in short
@unlink($tempfile);when rename fails.#18
Automatically closed -- issue fixed for 2 weeks with no activity.
#19
http://drupal.org/node/223610#comment-1138711
#20