Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Most of the low-level file management functions (the file_unmanaged_*()
functions) do not support remote streamwrappers properly because they incorrectly rely on drupal_realpath()
. That makes it impossible to stores files remotely in the current state.
Comment | File | Size | Author |
---|---|---|---|
#29 | 1083982-remove-unecessary-drupal-realpath-7.patch | 18.65 KB | Damien Tournoud |
#27 | 1083982-remove-unecessary-drupal-realpath.patch | 18.6 KB | Damien Tournoud |
#27 | interdiff.patch | 4.09 KB | Damien Tournoud |
#24 | remove_unnecessary_drupal_realpath_calls.patch | 18.79 KB | jbrown |
#19 | remove_unnecessary_drupal_realpath_calls.patch | 18.75 KB | jbrown |
Comments
Comment #1
Damien Tournoud CreditAttribution: Damien Tournoud commentedFixed the tests and extended them to cover a simple remote filesystem implementation. Let's see the extend of the damages.
Comment #2
Damien Tournoud CreditAttribution: Damien Tournoud commentedNow with the patch that should fix it.
Comment #3
Damien Tournoud CreditAttribution: Damien Tournoud commentedSweet, so there are bugs hidden by other bugs.
The custom DrupalStreamWrapper::chmod() function we added to standard PHP stream wrappers need to clear the stat cache because PHP cannot clear the cache of the managed file (public://test.jpg) even if it does clear the cache of the underlying file.
Comment #4
pwolanin CreditAttribution: pwolanin commentedDiscussing with DamZ - all the changes look reasonable. It's not clear why we used drupal_realpath() many of these places except that there is probably a corresponding realpath() call in Drupal 6.
Comment #5
johnvSo this patch should remove the message in system log, which appears every time after flushing the cashes?
"The file was not deleted, because it does not exist."
Comment #6
Damien Tournoud CreditAttribution: Damien Tournoud commentedCould we get some additional review on this? Those useless drupal_realpath() calls we have all over the place prevent any successful implementation of remote storage in Drupal.
Comment #7
Damien Tournoud CreditAttribution: Damien Tournoud commentedBumping to 8.x.
Comment #8
pwolanin CreditAttribution: pwolanin commentedMinor, but for clarity it would be nice to use parens to group the stuff to the right of the || on this line for those of use who don't automatically recognize operator binding precedence
+ if ($source == $destination || ($real_source !== FALSE) && ($real_destination !== FALSE) && ($real_source == $real_destination)) {
Comment #9
dopry CreditAttribution: dopry commentedFrom a quick look at the code, It looks like drupal_realpath should be dropped wholesale. This can be internalized to the the stream wrapper classes that need realpath support. The current drupal_realpath appears to break the encapsulation created by the stream wrapper layer. Any code using it should probably depend on the stream wrappers to deal with the realpath rather than themselves.
Unless of course I'm missing something important here.
Good find Damien. I hope you find some more reviewers.
Comment #10
dopry CreditAttribution: dopry commentedI poked around some more.
The DrupalStreamWrapperInterface specifies a realpath method.
Any code Stream Wrapper Interfaces not implementing realpath should be considered broken for Drupal 7. The should have one even if it only returns a dummy value such as empty string.
The removal of realpath as a whole should be done as an 8.x improvement to the API... but it also means you need to make sure url_stat is properly implemented in all stream wrappers so is_file etc will work. A large number of file_unmanaged functions which resolve a realpath then pass them to the is_file functions seem patently wrong. It seems like it should depend on StreamWrapperInterface::url_stat to return the proper data so is_file() etc will work.
Then again I haven't worked on this for a while and might be overlooking something. I remember there being some curious bugs in the scheme parsing in PHP's stream wrapper extension that this may be in place to work around.
Comment #11
Damien Tournoud CreditAttribution: Damien Tournoud commentedIt is perfectly legitimate not to implement realpath(). By 'not implement' I mean always returning FALSE.
From the interface:
Comment #12
Damien Tournoud CreditAttribution: Damien Tournoud commentedSee DrupalDummyRemoteStreamWrapper for an example of streamwrapper that doesn't support realpath().
Comment #13
drewish CreditAttribution: drewish commentedI agree that many seem unnecessary but haven't looked closely enough to say with certainty. I was curious to see when all those drupal_realpath() calls got added and it looks like they were there in the commit of #517814: File API Stream Wrapper Conversion but that they were just upgrades of the existing code which dates back to ancient stuff like #23028: file_check_location problems and the commit introducing file.inc. So I think revisiting their use is a good idea.
Comment #14
c960657 CreditAttribution: c960657 commentedBack in #227232: Support PHP stream wrappers I suggested the introduction of drupal_realpath() as a drop-in replacement for realpath(). The problem with realpath() is that it just returns FALSE for all files with a stream wrapper prefix, so I suggested that drupal_realpath() should call realpath() for regular files, and for files with a stream wrapper prefix it should try to normalize the path and return that. This approach is different from what was implemented in D7.
Would it solve your issue if drupal_realpath() returned FALSE only if the file does not exist? Would it introduce a security issue (this was briefly discussed in comments 62 and 63 in #227232-62: Support PHP stream wrappers)?
BTW using a wrapper around realpath() allows us to work around the issue that on BSD and in PHP with the Suhosin patch the function behaves differently than on a regular Linux+PHP:
http://www.php.net/manual/en/function.realpath.php#82770
Comment #15
Damien Tournoud CreditAttribution: Damien Tournoud commentedThe real issue we have here is that we call
drupal_realpath()
in a lot of places where it makes zero sense to call it.drupal_realpath()
only make sense for local filesystems. Everywhere we support remote filesystems, we should just not calldrupal_realpath()
.Comment #16
Starminder CreditAttribution: Starminder commentedsubscribe
Comment #17
johnvReviewing patch #3, which is in my system now for 2 months: It works fine, gives better notices.
Comment #18
pwolanin CreditAttribution: pwolanin commentedI don't see any arguments against this change, and all discussion seems to indicate that the drupal_realpath() calls were just added to mimic older D6 code rather then necessarily being needed at all.
Comment #19
jbrown CreditAttribution: jbrown commentedExample of when these were introduced: #573292: enable file_unmanaged_delete() to handle stream wrapper URIs The patch in that issue actually broke what it was trying to fix. It wasn't even broken to start with...
DrupalStreamWrapperInterface::realpath() is still useful - If a function does not accept URIs, you can determine the local path first (but only for local streamwrappers).
It is correct that DrupalStreamWrapperInterface::realpath() should return FALSE if it cannot provide a local filepath. However, for remote files to be deleted at present, the streamwrapper can return the full URI as the realpath. This is a bit of a hack, but it does work.
I've rerolled and also removed drupal_realpath() from image_gd_get_info(). getimagesize() works just fine with streamwrappers.
Comment #20
jbrown CreditAttribution: jbrown commentedFixing tags.
BTW the patch in #3 no longer applies.
Comment #21
Damien Tournoud CreditAttribution: Damien Tournoud commentedThanks for the reroll. I really hope we can get this in for 7.1. The extended tests show that this is the way forward (contrary to, typically #573292: enable file_unmanaged_delete() to handle stream wrapper URIs that ended up breaking what it was trying to fix).
Comment #22
jbrown CreditAttribution: jbrown commented#19: remove_unnecessary_drupal_realpath_calls.patch queued for re-testing.
Comment #24
jbrown CreditAttribution: jbrown commentedComment #25
chx CreditAttribution: chx commentedMost of the patch is removing drupal_realpath where unecessary and new tests. Very nice job.
As for the new watchdogs, we have discussed this to do death. We do not want to (we can not in D7) change the return value of drupal_realpath and as things stand, this is the cleanest solution.
Comment #26
chx CreditAttribution: chx commentedIn fact, it turns out one of the questions I raised on IRC does need fixing -- the ($realpath = drupal_realpath()) checks need a strict check against FALSE or the super-duper edge case of passing a file called '0' would fail. PHP, die, die, die!!!
Comment #27
Damien Tournoud CreditAttribution: Damien Tournoud commentedMade the tests for drupal_realpath() more strict, and simplified a condition.
Comment #28
chx CreditAttribution: chx commentedWell now we are really good. So many juicy tests and fixes. (Ps. I know PHP doesnt have bugs but the feature where realpath throws a FALSE for a remote URL is superb lame.)
Comment #29
Damien Tournoud CreditAttribution: Damien Tournoud commentedPorted to Drupal 7.
Comment #30
Damien Tournoud CreditAttribution: Damien Tournoud commentedComment #32
Damien Tournoud CreditAttribution: Damien Tournoud commentedFollow-up documentation issue: #1201024: drupal_realpath() should describe when it should be used
Comment #33
chx CreditAttribution: chx commented#27 is still RTBC :)
Comment #34
webchickCopy/pasting from my Skype chat with Damien:
I took a look at this patch, but it looked a bit risky to commit at the last minute before release. I could see Media module or some other custom/contrib file-handling module calling drupal_realpath() (since it was copying core) and then the path getting double-wrapped, causing breakage. I'm happy to commit it as one of the first things after rolling 7.3 though. That'd give us more like a month to find fall-out issues.
Comment #35
jbrown CreditAttribution: jbrown commentedI agree - there should just be a short merge window after each release.
Comment #36
webchickOk! Now that the pesky release is out the door, let's get back to bug fixin'!
HOLY CRAP this patch is awesome. It removes all kinds of annoying extra function calls, adds a butt-load of tests, AND fixes a major bug that makes Drupal more capable. Awesome!
Note that in a quick cursory glance it looks like at least http://drupal.org/project/imagemagick needs a similar change as image.gd.inc. I wonder if it's worth announcing this to the dev list as a small API change. Tagging accordingly. I'm not really sure how best to summarize it, though, so some help on that would be appreciated.
Committed to #27 8.x and #29 to 7.x. Thanks!
Comment #37
jhodgdonIf this is an API change, can someone summarize the API changes for 7.x and 8.x? Also, please leave it as "needs work" for update doc until the API change has been recorded on the API change pages.
Also removed the backport tag, because it looks like something was committed to d7 - if incorrect, sorry!
Comment #38
jhodgdonForgot status change...
Comment #39
catchThis isn't really an API change, it's more just using the API correctly - however clearly there is a lot of misunderstanding around this.
I've re-opened #1201024: drupal_realpath() should describe when it should be used, bumped status to major task instead of normal bug, and I'm closing this again (opened this ticket because I thought the patch wasn't in yet).
Comment #41
effulgentsia CreditAttribution: effulgentsia commentedIf people who were involved in this can review #1277228: Remote files can't be deleted with Drupal 7.6 and later to see if the corresponding fix to Media is proper, I'd appreciate it. Thanks!
Comment #42
pillarsdotnet CreditAttribution: pillarsdotnet commentedAlso see #1283502: FileTransfer::checkPath() uses drupal_realpath() where it should use realpath() instead.
and #1283536: update_manager_file_get() incorrectly assumes a hard-coded list of remote stream wrapper schemes.