Most of the low-level file management functions (the file_unmanaged_*() functions) do not support remote streamwrappers properly because they incorrectly rely on drupal_realpath(). That makes it impossible to stores files remotely in the current state.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Damien Tournoud’s picture

Status: Active » Needs review
FileSize
9.27 KB

Fixed the tests and extended them to cover a simple remote filesystem implementation. Let's see the extend of the damages.

Damien Tournoud’s picture

Now with the patch that should fix it.

Damien Tournoud’s picture

Sweet, so there are bugs hidden by other bugs.

The custom DrupalStreamWrapper::chmod() function we added to standard PHP stream wrappers need to clear the stat cache because PHP cannot clear the cache of the managed file (public://test.jpg) even if it does clear the cache of the underlying file.

pwolanin’s picture

Discussing with DamZ - all the changes look reasonable. It's not clear why we used drupal_realpath() many of these places except that there is probably a corresponding realpath() call in Drupal 6.

johnv’s picture

So this patch should remove the message in system log, which appears every time after flushing the cashes?
"The file was not deleted, because it does not exist."

Damien Tournoud’s picture

Could we get some additional review on this? Those useless drupal_realpath() calls we have all over the place prevent any successful implementation of remote storage in Drupal.

Damien Tournoud’s picture

Version: 7.x-dev » 8.x-dev

Bumping to 8.x.

pwolanin’s picture

Minor, but for clarity it would be nice to use parens to group the stuff to the right of the || on this line for those of use who don't automatically recognize operator binding precedence

+ if ($source == $destination || ($real_source !== FALSE) && ($real_destination !== FALSE) && ($real_source == $real_destination)) {

dopry’s picture

From a quick look at the code, It looks like drupal_realpath should be dropped wholesale. This can be internalized to the the stream wrapper classes that need realpath support. The current drupal_realpath appears to break the encapsulation created by the stream wrapper layer. Any code using it should probably depend on the stream wrappers to deal with the realpath rather than themselves.

Unless of course I'm missing something important here.

Good find Damien. I hope you find some more reviewers.

dopry’s picture

I poked around some more.

The DrupalStreamWrapperInterface specifies a realpath method.

Any code Stream Wrapper Interfaces not implementing realpath should be considered broken for Drupal 7. The should have one even if it only returns a dummy value such as empty string.

The removal of realpath as a whole should be done as an 8.x improvement to the API... but it also means you need to make sure url_stat is properly implemented in all stream wrappers so is_file etc will work. A large number of file_unmanaged functions which resolve a realpath then pass them to the is_file functions seem patently wrong. It seems like it should depend on StreamWrapperInterface::url_stat to return the proper data so is_file() etc will work.

Then again I haven't worked on this for a while and might be overlooking something. I remember there being some curious bugs in the scheme parsing in PHP's stream wrapper extension that this may be in place to work around.

Damien Tournoud’s picture

It is perfectly legitimate not to implement realpath(). By 'not implement' I mean always returning FALSE.

From the interface:

  /**
   * Returns canonical, absolute path of the resource.
   *
   * Implementation placeholder. PHP's realpath() does not support stream
   * wrappers. We provide this as a default so that individual wrappers may
   * implement their own solutions.
   *
   * @return
   *   Returns a string with absolute pathname on success (implemented
   *   by core wrappers), or FALSE on failure or if the registered
   *   wrapper does not provide an implementation.
   */
  public function realpath();
Damien Tournoud’s picture

See DrupalDummyRemoteStreamWrapper for an example of streamwrapper that doesn't support realpath().

drewish’s picture

I agree that many seem unnecessary but haven't looked closely enough to say with certainty. I was curious to see when all those drupal_realpath() calls got added and it looks like they were there in the commit of #517814: File API Stream Wrapper Conversion but that they were just upgrades of the existing code which dates back to ancient stuff like #23028: file_check_location problems and the commit introducing file.inc. So I think revisiting their use is a good idea.

c960657’s picture

Back in #227232: Support PHP stream wrappers I suggested the introduction of drupal_realpath() as a drop-in replacement for realpath(). The problem with realpath() is that it just returns FALSE for all files with a stream wrapper prefix, so I suggested that drupal_realpath() should call realpath() for regular files, and for files with a stream wrapper prefix it should try to normalize the path and return that. This approach is different from what was implemented in D7.

Would it solve your issue if drupal_realpath() returned FALSE only if the file does not exist? Would it introduce a security issue (this was briefly discussed in comments 62 and 63 in #227232-62: Support PHP stream wrappers)?

BTW using a wrapper around realpath() allows us to work around the issue that on BSD and in PHP with the Suhosin patch the function behaves differently than on a regular Linux+PHP:
http://www.php.net/manual/en/function.realpath.php#82770

Damien Tournoud’s picture

The real issue we have here is that we call drupal_realpath() in a lot of places where it makes zero sense to call it.

drupal_realpath() only make sense for local filesystems. Everywhere we support remote filesystems, we should just not call drupal_realpath().

Starminder’s picture

subscribe

johnv’s picture

Reviewing patch #3, which is in my system now for 2 months: It works fine, gives better notices.

pwolanin’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +Needs backport to D7

I don't see any arguments against this change, and all discussion seems to indicate that the drupal_realpath() calls were just added to mimic older D6 code rather then necessarily being needed at all.

jbrown’s picture

Title: Fix support for remote filesystem » Fix support for remote streamwrappers
Status: Reviewed & tested by the community » Needs review
Issue tags: -Needs backport to D7
FileSize
18.75 KB

Example of when these were introduced: #573292: enable file_unmanaged_delete() to handle stream wrapper URIs The patch in that issue actually broke what it was trying to fix. It wasn't even broken to start with...

DrupalStreamWrapperInterface::realpath() is still useful - If a function does not accept URIs, you can determine the local path first (but only for local streamwrappers).

It is correct that DrupalStreamWrapperInterface::realpath() should return FALSE if it cannot provide a local filepath. However, for remote files to be deleted at present, the streamwrapper can return the full URI as the realpath. This is a bit of a hack, but it does work.

I've rerolled and also removed drupal_realpath() from image_gd_get_info(). getimagesize() works just fine with streamwrappers.

jbrown’s picture

Issue tags: +Needs backport to D7

Fixing tags.

BTW the patch in #3 no longer applies.

Damien Tournoud’s picture

Thanks for the reroll. I really hope we can get this in for 7.1. The extended tests show that this is the way forward (contrary to, typically #573292: enable file_unmanaged_delete() to handle stream wrapper URIs that ended up breaking what it was trying to fix).

jbrown’s picture

Status: Needs review » Needs work
Issue tags: +Needs backport to D7

The last submitted patch, remove_unnecessary_drupal_realpath_calls.patch, failed testing.

jbrown’s picture

Status: Needs work » Needs review
FileSize
18.79 KB
chx’s picture

Status: Needs review » Reviewed & tested by the community

Most of the patch is removing drupal_realpath where unecessary and new tests. Very nice job.

As for the new watchdogs, we have discussed this to do death. We do not want to (we can not in D7) change the return value of drupal_realpath and as things stand, this is the cleanest solution.

chx’s picture

Status: Reviewed & tested by the community » Needs work

In fact, it turns out one of the questions I raised on IRC does need fixing -- the ($realpath = drupal_realpath()) checks need a strict check against FALSE or the super-duper edge case of passing a file called '0' would fail. PHP, die, die, die!!!

Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
4.09 KB
18.6 KB

Made the tests for drupal_realpath() more strict, and simplified a condition.

chx’s picture

Status: Needs review » Reviewed & tested by the community

Well now we are really good. So many juicy tests and fixes. (Ps. I know PHP doesnt have bugs but the feature where realpath throws a FALSE for a remote URL is superb lame.)

Damien Tournoud’s picture

Version: 8.x-dev » 7.x-dev
FileSize
18.65 KB

Ported to Drupal 7.

Damien Tournoud’s picture

Version: 7.x-dev » 8.x-dev

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 1083982-remove-unecessary-drupal-realpath-7.patch, failed testing.

Damien Tournoud’s picture

Status: Needs work » Reviewed & tested by the community
chx’s picture

#27 is still RTBC :)

webchick’s picture

Copy/pasting from my Skype chat with Damien:

I took a look at this patch, but it looked a bit risky to commit at the last minute before release. I could see Media module or some other custom/contrib file-handling module calling drupal_realpath() (since it was copying core) and then the path getting double-wrapped, causing breakage. I'm happy to commit it as one of the first things after rolling 7.3 though. That'd give us more like a month to find fall-out issues.

jbrown’s picture

I agree - there should just be a short merge window after each release.

webchick’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: +API change

Ok! Now that the pesky release is out the door, let's get back to bug fixin'!

HOLY CRAP this patch is awesome. It removes all kinds of annoying extra function calls, adds a butt-load of tests, AND fixes a major bug that makes Drupal more capable. Awesome!

Note that in a quick cursory glance it looks like at least http://drupal.org/project/imagemagick needs a similar change as image.gd.inc. I wonder if it's worth announcing this to the dev list as a small API change. Tagging accordingly. I'm not really sure how best to summarize it, though, so some help on that would be appreciated.

Committed to #27 8.x and #29 to 7.x. Thanks!

jhodgdon’s picture

If this is an API change, can someone summarize the API changes for 7.x and 8.x? Also, please leave it as "needs work" for update doc until the API change has been recorded on the API change pages.

Also removed the backport tag, because it looks like something was committed to d7 - if incorrect, sorry!

jhodgdon’s picture

Status: Fixed » Needs work

Forgot status change...

catch’s picture

Status: Needs work » Fixed
Issue tags: -API change, -Needs documentation updates

This isn't really an API change, it's more just using the API correctly - however clearly there is a lot of misunderstanding around this.

I've re-opened #1201024: drupal_realpath() should describe when it should be used, bumped status to major task instead of normal bug, and I'm closing this again (opened this ticket because I thought the patch wasn't in yet).

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

effulgentsia’s picture

If people who were involved in this can review #1277228: Remote files can't be deleted with Drupal 7.6 and later to see if the corresponding fix to Media is proper, I'd appreciate it. Thanks!