As some of you may know, I've written my bachelor thesis about Improving Drupal's page loading performance. More in particular, I've tackled the CDN integration part.

One part of my bachelor thesis involved porting the CDN integration module (which I wrote last year for Drupal 5) to Drupal 6. And that of course also involves porting the Drupal core patch. The core patch must do two things:

  1. unify file URL generation in a single function
  2. allow the file URLs to be altered

At the time I wrote the original CDN integration module, I also submitted its patch for review. It introduced a new function to route all file URLs through and a new hook which allowed multiple modules to perform synchronization to a CDN. Many people didn't see the value of having multiple modules doing that, and rightfully so. I wanted to make it as flexible as possible, but in hindsight, that was overkill.

This time however, I've talked to drewish first (at DrupalCon DC), since he wrote the File API for Drupal 7.

Drupal already has one function to generate file URLs: file_create_url($path). Unfortunately, this function is only designed to work for files that have been uploaded by users or are generated by modules (e.g. transformations of images). And now the bad news: there is no function through which the URLs for the other files (the ones that aren’t uploaded but are shipped with Drupal core and modules and themes) are generated. To be honest, the current method in Drupal core for generating these URLs is very ugly, although very simple: prepend the base path to the relative file path. So if you want to serve the file misc/jquery.js (which is part of Drupal core), then you would write the following code to generate an URL for it:
$url = base_path() . 'misc/jquery.js';
drewish and I agreed that since eventually both kinds of files are typically served from the same server(s), it only makes sense to generate their URLs through one function. So the sensible thing to do was to also route the non-uploaded files through the file_create_url() function to generate their URLs. And then there would be a function that a module could implement, custom_file_url_rewrite($path) which would then allow file URLs to be altered. Much like the custom_url_rewrite_inbound() and custom_url_rewrite_outbound() we have today.

So, I wrote a Drupal core patch exactly according to these specifications, and it works great (*). However, we must fall back to the old mechanisms in case the custom_file_url_rewrite() function returns FALSE (meaning that the CDN can’t or shouldn’t serve the file). But since there is a distinction between uploaded/generated files and shipping files, we must first determine which kind of file it is. This can be done by looking at the path that was given to file_create_url(): if it begins with the path of the directory that the Drupal administrator chose to use for uploaded and generated files, then it is an uploaded/generated file. After this distinction has been made, the original procedures are applied and we’re done.

The attached patch:

  • replaces all code using the following mechanism
    $url = base_path() . $file
    to use the new mechanism:
    $url = file_create_url($file)
  • changes the body of file_create_url() to match the logic described above

(*) It works great in Drupal 6. The file URL generation mechanism is identical in Drupal 6 and 7. The patch was very easy to port to Drupal 7. Almost the exact same patch as the Drupal 6 patch could've been applied, were it not that the coding guidelines have changes concerning spacing around the concatenation operator.
This patch is currently running live on my test case web site: http://driverpacks.net.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Status: Needs review » Needs work

The last submitted patch failed testing.

moshe weitzman’s picture

This is very much like a patch which I am working on for Economist (I will cease work and just use this). Once the testbot is happy, I think this is RTBC. Well done, Wim.

Dries’s picture

Issue tags: +Favorite-of-Dries

Awesome.

Dries’s picture

- I think the custom_file_url_rewrite($path) should be mentioned in the PHPdoc, not just in the inline comments. The inline documentation is really great but it is not as accessible as PHPdoc. This is pretty minor.

- It would be nice, but not required, to have an example custom_file_url_rewrite($path). It's pretty straight-forward so this isn't a must have.

- Can't we do + $is_in_files_directory = (strpos($path, file_directory_path() .'/') === 0); after the CDN check? $is_in_files_directory is only used once.

- This is worth a CHANGELOG.txt entry.

- Could benefit from a test but it is not clear how easy that would be to write.

m3avrck’s picture

Great patch! I was looking at rerolling my old one http://tedserbinski.com/tags/drupal/getting-drupal-play-nice-with-your-cdn but you beat me to it!

Since that patch I've posted, we've used it on a number of sites without issue.

The main change in the patch posted and what we're currently running is the introduction of a variable_get('cdn_path', ''). The advantage to this approach is sometimes we use a different CDN in QA and production to test different combinations. Also in the above path I tried to roughly split the usage between JavaScript and CSS so that you can roughly split requests in 2: http://yuiblog.com/blog/2007/04/11/performance-research-part-4/

I would love to see this patch support directly a variable override instead of a custom_url_rewrite over ride since I'm betting 95% of the time a simple variable override in $conf in settings.php will do the trick.

If not, still a great patch!

Dries’s picture

Maybe we can add an example custom_url_rewrite example in settings.php that covers 95% of the case? A variable seems limiting to me. I can imagine a scenario where CSS, JS and images are hosted on static file servers, but where videos and audio files are hosted on a CDN.

geerlingguy’s picture

@ Dries / #6 - that is pretty much the only thing I ever use a CDN for (streaming and on-demand video / audio content). The site files, js, and such are all on the main server. I would presume most people have a similar situation.

chx’s picture

WIm, first accept my public apologizes for not getting back to you on CDN. I will write up something of what happened but in short: labs.nowpublic.com.

This issue is very, very cool. However we will have two rewrites, one for files , one for paths. I am wondering whether we want to integrate. Please feel free to yell "dont bloat this issue" at me I am aware that I might want to do too much here but this really nags me.

chx’s picture

OK, now I can write a followup that makes sense 'cos I got permission from Michael to post code from the np.com codebase. I wrote some about this in this blog post already. This post will not be short.

So, the current custom_url_rewrite functionality is not really a replacement for aliasing -- it supplements it. I know the canonical example is replacing node/123 with article/123 but that'd be the less common case I believe. I think most people simply use pathauto or similar -- it's not too effective to programatically calculate the outgoing aliases unless it's as simple as replacing node/ with article/ (retrieve every node title from the db for every link? uh huh). With that said, custom_url_rewrite functionality can not be used alas for the above described CDN integration as it is usable for internal paths only. This IMO needs to be fixed and extended to files which is what this issue is about.

So NowPublic cant use url(), we use np_url() instead. We have a number of constants like NP_WWW_HOST, NP_MEDIA_HOST (user posted material), NP_STATIC_HOST (our own static files) pointing to on production to the CDN, on dev and staging to the relevant servers and then the relevant parts of np_url is simply

function np_url($prefix, $path, $rev = NULL, $protocol = NP_PROTOCOL) {
  $prefix = 'NP_'. strtoupper($prefix) .'_HOST';
...
  return $protocol .'://'. constant($prefix) . $path;
}

So what I wanted to say is just this: there is definitely a CDN use case where there is a common rewrite step be it for files or URLs and also the current custom_url_rewrite functionality borders useless (previously it was used for language purposes but now that's covered separately) so it could be removed from the aliasing chain and moved directly to url() and then it be called from file.inc too. Please note that this wont be that dramatically different from what this patch does -- pass in 'file' from file.inc, 'url' from url() and be done. Later we can separate out theme stuff for what i called static above.

febbraro’s picture

subscribe

Wim Leers’s picture

@Moshe, Dries, m3avrck & chx: thanks!

@Dries:

- I think the custom_file_url_rewrite($path) should be mentioned in the PHPdoc, not just in the inline comments. The inline documentation is really great but it is not as accessible as PHPdoc. This is pretty minor.

Done.

- It would be nice, but not required, to have an example custom_file_url_rewrite($path). It's pretty straight-forward so this isn't a must have.

Done. Added to system.api.php. The example function is the 95% use case.

- Can't we do + $is_in_files_directory = (strpos($path, file_directory_path() .'/') === 0); after the CDN check? $is_in_files_directory is only used once.

Done.

- This is worth a CHANGELOG.txt entry.

Wow :)

- Could benefit from a test but it is not clear how easy that would be to write.

If it's just testing file_create_url(), it'd be fairly easy. But I'm not sure how I'm supposed to add a custom function only for one specific test. If somebody could point me to the right documentation, I'll write it.

@m3avrck: Adding support for such a variable makes the possibilities too limited. Or the end result becomes too messy. With this simple function, it's still possible to support your simple use case. See the updated patch for your simple use case.

@geerlingguy: your use case is very easy to support: just change a couple of lines

@chx:

This issue is very, very cool. However we will have two rewrites, one for files , one for paths. I am wondering whether we want to integrate. Please feel free to yell "dont bloat this issue" at me I am aware that I might want to do too much here but this really nags me.

In my opinion, it's as simple as this: don't make it harder than it can be. Some people will want to rewrite file URLs but not Drupal paths. Other people will want to do the opposite, and even other people will want to do both. Also, file URLs may be served from other domains after rewriting and that will never be the case for Drupal paths.
It's simply conceptually easier to separate them. At least for me, that would be the case. I don't see how I could clearly write file URL rewriting logic if it were part of the same function as Drupal path rewriting.

@all:
Patch rerolled, it now passes. Instead of checking if the passed in path in file_create_url() begins with file_directory_path(), I'm now checking if it begins with one of misc, modules, profiles, sites or themes. This is ugly, but only mildly more so than the previous check. I'm open to suggestions of course.

One more thing. Unfortunately, thanks to m3avrck's comment, I stumbled upon one tricky, missing feature. The rewriting of URLs in aggregated CSS files. m3avrck rewrote the URLs in CSS (url(path/to/file)) to be served from the CDN.
From one point of view, this is plain wrong: the CSS files on the Drupal web server should point to files on the Drupal web server. Otherwise, you can no longer fall back easily anymore.
From another point of view, this is right: it allows you to easily serve images referenced from CSS files from the CDN/static file server.
I've always assumed that the files on the Drupal web server should be the fallback, i.e. that they are the "original" files and that any changes to serve files from a CDN/static file server are made dynamically (in the file URL) or through file processing. That is why my synchronization daemon also processes CSS files to update the URLs before syncing.
Maybe we could make it pass in context, so that the author of the custom_file_url_rewrite() can decide what is appropriate for his site setup.

ugerhard’s picture

Status: Needs work » Needs review

Let's see what the bot thinks of the latest patch :-)

moshe weitzman’s picture

I don't yet grok the aggregated css images issue yet. My understanding is that all these images are relative links. So, the prerequisite for serving the images from the CDN is to serve the CSS file from the CDN. And thats what this patch already enables. I don't think we have to support a use case where the css images are served from CDN but the CSS file itself is not.

I think using a separate hook for file URLs is reasonable.

Gerhard Killesreiter’s picture

Status: Needs review » Needs work
Issue tags: +Investigate for use on drupal.org

tagging

Wim Leers’s picture

Status: Needs work » Needs review

@ugerhard: heh, oops, I had to submit my comment 10 times before d.o would accept it (probably my university network was to blame), and I guess I forgot to set it again. Thanks!

@moshe: My apologies for my bad explanation. You're absolutely right. In 99% of the cases, what I was talking about is unnecessary (and so is that part of m3avrck's patch). It is only necessary when the base path of your Drupal site and of your CDN are *different*.
E.g.:
1) http://drupal.org/files/css/aggregated.css references http://drupal.org/misc/bar.png through the relative url /misc/bar.png.
2a) http://drupal.org/files/css/aggregated.css is synced to http://cdn.drupal.org/files/css/aggregated.css. The relative URL /misc/bar.png is still valid, but it now points to http://cdn.drupal.org/misc/bar.png
2b) http://drupal.org/files/css/aggregated.css is synced to http://cdn.drupal.org/different/base/path/files/css/aggregated.css. The relative URL /misc/bar.png is now invalid, because it points to http://cdn.drupal.org/misc/bar.png instead of http://cdn.drupal.org/different/base/path/misc/bar.png.

@Gerhard: changing back to needs review so the test bot can do its thing :)

moshe weitzman’s picture

Status: Needs review » Needs work

@Wim - the standard way to create a function for only one test is to add a testing module that is only enabled by one test class. The module gets a 'hidden' param in its .info file so it does not clutter the UI. The simpletest/tests directory has many examples.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
18.34 KB

Tests have been written.

There is one gotcha: FileURLRewritingTest subclasses FileDownloadTest. This results in 2 things:

  1. in the setUp() method, we can no longer call parent::setUp(); instead we must call DrupalWebTestCase::setUp()
  2. the testPrivateFileTransfer() method is inherited and this is desired, because it ensures private file transfers are still working when the custom_file_url_rewrite() function is present, which doesn't rewrite URLs when private downloads are enabled (this is the 95% use case)

Now, another issue is the cruft present in Drupal core's current (and past) file_create_url():

function file_create_url($path) {
  // Strip file_directory_path from $path. We only include relative paths in urls.
  if (strpos($path, file_directory_path() .'/') === 0) {
    $path = trim(substr($path, strlen(file_directory_path())), '\\/');
  }
  // ...
}

That piece of code implies that you could pass in sites/default/files/foo.png just as well as foo.png and get the same result. This conversion really should only happen for private downloads. Hence the patch changes it to that behavior.
Passing in sites/default/files/foo.png (and not just foo.png) is actually also what Drupal core does in the Upload and Blog API modules. The Upload module for example, stores paths like sites/default/files/foo.png in the database and routes that through file_create_url().
The only places where Drupal core deviates from this is ironically enough in the File tests themselves. I consider this a bug, and I hope you do too. A function sets requirements on the data its parameters contains. I suggest that the $path parameter file_create_url() accepts should be "a string containing the Drupal path (i.e. path relative to the Drupal root directory) of the file to generate the URL for". That way, the path custom_file_url_rewrite() receives is always one that exists. (Note: custom_file_url_rewrite() assumed this already, I had forgotten about this piece of cruft since none of the Drupal core modules actually uses the "path relative to the files directory" alternative.)

Sorry for yet another long post. I hope it gets the message across.

The last submitted patch failed testing.

Wim Leers’s picture

The test bot is broken. The patch applies just fine.

Dries’s picture

moshe weitzman’s picture

@Wim - we very much want to turn custom_url_rewrite_outbound() into a hook instead of a special function. Would you be willing to change the new function into a hook here? Of course it could be later as well.

Wim Leers’s picture

@moshe: I'd prefer not. For the reasons cwgordon7 mentioned and because I think it'd lead to more confusion than the clarity/consistency that can be achieved by turning "special functions" into hooks.

So I'd prefer to see the concept of this patch being approved and therefor committed, and *then* discuss the implications of having potentially multiple implementations of hooks that are not suited very well for multiple implementations.

aaron’s picture

FYI, #227232: Support PHP stream wrappers will break this patch, or vice versa, according to which gets committed first. Additionally, I believe that implementing this will be slightly different, as it would probably be written as a stream wrapper class. I'll take a look at this patch tomorrow during the Media sprint; I only tonight learned about it.

I suspect we might want the Stream Wrapper patch to go through first, as it would otherwise mean we'd have to incorporate a rewrite of a CDN wrapper for this, which will add bloat to that patch, which is already a fairly extensive rewrite of the File API. That patch is fairly much ready for review, as well, if any interested eyes here want to take a look there.

PS Subscribing. :D

Wim Leers’s picture

aaron, I think the stream wrappers patch is *very* far from being RTBC? It's not even entirely sure that it will make it into core. It will surely need benchmarking as well because it is so pervasive.

Also, the changes I made that clash with that patch are all in a single function: file_create_url(). Most changes are about ceasing to use hardcoded paths to files, but to generate proper file URLs instead. That remains a necessity.

So I would very much prefer to see my patch go in first.

Worst case scenario: file_create_url() change again. It remains necessary to keep custom_file_url_rewrite() though. When your patch would get committed, Drupal *could* use stream wrappers, but that doesn't mean it *should* (or could in all situations) use that to sync files to a CDN. For Origin Pull CDNs (CDNs that automatically fetch files from the origin server, the most commonly used CDN type), it also remains necessary to be able to alter the file URL. For Origin Pull CDN file URLs, it's also imposible to stat() them. I'm not sure the stream wrappers patch takes that into account.
In the end, we're talking about a single small function in your patch that would be affected. It doesn't seem sensible to me to postpone a simple, small, almost RTBC patch containing things that are necessary anyway, for a fairly complex, large, not-even-sure-to-make-it-to-RTBC patch.

Dries’s picture

Status: Needs review » Needs work

- I agree with Wim that #227232: Support PHP stream wrappers should go in second. Let's put this patch in first, along with tests.

- I also think that we should use a hook instead of a function. So if we can alter this patch to use a hook, I think we're good to go.

- Instead of creating yet another module file, can we add the tests to an existing module like common_test.module or something.

moshe weitzman’s picture

I agree that there is some complexity when changing from a specially named function to a hook. But without that change, it is impossible for multiple modules to alter urls without the site admin becoming a developer. i think thats a worse outcome, overall.

I'll be watching for the next patch here. Hope to RTBC it since Dries has stated the needed changes.

aaron’s picture

I still think this is a bad idea. It's scratching an important itch, but in so doing means that core will hitherto require CDN support. That means that if we ever do implement stream wrappers, we would then need to include a cdn:// stream wrapper in core. That really needs to live in contrib. The only reason it's not is because of the url rewriting business, which becomes a complete non-issue w/ stream wrappers.

I believe that #227232: Support PHP stream wrappers is actually fairly close to being RTBC, at least from a stability and testing standpoint. Benchmarking will still need to be done, but I believe the only time there would be a performance issue would be when copying the file to the CDN, and batchapi or another progress indicator could be implemented there.

aaron’s picture

If this does go in, how would multiple CDN's be handled? (Say Voxel for images & audio podcasts, and S3 for video)? Obviously, the current patch would not allow this. But even if we use a hook, things get really sticky. Again, stream wrappers in core addresses both of these solutions pretty much out of the box.

Wim Leers’s picture

@Dries & moshe: Thanks for the reviews! A new patch will be rolled in the next few days. Can't say when yet.

@aaron: in not a single way does this patch require CDN support. I have no clue why you are suggesting that. You can combine static file servers with CDNs. You can do anything you want. In fact, my test case, http://driverpacks.net, mixes a static file server with a CDN and assigns it based on the visitor's properties (his country actually, based on his IP). As far as I can tell, no more flexibility than that can be required.

The only thing this patch does, is enable the ability to rewrite file URLs. That's it.

I fully agree that all CDN-handling stuff should live in contrib. That's what this patch makes possible in the first place.

Obviously, this patch *does* support multiple CDNs. The fact that you're saying it does not supports this, suggests that you did not even take a look at the patch. The patch even includes a sample custom_file_url_rewrite() implementation in system.api.php in which exactly that is being demonstrated!

I fully agree that things do get tricky when using hooks, that's what I'll have to tackle next.

However, stream wrappers cannot make that more easy. Stream wrappers is just a way to access remote file systems. The difficulty Drupal (or the site administrator) has to deal with, is *not* how to get files on a remote file system (which can be done through a syncing daemon, through your stream wrapper system …), but the ability to decide from which server each file should be served (or CDN, but a CDN can be considered just another server).

jmstacey’s picture

The Stream wrapper registry in #227232: Support PHP stream wrappers takes care of this use scenario with getExternalUrl(). In our patch, file_create_url() will call the registered wrapper's getExternalUrl(). This means that we don't have to support CDNs directly in core with a hook. What concerns me about this patch is that it doesn't answer the question as to how the files actually get to the CDN. It seems like this should go hand in hand with generating the URL. For example, some might only want to move the file to the CDN on the first request, some might want to move them via cron. This seems to be exactly what the stream wrapper patch is built to accommodate.

I skimmed this patch and file_create_url() will behave as expected if you provide it a valid scheme (s3://, cf://, etc.). On the surface it seems that implementing a mechanism (possibly hook) to change core paths to the appropriate stream scheme is all that's needed [stream is referenced as scheme://stream].

Precautions such as pulling these files locally for some users (e.g. UID 1) would be prudent because there is a possibility of the CDN breaking resulting in an unusable site, even for administration.

jmstacey’s picture

Edit: I didn't see your response Wim Leers until I had submitted. After reading: I think we're on the same page. We'll need to merge file_create_url, but in theory I think we're good.

The override precautions for UID 1 might still be a good idea though. I think the the admin user should always be pulled locally (or at least an option).

Wim Leers’s picture

@jmstacey: You're trying to move a boatload of new code into core, whereas I'm only making file URLs alterable. That's useful in itself, regardless of using a CDN or not. Some people like to run a light-weight Apache instance or lighttpd for serving static files, and this makes that easy to do, too.

In my opinion, it doesn't *have* to be up to Drupal to move files to a CDN. Origin Pull CDNs also don't require Drupal core to do anything. In my case, I'm using a daemon (open source, GPL, will get a homepage soon) to synchronize files. This allows for much more flexibility and speed than letting Drupal sync the files ever could. Because Drupal cannot perform simultaneous uploads, because PHP doesn't allow for that. My daemon also allows files to be preprocessed, e.g. to compress the CSS and JS using YUI Compressor (which gives a better compression ratio than Drupal's compression algorithm), to optimize images (PNG files can easily become 10% smaller), video transcoding, and so on. That sort of preprocessing is unimaginable to do from within PHP, as it'd take ages, because nothing can happen in parallel. In fact, I talked to the Media Mover module author at DrupalCon DC and he was very interested because it is PHP that limits the transcoding speed (due to lack of parallelism).
In the case of the daemon, a input_file -> CDN URL mapping is stored in an SQLite DB. All files' CDN URLs are looked up — it takes about 0.2-0.4 ms to look up a single URL. You have limitless flexibility, as you can even decide to sync the same files to multiple servers (e.g. in the case you have a static file server near the location of an event your organization is hosting (e.g. DrupalCon Paris), to minimize latency for the visitors there) and pick a server dynamically based on the user's characteristics.

Drupal is awesome and should eat its own dog food, but there are limits. It of course depends on your requirements. But after all, if you think about the big picture, a stand-alone, multi-threaded daemon to synchronize files makes more sense than a PHP CMS syncing files, doesn't it? :)

Please don't get me wrong, I'm sure many site owners will love stream wrappers. I'll even use it myself for my smaller web sites. But it's not always the appropriate solution.

So yes, I also think we're on the same page, just with slightly different points of view. Now that we agree that our patches are in fact complementary (I should have used that word in my first reaction), lets' stop derailing this issue :) Ping me on IRC if you've got more questions/thoughts :)

Also, thanks for the excellent idea on never giving uid 1 files from the CDN. I'll definitely add that in my sample code. :)

drewish’s picture

subscribing. obviously i'd like to see the stream wrapper patch hit first but i'm not going to let the perfect be the enemy of the good.

moshe weitzman’s picture

Anyone willing to roll a new patch here?

aaron’s picture

Status: Needs work » Needs review
FileSize
17.37 KB

OK, this one uses drupal_alter to implement hook_url_alter.

Status: Needs review » Needs work

The last submitted patch failed testing.

aaron’s picture

Status: Needs work » Needs review
FileSize
20.04 KB

oops, forgot the tests. i moved the test setup to file_test.module. also fixed up the documentation to match the new functionality.

aaron’s picture

let's try that again; in the correct test setup module this time...

Status: Needs review » Needs work

The last submitted patch failed testing.

aaron’s picture

Status: Needs work » Needs review
FileSize
19.13 KB

Ah, that's why it was in its own module. You don't want its hook_url_alter invoked in the other tests.

Rerolling with the new test module back in place.

Dries’s picture

This looks good to me, but I'd like Wim to review the patch in #40 too.

meba’s picture

Applied with offset. I tried uploading some files and works as expected. Went through the code, I don't have objections.

Note: is this change intentional?

@@ -1916,7 +1935,7 @@ class FileDownloadTest extends FileTestC

     // Create a file.
     $file = $this->createFile();
-    $url = file_create_url($file->filename);
+    $url = file_create_url($file->filepath);
aaron’s picture

"Note: is this change intentional?"

Yes, it was actually an error before, as file_create_url expects a full path, not just the filename (though it would have passed the tests previously anyway).

Wim Leers’s picture

Rerolled, with comment fixes and a couple of minor code style fixes. Instead of hook_url_alter(), I've named it hook_file_url_alter(), because it is really only meant for file URLs.

Note that in the approach aaron has taken — and with which I agree — it is impossible for two modules to alter the same file URL. As soon as one module alters the file URL, it is assumed to be the final one. This simple rule prevents all sorts of weird problems.

Status: Needs review » Needs work

The last submitted patch failed testing.

aaron’s picture

Status: Needs work » Needs review
FileSize
19.22 KB

needed -upN when creating the diff. here's a reroll with the missing files.

chrisakeley’s picture

subscribe

Status: Needs review » Needs work

The last submitted patch failed testing.

neclimdul’s picture

#491456: ImageCache preset API broke this patch(as noted by testing bot). I think it really only raises the importance of this patch though. Will need to update parts from that patch to handle these changes.

neclimdul’s picture

Run at updating the patch to apply. I'd say this actually needs some work still, I didn't really look at the image changes mentioned in #49. Also, I'm not sure about file_directory_strip()'s location. Also, if everything passes through file_create_url, then the need for that function kinda fades away.

neclimdul’s picture

a real patch...

neclimdul’s picture

a real patch...

neclimdul’s picture

The current managed vs core file test doesn't actually work on preprocessed files because we've changed the one of the features of file_create_url, because it previously added file_directory_path to the path. I'm not sure what the correct fix is as I mostly came in to make a review.

Wim Leers’s picture

I'll reroll this patch tomorrow. I should've rerolled it right away after I posted my latest version.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
18.21 KB
neclimdul’s picture

Looks great to me. Everything seems to be working well and provides a much more functional file api function.

neclimdul’s picture

Status: Needs review » Needs work

Found a problem in the imagecache stuff. Add an image to your profile and the image will be broken. Investigating in file_url_alter shows this:

...
modules/toolbar/toolbar.js
htp://mysite/image/generate/thumbnail/sites/default/files/pictures/picture-1.jpg
misc/powered-blue-80x15.png
---

http broken to avoid url filter. Initial assumption would be that the url handling magic is chocking on the full url being passed in. This might be a behavior we want to go with and fix image's call to file_create_url (though I'm not sure it should be calling it if it knows it isn't going to be a real file url).

neclimdul’s picture

figured it out. Here's the issue:

  if (strpos($path, file_directory_path() . '/') !== 0) {
    return base_path() . $path;
  }

When this is passed a full url(http://foo/bar/baz) it chokes and puts the base path in front of it. ie /http://foo/bar/baz
I confirmed by making the return return url($path);. Since url() has an "external url" check, this sorted out the problem.

As I noted in my previous comment though, I'm not sure the image style code is actually passing a value we want to consider acceptable so I'm not posting a follow up patch.

drewish’s picture

humm... this is totally going to conflict with the work we've been doing over on the file stream wrappers (#227232: Support PHP stream wrappers) which is now happening over on #517814: File API Stream Wrapper Conversion.

Jody Lynn’s picture

I had already done some of the same changes to use file_create_url (the css and js ones) over on the stream wrapper git repository, so I went ahead and merged in the rest of those from this patch over there.

I didn't move over the test changes since our stream wrapper tests are a bit different, and didn't change the function definition or add the new hook. We have changed the function name from file_create_url to file_external_url as well.

Wim, any chance we can lure you over into the stream wrapper git world?

Wim Leers’s picture

@neclimdul: why then does imagecache pass absolute URLs? The doxygen said:

- * @param $path A string containing the path of the file to generate URL for.

and now says:

+ * @param $path
+ *   A string containing the Drupal path (i.e. path relative to the Drupal
+ *   root directory) of the file to generate the URL for.

Clearly, it was never intended to accept absolute URLs. So imagecache is at fault, IMO.

@drewish: that has been mentioned already, but this is complementary, really

@Jody Lynn: no, I don't want to start working on that massive patch and it's outside of my scope

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
18.29 KB

Turns out it wasn't imagecache to blame, but my change in theme_image() was incorrect. It didn't accept absolute URLs anymore.

Patch rerolled.

Dave Reid’s picture

Might be of interested to cross-link this with #320331: Turn custom_url_rewrite_inbound and custom_url_rewrite_outbound into hooks so we probably wouldn't need to create a new hook. Maybe file_create_url() could pass $options['file'] = TRUE to url() for use with the url_rewrite hooks.

Wim Leers’s picture

I'd prefer to keep them separate, as I already outlined in #11:

In my opinion, it's as simple as this: don't make it harder than it can be. Some people will want to rewrite file URLs but not Drupal paths. Other people will want to do the opposite, and even other people will want to do both. Also, file URLs may be served from other domains after rewriting and that will never be the case for Drupal paths.
It's simply conceptually easier to separate them. At least for me, that would be the case. I don't see how I could clearly write file URL rewriting logic if it were part of the same function as Drupal path rewriting.

Status: Needs review » Needs work

The last submitted patch failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
8.59 KB

#227232: Support PHP stream wrappers went in, hence I have to reroll this patch. The previous patch worked just fine, but didn't apply anymore due to stream wrappers going in.

This patch previously replaced all base_path() calls with file_create_url() calls. That is no longer necessary, because the file stream wrappers patch already does this now.

A change over the previous version is that instead of getting a $path parameter, you now get a $uri parameter in hook_file_url_alter(). You can then also decide based on the scheme if you want to rewrite the URL. This is necessary for e.g. uploaded files, which use the public:// scheme.

Thanks to Mollom for letting me re-roll this patch on company time :)

Wim Leers’s picture

Title: CDN integration: use file_create_url() for all file URLs » CDN integration: allow file URLs to be rewritten by hook_file_url_alter()

This title is now more appropriate, since file_create_url() is now already being used for all file URLs, thanks to #227232: Support PHP stream wrappers.

Wim Leers’s picture

And I forgot to port the changes to system.api.php. Fixed that in this new iteration.

neclimdul’s picture

+++ includes/file.inc	27 Aug 2009 20:40:06 -0000
@@ -299,11 +314,29 @@ function file_stream_wrapper_get_instanc
+  if ($uri != $old_uri) {

Does this limit us? What I mean is, why can't we alter into a different stream wrapper URI?

Other than that, I think this is almost a no brainer!

Wim Leers’s picture

I just removed that check (after talking with neclimdul). That automatically allows you to alter the scheme, i.e. to switch to a different stream wrapper.

Wim Leers’s picture

Forgot to remove a little piece of now obsolete documentation.

drewish’s picture

Status: Needs review » Needs work

I think it would be best to just put the url altering test module code into the existing file_test.module. i don't see the need for a separate module.

The first line of file_create_url()'s docs should be a single line, definitely not two sentences. I'd also see if you can avoid using "shipped" and "created" until you define them since they're not common vocabulary.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
10.33 KB

It *needs* to be in its own module, otherwise the hook_file_url_alter() implementation will cause other tests to fail. aaron also didn't see this immediately, but he realized it in #40.

I'm distinguishing between "created" and "shipped" files now because it's otherwise not very clear why no scheme would be set. And previously, base_path() plus some string concatenations would be used to create the URL to created files and hence there was no need for such vocabulary. Now that everything is being routed through file_create_url(), it makes sense to add that explanation IMO.

Changed the first line of file_create_url()'s doxygen.

drewish’s picture

Wim Leers, i don't think that makes any sense. you could use a variable_set()/variable_get() to control the alter. that's what file_test.module was doing at one point to check the hook_file_dowload but it looks like a subsequent patch has removed that. So no, i don't think we need two modules for this.

As for "shipped" vs "created" all i wanted was for you to not mention them in the 1st line since you didn't have the space to define them. The currrent patch meets that goal.

Wim Leers’s picture

Ah, well, a variable_set()/variable_get() to control the alter is possible, yet very ugly/unclear. However, I understand that the number of test modules should be kept to a minimum, so from that perspective your suggestion does make sense.
Changed that in this new iteration: removed the file_url_test.module and moved the hook_file_url_alter() implementation into file_test.module.

And I indeed misinterpreted what you said about "shipped" vs "created". Glad that's sorted out :)

neclimdul’s picture

Status: Needs review » Reviewed & tested by the community

Looks even better than the last time I looked! I think its ready for the eagle eyes of a committer.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks! Should we add a CHANGELOG.txt entry about "CDN support"?

webchick’s picture

Status: Fixed » Needs work
Issue tags: +Needs documentation

I think so. :)

We should also document the addition of this new hook in the module upgrade guide: http://drupal.org/update/modules/6/7

Wim Leers’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
758 bytes

Yay! :)

I found a small bug in the system.api.php documentation that I forgot to remove. See the attached patch.

Updated the D6->D7 upgrade guide as well.

Dries’s picture

Status: Reviewed & tested by the community » Needs work

Committed to CVS HEAD. Still need CHANGELOG.txt entry so marking as 'code needs work'.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
720 bytes

Something like this?

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

Wrong status.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed. Thanks.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

parrottvision’s picture

Will have you seen this module also?

http://drupal.org/project/simplecdn

Status: Closed (fixed) » Needs review

mozier requested that failed test be re-tested.

Jody Lynn’s picture

Status: Needs review » Closed (fixed)

mozier is a bot?

BenK’s picture

Just need to keep track of this thread....

404’s picture

subscribing