I stumbled on this when running some unit tests, problem with handling paths with spaces in them.

I suspect this may not be an issue because Drupal never expects a space in any of its internal URLs.

But here's the argument that will fail if passed to l() (and hence to url()):
"/a b.html"
which gets converted into
"/a+b.html"
and this will not work (even if a file "a b.html" exists).

Correct conversion should be:
"/a%20b.html" - this will pull up the file named with space in it.

If "/a%20b.html" is sent to URL, it gets converted to "/a%2520b.html", which is also an invalid page.

As I said this may not be of interest for Drupal internal URLs, but looking at the PHP urlencode() docs, it says that for historical reasons it uses a + instead of %20, and rawurlencode() should be used to confirm to RFC 1738 standard.
If urlencode() does not confirm to the required standard, shouldn't it be replaced with rawurlencode() everywhere?
Just curious.

Comments

Steven’s picture

l() and url() should only be used to link to Drupal menu paths, not to files. For those, the choice of + signs is equivalent to %20 (as it always ends up as a GET query argument, after mod_rewrite), but shorter and easier to read.

The only exception is that you can pass in absolute URLs, but this is mostly for creating custom menu items.

Drupal has no problem using spaces in menu paths. This for example:
http://drupal.org/search/node/urlencode+menu+system

There is no custom code to turn those '+' signs into spaces. PHP does this when parsing the GET query.

--
If you have a problem, please search before posting a question.

ms2011’s picture

Files/URLs are not accessible if they contain reserved or non-ascii characters.
See url_generation.patch here http://drupal.org/node/43505#comment-316523

Mike Smullin
http://www.mikesmullin.com/

davedelong’s picture

+1

Any image we upload that has spaces in the name returns a 404 when accessed from our Atom feeds, because the Atom module uses url() internally. This means either the upload module should be fixed to strip out the characters or the url() method shouldn't use urlencode(). (Drupal 5.7)

Dave

pmetras’s picture

+ 1 too.

This is really important with web sites which accept user generated content. Other solutions like file names transliteration are not acceptable as users expect to see their uploaded files stored without change on the site (and when your platform supports correctly UTF-8, that's not a problem).

So either Drupal must generate correct URLs, or else it should be hacked to look up for files containing spaces and exotic (means non US ASCII) characters in the name.