This patch fixes three things:

The JS test for clean-urls-or-not is fixed, not found is -1, not 0.

String.replace() in JS only replaces the first occurrence, using a regular expression with '/g' replaces all.

Apache sees http://example.com/some//path as http://example.com/some/path, which we do not want. Like '&' and '#', '/' is double decoded by mod_rewrite, which is why clean URLs containing '%2F' are 404ed by Apache. To avoid both Apache annoyances, we double encode every second '/' when using clean URLs.

I have tested on both FireFox 2 and IE 6. If you want to test, go to an autocomplete field and type '//', '&', and '#' with clean URLs both on and off.

Comments

pwolanin’s picture

works as advertised in terms of encoding the path - viewing request in firebug console

I was able to verify the functionality using a free-tagging taxonomy field in the node form, since terms can include these special chars.

Stefan Nagtegaal’s picture

Version: 7.x-dev » 6.x-dev
Component: base system » javascript
Status: Needs review » Reviewed & tested by the community

Seems to work as intended..
RTBC (should be backported to 5.x as well, where this problem also occurs)

gábor hojtsy’s picture

Status: Reviewed & tested by the community » Needs work

Looks to me that "mod_rewrite's unescapes %-encoded ampersands," should simply be "mod_rewrite unescapes...", otherwise I don't understand what it means.

drumm’s picture

Status: Needs work » Reviewed & tested by the community
StatusFileSize
new2.2 KB

Trivial comment fix.

gábor hojtsy’s picture

Status: Reviewed & tested by the community » Patch (to be ported)

Thanks, committed to 6.x! Marked as "to be ported" to 5.x, as Stefan pointed out.

gábor hojtsy’s picture

Version: 6.x-dev » 5.x-dev
drumm’s picture

Version: 5.x-dev » 4.7.x-dev
Status: Patch (to be ported) » Reviewed & tested by the community
StatusFileSize
new1.46 KB

Only the common.inc part is relevant for 4.7.

drumm’s picture

Oh and I committed the full patch to 5.x.

gerhard killesreiter’s picture

Status: Reviewed & tested by the community » Fixed

applied

mfb’s picture

This patch seems to break the audio module's flash player.
http://drupal.org/node/162379
If anyone has ideas on a fix, feel free to suggest..

killes@www.drop.org’s picture

Priority: Normal » Critical
Status: Fixed » Active

this patch just broke a client's site. urlencoding of full urls is clearly broken.

mfb’s picture

Version: 4.7.x-dev » 5.x-dev

Assuming this is not tracked elsewhere in 5, I am moving this from 4.7 to 5.x-dev.

drumm’s picture

Unlike the main URL, the query string does not need the // prevention.

drumm’s picture

Version: 5.x-dev » 6.x-dev
Status: Active » Reviewed & tested by the community

Unless someone has a better idea, I am in favor of reverting this patch.

URL encoding in Drupal is a mess which I do not have the patience to solve now. The main issue is that when mod_rewrite changes a path to an argument, it de-urlencodes it. This is a problem for special URL characters, & and #. drupal_urlencoding the arguments of a URL should not happen, mod_rewrite does not de-urlencode them. However, we do not want basic urlencode() for arguments. When clean URLs are off, %2F is changed to /. This should happen for all encoded arguments.

drupal_query_string_encode() needs to be fixed, it incorrectly uses drupal_urlencode(). All of the urlencoding code in Drupal needs to be reviewed for correct use of the API; there are many places which do not use drupal_query_string_encode() and build their own query string.

For the record, I encountered this when using Drupal's autocomplete for a URL field. Autocomplete puts the query in the path and the // became /. With this patch reverted, any // will not autocomplete.

drumm’s picture

Status: Reviewed & tested by the community » Needs work

Actually, a straight revert won't be good since some bugs in the JS got fixed. Just the // replacement part needs reverting.

drumm’s picture

Status: Needs work » Needs review
StatusFileSize
new1.43 KB

Attached is a revert of the // encoding. I tried testing this with taxonomy's tag autocomplete and discovered that anything after the first slash is ignored, so the encoding of the second slash does not matter at all.

Here are a few points for and against revert:

+1: This fixes any modules which were broken.
-1: It breaks any module which now depends on the current code.
Audio module is the only one I know about, and it will not be affected since they reimplemented drupal_query_string_encode() and it does not call drupal_urlencode().

+1: It put thing back the way they were.
-1: Both ways are a mess.

-1: Some autocompletes theoretically become more broken.
There is a good chance they are broken anyway, they often just check one arg().

I will see what I can do for fixing the mess on HEAD only, with or without this patch.

gerhard killesreiter’s picture

All I see is that the 4.7 patch broke something which was working before, so I am in favour of revokign the relevant parts of this patch.

gábor hojtsy’s picture

Oh, uhm. Chx asked me to look into this, but I'd rather hand this decision off to Dries.

drumm’s picture

Status: Needs review » Closed (fixed)

I put a new issue up at http://drupal.org/node/178615 to fix the new problems I found.