drupal_query_string_encode() should not call drupal_urlencode()
| Project: | Drupal |
| Version: | 5.x-dev |
| Component: | base system |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | patch (to be ported) |
| Issue tags: | test case |
The query string should be encoded using rawurlencode() and not drupal_urlencode(). The latter causes double-encoding of the query string contents, which is not needed: the double-encoding is only needed for clean URLs due to issues with apache's mod_rewrite, and these issues don't affect the query string part of the page request.
Related issues:
http://drupal.org/node/178615 - includes the double-encoding problem, proposes a fix by making drupal_urlencode() more complex.
http://drupal.org/node/74070 - an old "fix" that I think was wrong, or is now wrong, and needs undoing.
http://drupal.org/node/301192 - workaround for the same bug, as found in the Printer module.
Note: the only place that drupal_urlencode() is called (correctly) from core is within the url() function. It might make sense to remove the drupal_urlencode() function and move its contents into url(). This would remove the temptation to call it from inappropriate places. Contributed modules should use url() to construct encoded URLs.

#1
Here's the very simple patch needed to fix the double-encoding problem.
#2
Here's an improved patch, including a comment within the drupal_urlencode() description to warn that the function shouldn't be used for query strings.
(For some reason diff thinks the change is in drupal_json(), but it isn't).
#3
Ok, please remove drupal_urlencode().
Also this will need to go to the current development branch (Drupal 7.x), and we will need to write a complete test case for this.
#4
Here's a patch for HEAD (1.797), not much different apart from the additional space in string concatenation on the second changed line. Hopefully I'm doing this correctly - I'm new to getting involved with Drupal core (I've been coding sites with Drupal 5 and 6 for a year or so now).
This just corrects query_string_encode(), leaving drupal_urlencode() as a callable function. I suspect there's a whole separate discussion needed as to whether drupal_urlencode() is redundant (and dangerous?) - unless by "please remove" you mean get rid of the function too?
Now I just need to investigate writing test cases! I found the problem with data in a query string and the standard Drupal sortable tables (which use drupal_query_string_encode() to generate the sort links in the headers).
#5
Here's another patch, with fixes for some other places where drupal_urlencode() is called on query strings.
I've also improved the function documentation for url() to make it clear when the query optional argument should be urlencoded or not. [I've a feeling the argument should never be encoded, but that would seem to be a fairly big task to change everywhere that calls url() with a pre-encoded and non-array query option.]
Not sure about the two remaining places where this sort of thing happens:
1) misc/drupal.js has a copy of drupal_urlencode(). We might also need to investigate usage of encodeURIComponent() in javascript code, to check it isn't being used on query strings.
2) modules/search/search.test uses drupal_urlencode(), but I don't know if drupalGet() also does encoding...
#6
Now that Drupal 7 requires PHP5, I think that drupal_query_string_encode() can be simplified a lot by using http_build_query().
#7
The patch looks good.
You are right about both points in #5.
1) The problem with drupal.js can be seen on the page admin/settings/date-time: Select “Custom format” and add "//" to the text field. Note how the example date below the field looks wrong.
2) It's true that drupalGet() encodes the path (via url()), so the drupal_encode() call can simply be removed. Your can verify the problem by modifying drupal_web_test_case::drupalCreateNode() to include some special characters in the node title.
I fixed these two things.
#8
The last submitted patch failed testing.
#9
Reroll.
#10
The last submitted patch failed testing.
#11
Reroll.
#12
Just marked the issue submitted at #298498: Drupal assumes mod_rewrite bugs which are not there a duplicate of this one. That involves the JS Drupal.encodeURIComponent() usage of which breaks URLs in l10n_client. Unfortunately when implementing the encoding code for l10n_client, I've followed the usage of Drupal.encodeURIComponent() from system.js, which turns out is bugos. This bugfix should also be backported to Drupal 6!
I'd also ask for a little doc update to Drupal.encodeURIComponent() to mention that it should not be used on URL components other then the path.
#13
Subscribing, I found this today in a site, especial characters in query values encoded twice.
#14
I renamed Drupal.encodeURIComponent() to Drupal.urlencode() to emphasize that it is closer related to drupal_urlencode() than to JavaScript's built-in encodeURIComponent(), and added a comment about its usage.
See also #284899: Drupal url problem with clean urls that suggests another workaround for the mod_rewrite limitation that eliminates the need to use different functions for encoding the path and the query string.
#15
The last submitted patch failed testing.
#16
Reroll.
#17
>The latter causes double-encoding of the query string content.
Have a look at the related issue #385918: destination parameter gets two times urlencoded.
As I've noted there I think it gets double-encoded because it calls drupal_url_encode/url_encode twice on it.
#18
#19
The patch looks great. Now we just need a test case! :)
#20
I added a few tests. I added them to the class that already contains the tests for l(), as the functions seem related. Future tests for url() may be added to this class too.
#21
The last submitted patch failed testing.
#22
Reroll.
#23
The last submitted patch failed testing.
#24
Reroll.
#25
Let's rename drupal_urlencode() to drupal_encode_path(), and stop the confusion. The patch itself looks good.
#26
Renamed drupal_urlencode() to drupal_encode_path(), and the JS function Drupal.urlencode() to Drupal.encodePath().
This should be added to the upgrade docs before this bug is marked as fixed.
#27
#28
Nice! Let's get this in.
#29
The last submitted patch failed testing.
#30
HEAD is broken : http://testing.drupal.org/node/35
#31
I'm ready to roll a D6 backport as soon as this goes in.
#32
The last submitted patch failed testing.
#33
testbot is right - patch does not apply. Re-rolling
#34
re-rolled - must have just been whitespace changes.
#35
Makes complete sense, the test bot it happy, let's get this in.
#36
This makes sense indeed. Committed.
#37
getting ready to backport minus the function name changes.
#38
#40
Looks good.
#41
Committed to Drupal 6, thanks!
#42
backport for D5?
http://api.drupal.org/api/function/drupal_query_string_encode/5