Hi
I'm trying to use a %token in the Redirect Location Custom URL box to send information to an external site, but whatever I do, it gives the error message "The entered URL is not a valid address". As a test, I put a fixed value in the URL (without a %) and the problem went away.
I've looked in the issues list but as far as I can tell, what I am trying to do should be ok.
I'd be glad of any advice.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

quicksketch’s picture

Category: support » bug

After re-reading your request a few times I found the circumstances around how to reproduce (though you described it accurately). This problem only occurs with full URLs (such as http://example.com/sample?nid=%nid) not with relative ones (such as node/%nid/done). This is because absolute URLs are tested against the valid_url() function, which will of course not work with a URL that contains % signs because they're not allowed in URLs other than as encoding characters.

Perhaps we could run _webform_filter_values() on the Redirect URL before running against valid_url()? But then we may end up with empty values when $submission is not passed in... I'll have to take a look at this.

quicksketch’s picture

Title: Use of %tokens in redirect location url » Use of %tokens in external URL redirect locations not allowed
class47’s picture

Thanks. The reason for a full URL is that it needs to go to a secure (https) URL . However, now I know it is ok for relative urls, I can use a workaround.
Thanks for the helpful reply.

kevinwalsh’s picture

@class47: Just wondering what your workaround was?

class47’s picture

Kevin
Just seen your question. Sorry for the long delay.
The workaround was to use tokens that did not involve % characters. The tokens I did use produced a serial number for the message which was added to the redirection URL and was also included in the email message to the Webform recipient. It almost achieved what I had originally wanted to do with % tokens, and was good enough.

Mike

kevinwalsh’s picture

Thanks!

drews2’s picture

This is also a problem in Drupal 7. I need to use URL parameters like %value[name] in order to pass the webform values to a Python script that generates maps from NetCDF data. Is a fix on the way here?

What is the list of "tokens that did not involve % characters."?

rick.kowal’s picture

in case anyone still needs a fix to this, i found a workaround.

Create a hidden form variable in your webform of the external URL you want to submit to. Then use that token in the URL.

For example, I made an easy "Find a Kroger Store Near You" box which simply submits to google maps:

http://maps.google.com/?q=Kroger+near+48375

My Custom URL in Location Settings is:

%value[url]/?q=Kroger+near+%value[zip]

of course, the hidden url field was set with the value of "http://maps.google.com"

works like a charm.

stevengraff’s picture

I tried rick.kowal's workaround but ran into a problem when the value I pass includes a space.

I'm using a webform to gather a visitor's address, so I can re-direct to a google map showing a route to my office. It works fine if the city name is something like Detroit or Farmington, but Oak Park fails. I hit the submit button and the screen just flickers once.

btw, maybe no surprise, but if I enter the city into the form as Oak%20Park, well, everything works fine.

Actually, thinking it through, the city name could be worked around by just sticking to the zip code; the real problem comes from passing in the address, i.e. 125 Main Street.

Any workarounds come to mind for this situation?

I'm using 7.x-3.17.

csc4’s picture

@stevengraff - thanks for a brilliantly simple workaround - perhaps worth documenting somewhere on the module pages/faq to save others the runaround until a full fix is possible?

I've spent 4 and a half hours trying to fix this and was about to bodge the test in webform.pages.inc to exclude URLs with a % in - but really didn't want to! When I found your post http://drupal.org/node/1244072#comment-5946406

Thanks! Just wish I'd found it yesterday!

gaele’s picture

Version: 6.x-3.11 » 7.x-3.x-dev

Still a problem.

Thank you rick.kowal for the workaround!

ubun2Junky’s picture

stevengraff;

Did you find a solution to the space issue? I'm having the exact same problem, however my field requires that I enter terms with spaces. Anyone? Thx.

L

stevengraff’s picture

I'm sorry; I did not.

dotbjorn’s picture

I've written a patch for this very issue for our site.
It creates new tokens prefixed with %urlencode_ that contain URLencoded token replacements, this is only done when creating redirect URLs, it also patches the check for correct URLs on webform create/edit and the help text.

Feel free to use/ignore as you see fit :)

quicksketch’s picture

Title: Use of %tokens in external URL redirect locations not allowed » Use of tokens in external URL redirect locations not allowed in some cases and not URL Encoded

Thanks for the patch dotbjorn. I'd hate to add another parameter to _webform_filter_values(). I'm also not keen on adding new tokens specifically that just URL Encode the values. Most users aren't going to know when it's appropriate to URL Encode a variable.

Marked duplicates:

#1803490: Custom URL Encoded?
#1946460: Allow redirection to custom URL with query parameters

dotbjorn’s picture

OK. Thanks for looking at this.

Your comments make sense, I could easily remove the new parameter to webform_filter_values, that parameter is there mostly because I initially wrote the patch to URL encode all tokens returned, and then realised that this broke some of our existing webforms which depended on the old behavior. It could be removed with only a very slight perfomance penalty.

But I'm somewhat unsure how else to provide URL encoded tokens. Simply replacing all tokens going into URLs with their URL encoded equivalent is problematic, as some users are putting URLs into hidden parameters, and could mean broken redirects for people updating the module.

If you have any suggestions as to a better solution I'm more than happy to re-roll the above patch so that you can include it in the module. Otherwise I'll have a think when I have some time (whenever that might be...). I do think that providing a way to urlencode parameters in redirects is pretty important if redirect based on parameters is being offered as a feature.

quicksketch’s picture

I think what we'd need to do here is explode out the URL that's been entered by the user using parse_url, then we should separately run _webform_filter_values() on the path and query string parts, where we URL encode the query string parts after token replacement but don't do that for the query string. Because this bit of logic will get a little messy, I think making a dedicated function for handling it would be best, i.e. _webform_filter_url() that takes most of the same arguments as _webform_filter_values() but works explicitly on URLs alone.

swirt’s picture

This is how I just resolved it on the 7.x-4.x alpha6 branch in webform.module
The issue is that the redirect is just being passed as a string, so any query parameters and or fragment (hash) are not getting handled correctly by the form state redirect which is set to handle an array.

Just before the line

 $form_state['redirect'] = $redirect;

Add the following:

//process the redirect string into an array
  if (is_string($redirect)) {
      $aOldUrl = parse_url($redirect);
      //Assemble url components
      $sBaseUrl = (!empty($aOldUrl['scheme'])) ? $aOldUrl['scheme'] . '://' : '';
      $sBaseUrl .= (!empty($aOldUrl['host'])) ? $aOldUrl['host'] : '';
      $sBaseUrl .= (!empty($aOldUrl['path'])) ? $aOldUrl['path'] : '';
      $sFragment = (!empty($aOldUrl['fragment'])) ? $aOldUrl['fragment'] : '';
      $aQuery = array();
      if (!empty($aOldUrl['query'])) {
        parse_str($aOldUrl['query'], $aQuery);
      }

      $aNewUrl = array(
          $sBaseUrl,
          array(
            'query' => $aQuery,
            'fragment' => $sFragment,
              ),
      );
    $redirect = $aNewUrl;
  }

My apologies for not rolling a proper patch, but my branch doesn't match so the patch fails.

quicksketch’s picture

Thanks @swirt, I hadn't thought of taking that approach. Using an array instead of a string for the $form_state['redirect'] would solve our encoding problems too. Rather than manually doing it ourselves, Drupal will do that for us if we pass an array. The only additional thing we need to avoid is any double-encoding in the URL. I think we should only URL-encode the token parts and assume that any hand-typed strings are already urlencode'd

swirt’s picture

Good point about the double encoding. Rather than worry about where the first encoding came from (token or hand typed) I think it would work to just treat both the path and the query as "possibly encoded" and decode them both. I might be missing some possible fringe cases, but I think this would handle it.

//process the redirect string into an array
  if (is_string($redirect)) {
      $aOldUrl = parse_url($redirect);
      //Assemble url components
      $sBaseUrl = (!empty($aOldUrl['scheme'])) ? $aOldUrl['scheme'] . '://' : '';
      $sBaseUrl .= (!empty($aOldUrl['host'])) ? $aOldUrl['host'] : '';
      $sBaseUrl .= (!empty($aOldUrl['path'])) ? urldecode($aOldUrl['path']) : '';
      $sFragment = (!empty($aOldUrl['fragment'])) ? $aOldUrl['fragment'] : '';
      $aQuery = array();
      if (!empty($aOldUrl['query'])) {
           $aOldUrl['query'] = urldecode ($aOldUrl['query']);
           parse_str($aOldUrl['query'], $aQuery);
      }

      $aNewUrl = array(
          $sBaseUrl,
          array(
            'query' => $aQuery,
            'fragment' => $sFragment,
              ),
      );
    $redirect = $aNewUrl;
  }
dotbjorn’s picture

Okay... Sorry for taking so long to respond on this. I've been a little busy.

I've tested the code above, posted by swirt. It does generate valid URLs, however it doesn't really correctly handle URLencoding of special characters, &, =, etc.

If a user enteres "something&something" into a text field, the generated URL will have a parameter named "something". Which is not ideal.

The problem is really that we're trying to second guess the admin's intentions when building the form. Which is a tricky thing to do well.

I've rolled 3 patches for this, in the hopes of seeing at least some progess on this issue. My preference would still be to allow site administrators to decide when tokens get url encoded. I can see the downsides to this, but it's the only way to ensure correct behaviour and satisfy my more OCD developer instincts :)

Patches to follow...

dotbjorn’s picture

Patch 1:

Simply provide tokens prefixed "urlencode_" and allow the site administrator to decide when to encode.

This removes the extra parameter added to _webform_filter_values in my first patch. It's otherwise very much the same.

dotbjorn’s picture

Patch 2:

This is a compromise solution.

We attempt to semi-intelligently guess the intentions of the administrator.

If an URL has been entered with a valid path and some kind of query string (as determined by parse_url), we attempt urlencoded token replacement on the query string, and simple token replacement on all other parts of the URL.

If the above does not produce a valid URL, as determined by valid_url, we fall back to swirt's method above.

This adds an extra parameter to _webform_filter_values to do the url encoding. I cannot see how else to acomplish that task without duplicating the function or putting the parameters into a big array.

dotbjorn’s picture

Patch 3:

This is basically the code posted by swirt, with some checks for validity modified in webform.pages.inc.

dotbjorn’s picture

Status: Active » Needs review

I hope the above are of some help. Happy to anwser any questions...

quicksketch’s picture

Status: Needs review » Needs work

Thanks dotbjorn. These patches could use some work, each for the following reasons:

Patch in #23: We're not going to add any more parameters to _webform_filter_values(). Just filter the tokens individually and then urlencode the result.
Patch in #24: This one looks pretty good but it still modifies _webform_filter_values(). It also includes a second patch in the patch itself.
Patch in #25: This has a complete diff between Webform 2.x and 3.x(?) I can't tell what's going on in this one.

So in short, #24 looks best but it should be trimmed down to only include the new _webform_filter_url() function and not modify the signature of _webform_filter_values().

dotbjorn’s picture

Status: Needs work » Needs review
FileSize
4.91 KB

OK...

Sorry about the third patch above. Don't know what happened there. Finger trouble on my part no doubt.

I've modified the second patch so that the signature of _webform_filter_values is not changed. This is done by filtering out all possible tokens from the query string using regex and filtering and urlencoding each one individually. Less robust that way, but should work well enough.

How do we feel about this one?

quicksketch’s picture

Thanks, this patch looks like it's on the right track. The abstraction is right where it should be, IMO. The implementation still looks pretty complex though. I'm going to need to spend some time combing through this implementation and see if it can be cleaned up.

ianthomas_uk’s picture

Issue summary: View changes
Status: Needs review » Needs work

#17 I think we should only URL-encode the token parts and assume that any hand-typed strings are already urlencode'd

This makes sense, the non-token parts are likely to include special characters which shouldn't be encoded (most notably slashes, ampersands and question marks).

#20 I think it would work to just treat both the path and the query as "possibly encoded" and decode them both

That causes problems when you do want something to be double encoded, for example when one of your query string parameters contains a URL encoded as a value.

Review of #27

  1. +++ webform/includes/webform.pages.inc	2013-08-19 15:42:18.000000000 +0100
    @@ -264,7 +264,7 @@
    -    elseif (strpos($redirect_url, 'http') === 0 && !valid_url($redirect_url, TRUE)) {
    +    elseif (strpos($redirect_url, 'http') === 0 && !valid_url(_webform_filter_url($redirect_url, TRUE))) {
    

    I think this is needed to prevent unparsed tokens from looking like url-encoded entities.

  2. +++ webform/webform.module	2013-08-19 15:49:46.000000000 +0100
    @@ -2576,6 +2576,89 @@
    +  // Otherwise, we attempt to reconstruct the correct URL after the fact
    +  $redirect_url = _webform_filter_values($redirect_url, $node, $submission, NULL, FALSE, TRUE);
    

    I think this fallback defeats the much of point of the rest of the code.
    If $request_url was
    http://example.com?foo=[bar] and bar was 'false&letmein=true' then the returned url would be
    http://example.com?foo=false&letmein=true

  3. +++ webform/webform.module	2013-08-19 15:49:46.000000000 +0100
    @@ -2576,6 +2576,89 @@
    +function _webform_assemble_parsed_url($parsed_url) {
    

    Nothing in this is webform specific, I think we should be using url() for this.

I think this will also need the patch from https://drupal.org/node/2032793

This allows tokens to be passed as the values of query string parameters. Do we want to allow tokens to be used in the URL path or the names of query string parameters? If so, what are the security implications and how do we escape them?

ianthomas_uk’s picture

I no longer think the code quoted in #29 point 1 is necessary in Drupal 7. As far as I can tell, this was to handle tokens containing % characters (such as %site-name), but tokens in D7 don't need to be URL escaped (the example becomes [site:name]).

I'm working on a combined, simplified, patch now.

ianthomas_uk’s picture

Status: Needs work » Needs review
FileSize
5.9 KB

Here's my revised patch the incorporates the changes from #2032793: When adding a query string to redirect url, the ? and = are converted and url is borken

The first two hunks for webform.module are mostly refactoring to bring to URL handling closer to the code that needs it and not even run it if redirect_url is one of the magic constants ( or ). The important changes are the switch from webform_replace_tokens to webform_replace_query_string_tokens and the call to drupal_parse_url from the patch on #2032793: When adding a query string to redirect url, the ? and = are converted and url is borken.

The third hunk is implementation for the new function. This doesn't need any explicit urlencoding, as that is handed when the URL array is turned back into a URL.

quicksketch’s picture

Thanks @ianthomas_uk for this patch. It's looking really good. I'd suggest that we pare down the parameters for webform_replace_query_string_tokens() (we don't need $email or $sanitize), but otherwise this concise patch does what we need it to. Anyone else needing this functionality, please test out this patch and report your findings.

ianthomas_uk’s picture

I guess that makes sense. Here's a patch to remove those variables.

Fixing version. This was created against 7.x-4.x-beta2+1.

quicksketch’s picture

This patch looks great, thanks @ianthomas_uk! One more minor thing:

+  $token_data = array();
+  if ($node) {
+    $token_data['node'] = $node;
+  }
+  if ($submission) {
+    $token_data['webform-submission'] = $submission;
+  }
+
+  $parsed_redirect_url = drupal_parse_url($redirect_url);
+  if (!empty($parsed_redirect_url['query'])) {
+    foreach ($parsed_redirect_url['query'] as $key => $value) {
+      $parsed_redirect_url['query'][$key] = token_replace($value, $token_data, array('clear' => true, 'sanitize' => FALSE));

Unless I'm missing something, this is identical functionally to just:

+  $parsed_redirect_url = drupal_parse_url($redirect_url);
+  if (!empty($parsed_redirect_url['query'])) {
+    foreach ($parsed_redirect_url['query'] as $key => $value) {
+      $parsed_redirect_url['query'][$key] = webform_token_replace($value, $node, $submission);

Other than that, things like great. I'll reroll and commit here shortly.

quicksketch’s picture

Hm, additionally, shouldn't tokens be supported in some capacity for the path component as well? It's not uncommon to have a redirect URL that would be something like "node/[submission:values:parent_nid]".

quicksketch’s picture

Here's a rerolled version, which does token replacement in the "path" portion of the URL as well, even though those tokens are *not* currently URL encoded. It'd be better to allow tokens that may not work in some situations than to break sites that may already be depending on tokens in paths that don't need to be encoded.

ianthomas_uk’s picture

Use of webform_token_replace looks fine.

I avoided that because I was concerned that there could be a security issue if a token with a user-provided value was used in the path. I can't think of any specific exploits, it just seemed risky.

That could be avoided by restricting the permitted characters, but I think that could be confusing and sometimes unreliable.

quicksketch’s picture

I was concerned that there could be a security issue if a token with a user-provided value was used in the path

If users could type in a particular path into the URL and have it be a security exploit, that in itself would be a problem. There could potentially be tokens that contained sensitive data (i.e. a session ID or cookie information), but I don't think there are any such tokens provided by core or token module. In any case that'd be an issue with the token, not with Webform for allowing you to use it.

If this looks all good, I'm happy to put it into the 4.x branch. Can you find any problems with the latest version of the patch?

ianthomas_uk’s picture

My concern is that you're potentially allowing the unpriviledged visitor to choose the destination path. See https://www.owasp.org/index.php/Top_10_2013-A10-Unvalidated_Redirects_an... for more details about the sort of exploits possible. The problems are almost entirely mitigated by the fact that we're only talking about the path rather than the full URL, which greatly restricts what an attacker can do.

For an example of a theoretical attack, imagine that you've got an online store selling personalised items (cafepress style) and you have an order form that people can use to submit their artwork etc to get a quote. Because you sell to lots of sports teams you also offer public blog hosting.

After submitting the order form, you might redirect users based on the product they were interested in:

/[submission:values:product-type]/[submission:values:product]/

intended to redirect to

/apparel/hoodies

An attacker then crafts a link to the order form such that product type is 'blogs' and product is 'mallory', so the user is redirected to /blogs/mallory. Mallory has one blog post, which thanks users for their order and and directs them to email their payment details to mallory@example.com

This specific example would probably be mitigated by other factors (like how you get product-type to be blog), but should give you an idea of where my concerns are. Is there anyone who is an expert in this field that we could ask for an opinion?

There could potentially be tokens that contained sensitive data (i.e. a session ID or cookie information), but I don't think there are any such tokens provided by core or token module.

That the responsibility the webform author to choose appropriate tokens, there's not much we can do.

Other than this potential security concern, I'm happy with the patch on #36.

greggles’s picture

There are two potential problems I can think of: http header injection and redirecting people to an unintended destination.

The http header injection problem should be OK, I believe, because drupal_goto protects against that.

On redirecting to an unintended destination...the most simple case feels like not a big deal. The user is entering data into a webform and will get redirected to some of those values based on their entries. So...they will go to the place they have selected, in effect.

Further, the site builder has a responsibility (to usability if nothing else) to have content at every possible landing page to avoid 404s and modules like pathauto prevent creating new urls that clobber existing urls.

A more complex case is if webform values are getting preset from url parameters and are maybe hidden in a fieldset, then I can see how this would be considered a risk.

It seems reasonable to me to warn people of the risks of using tokens in their form (usability and security) and then call this good.

quicksketch’s picture

Status: Needs review » Fixed

Cool, okay well not changes needed in this particular case then. This patch comes back down to just URL-encoding tokens in the query string, which is the only actual change here. Thanks @greggles for your valued input here!

Committed and pushed #36.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

fenstrat’s picture

Version: 7.x-4.x-dev » 8.x-4.x-dev
Assigned: Unassigned » fenstrat
Status: Closed (fixed) » Patch (to be ported)

Needs porting to 8.x-4.x.

fenstrat’s picture

Version: 8.x-4.x-dev » 7.x-4.x-dev
Assigned: fenstrat » Unassigned
Status: Patch (to be ported) » Fixed

Committed and pushed 2a1a005 to 8.x-4.x. Thanks!

  • Commit aec9b2a on 8.x-4.x by fenstrat:
    Issue #1244072 by ianthomas_uk, quicksketch: Use of tokens in external...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

John Pitcairn’s picture

Status: Closed (fixed) » Needs work

This change prevents the confirmation from being displayed as a system message when there is no redirect, because $redirect_url is never set. See #1414176: no confirmation message after form has been sent successfully (no redirect, just pagereload) comment #9-10 for the symptom...

John Pitcairn’s picture

John Pitcairn’s picture

And it was fixed again in #2199925: Message not displayed when Redirection location is 'none'. I was using RC3, which has the same date as dev but does not include the fix. Apologies for the noise.