Link module separates out the query string and fragment identifier in _link_sanitize(). When this happens, it does not add them back before saving into the database, resulting in these parts of the URL being lost. The attached patch corrects this. The patch also removes some trailing whitespace.

Comments

dqd’s picture

Assigned: Unassigned » dqd

@ Liam Morland : thanks for pointing out! First I wasn't able to reproduce. Weird that it first doesn't occur. I've tried with a google.com/search?cgi string and all worked fine. But suddently ... two days later the error you describe came up. Strange.

thanks for the patch! I will test around and hope that some others also can join to confirm that it works with no side effects. Then I will commit it immediately!

Thanks to Liam Mordan++

dqd’s picture

No guess what: the thing becomes even more strange - I thought I just should test it again by entering the ?link again and looking if it breaks again, but then ... no ... it works again. Without any modification. I can't get it. No cache enabled. The only thin I remember is that I have installed another module in between in the first scenario. It was editable fields. I think I have to test more ...

liam morland’s picture

We are using Link in combination with Workbench Moderation. We've found that drafts can be saved for content types which have Link fields and the URLs are fine. However, when they are published from the View tab (not from Edit or Moderate), the query string is removed. I don't fully understand why this happens, but the patch above fixes it and, so far, doesn't appear to have any side-effects.

liam morland’s picture

I should add that I noticed that _link_sanitize() is only called under some conditions in our setup. When it is not called, URLs are stored unchanged. When it is called, the query string and fragment identifier are separated out but not put back together.

dqd’s picture

ah, I see! Ok let me do some more checks on that.

lsolesen’s picture

Also using workbench moderation and has the same issues.

lsolesen’s picture

Priority: Normal » Major
dqd’s picture

Priority: Major » Normal

please stop tagging issues as major randomly here as along as there is no data loss or WSOD. Review soon ...

karlshea’s picture

Status: Needs review » Reviewed & tested by the community

We are using workbench_moderation and I can also confirm the patch fixes the problem.

dqd’s picture

Thanks KarlShea, Thanks Liam

Patch on the way to HEAD

summit’s picture

Hi,
Am I correct this is what needs to be added?

@@ -346,6 +346,14 @@ function _link_process(&$item, $delta = 0, $field, $entity) {
       unset($item['url']);
     }
   }
+
+  // Re-assemble URL from parts separated out in _link_sanitize.
+  if (!empty($item['query'])) {
+    $item['url'] .= '?' . http_build_query($item['query']);
+  }
+  if (!empty($item['fragment'])) {
+    $item['url'] .= '#' . $item['fragment'];
+  }
karlshea’s picture

That looks correct, that was all that I changed in my copy of the module.

I had a semi-related fix in another module, and they changed http_build_query to drupal_build_query.

Jevedor’s picture

Its worth noting that this problem also exists in the link module for drupal 6. the same fix also resolves it however it looks like $item["query"] comes over as already processed into a string so you dont need to wrap it in http_build_query().

dqd’s picture

Status: Reviewed & tested by the community » Fixed
StatusFileSize
new522 bytes

Thanks Liam Morland for the patch. But for the future, please don't mix things up. Provide a patch with only issue depending code next and seperaate the rest for another issue. It makes it easier to follow, merge or even apply patches with a line offset without needing to reroll on changed --dev snapshots-

Thanks to all for report and contribution. Here's the D7 patch (D6 anyone?).

Patched, reviewed, tested, committed and pushed to HEAD.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

michaek’s picture

This patch doesn't seem to solve the issue against the current stable release (beta1). The $item array values set in _link_sanitize don't seem to be available in _link_process, so the full URL cannot be reconstructed.

I've also tested with the current 7.x dev release, and the query string and anchors are still truncated.

Is there a good reason for a URL to be stripped down to its path by _link_sanitize? Reconstructing the URL after stripping it down seems a little odd.

michaek’s picture

StatusFileSize
new420 bytes

check_plain is what's stripping the URL. Assuming we need to use check_plain on the input, how about this patch instead?

michaek’s picture

The patch I added in #17 doesn't like URLs of the form /some/path - the url() function will prepend the Drupal base path unless you pass external=TRUE in the options array. So: don't consider it an appropriate solution as it stands.

michaek’s picture

Status: Active » Closed (fixed)
StatusFileSize
new789 bytes

Sorry for the barrage of comments here.

When I though check_plain had something to do with it, I misread what was happening in that function. Clearly, it's the two substr() calls that are removing the query and the fragment. What's the reason, though, to return the URL stripped of the query and the fragment instead of the full URL?

The attached patch merely comments out those two lines. I don't know what this might break, so I apologize in advance if it's a Bad Idea.

UPDATE: It's a Bad Idea. Don't use this patch.

naeluh’s picture

I am also having an issue with this.
I will use the patch and report my results

heres where I posted my issue http://drupal.org/node/1627588

thanks

naeluh’s picture

Status: Closed (fixed) » Active
naeluh’s picture

I am trying to get this to work but I am unfortunately having no luck.
My urls are still getting trimmed and if I use the patch in #19

My urls become decoded for some reason

It turns my url from -

http://www.google.com/url?sa=X&q=http://kid-stories-online.blogspot.com/2012/05/win-it-anastasia-beverly-hills-5.html&ct=ga&cad=CAcQARgAIAEoATAAOABAu5qJ_gRIAlgAYgVlbi1VUw&cd=4jsrVpUezYY&usg=AFQjCNHRx6DQILIhsFU99gwIYh3NMBLerw

to

http://www.google.com/url?sa=X&q=http://kid-stories-online.blogspot.com/2012/05/win-it-anastasia-beverly-hills-5.html&ct=ga&cad=CAcQARgAIAEoATAAOABAu5qJ_gRIAlgAYgVlbi1VUw&cd=4jsrVpUezYY&usg=AFQjCNHRx6DQILIhsFU99gwIYh3NMBLerw

All the & become & causing the url to be incorrect.

Please advise thanks !

naeluh’s picture

Update: I applied the patch from this issue page

http://drupal.org/node/1309658#comment-5443946

and I still getting the same error

all the & are changing to &

Please I am really eager to get this fixed

thanks

michaek’s picture

I'll take another look at this. I did mention that I had some concerns that what I was proposing was a Bad Idea. :)

Could a maintainer please weigh in on this? The module does not support many forms of valid URL, at the moment.

michaek’s picture

Status: Closed (fixed) » Active

I'm not seeing this issue any longer (after reverting the patch from #19), but since the code hasn't changed I feel like it's unlikely that the problem is actually gone. Bottom line: don't use the patch I provided.

naeluh’s picture

Hi,

@michaek

I understand the situation. I am aware of your solution as it was on another issue.
The patch on #14 works.

The problem is as I stated in my post.

the

amp; gets added into the code 

My issue is still as I stated in http://drupal.org/node/1321482#comment-6105540

Please look there that is the problem it seems that amp; is not encoding for some reason

I am extremely eager to fix this but my background is sort of limited to fix this type php url validation

I think that the query and fragment stripping in this module is effecting my issue because without the patch this does not happen so it is probably an issue there.

It making my google alert urls invalid therefore they do not work when clicked

Please advised I will test any new patch and report here thanks!

Bevan’s picture

naeluh’s picture

@Bevan

Do you think that your patch might wrk to solve my issue ?

please see http://drupal.org/node/1321482#comment-6105540

thanks !

liam morland’s picture

Status: Active » Fixed

The problem I originally encountered was a bug in Workbench Moderation, fixed here: #1245590: Moderation form should not use altered view node. If there are problems not related to Workbench Moderation, then those are different issues and should be put in their own ticket.

naeluh’s picture

Okay I felt as though it was same issue, I was having trouble with queries and fragments
I will create a new issue.

thanks

rypit’s picture

Confirming that this is caused by workflow moderation. See patch 13 for solution. Orignial issue queue #1245590: Moderation form should not use altered view node.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

colepacak’s picture

Issue summary: View changes
StatusFileSize
new1013 bytes

I ran into the same issue on 7.x-1.x and was unable find a relevant patch or one that successfully applied (may have been some line ending issue).
So here's an updated patch that refactors _link_sanitize:

liam morland’s picture

This issue is over three years old. If you have observed a problem, please open a new issue. You can make this issue be the parent if they are related.

gabriel.achille’s picture

For the record: I noticed as well that the bug have been reintroduced in 7.x-1.3... and re-fixed in #2367069: Link 1.3 removes the query string from field tokens which have been committed to dev on the 2014-Nov-04. Hence until 7.x-1.4 is released, using the dev version will be enough to fix this issue.

mrtndlmt’s picture

Patch works for me !! Thanks a lot!