A contributor on one of my sites just discovered what seems a bug. Here is how to reproduce it:
- work on a drupal site, say
http://www.drupal.org/
- create a post, say a story
- prior to submitting, define the "URL path settings" : do not read the description that says these must be local, and type in some absolute URL, like
http//:www.example.com/
. - submit
At this point, you find yourself on http://www.drupal.org/http%3A//www.example.com
, which causes a "page not found" error.
This seems to be a bug: absolute paths should probably cause a post validation error. Just because users can't read doesn't mean sites shouldn't give some clues instead of bailing out.
Comment | File | Size | Author |
---|---|---|---|
#51 | path-alias-test-only-103680-51.patch | 2.38 KB | drupal_was_my_past |
#51 | path-alias-with-test-103680-51.patch | 3.83 KB | drupal_was_my_past |
#48 | path-alias-103680-48.patch | 2.43 KB | Désiré |
#46 | path-alias-103680-46.patch | 2.39 KB | drupal_was_my_past |
#41 | trim-alias-no-dots.patch | 2.15 KB | rsaddington |
Comments
Comment #1
fgmA possible solution would be to sanitize aliases so that incorrect urls like absolute ones are converted to something that will actually be recognizable when requested and point to the same page.
Comment #2
ChrisKennedy CreditAttribution: ChrisKennedy commentedAgreed that this should be fixed; I had it happen to me once or twice when I just started using drupal.
Comment #3
Darrell CreditAttribution: Darrell commentedIn ver 4.6 I was using absolute urls as path aliases for some static html pages. When I upgraded to 4.7.4 that no longer worked but there was no help given to fix it.
What I found that worked for me is this.
Drupal resides in http:www.mysite.com/drupal. The static page resides at http://www.mysite.com.
The old absolute url path was http://mysite.com/index.html
The new url path that works is ../index.html
Comment #4
ChrisKennedy CreditAttribution: ChrisKennedy commentedRather than return an error, this patch silently removes any leading or trailing forward slashes. I think this is the best solution in terms of usability. Without this patch it can be troublesome to fix nodes given an incorrect alias, because the links to the aliased url will be broken too.
I tested with http://www.example.com and it worked fine on FF and IE without any patch, so I didn't include a fix for that input.
Comment #5
ChrisKennedy CreditAttribution: ChrisKennedy commentedDang, forgot to attach the patch again.
Comment #6
Darrell CreditAttribution: Darrell commentedSo am I being told that there is no way to do an absolute URL to another site?
I tried putting a link to another site of mine using fill HTML and it was converted to a local link.
What is going on?????
Comment #7
Darrell CreditAttribution: Darrell commentedForgot to say I am using 4.7.4.
At this rate, every time I upgrade to the latest version I loose at least one thing that I need. My last upgrade I lost the album module. This time I lost absolute URLs, authenticated user and all of the usable info under the status column in the users list.
I am beginning to think that upgrading is not a good idea.
Comment #8
fgmDarrell, this is not what is at point here: the problem being discussed is that you should not create a page on site A and set for it an alias residing on site B, since site B would normally have no way to redirect you to the original page on site A.
This has nothing to do with links in post content.
Comment #9
drummComment #10
keith.smith CreditAttribution: keith.smith commentedpatch no longer applies
# patch -p0 < path_absolute.patch
patching file modules/path/path.module
Hunk #1 FAILED at 128.
1 out of 1 hunk FAILED -- saving rejects to file modules/path/path.module.rej
Comment #11
Jody LynnNew patch for HEAD. This patch trims leading or trailing forward slashes, and removes the warning from the field description. It seems that absolute paths no longer break anything anyway.
Comment #12
moshe weitzman CreditAttribution: moshe weitzman commentedlooks good. i have not tested.
Comment #13
alasda CreditAttribution: alasda commentedVerified issue
Changed url to /my/page/
Upon viewing, was directed to my/page
Editing the page also shows corrected my/page URL
Warning message did not appear in URL setting directions
Ready for commit
Comment #14
Gábor HojtsyErm, what happens to http://www.example.com/ when trim-ed with /? IMHO http://example.com Better? How does this solve the issue?
Comment #15
Jody LynnI might misunderstand your question but I think maybe the confusion is that the issue has morphed and is now mistitled.
The original issue was about absolute urls, but in head that issue is already fixed and no longer breaks anything. So actually this patch is to fix relative aliases that are misentered, like if someone enters a path /node/14, so that it will just fix it rather than having to include fineprint to warn you not to do it.
Comment #16
Gábor HojtsyAh, good titles matter a lot. So what about adding a comment to that effect to the trim(). I remember we were puzzled on some parts of this function before, so it is always good to have this documented. I'd add something along the lines of
// Remove possible slashes from both ends to avoid invalid aliases.
Comment #17
ChrisKennedy CreditAttribution: ChrisKennedy commentedYep, I had a comment in my original patch...
Comment #18
Gábor HojtsyI'll get to massaging this a bit later, but it looks good (although it changes a string, it fixes a quite often encountered usability issue).
Comment #19
PanchoIf we anyway have to change the string, we should put improve on the wording:
In path_admin_form():
In both occurrences:
about
then)?Proposal:
"Specify a path alias by which this page can be accessed. Use a relative path. For example, type <kbd>about</kbd> to refer to an about page."
and
"Optionally specify a path alias by which this node can be accessed. Use a relative path. For example, type <kbd>about</kbd> when writing an about page."
Also it is very misleading that the module is called "Path" (path.module) and the admin page on the other hand "URL aliases". I always intuitively search for a "path" menu entry, then scroll the list down and once I find "URL aliases", I remember: "Ahh, that was the misworded one".
Comment #20
Gábor HojtsyBack to discussion then.
Comment #21
PanchoSome comments would be great, since we are mostly done!
All that's left is the correct naming conventions and the wording of two strings (see #19).
So let's start with the most general issue:
The module's name is "path". Right now, the functionality of path aliasing is inconsistently named "URL alias" (mostly), "Path alias", "alternative path".
While core shouldn't generally follow contrib - the important pathauto module is not called URLauto, but pathauto and it consistently uses the terminology "path alias".
Now I think we should make up a decision whether in the functionality be consistently called "Path alias" or "URL alias" in core... some opinions on this, please!
Comment #22
Jody LynnI made a new patch with Pancho's text changes. I agree with his points about the text.
Comment #23
Pancho+1 for #22, though this only covers part of the naming issues.
Mainly I was criticizing the inconsistant usage of "URL alias" vs. "Path alias". We would need to unify this to "Path alias" IMO. The enclosed patch covers this. Unfortunately we'd need to rename two permissions, change four more strings and remove one. While this might sound inacceptable, this would IMHO be a large UI improvement and finally streamline the inconsistant naming.
Don't know if we should split that off this issue here, as we moved away from the starting point.
As the trim-slash part we all agree upon should be committed asap, this shouldn't hold back the patch in #22 anyway.
Comment #24
Jody LynnComment #25
Robin Monks CreditAttribution: Robin Monks commentedNo longer applies to HEAD:
This is also a large patch that is hard to review, please consider splitting up this patch to make it easier to review quickly.
Robin
Comment #26
Jody LynnAgreed, Robin. The purpose of this patch is to remove leading and trailing slashes in user-entered paths, which is a usability improvement.
Pancho's naming consistency issue should be a separate issue.
The patch in #22 should be rerolled and retested for HEAD.
Comment #27
Jody LynnPatch removes leading and trailing slashes in user-entered path aliases and adjusts path alias form description text to remove the no-trailing-slash requirement.
Comment #29
Jody LynnI couldn't reproduce the test problem, resubmitting.
Comment #31
Bojhan CreditAttribution: Bojhan commentedtagging, failing test
Comment #32
Dave ReidMarked #250525: Disallow trailing slashes in path aliases as a duplicate of this issue.
Comment #33
Jody Lynnreroll
Comment #34
rsaddington CreditAttribution: rsaddington commentedJust reviewing this - if you enter an alias with a . after the / you get the same page not found issue.
Try with alias "http://example.com/."
Comment #35
rsaddington CreditAttribution: rsaddington commentedUpdated patch to fix dot issue noted in #34
Comment #37
rsaddington CreditAttribution: rsaddington commentedResubmitting for testing, incorporates changes made in #33
Comment #38
Jody LynnRemoving all dots is a bad idea, for example a person might want to use an alias of test.html to match old paths. Seems better to use trim to remove dots from the beginning and end.
Comment #39
rsaddington CreditAttribution: rsaddington commentedAgree - patch modified to trim dots from either end.
Also removes "./" and "/." from aliases which cause page not found errors.
Comment #41
rsaddington CreditAttribution: rsaddington commentedre-rolling for another test...
Comment #43
chx CreditAttribution: chx commentedThe description changes and trimming the slash is great. However, str_replace is not, what if you have something like foo/bar/../baz then you can't just str_replace those pieces out , you need a validation error instead. Also, needs a test.
Comment #44
nirbhasa CreditAttribution: nirbhasa commentedIf we took out the 2 str_replace lines in the patch and just submit the trim slashes functionality, would this then be able to get into D7?
Comment #45
XanoWe had some discussion on this in IRC and I'll finish this tonight. I'll merge this issue with #394402: Path field needs better help text and validation.
Comment #46
drupal_was_my_past CreditAttribution: drupal_was_my_past commentedSince #332333: Add a real API to path.module resolved the
trim()
andstr_replace()
issues by including trim() in all relevant cases. I believe what's left here is to the update the help text and to check in the alias validation for non-relative URLs. Patch attached to that effect.Also tagging for backport to D7, though the string updates should probably be left out due to translation issues.
Comment #47
xjm#46 looks good to me, and I agree with the point about a backport without the string changes. We'll also want to add some tests for this.
Note that the Drupal 8.x patch will need to be rerolled, because the core directory structure for Drupal 8 has now changed. (For more information, see #22336: Move all core Drupal files under a /core folder to improve usability and upgrades). When the patch has been rerolled, please set the issue back to "Needs Review."
Tagging as novice for the task of rerolling the Drupal 8.x patch.
Comment #48
Désiré CreditAttribution: Désiré commentedHere is the rerolled patch, for further work.
Comment #49
xjmThanks! Sending to testbot.
Comment #50
Désiré CreditAttribution: Désiré commentedStill needs tests.
And the patch submitted in #46 and rerolled in #48 is not correct:
It check the alias is not an external url, and set an error if it is. But why? This issue is about trailing slashes. Why is it problem to set an external url for alias? It will works without this patch, I think it isn't a problem.
But the patch don't do anything with trailing slashes, and if we set an alias with them, Drupal will still give a page not found page.
Comment #51
drupal_was_my_past CreditAttribution: drupal_was_my_past commented@Désiré you're right. I re-read the history of this issue and agree with you. Here is a test for leading and trailing slashes and a patch to fix leading and trailing slashes.
Comment #52
xjmThanks for the tests. Not sure about the new patch, though... What happened to validation errors for
/.
in the middle and the updated UI text, though? That's what theurl_is_external()
bit was for.Comment #53
drupal_was_my_past CreditAttribution: drupal_was_my_past commented@xjm I'm feeling a bit confused about what the issue is and what the solution should be for it. It seems like external URLs are handled ok by the system, so I didn't address that in the patch. Should this issue be about removing leading and trailing slashes and dots or about external URLs?
Comment #54
xjmHmmm, I don't think external URLs "work." You end up with URLs like:
http://localhost:8888/d7/http%3A//localhost%3A8888/d7/foo
And it is related. E.g., for especial fun, create a node with this path alias if your site's domain is
mysite.com
:http://mysite.com/foo/../
A user might enter an exernal URL for the same reason he or she might enter a relative path using
.
or..
, and I think it makes sense to fail validation on that.Maybe we just need to expand the scope of this issue?
In any case, the current patch above also does not fix other patterns that result in a 404, like:
/d7/foo/../bar
/d7/foo/./bar
...and I think these at least were in the original scope of the issue.
Comment #55
xjmMarking NW for the last case above, at least. We can strip leading or trailing slashes, sure, but I think we should be adding better validation on the field.
Comment #56
Bojhan CreditAttribution: Bojhan commentedran into this again at usability testing.
Comment #57
xjmTagging.
Comment #58
webchickWait. Why removing dots? "index.html" is a perfectly valid path alias. If the concern is that people put an entire URL in there, then prefix the field with the site address. (e.g. http://example.com/[_______________________])
But +1000 for stripping out leading/trailing slashes. That's just silly to do to people.
Comment #59
xjmNot removing all dots... just path components with only dots and no alphanumeric characters. See #54. We need a more sophisticated regex.
Comment #60
jenlamptonupdating tags
And here's a painful video showing the Google Usability study where someone entered a path with a leading slash and got an error.
Comment #61
drupal_was_my_past CreditAttribution: drupal_was_my_past commentedComment #62
fagoI don't think disallowing trailing slashes is the right way - in some situations one might actually prefer them. Some SEOs consultants suggest adding trailing slashes to indicate a directory-like page.
So instead of removing it, I think it should work as via an URL alias - so that can be used to make the trailing slash variant the canonical path. However, I guess it would be a sane default behavior to disallow having two url aliases (with an without trailing slash) pointing at two different paths.
To make the paths work, something like that is enough. As said, better validation would make sense though.
Comment #63
Jody LynnSee also #1847174: Path alias validation should test for relative path, no trailing slash requirements.
Comment #66
TravisCarden CreditAttribution: TravisCarden commented#1847174: Path alias validation should test for relative path, no trailing slash requirements addressed much of this. The only issue that remains seems to be failing validation for full URLs.
Comment #75
quietone CreditAttribution: quietone as a volunteer commentedI tested this on a standard install of drupal 7.79-dev and Drupal 8.9.0 using the steps in the Issue Summary.
On Drupal 8.9.0 it is not possible to enter an alias that does not start with a '/'. A validation error,
The alias path has to start with a slash.
is displayed.On Drupal 7 the alias is accepted and the problem is reproducible.
Therefor, moving to the Drupal 7 issue queue.