Posted by fgm on December 16, 2006 at 6:09pm
23 followers
| Project: | Drupal core |
| Version: | 8.x-dev |
| Component: | path.module |
| Category: | bug report |
| Priority: | normal |
| Assigned: | rocket_nova |
| Status: | needs work |
| Issue tags: | GoogleUX2012, needs backport to D7, Needs tests, Usability |
Issue Summary
A contributor on one of my sites just discovered what seems a bug. Here is how to reproduce it:
- work on a drupal site, say
http://www.drupal.org/ - create a post, say a story
- prior to submitting, define the "URL path settings" : do not read the description that says these must be local, and type in some absolute URL, like
http//:www.example.com/. - submit
At this point, you find yourself on http://www.drupal.org/http%3A//www.example.com, which causes a "page not found" error.
This seems to be a bug: absolute paths should probably cause a post validation error. Just because users can't read doesn't mean sites shouldn't give some clues instead of bailing out.
Comments
#1
A possible solution would be to sanitize aliases so that incorrect urls like absolute ones are converted to something that will actually be recognizable when requested and point to the same page.
#2
Agreed that this should be fixed; I had it happen to me once or twice when I just started using drupal.
#3
In ver 4.6 I was using absolute urls as path aliases for some static html pages. When I upgraded to 4.7.4 that no longer worked but there was no help given to fix it.
What I found that worked for me is this.
Drupal resides in http:www.mysite.com/drupal. The static page resides at http://www.mysite.com.
The old absolute url path was http://mysite.com/index.html
The new url path that works is ../index.html
#4
Rather than return an error, this patch silently removes any leading or trailing forward slashes. I think this is the best solution in terms of usability. Without this patch it can be troublesome to fix nodes given an incorrect alias, because the links to the aliased url will be broken too.
I tested with http://www.example.com and it worked fine on FF and IE without any patch, so I didn't include a fix for that input.
#5
Dang, forgot to attach the patch again.
#6
So am I being told that there is no way to do an absolute URL to another site?
I tried putting a link to another site of mine using fill HTML and it was converted to a local link.
What is going on?????
#7
Forgot to say I am using 4.7.4.
At this rate, every time I upgrade to the latest version I loose at least one thing that I need. My last upgrade I lost the album module. This time I lost absolute URLs, authenticated user and all of the usable info under the status column in the users list.
I am beginning to think that upgrading is not a good idea.
#8
Darrell, this is not what is at point here: the problem being discussed is that you should not create a page on site A and set for it an alias residing on site B, since site B would normally have no way to redirect you to the original page on site A.
This has nothing to do with links in post content.
#9
#10
patch no longer applies
# patch -p0 < path_absolute.patch
patching file modules/path/path.module
Hunk #1 FAILED at 128.
1 out of 1 hunk FAILED -- saving rejects to file modules/path/path.module.rej
#11
New patch for HEAD. This patch trims leading or trailing forward slashes, and removes the warning from the field description. It seems that absolute paths no longer break anything anyway.
#12
looks good. i have not tested.
#13
Verified issue
Changed url to /my/page/
Upon viewing, was directed to my/page
Editing the page also shows corrected my/page URL
Warning message did not appear in URL setting directions
Ready for commit
#14
Erm, what happens to http://www.example.com/ when trim-ed with /? IMHO http://example.com Better? How does this solve the issue?
#15
I might misunderstand your question but I think maybe the confusion is that the issue has morphed and is now mistitled.
The original issue was about absolute urls, but in head that issue is already fixed and no longer breaks anything. So actually this patch is to fix relative aliases that are misentered, like if someone enters a path /node/14, so that it will just fix it rather than having to include fineprint to warn you not to do it.
#16
Ah, good titles matter a lot. So what about adding a comment to that effect to the trim(). I remember we were puzzled on some parts of this function before, so it is always good to have this documented. I'd add something along the lines of
// Remove possible slashes from both ends to avoid invalid aliases.
#17
Yep, I had a comment in my original patch...
#18
I'll get to massaging this a bit later, but it looks good (although it changes a string, it fixes a quite often encountered usability issue).
#19
If we anyway have to change the string, we should put improve on the wording:
In path_admin_form():
In both occurrences:
aboutthen)?Proposal:
"Specify a path alias by which this page can be accessed. Use a relative path. For example, type <kbd>about</kbd> to refer to an about page."
and
"Optionally specify a path alias by which this node can be accessed. Use a relative path. For example, type <kbd>about</kbd> when writing an about page."
Also it is very misleading that the module is called "Path" (path.module) and the admin page on the other hand "URL aliases". I always intuitively search for a "path" menu entry, then scroll the list down and once I find "URL aliases", I remember: "Ahh, that was the misworded one".
#20
Back to discussion then.
#21
Some comments would be great, since we are mostly done!
All that's left is the correct naming conventions and the wording of two strings (see #19).
So let's start with the most general issue:
The module's name is "path". Right now, the functionality of path aliasing is inconsistently named "URL alias" (mostly), "Path alias", "alternative path".
While core shouldn't generally follow contrib - the important pathauto module is not called URLauto, but pathauto and it consistently uses the terminology "path alias".
Now I think we should make up a decision whether in the functionality be consistently called "Path alias" or "URL alias" in core... some opinions on this, please!
#22
I made a new patch with Pancho's text changes. I agree with his points about the text.
#23
+1 for #22, though this only covers part of the naming issues.
Mainly I was criticizing the inconsistant usage of "URL alias" vs. "Path alias". We would need to unify this to "Path alias" IMO. The enclosed patch covers this. Unfortunately we'd need to rename two permissions, change four more strings and remove one. While this might sound inacceptable, this would IMHO be a large UI improvement and finally streamline the inconsistant naming.
Don't know if we should split that off this issue here, as we moved away from the starting point.
As the trim-slash part we all agree upon should be committed asap, this shouldn't hold back the patch in #22 anyway.
#24
#25
No longer applies to HEAD:
This is also a large patch that is hard to review, please consider splitting up this patch to make it easier to review quickly.
Robin
#26
Agreed, Robin. The purpose of this patch is to remove leading and trailing slashes in user-entered paths, which is a usability improvement.
Pancho's naming consistency issue should be a separate issue.
The patch in #22 should be rerolled and retested for HEAD.
#27
Patch removes leading and trailing slashes in user-entered path aliases and adjusts path alias form description text to remove the no-trailing-slash requirement.
#28
The last submitted patch failed testing.
#29
I couldn't reproduce the test problem, resubmitting.
#30
The last submitted patch failed testing.
#31
tagging, failing test
#32
Marked #250525: Disallow trailing slashes in path aliases as a duplicate of this issue.
#33
reroll
#34
Just reviewing this - if you enter an alias with a . after the / you get the same page not found issue.
Try with alias "http://example.com/."
#35
Updated patch to fix dot issue noted in #34
#36
The last submitted patch failed testing.
#37
Resubmitting for testing, incorporates changes made in #33
#38
Removing all dots is a bad idea, for example a person might want to use an alias of test.html to match old paths. Seems better to use trim to remove dots from the beginning and end.
#39
Agree - patch modified to trim dots from either end.
Also removes "./" and "/." from aliases which cause page not found errors.
#40
The last submitted patch failed testing.
#41
re-rolling for another test...
#43
The description changes and trimming the slash is great. However, str_replace is not, what if you have something like foo/bar/../baz then you can't just str_replace those pieces out , you need a validation error instead. Also, needs a test.
#44
If we took out the 2 str_replace lines in the patch and just submit the trim slashes functionality, would this then be able to get into D7?
#45
We had some discussion on this in IRC and I'll finish this tonight. I'll merge this issue with #394402: Path field needs better help text and validation.
#46
Since #332333: Add a real API to path.module resolved the
trim()andstr_replace()issues by including trim() in all relevant cases. I believe what's left here is to the update the help text and to check in the alias validation for non-relative URLs. Patch attached to that effect.Also tagging for backport to D7, though the string updates should probably be left out due to translation issues.
#47
#46 looks good to me, and I agree with the point about a backport without the string changes. We'll also want to add some tests for this.
Note that the Drupal 8.x patch will need to be rerolled, because the core directory structure for Drupal 8 has now changed. (For more information, see #22336: Move all core Drupal files under a /core folder to improve usability and upgrades). When the patch has been rerolled, please set the issue back to "Needs Review."
Tagging as novice for the task of rerolling the Drupal 8.x patch.
#48
Here is the rerolled patch, for further work.
#49
Thanks! Sending to testbot.
#50
Still needs tests.
And the patch submitted in #46 and rerolled in #48 is not correct:
It check the alias is not an external url, and set an error if it is. But why? This issue is about trailing slashes. Why is it problem to set an external url for alias? It will works without this patch, I think it isn't a problem.
But the patch don't do anything with trailing slashes, and if we set an alias with them, Drupal will still give a page not found page.
#51
@Désiré you're right. I re-read the history of this issue and agree with you. Here is a test for leading and trailing slashes and a patch to fix leading and trailing slashes.
#52
Thanks for the tests. Not sure about the new patch, though... What happened to validation errors for
/.in the middle and the updated UI text, though? That's what theurl_is_external()bit was for.#53
@xjm I'm feeling a bit confused about what the issue is and what the solution should be for it. It seems like external URLs are handled ok by the system, so I didn't address that in the patch. Should this issue be about removing leading and trailing slashes and dots or about external URLs?
#54
Hmmm, I don't think external URLs "work." You end up with URLs like:
http://localhost:8888/d7/http%3A//localhost%3A8888/d7/fooAnd it is related. E.g., for especial fun, create a node with this path alias if your site's domain is
mysite.com:http://mysite.com/foo/../A user might enter an exernal URL for the same reason he or she might enter a relative path using
.or.., and I think it makes sense to fail validation on that.Maybe we just need to expand the scope of this issue?
In any case, the current patch above also does not fix other patterns that result in a 404, like:
/d7/foo/../bar/d7/foo/./bar...and I think these at least were in the original scope of the issue.
#55
Marking NW for the last case above, at least. We can strip leading or trailing slashes, sure, but I think we should be adding better validation on the field.
#56
ran into this again at usability testing.
#57
Tagging.
#58
Wait. Why removing dots? "index.html" is a perfectly valid path alias. If the concern is that people put an entire URL in there, then prefix the field with the site address. (e.g. http://example.com/[_______________________])
But +1000 for stripping out leading/trailing slashes. That's just silly to do to people.
#59
Not removing all dots... just path components with only dots and no alphanumeric characters. See #54. We need a more sophisticated regex.
#60
updating tags
And here's a painful video showing the Google Usability study where someone entered a path with a leading slash and got an error.