Break URLs at separator character and remove the separator
tcconway - June 12, 2008 - 19:51
| Project: | Pathauto |
| Version: | 6.x-2.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed |
Description
Often times, we are forced to have a very long title name. When that happens, PathAuto often trunkates the path as it should, but it sometimes keeps the trailing space (and converts it to a - as it should)
Possible for PathAuto to delete the trailing slash (trailing -, when its converted)
An example url that PathAuto generates is:
http://www.example.com/press_releases/2008/05/aol-awards-ohio-university...
Steps to repeat:
1. Create a posting. For the title, give it a REALLLYYY LOOONNGGG name.
2. Submit it.
Thanks!

#1
I just added a lot of pattern trimming to both the 5.x-2.x and 6.x-1.x branches. Could please confirm this is still happening with the 5.x-2.x-dev version? (You may have to wait a bit though, until the "Last updated" field says "June 13" (or later).)
#2
Hey Freso,
Just installed 5.x-2.x-dev and still having the same issue. It still creates a url that ends in "-".
Thanks.
#3
Steps taken:
1. Create a node with the title
Often times, we are forced to have a very long title name. When that happens, PathAuto often trunkates the path as it should, bu2. Resulting alias was
often-times-we-are-forced-to-have-very-long-title-name-when-that-happens-pathauto-often-trunkates-thThat seems as expected to me.
So, I tried
1. Create a node with the title
Often times, we are forced to have a very long title name. When that happens, PathAuto often trunkates the path as it(note all of the spaces after trunkates, I expected pathauto to cut the line there and replace the space with hyphen) but...2. Resulting alias was
often-times-we-are-forced-to-have-very-long-title-name-when-that-happens-pathauto-often-trunkates-thSo I tried
1. Create a node with the title
Often times, we are forced to have a very long title name. When that happens, PathAuto often trunkates t he path as it(note the space in the middle of the word "the" in "trunkates t he")2. Resulting alias was
often-times-we-are-forced-to-have-very-long-title-name-when-that-happens-pathauto-often-trunkates-t-So, I'd say that is the bug, right?
#4
Well played. That's exactly what we are experiencing as well.
To me, there are two issues:
often-times-we-are-forced-to-have-very-long-title-name-when-that-happens-pathauto-often-trunkates-th. Resulting in:often-times-we-are-forced-to-have-very-long-title-name-when-that-happens-pathauto-often-trunkates#5
Hohum. I'm inclined to re-categorise into a feature request for what's described in #4. This also means that I'm inclined to not want to spend time fixing this for 5.x, but have it moved to 6.x-2.x instead.
#6
@Freso - Agreed.
I've also received several requests that when urls are shortened for whatever reason they should be shortened to a good logical point (i.e. separators) instead of just the exact point where 100 characters lands. Since those seem similar in my mind, I'll just bundle that one in here.
#7
Something like http://api.drupal.org/api/function/truncate_utf8 will sure help when we need to do this.
#8
#9
This should work, but I haven't tested it. So please do.
#10
Just tested it, and it doesn't seem to work. Bugger.
#11
How about this?
#12
Yay! It works! :D
It had a line of pure whitespace though, so fixed that. I also added a line documenting what that block of code does. Should be good to go.
#13
Awesome - thanks, Freso.
http://drupal.org/cvs?commit=131526
#14
Per Freso, this needs a little more testing...
#15
The problem was that if you created a node with a short title this still lopped off the last word.
1. Create a node with the title "this is a short title"
Expect results:
aliased as content/this-short-title
Actual results:
aliased as content/this-short
The attached patch fixes that problem, generalizes this code so it can be re-used elsewhere in pathauto, and then re-uses it where individual tokens are built and shortened to the appropriate length.
#16
Attached...now...
#17
+1 for separating the logic to its own function, but perhaps call it
_pathauto_truncate_url()ortruncate_charsinstead? (FWIW, the 7.x version oftruncate_utf8()will bedrupal_truncate_chars().)Also "A Pathauto friendly version of truncate_utf8" should be finished off with a ".". ;)
Apart from this, the patch looks good. Looking forward to test it in the morrow!
#18
Good points.
#19
Works now with both long and short aliases! Yay! And I see no more nits to pick at (well, perhaps
truncate_utf8could be referenced with a pair of parentheses... but I'm somewhat indifferent on that). :p#20
It breaks the tests though, but this is likely because the tests themselves need to be updated for this new logic.
#21
Actually, without this follow-up patch, it'll break two tests, but with it, it'll only break one. So it's an improvement. The test still breaking is "[testPathAuto]: Node accessible through alias at [[...]/sites/all/modules/pathauto/tests/pathauto.test line 84]", which tries to fetch the node using the alias.
#22
And even reverting the originally committed patch from this issue doesn't take the fail away.
#23
And fixed - http://drupal.org/cvs?commit=131753
Thanks for the reviews/testing.
#24
Automatically closed -- issue fixed for two weeks with no activity.