The site I am working is going to have extensive use of workflow and other forms of moderated content. During the transition in the workflows it is very likely that the title of a node is going to change, and thus create a new URL alias. Until that node is in live stage it will remain unpublished.
It is also likely that the writers will change the title while they are working on the drafts of new articles. Therefore it is very likely that the URL alias list will end up with loads of aliases pointing to the same node that was generated while the node was set to "unpublished".
As these unpublished aliases are never going to be publicly available there is no point of storing them for future redirect use, etc.
Therefore it would be great if Pathauto had an option to always delete, and create a new, the current unpublished alias every time the node is saved/updated and only store aliases for when a node has been published.
I hope this makes sense.
Comments
Comment #1
tsvenson commentedJust changed code to node in title (no idea where code came from?!)
Comment #2
gregglesSorry I missed this.
I'm pretty opposed to the idea as part of pathauto. I think it should be a submodule if anything, which would be pretty easy to write (hook_nodeapi, see if a node is getting published, look for the "current alias" based on the most recent pid, delete all aliases for the node except for that one).
Comment #3
tsvenson commentedPlease don't be sorry greggles, you can't be everywhere. Only thing I am trying to do is to find a way to avoid un-needed aliases being created while a node is not published.
Unfortunately my skills are rather limited so I am unable to write such a module myself.
Comment #4
klonosI can't say if the best way would be to have this as a submodule or as part of pathauto, but I surely like the idea of this feature. So, +1 from me.
It would really help cut down on the vast amount of unwanted aliases in busy sites and make admins' lives a bit easier.
Comment #5
dave reidWhat is your pathauto 'update action' setting set to on admin/build/path/pathauto ? If it's set to 'Create new alias, delete old alias." this should be automatic.
Comment #6
klonosI believe you meant ../admin/build/path/settings Dave. I already knew that there's this setting available (if I'm not mistaken 'create new - delete old' is the default anyways), so I agree that this is there for the exact reason/need of having to delete old aliases.
You have to give some credit to the feature request though, because one might have changed the default settings after initial setup and only realized they got themselves in such a situation described by Thomas in #0 after quite some time. If that has happened in a really busy site, the only easy way to get out of it is this feature. In other words this is not a setting one would choose for the 'Update action' on initial setup, but rather a helpful feature addition in an already useful module.
If you do insist in your opinion (and/or Greg's proposal in #2 that this could be in another submodule), go ahead and set this to 'won't fix', but I strongly think you should reconsider and get this implemented at some point. If you agree, then please set it to 'postponed'. I don't know what Thomas has to say in this though, since he is the one that originally filed this request and he seems to be facing an actual use case.
PS: if you finally decide to do this some time in the future, I think that implementation and UI wise it should be an extra setting (checkbox) 'bound' to the 'Create a new alias. Leave the existing alias functioning.' radio option. Its label could be something like 'Only delete old alias(es) once node gets published'.
Comment #7
tsvenson commentedWhat about an option such as:
- Don't create alias for unpublished nodes
Then Pathauto only needs to check if the node is published or not and will only create aliases when the node changes from not published to published.
I use the "Create a new alias. Redirect from old alias." option for most sites I am involved in. Sure it does take care of redirecting from old aliases, but every time a new alias is created for an unpublished node it will add to administration as well as have a small impact on performance as well.
This option would eliminate that as well as keep the alias list much smaller for sites where a lot of content is being published.
On the sites I publish content myself I have noted that I am changing the title several times while I write the content. Then I always end up having to go to the alias list and remove those that was created before publishing. On sites where I control the content this is not a problem, except the extra time it takes, but on sites with lots of content contributors, editors, admins etc this probably is something they miss doing.
Lets say a site publishes 30 new nodes per day and on average the title is changing three times, before being published. Then that is 60 extra aliases that was created for the unpublished nodes or almost 11,000 unwanted aliases over a period of one year.
Changing the title 3 times happens quite often for myself when writing. Then, if you also have a workflow where an editor reviews it, making SEO changes and so on, it can easily be changed a few more times.
Comment #8
klonosThat sounds good to me too, but as I've already said Thomas in #0 implies that they need the aliases while notes are not published(?). I'd argue saying that there's always revision history and I cannot possibly imagine why one would want to retain all old aliases, but I know that there is the case of people linking other nodes from their published content (using aliases) and I honestly do not know of any way (module) that automatically also updates these links if the aliases happen to change. You do get a lot of broken links this way and keeping a history of old aliases around in this case might prove handy for fixing these broken links!
Besides, we miss the case of people that have gotten themselves in such a situation already and need to remedy it by deleting the old aliases. I guess they could simply set it back to 'create new - delete old' and go through the process of re-saving each node with multiple old aliases. Again, too much manual work.
On a note here, we could simply provide some sql command (as a tech note in the README.txt) for this issue and spare us the trouble of implementing this in the module.
Comment #9
tsvenson commentedNo, alias is not needed for unpublished nodes. The system paths are just fine when content is being worked on.
Old aliases for published content is needed not only for people bookmarking or linking to them from other websites. Even more important is search engines. Without the website properly managing and redirecting old (published) URLs then search visitors will face a 404 page when they click on a URL that is old and deleted.
I know Google for example will understand the 301 redirect and at some point update their link to the new alias.
Together with canonical you also prevent that the same page has different URLs in search engines and thus compete with each other for higher position. That would result in that the page(s) is ranked much lower instead.
Problem with the way it currently works is that I have no idea if an alias was created when the node was published or not. So just deleting and regenerating will not help, it will end up with broken links on your website, bookmarks, other websites and search engines.
Only way to know that is to not create a new alias unless the node is published.
Comment #10
klonosok then 'Don't create alias for unpublished nodes' (or 'Create alias only for published nodes') sounds perfect if it gets implemented.
Comment #11
gregglesWe are not introducing an option in the UI for this. It would be too much UI for a module that already has too much.
If anything we should make sure that people who need this can write a little module that overwrites the default behavior and stops the alias from getting created. And...I think that is possible. We have a checkbox value for whether or not the alias is created and pathauto's weight is 10. So a module with a weight less than that can jump in and do this as necessary.
Comment #12
troybthompson commentedWas there ever a module made to do this? I have a problem where a lot of my nodes never get published but new ones have the title and get the same name so the published ones get the -# extension even if the clean alias never gets published.
Comment #13
rp7 commentedSorry for bumping this old issue, but people looking for a solution to not having URL aliases for unpublished entities, this does the trick in Drupal 8:
IMO this would be a nice functionality to be in the main module.