Early Bird Registration for DrupalCon Portland 2024 is open! Register by 23:59 PST on 31 March 2024, to get $100 off your ticket.
If I bulk generate aliases on nodes, only 50 nodes are aliased at once.
If this is by design, then this is probably a documentation omission/bug.
If this is not by design, but you still wish to throttle bulk generation, maybe provide a field where we can specify the max number of nodes to bulk edit at once?
Comment | File | Size | Author |
---|---|---|---|
#87 | 201151-pathauto-bulk-update-D7.patch | 56.99 KB | Dave Reid |
#85 | 201151-pathauto-bulk-update.patch | 46.02 KB | Dave Reid |
#84 | 201151-pathauto-bulk-update.patch | 46 KB | Dave Reid |
#83 | 201151-pathauto-bulk-update-D7.patch | 56.99 KB | Dave Reid |
#81 | 201151-pathauto-bulk-update-D7.patch | 57.09 KB | Dave Reid |
Comments
Comment #1
gregglesWhen you visit the admin/settings page, do you have a text box to enter the
Maximum number of objects to alias in a bulk update:
?That is the textbox you desire, right?
I'm starting to think I should create a new "bulk update" page. That page would have this text box and then also the checkboxes to execute bulk update instead of putting them at the bottom of the object-type fieldsets.
Comment #2
Aren Cambre CreditAttribution: Aren Cambre commentedOOps, sure is. I think I missed it because I usually expect general settings to be automatically displayed (not hidden in a collapsible block), so I didn't think to look in there.
I agree with your suggestion about the bulk update page. I've changed this to a feature request to reflect that.
Comment #3
doc2@drupalfr.org CreditAttribution: doc2@drupalfr.org commentedThis issue might be duplicate (EDIT: I could be wrong! cf: #4. It could just be related. I'll edit back in that case). Indeed, have you seen this one: #176669: document bulk generation and bulk deletion feature ?
(nevermind about the crossed-out link, just click)
Comment #4
Aren Cambre CreditAttribution: Aren Cambre commentedI think this issue is substantively different, but I'll let greggles decide.
Comment #5
gregglesThe original issue seems similar, but the final idea here is to create a separate bulk update page to help users understand how to use it and where to find it. I think that's pretty different from just documenting it. If the separate page is clear enough then the documentation will become less important.
Comment #6
Aren Cambre CreditAttribution: Aren Cambre commentedWhy do I keep getting emails about this issue? Here is the email:
---------- Forwarded message ----------
From: <>
Date: Mon, Jun 16, 2008 at 11:34 AM
Subject: Your submitted bugs for June 16, 2008
To: XXXXXXXXX@XXXXXXX.XXX
[ Pathauto / Code ]=====================================================
Create separate bulk update page
state: active
age: 26 weeks 2 days
url: http://drupal.org/?q=node/201151
Comment #7
gregglesIf you mean why did you get so many just now that was due to a bug with drupal.org. There should be an explanation post coming soon.
Comment #8
gregglesExplanation is now live: http://drupal.org/node/271466
Comment #9
Freso CreditAttribution: Freso commentedFWIW, +1 to a separate page for bulk creating and updating. It has to happen in the proper branch though.
Comment #10
gregglesAlso...this could be simply documentation about how to do this with http://drupal.org/project/views and http://drupal.org/project/views_bulk_operations
That would help a lot and simplify a lot of the code and make it more valuable for end users (though more complex to setup...)
Comment #11
Aren Cambre CreditAttribution: Aren Cambre commentedI just now got another email about this issue:
I have opened many issues; I have no idea why I keep getting email notifications of this one.
Comment #12
gregglesYou get e-mails because this project is configured to send reminders about open issues once a month.
I've added a note to the top of the issue submission page for this project to let people know it will happen.
Comment #13
Aren Cambre CreditAttribution: Aren Cambre commentedThanks. Strangely, it has been many months since my prior emailed notice.
Comment #14
Aren Cambre CreditAttribution: Aren Cambre commentedWow, I just got another email. :-)
Comment #15
giorgio79 CreditAttribution: giorgio79 commentedBatch updating of pathauto aliases is already possible with VBO and a node view.
What I am looking for is batch update via VBO for term aliases, which is not yet possible :)
Any ideas are appreciated.
Comment #16
adamo CreditAttribution: adamo commentedIf you use the Batch API module when doing bulk updates, there will be no need to set a limit. You would be able to bulk update an infinite number of items without having to worry about the PHP max execution time or HTTP request timing out.
http://drupal.org/node/180528
Comment #17
agentrickardHere's a module I wrote to accomplish this for a project. It needs some review. I don't really have the bandwidth for a proper patch right now, but will be at DC Paris.
Comment #18
sebos69 CreditAttribution: sebos69 commentedinteresting, subscribing...
Comment #19
Aren Cambre CreditAttribution: Aren Cambre commentedJust ran into this again. Wow, been two years since I filed this request, but glad there's progress.
Want to again plug for this enhancement. Bulk generate is a tool, not a setting, so it doesn't make sense to bury it inside collapsed sections on a settings page.
Also, moving this to its own tab would also allow you to move the Maximum number of objects to alias in a bulk update section to that page.
Comment #20
sunWorking on this.
EDIT: Marked #212084: perform bulk updates during cron and/or via the batch API as duplicate of this issue.
Comment #21
sunLooks big, but isn't really. Quite nice.
Posting what I have after testing on small local development site (works nicely there) - will now test on a (very) large site with >200,000 nodes, >600,000 comments, >150,000 users, and a couple of taxonomy and forum terms.
Comment #22
sunFatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 71 bytes) in sites/all/modules/pathauto/pathauto.admin.inc on line 407
ok, need to rethink how to generate the batch operations.
Comment #23
Dave ReidYeah the approach in #21 is going to create a metric crap-ton of batch operations instead of a preferred one operation per module. That operation should then be using context and looping through groups of objects/entities like we usually do with large batch API processes.
Comment #24
sunAlrighty - this one works. And takes its time ;)
Comment #25
sunFixed missing PHPDoc.
Ready to fly, if you'd ask me. :)
Comment #26
sunLast tweak: Cut down to a single batch set with multiple operations (one op per path alias type).
Comment #27
sunI should also mention that those new batch updates use fully loaded entities to generate aliases, which of course slows down the generation process a bit. I'm not sure why it was using minimal database query objects before, but considering that arbitrary tokens can be used for URL alias patterns, this smelled like a bug. (Only nodes were fully loaded via node_load() previously.)
If this is handled elsewhere already, then we can easily revert this to the previous way.
Any chance to get this in?
Comment #28
svendecabooterThanks a lot for this patch sun. This is really a feature that should go into the Pathauto module...
I tested the patch on a dev site with a few thousand nodes & terms, and it worked perfectly.
Comment #29
svendecabooterThat would be patch #26 obviously.
Comment #30
RobLoachAfter applying the patch, I ran through a test for all nodes, tags and users. I really like how it has been moved off the settings page. This has to go in. Even the code is nicer!
Comment #31
Dave ReidBig thing I noticed is an inconsistency with the queries in the operations. Sometimes we're using a $context['sandbox']['query'] and re-using it in the processing, sometimes we're not but it looks like that pattern should be consistent throughout. I'll test as well today and report back.
Why isn't the pattern of $context['sandbox']['query'] used here?
Ditto about not using the sandbox query pattern.
Ditto.
Powered by Dreditor.
Comment #32
sun@Dave: That's really no reason to hold up this patch :P In short: All but user aliases require rather complex queries that depend on configuration values, so the queries to find entities without aliases require quite some preparation. Therefore, the preparation of queries happens once in an initial step. For user-related aliases, all queries are very simply and straightforward, so no special preparation is required. I would consider this difference a good thing, because this way, other developers that may want to integrate with pathauto have examples for both complex and simple implementations.
Thanks for reviewing! :)
Comment #33
sunMaintainer feedback?
Comment #34
sunbump
Comment #35
gregglesI left explanations in comments #7 and #8 and #10.
I think we should even consider removing the bulk generation checkboxes from Pathauto and making it rely on VBO more heavily. Ken raised the point that we shouldn't have to rely on other modules to get this functionality, but that's not really compelling to me. The point of modules is that we rely on them to do what they do best. Pathauto is not a "bulk operations" tool - VBO is.
Comment #36
Dave ReidI have to agree with Greg. Although this is a nice feature to cleanup/fix, I think we should move it to a separate project for batch updating and/or aliasing un-aliased locations on cron since that would be great. If we're going to attempt to get Pathauto into Drupal 8, it's got to be lean and mean and I don't think this feature would come along.
We should stick to supporting hook_node/user/etc_operations() so that other modules (like VBO or a pathauto_cron) can easily extend the basic features.
Comment #37
Dave ReidTagging all the bulk alias issues for #713238: RFC: Pathauto Bulk module.
Comment #38
sunI have to disagree. Strongly. Please reconsider based on the following:
I seriously think that this functionality belongs into Pathauto module itself. But if absolutely required, then I'll create a new module for it.
Comment #39
Dave Reid@sun: I'm not sure exactly why #713238: RFC: Pathauto Bulk module is so bad, but I talked a little bit with greggles today and most likely what we'll do is make the batch updating a sub-module of pathauto in 6.x-2.x and 7.x-1.x.
Comment #40
gregglesThanks for providing some of your motivation.
I appreciate your passion for this issue, but would be disappointed if the end result is that you create a new module for it ignoring the advice that I have on it. After 3.5 years as the maintainer of this module I've learned a little bit about how people currently use it and how they want to use it ;)
Comment #41
sunI would be disappointed too, if I'd have to create a separate module. So let's just avoid that! :)
Comment #42
sunThis patch is against 6.x-2.x.
Answers on earlier replies:
I also fail to see how cron is involved in mass-updating URL aliases. Again, doing this is a one-time, one-shot job. If you need to do it, then you need it immediately, not after the next cron runs, or even worse, after the next 500 cron runs.
a) D8 is galaxies away. b) Even for Drupal core, we will need a way to update existing aliases according to the new configuration. Ultimately, the Path module maintainer will have the final word on this, but at least for me, a Pathauto feature without mass-operations makes no sense at all.
But you know that exactly those operations are completely busted, right? Exactly this reason, VBO implements proper actions instead of those operations.
While the generic usage of actions would be worth to explore for Pathauto, I already doubt that it would be the proper use-case scenario for actions. Actions are targeted at automated operations that shall be configurable and perhaps even entirely skippable (not enabled by default) for the site administrator. The configurable part applies, but the skippable part does not.
Comment #43
chx CreditAttribution: chx commentedIf you can it via the batch API, do it. Making it relying on the rather big Views + VBO code base is not a wise move.
Comment #44
giorgio79 CreditAttribution: giorgio79 commentedBig or not, Views + VBO is the future. :) IMHO with views users will have much more options like the ability to filter, slice and dice, write custom actions etc.
For large sites this is essential so they dont have to redo the entire alias table, or content type section. With Views this becomes much much more flexible and the nodes needing realias can be properly isolated with SQL.
Comment #45
agentrickardgreggles and I had this argument once. Personally, I don't think modules should introduce a long chain of dependencies (Views, VBO) for administrating themselves.
However, I have always deferred to his choice as maintainer. Over in another project, I'm in the process of adding support for both, by providing a default mechanism and exposing a hook_node_operations() to VBO.
Comment #46
sunGuys, I need to advance on this batch API functionality very soon (see #38 and following). It would be great to know the target code-base for my efforts.
Comment #47
gregglesI'm mostly deferring to Dave on this. I'd be fine if this were included in the 6.x-2.x branch.
Comment #49
sunI pinged Dave via Skype. Hope that we can quickly agree on a direction, as I likely have to advance on this functionality this week.
Comment #50
Dave ReidWe discussed:
1) Let's go with the batch API patch as is for now. We can re-evaluate and discuss splitting this off into a sub-module of Pathauto in a new issue once it lands. It doesn't make sense to try and do both. Let's just fix the darn thing.
2) Re-roll against 2.x.
3) At least basic tests would be required.
4) Forward-port against HEAD would be cool (actually it would be very nice since I'd like to keep 6.x-2.x in sync as much as possible. Ideal is apply to 7.x-1.x first, then backport).
Comment #51
sunHm. I can't really explain the notices I get when running tests locally - does D6 support testing a batch at all?.
Comment #52
RobLoachI'm getting a couple hunk hits against DRUPAL-6--2. That's the correct branch, right?
Comment #53
Dave Reid#51: pathauto.batch_.51.patch queued for re-testing.
Comment #55
sunStraight re-roll.
Comment #56
sunComment #57
Dave ReidHow soon could we get a patch for 7.x-1.x? I don't want to commit something that's going to put 6.x-2.x and 7.x-1.x out of sync.
Comment #58
sunIn the meantime, bulk updates entirely stopped working in 2.x-dev.
And I'm somehow unable to re-roll this patch.
Comment #59
sunWas a bit tough to re-roll...
Comment #60
sunWhat's left to move on here?
Comment #61
rapsli CreditAttribution: rapsli commentedwhat's the status here?
Comment #62
chx CreditAttribution: chx commentedNote that if the powers that be want this patch in, a D7 port is readily available.
Comment #63
greggleshttp://drupal.org/node/201151#comment-2650384
Comment #64
chx CreditAttribution: chx commentedHEAD patch, unfinished but works for node, user and taxonomy.
Comment #66
chx CreditAttribution: chx commentedComment #67
Dave ReidProper version.
Comment #68
Jody LynnI'm concerned that this batch update still only updates items without aliases (correct?). On a live site, if you want to update your aliases, you want to add additional aliases in bulk and not have to delete all your existing aliases first (which the admin/content/node 'refresh aliases' can do but not yet in bulk). I just wrote a custom module to give me a 'refresh' batch callback that goes through all my nodes with pathauto_create_alias.
Comment #69
sun@Jody: Right, that's more or less what I wanted to add after this baseline hit the floor, but it never did.
I'm a bit disgusted with this issue, somehow hoping that an angel will hand some flowers, rainbows bending, and a black magician takes off the hat. Your wishes may vary. :)
Comment #70
greggles@Jody Lynn - You could have used the existing action and VBO to achieve a similar thing.
@sun, I'm sorry you feel disgusted by this. I feel frustrated as well that people want to build code to do what is already possible and haven't yet worked on actions for users, terms etc. which would be truly novel functionality. As I mentioned earlier, I've handed off the Bulk generation "component" to Dave Reid and am letting him act as architect and gate keeper for concepts.
Comment #71
Dave ReidI thought #201151-57: Use batch API to perform path alias bulk updates was pretty darn clear, but was never answered until chx found me in IRC a few days ago and I asked him to start posting his D7 patch into the issue so I could at least take a look at it.
Comment #72
adrien.gibrat CreditAttribution: adrien.gibrat commented#64: bulkupdate-head.patch queued for re-testing.
Comment #74
mrtorrent CreditAttribution: mrtorrent commentedsubscribing
Comment #75
Dave ReidRevised patch against D7 and executive summary:
Followups:
Comment #77
Dave ReidTestbot client failure.
Comment #78
sun2 typos in bulkupdate here.
I don't understand why this entire module_implements_alter is needed. You don't need it just to support hook implementations on behalf of core modules. Also not sure why pathauto's implementation needs to be called first?
EDIT: Read the explanation elsewhere. Still looks a bit like abusing hook_module_implements_alter() to me, and just because we want to remove 4 other lines that include a file? At the very least, this should be discussed in a separate issue/patch, as it has little to do with this issue.
When I worked on the D6 patch, it became more and more apparent that, with a batch API driven mass-update, the actual (bulk)update of individual entities is not different to a regular update of an individual entity.
I would therefore suggest to drop the entire "bulkupdate" operation and replace it with the already existing "update". Any informational messages are (or should be) generated by the caller anyway. That, however, could also be deferred to another issue.
Likewise, I do not see what other kind of special queries Pathauto should ever perform, so a query tag of "pathauto_bulk_update" seems overly lengthy/verbose to me -- "pathauto" sufficiently cuts it and leads to hook_query_pathauto_alter(). Additionally, it's a bit wonky to tag a SELECT query with the term "update".
s/Implement/Implements/
(and elsewhere)
1) Why $op if there is just one possible? Also, we de-$op-ified D7, so hook_pathauto_info() or similar would be more appropriate.
2) Those $settings are being type-casted to an object everywhere, that doesn't make much sense to me.
3) Missing newline between return and default.
4) default case is empty in all of these switch structures (and a default case wouldn't make much sense for this kind of hook either).
"bulk" and "batch" are synonyms, these function names can definitely be shortened.
Typo in LANGUAGE_NONE
Typo in Pathauto
As visible in this test code, it would be much more readable if these messages would simply use numbers instead of complex wording ("one"/"no new"). They are administrator-only messages after all.
Additionally, the test needs to use the same mechanism to generate the messages, i.e. format_plural()
Powered by Dreditor.
Comment #79
Dave ReidThis is same thing we have to do with Token.module to add core-support for hooks if we don't want to clutter them into the main module file. I think this is a valid change for this. Especially since I would have had to add more calls to _pathauto_include() rather than just module_invoke_all('pathauto').
Using $op = 'bulkupdate' disables verbose mode. I'd prefer not to change this functionality too.
The tag should at least be kinda descriptive? We would have the possibility that we'd have other queries in Pathauto. Maybe use 'pathauto_bulk_update_select' as the tag? We don't use 'node' as the node access query tag.
Existing code that I'm not touching. It's just being moved from one file to the next. I'll cover cleanups with this API function in a separate issue.
It's the batch operations of the bulk update process. They're two different things to me. Plus they match the function signatures in pathauto.admin.inc.
I guess I see your point, but I guess I was taught to always use the word for smaller numbers. Core is mixed on this use with full sentences (sometimes uses 'one', most other times '1'). I've changed it to just use format_plural with 1/@count and provide the total number of aliases generated.
We shouldn't need to if we know exactly the number that is expected. We do the same thing in many core tests.
Revised patch with minor tweaks to UI text/messages, fixes spelling errors
Comment #80
Dave ReidRevised patch that changes the post-batch message to use format 'Generated @count URL aliases.' instead of '@count URL aliases generated.'.
Comment #81
Dave ReidRevised patch with a little more abstraction and cleanups with the individual entity alias generation helpers.
Comment #82
bleen CreditAttribution: bleen commentedsubscribe
Comment #83
Dave ReidRevised patch for D7 that found some tiny mistakes in documentation and the batch processing start.
Comment #84
Dave ReidPosting same patch for D6 to see how the bot likes it.
Comment #85
Dave ReidRe-testing the d7 patch just to be sure before I pull the trigger.
Comment #87
Dave ReidHeh, D6 patch uploaded instead of D7. :/
Comment #88
Dave ReidProud to say this has been committed to 7.x and 6.x-2.x!
http://drupal.org/cvs?commit=399486
http://drupal.org/cvs?commit=399488
http://drupal.org/cvs?commit=399490
Comment #89
bleen CreditAttribution: bleen commentedCool beans man!
Comment #90
Dave ReidA side note to anyone here, we've got a small D6 core bug that will hold up any module tests that include a batch process (including this new bulk update feature in Pathauto). If anyone could please help review #867722: PHP notices when batch is used without JavaScript that would be greatly appreciated.
Comment #92
Gabriel R. CreditAttribution: Gabriel R. commentedThe module at #17 is made for updating node aliases, not users or others.
Also, it doesn't work for me. On my setup, it throws the error user warning: UID is not a number. in /blahblah/user_stats/user_stats.module on line 88. ...
Comment #93
asb CreditAttribution: asb commentedBulk updating taxononomy terms with pathauto 6.x-2.0 doesn't work for me either. Tried this on two sites; in the first one, the bulk update script seemed to loop somehow and continued to generate -0, -1, -2 aliases, until I canceled at -48, -49...; on the second site, it does not even start (just says "initializing..."). A couple of thousand unaliased taxonomy terms...
<irony>
it could be worse</irony>
. Ah yes, it will be worse when Googlebot hits the broken links and triggers search404 thousands of times.Maybe we could file a couple of new issues for those, but probably we should save us the trouble of just another bunch of "can not reproduce" issues :(
Comment #94
Dave ReidFiling new bug reports are better than commenting on an issue that's been closed for over a year.
Comment #95
asb CreditAttribution: asb commentedSorry, sometimes the frustration with Drupal is simply overwhelming.
Issue 1 ("bulk creating aliases does not work") is already known and has been tagged for 7.x-1.x-dev: #1289918: Bulk update stuck on initializing - is there a command line I could use?. This issue has been closed as "duplicate" because someone requested an action for Drush: #867578: Add drush commands for bulk alias updating/deleting. While the Drush action definitely would be a "nice to have", it'd be still a workaround for the original issue (bulk alias creation is stuck when initializing). However, #1289918 is closed. It is beyond me if it'd be proper procedure to reopen a closed issue against 7.x-1.x-dev which is also virulent in 6.x-2.0 (and probably 6.x-1.x-dev, which were both last updated on 2011-Oct-31).
Issue 2 ("Pathauto creates duplicate aliases") is a freak phenomenon probably nobody will even bother to look into because there is not procedure to reproduce it, and it is known in various incarnations for a couple of years as well (#537800: duplicate url aliases, #593048: _pathauto_alias_exists handles language-neutral aliases wrong, and some more). In my case, it's probably some weird interaction between 'Pathauto', 'Globalredirect', 'Path_redirect', and 'Content_taxonomy', that fits nowhere in the issue queue respectively already exists multiple times in various closed (duplicate, can not reproduce, leave me alone...) states.
Comment #96
JCB CreditAttribution: JCB commentedI had some issues updating term aliases as I have vocabulary with 50,000 terms.
This lead to internal server errors.
I can confirm that updating URL alias is working with views bulk operations (VBO).
I installed the latest dev version which supplied the required options to make this possible.
6.x-2.x-dev tar.gz (42.96 KB) | zip (47.99 KB) 2012-Sep-28 Notes