If I bulk generate aliases on nodes, only 50 nodes are aliased at once.

If this is by design, then this is probably a documentation omission/bug.

If this is not by design, but you still wish to throttle bulk generation, maybe provide a field where we can specify the max number of nodes to bulk edit at once?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

greggles’s picture

When you visit the admin/settings page, do you have a text box to enter the Maximum number of objects to alias in a bulk update: ?

That is the textbox you desire, right?

I'm starting to think I should create a new "bulk update" page. That page would have this text box and then also the checkboxes to execute bulk update instead of putting them at the bottom of the object-type fieldsets.

Aren Cambre’s picture

Title: Bulk generate only does 50 nodes at once » Create separate bulk update page
Category: bug » feature

OOps, sure is. I think I missed it because I usually expect general settings to be automatically displayed (not hidden in a collapsible block), so I didn't think to look in there.

I agree with your suggestion about the bulk update page. I've changed this to a feature request to reflect that.

doc2@drupalfr.org’s picture

This issue might be duplicate (EDIT: I could be wrong! cf: #4. It could just be related. I'll edit back in that case). Indeed, have you seen this one: #176669: document bulk generation and bulk deletion feature ?

(nevermind about the crossed-out link, just click)

Aren Cambre’s picture

I think this issue is substantively different, but I'll let greggles decide.

greggles’s picture

The original issue seems similar, but the final idea here is to create a separate bulk update page to help users understand how to use it and where to find it. I think that's pretty different from just documenting it. If the separate page is clear enough then the documentation will become less important.

Aren Cambre’s picture

Why do I keep getting emails about this issue? Here is the email:

---------- Forwarded message ----------
From: <>
Date: Mon, Jun 16, 2008 at 11:34 AM
Subject: Your submitted bugs for June 16, 2008
To: XXXXXXXXX@XXXXXXX.XXX

[ Pathauto / Code ]=====================================================
Create separate bulk update page
state: active
age: 26 weeks 2 days
url: http://drupal.org/?q=node/201151

greggles’s picture

If you mean why did you get so many just now that was due to a bug with drupal.org. There should be an explanation post coming soon.

greggles’s picture

Explanation is now live: http://drupal.org/node/271466

Freso’s picture

Version: 5.x-2.0 » 7.x-1.x-dev

FWIW, +1 to a separate page for bulk creating and updating. It has to happen in the proper branch though.

greggles’s picture

Also...this could be simply documentation about how to do this with http://drupal.org/project/views and http://drupal.org/project/views_bulk_operations

That would help a lot and simplify a lot of the code and make it more valuable for end users (though more complex to setup...)

Aren Cambre’s picture

I just now got another email about this issue:

[ Pathauto / Code ]=====================================================
Create separate bulk update page
 state: active
 age: 1 year 3 weeks
 url: http://drupal.org/node/201151

I have opened many issues; I have no idea why I keep getting email notifications of this one.

greggles’s picture

You get e-mails because this project is configured to send reminders about open issues once a month.

I've added a note to the top of the issue submission page for this project to let people know it will happen.

Aren Cambre’s picture

Thanks. Strangely, it has been many months since my prior emailed notice.

Aren Cambre’s picture

Wow, I just got another email. :-)


---------- Forwarded message ----------
From: <info@drupal.org>
Date: Sun, Jul 19, 2009 at 12:51 PM
Subject: Your submitted bugs for July 19, 2009
To: xxxx@xxxxxx.xxx


[ Pathauto / Code ]=====================================================
Create separate bulk update page
 state: active
 age: 1 year 31 weeks
 url: http://drupal.org/node/201151
giorgio79’s picture

Batch updating of pathauto aliases is already possible with VBO and a node view.

What I am looking for is batch update via VBO for term aliases, which is not yet possible :)

Any ideas are appreciated.

adamo’s picture

If you use the Batch API module when doing bulk updates, there will be no need to set a limit. You would be able to bulk update an infinite number of items without having to worry about the PHP max execution time or HTTP request timing out.

http://drupal.org/node/180528

agentrickard’s picture

Title: Create separate bulk update page » Create separate batch update page
Status: Active » Needs review
FileSize
3.21 KB

Here's a module I wrote to accomplish this for a project. It needs some review. I don't really have the bandwidth for a proper patch right now, but will be at DC Paris.

sebos69’s picture

interesting, subscribing...

Aren Cambre’s picture

Just ran into this again. Wow, been two years since I filed this request, but glad there's progress.

Want to again plug for this enhancement. Bulk generate is a tool, not a setting, so it doesn't make sense to bury it inside collapsed sections on a settings page.

Also, moving this to its own tab would also allow you to move the Maximum number of objects to alias in a bulk update section to that page.

sun’s picture

Assigned: Unassigned » sun

Working on this.

EDIT: Marked #212084: perform bulk updates during cron and/or via the batch API as duplicate of this issue.

sun’s picture

FileSize
23.15 KB

Looks big, but isn't really. Quite nice.

Posting what I have after testing on small local development site (works nicely there) - will now test on a (very) large site with >200,000 nodes, >600,000 comments, >150,000 users, and a couple of taxonomy and forum terms.

sun’s picture

Status: Needs review » Needs work

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 71 bytes) in sites/all/modules/pathauto/pathauto.admin.inc on line 407

ok, need to rethink how to generate the batch operations.

Dave Reid’s picture

Yeah the approach in #21 is going to create a metric crap-ton of batch operations instead of a preferred one operation per module. That operation should then be using context and looping through groups of objects/entities like we usually do with large batch API processes.

sun’s picture

Status: Needs work » Needs review
FileSize
34.54 KB

Alrighty - this one works. And takes its time ;)

sun’s picture

Title: Create separate batch update page » Use batch API to perform path alias bulk updates
FileSize
34.9 KB

Fixed missing PHPDoc.

Ready to fly, if you'd ask me. :)

sun’s picture

FileSize
34.58 KB

Last tweak: Cut down to a single batch set with multiple operations (one op per path alias type).

sun’s picture

I should also mention that those new batch updates use fully loaded entities to generate aliases, which of course slows down the generation process a bit. I'm not sure why it was using minimal database query objects before, but considering that arbitrary tokens can be used for URL alias patterns, this smelled like a bug. (Only nodes were fully loaded via node_load() previously.)

If this is handled elsewhere already, then we can easily revert this to the previous way.

Any chance to get this in?

svendecabooter’s picture

Thanks a lot for this patch sun. This is really a feature that should go into the Pathauto module...
I tested the patch on a dev site with a few thousand nodes & terms, and it worked perfectly.

svendecabooter’s picture

That would be patch #26 obviously.

RobLoach’s picture

Priority: Normal » Critical
Status: Needs review » Reviewed & tested by the community

After applying the patch, I ran through a test for all nodes, tags and users. I really like how it has been moved off the settings page. This has to go in. Even the code is nicer!

Dave Reid’s picture

Status: Reviewed & tested by the community » Needs review

Big thing I noticed is an inconsistency with the queries in the operations. Sometimes we're using a $context['sandbox']['query'] and re-using it in the processing, sometimes we're not but it looks like that pattern should be consistent throughout. I'll test as well today and report back.

+++ pathauto_user.inc
@@ -112,91 +112,160 @@ function contact_pathauto($op) {
+  // Get the total amount of items to process.
+  if (!isset($context['sandbox']['total'])) {
+    $context['sandbox']['total'] = db_result(db_query("SELECT COUNT(uid) FROM {users} LEFT JOIN {url_alias} ON CONCAT('user/', CAST(uid AS CHAR)) = src WHERE uid > 0 AND src IS NULL"));
+    $context['sandbox']['count'] = 0;
+    $context['sandbox']['items'] = array();
+  }
+
+  // Get the next stack of items to process.
+  if (empty($context['sandbox']['items'])) {
+    $query = "SELECT uid FROM {users} LEFT JOIN {url_alias} ON CONCAT('user/', CAST(uid AS CHAR)) = src WHERE uid > 0 AND src IS NULL";
+    $result = db_query_range($query, $context['sandbox']['count'], 1000);
+    while ($uid = db_result($result)) {
+      $context['sandbox']['items'][] = $uid;
+    }
+  }

Why isn't the pattern of $context['sandbox']['query'] used here?

+++ pathauto_user.inc
@@ -112,91 +112,160 @@ function contact_pathauto($op) {
+  // Get the total amount of items to process.
+  if (!isset($context['sandbox']['total'])) {
+    $context['sandbox']['total'] = db_result(db_query("SELECT COUNT(uid) FROM {users} LEFT JOIN {url_alias} ON CONCAT('blog/', CAST(uid AS CHAR)) = src WHERE uid > 0 AND src IS NULL"));
+    $context['sandbox']['count'] = 0;
+    $context['sandbox']['items'] = array();
+  }
+
+  // Get the next stack of items to process.
+  if (empty($context['sandbox']['items'])) {
+    $query = "SELECT uid FROM {users} LEFT JOIN {url_alias} ON CONCAT('blog/', CAST(uid AS CHAR)) = src WHERE uid > 0 AND src IS NULL";
+    $result = db_query_range($query, $context['sandbox']['count'], 1000);
+    while ($uid = db_result($result)) {
+      $context['sandbox']['items'][] = $uid;
     }
   }

Ditto about not using the sandbox query pattern.

+++ pathauto_user.inc
@@ -112,91 +112,160 @@ function contact_pathauto($op) {
+  // Get the total amount of items to process.
+  if (!isset($context['sandbox']['total'])) {
+    $context['sandbox']['total'] = db_result(db_query("SELECT COUNT(uid) FROM {users} LEFT JOIN {url_alias} ON CONCAT(CONCAT('user/', CAST(uid AS CHAR)), '/track') = src WHERE uid > 0 AND src IS NULL"));
+    $context['sandbox']['count'] = 0;
+    $context['sandbox']['items'] = array();
+  }
+
+  // Get the next stack of items to process.
+  if (empty($context['sandbox']['items'])) {
+    // We do the double CONCAT because Pgsql8.1 doesn't support more than three
+    // arguments to CONCAT.
+    $query = "SELECT uid FROM {users} LEFT JOIN {url_alias} ON CONCAT(CONCAT('user/', CAST(uid AS CHAR)), '/track') = src WHERE uid > 0 AND src IS NULL";
+    $result = db_query_range($query, $context['sandbox']['count'], 1000);
+    while ($uid = db_result($result)) {
+      $context['sandbox']['items'][] = $uid;
     }

Ditto.

Powered by Dreditor.

sun’s picture

@Dave: That's really no reason to hold up this patch :P In short: All but user aliases require rather complex queries that depend on configuration values, so the queries to find entities without aliases require quite some preparation. Therefore, the preparation of queries happens once in an initial step. For user-related aliases, all queries are very simply and straightforward, so no special preparation is required. I would consider this difference a good thing, because this way, other developers that may want to integrate with pathauto have examples for both complex and simple implementations.

Thanks for reviewing! :)

sun’s picture

Status: Needs review » Reviewed & tested by the community

Maintainer feedback?

sun’s picture

bump

greggles’s picture

I left explanations in comments #7 and #8 and #10.

I think we should even consider removing the bulk generation checkboxes from Pathauto and making it rely on VBO more heavily. Ken raised the point that we shouldn't have to rely on other modules to get this functionality, but that's not really compelling to me. The point of modules is that we rely on them to do what they do best. Pathauto is not a "bulk operations" tool - VBO is.

Dave Reid’s picture

I have to agree with Greg. Although this is a nice feature to cleanup/fix, I think we should move it to a separate project for batch updating and/or aliasing un-aliased locations on cron since that would be great. If we're going to attempt to get Pathauto into Drupal 8, it's got to be lean and mean and I don't think this feature would come along.

We should stick to supporting hook_node/user/etc_operations() so that other modules (like VBO or a pathauto_cron) can easily extend the basic features.

Dave Reid’s picture

Issue tags: +bulk

Tagging all the bulk alias issues for #713238: RFC: Pathauto Bulk module.

sun’s picture

I have to disagree. Strongly. Please reconsider based on the following:

  1. VBO is based on views. Views only work for a single, specific entity. To mass-create and update aliases across entities, you have to 1) have a view for each entity in the first place and 2) invoke the action for every single entity separately.
  2. VBO is based on views. If you look at the queries performed to gather the entities that do not have an alias yet, then you'll quickly recognize that you will have to create new, specific views for Pathauto - which would require views filters that do not exist yet.
  3. This patch is just a start to convert the current functionality. After using it, I ran into the next best problem: Bulk updating aliases, regardless of whether an alias exists already - keeping the old alias to prevent 404s, but not duplicating the existing alias if the new would be the same.

I seriously think that this functionality belongs into Pathauto module itself. But if absolutely required, then I'll create a new module for it.

Dave Reid’s picture

@sun: I'm not sure exactly why #713238: RFC: Pathauto Bulk module is so bad, but I talked a little bit with greggles today and most likely what we'll do is make the batch updating a sub-module of pathauto in 6.x-2.x and 7.x-1.x.

greggles’s picture

Thanks for providing some of your motivation.

  1. This is true, but I don't see what the problem is. The entities we need are pretty simple: nodes, users, terms, and maybe one or two other small things. That is not hard.
  2. Creating and exporting views is quite easy. Look at project module, for example, which exports lots of views. We can create new filters for the things we need.
  3. In general bulk updates should respect the update action. Part of exposing this as VBO is that it would be much easier to give the admin more fine grained control. We can create separate actions as buttons in VBO for "delete old alias, create new" and "create new alias, redirect from old alias" etc. so that the admin has fine grained control over the specific action being done without having to change their default setting for the editing of a node. Many people have requested that.

I appreciate your passion for this issue, but would be disappointed if the end result is that you create a new module for it ignoring the advice that I have on it. After 3.5 years as the maintainer of this module I've learned a little bit about how people currently use it and how they want to use it ;)

sun’s picture

  1. D7 supports many more entities. Infinite, to be exact.
  2. So where are (all) those views located? I still fail to see how views are actually and fundamentally remotely related to bulk updating URL aliases.
  3. We can do this, and a lot more, without Views and VBO, too. And, yes, I still don't see the connection to Views. I need a real view when I want to permanently work with the data in a view. But not for one-time, one-shot mass-operations after installing or re-configuring a module.

I would be disappointed too, if I'd have to create a separate module. So let's just avoid that! :)

sun’s picture

Version: 7.x-1.x-dev » 6.x-2.x-dev

This patch is against 6.x-2.x.

Answers on earlier replies:

I think we should move it to a separate project for batch updating and/or aliasing un-aliased locations on cron

I also fail to see how cron is involved in mass-updating URL aliases. Again, doing this is a one-time, one-shot job. If you need to do it, then you need it immediately, not after the next cron runs, or even worse, after the next 500 cron runs.

If we're going to attempt to get Pathauto into Drupal 8, it's got to be lean and mean and I don't think this feature would come along.

a) D8 is galaxies away. b) Even for Drupal core, we will need a way to update existing aliases according to the new configuration. Ultimately, the Path module maintainer will have the final word on this, but at least for me, a Pathauto feature without mass-operations makes no sense at all.

We should stick to supporting hook_node/user/etc_operations() so that other modules (like VBO or a pathauto_cron) can easily extend the basic features.

But you know that exactly those operations are completely busted, right? Exactly this reason, VBO implements proper actions instead of those operations.

While the generic usage of actions would be worth to explore for Pathauto, I already doubt that it would be the proper use-case scenario for actions. Actions are targeted at automated operations that shall be configurable and perhaps even entirely skippable (not enabled by default) for the site administrator. The configurable part applies, but the skippable part does not.

chx’s picture

If you can it via the batch API, do it. Making it relying on the rather big Views + VBO code base is not a wise move.

giorgio79’s picture

Making it relying on the rather big Views + VBO code base is not a wise move

Big or not, Views + VBO is the future. :) IMHO with views users will have much more options like the ability to filter, slice and dice, write custom actions etc.

For large sites this is essential so they dont have to redo the entire alias table, or content type section. With Views this becomes much much more flexible and the nodes needing realias can be properly isolated with SQL.

agentrickard’s picture

greggles and I had this argument once. Personally, I don't think modules should introduce a long chain of dependencies (Views, VBO) for administrating themselves.

However, I have always deferred to his choice as maintainer. Over in another project, I'm in the process of adding support for both, by providing a default mechanism and exposing a hook_node_operations() to VBO.

sun’s picture

Guys, I need to advance on this batch API functionality very soon (see #38 and following). It would be great to know the target code-base for my efforts.

greggles’s picture

I'm mostly deferring to Dave on this. I'd be fine if this were included in the 6.x-2.x branch.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, pathauto.batch_.26.patch, failed testing.

sun’s picture

I pinged Dave via Skype. Hope that we can quickly agree on a direction, as I likely have to advance on this functionality this week.

Dave Reid’s picture

We discussed:
1) Let's go with the batch API patch as is for now. We can re-evaluate and discuss splitting this off into a sub-module of Pathauto in a new issue once it lands. It doesn't make sense to try and do both. Let's just fix the darn thing.
2) Re-roll against 2.x.
3) At least basic tests would be required.
4) Forward-port against HEAD would be cool (actually it would be very nice since I'd like to keep 6.x-2.x in sync as much as possible. Ideal is apply to 7.x-1.x first, then backport).

sun’s picture

Status: Needs work » Needs review
FileSize
37.93 KB

Hm. I can't really explain the notices I get when running tests locally - does D6 support testing a batch at all?.

RobLoach’s picture

I'm getting a couple hunk hits against DRUPAL-6--2. That's the correct branch, right?

Dave Reid’s picture

Issue tags: -bulk

#51: pathauto.batch_.51.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +bulk

The last submitted patch, pathauto.batch_.51.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
37.86 KB

Straight re-roll.

sun’s picture

FileSize
37.41 KB
Dave Reid’s picture

How soon could we get a patch for 7.x-1.x? I don't want to commit something that's going to put 6.x-2.x and 7.x-1.x out of sync.

sun’s picture

In the meantime, bulk updates entirely stopped working in 2.x-dev.

And I'm somehow unable to re-roll this patch.

sun’s picture

FileSize
37.62 KB

Was a bit tough to re-roll...

sun’s picture

What's left to move on here?

rapsli’s picture

what's the status here?

chx’s picture

Note that if the powers that be want this patch in, a D7 port is readily available.

greggles’s picture

chx’s picture

FileSize
26.61 KB

HEAD patch, unfinished but works for node, user and taxonomy.

Status: Needs review » Needs work

The last submitted patch, bulkupdate-head.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
Dave Reid’s picture

Version: 6.x-2.x-dev » 7.x-1.x-dev

Proper version.

Jody Lynn’s picture

I'm concerned that this batch update still only updates items without aliases (correct?). On a live site, if you want to update your aliases, you want to add additional aliases in bulk and not have to delete all your existing aliases first (which the admin/content/node 'refresh aliases' can do but not yet in bulk). I just wrote a custom module to give me a 'refresh' batch callback that goes through all my nodes with pathauto_create_alias.

sun’s picture

@Jody: Right, that's more or less what I wanted to add after this baseline hit the floor, but it never did.

I'm a bit disgusted with this issue, somehow hoping that an angel will hand some flowers, rainbows bending, and a black magician takes off the hat. Your wishes may vary. :)

greggles’s picture

Component: Code » Bulk generation

I just wrote a custom module to give me a 'refresh' batch callback that goes through all my nodes with pathauto_create_alias.

@Jody Lynn - You could have used the existing action and VBO to achieve a similar thing.

@sun, I'm sorry you feel disgusted by this. I feel frustrated as well that people want to build code to do what is already possible and haven't yet worked on actions for users, terms etc. which would be truly novel functionality. As I mentioned earlier, I've handed off the Bulk generation "component" to Dave Reid and am letting him act as architect and gate keeper for concepts.

Dave Reid’s picture

I thought #201151-57: Use batch API to perform path alias bulk updates was pretty darn clear, but was never answered until chx found me in IRC a few days ago and I asked him to start posting his D7 patch into the issue so I could at least take a look at it.

adrien.gibrat’s picture

Issue tags: -bulk

#64: bulkupdate-head.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +bulk

The last submitted patch, bulkupdate-head.patch, failed testing.

mrtorrent’s picture

subscribing

Dave Reid’s picture

Assigned: sun » Dave Reid
Status: Needs work » Needs review
FileSize
52.49 KB

Revised patch against D7 and executive summary:

  1. Adds a new 'Bulk update' form at admin/config/search/path/update_bulk
  2. Removes the bulk update checkboxes from admin/config/search/path/settings and removes the no longer necessary variables (including pathauto_modulelist).
  3. Adds a proper pathauto hook registry via pathauto_hook_info(). Any hook_pathauto() or hook_path_alias_types() can and should be moved to modulename.pathauto.inc.
  4. Moves pathauto's node, taxonomy, forum, user, and blog implementations to pathauto.pathauto.inc. Adds pathauto_module_implements_alter() to support this move. This also gets rid of _pathauto_include().
  5. Ensure nodes and taxonomy terms have their own 'update alias' API functions that can be re-used in views bulk operations and during batch generation
  6. Adds a 'batch_update_callback' and 'batch_file' properties to hook_pathauto('settings'). batch_file is used to ensure the file with the batch callback is loaded since batch API runs in new requests and may not have files like pathauto.pathauto.inc loaded. The 'batch callback' info is simply a function name. It's up to the function itself to do all the logic; Pathauto just runs it.
  7. Adds batch callback implementations for node, taxonomy, forum, user, and blogs. Since this is a proper batch API process, we don't currently care if some items don't have patterns. There will be minimal execution for those cases. We can refine this further.
  8. The queries executed in the batch operations have a tag of 'pathauto_bulk_update' and the entity type added as metadata. Useful for other modules that want to extend or change these queries.
  9. We're using proper object loading for all callbacks (node_load_multiple/user_load_multiple), so we should have the full object.
  10. Removes the obsolete pathauto_node.inc, pathauto_taxonomy.inc and pathauto_user.inc files.
  11. Includes a test of the bulk updating functionality.

Followups:

  • #863980: Stop special-casing the forum vocabulary - Will remove all the forum-handling code and simply both the bulk update form but also the patterns form page.
  • Add a 'delete aliases before updating' to the update form so we can make it a one-step process to re-generate aliases for all items
  • Add an 'Only generate aliases for items that do not already have an alias' checkbox so we can make the joins on url_alias optional.
  • Add some code to restrict the bulk update queries to only bundles that will have patterns.
  • Doing left-joins from an entity table (e.g. node) to url_alias will not work when entity_uri() returns a non-default value. We need to abstract more logic to re-use entity_info and rely less and less on hook_pathauto().

Status: Needs review » Needs work

The last submitted patch, 201151-pathauto-bulk-update-D7.patch, failed testing.

Dave Reid’s picture

Status: Needs work » Needs review

Testbot client failure.

sun’s picture

Status: Needs review » Needs work
+++ pathauto.install	26 Jul 2010 19:33:44 -0000
@@ -123,6 +116,21 @@ function pathauto_update_6200() {
+function pathauto_update_7000() {
...
+  variable_del('pathauto_forum_bulkupdadate');
...
+  variable_del('pathauto_blog_bulkupdadate');

2 typos in bulkupdate here.

+++ pathauto.module	26 Jul 2010 19:33:45 -0000
@@ -24,6 +24,36 @@
+/**
+ * Implements hook_module_implements_alter().
+ *
+ * Adds pathauto support for core modules.
+ */
+function pathauto_module_implements_alter(&$implementations, $hook) {
+  $hooks = pathauto_hook_info();
+  if (isset($hooks[$hook])) {
+    $modules = array('node', 'taxonomy', 'user', 'forum', 'blog');
+    foreach ($modules as $module) {
+      if (module_exists($module)) {
+        $implementations[$module] = TRUE;
+      }
+    }
+    // Move pathauto.module to get included first since it is responsible for
+    // other modules.
+    unset($implementations['pathauto']);
+    $implementations = array_merge(array('pathauto' => 'pathauto'), $implementations);
+  }
+}

I don't understand why this entire module_implements_alter is needed. You don't need it just to support hook implementations on behalf of core modules. Also not sure why pathauto's implementation needs to be called first?

EDIT: Read the explanation elsewhere. Still looks a bit like abusing hook_module_implements_alter() to me, and just because we want to remove 4 other lines that include a file? At the very least, this should be discussed in a separate issue/patch, as it has little to do with this issue.

+++ pathauto.pathauto.inc	26 Jul 2010 19:33:45 -0000
@@ -0,0 +1,422 @@
+  $query->addTag('pathauto_bulk_update');
...
+  $result_count = pathauto_taxonomy_term_update_alias_multiple($tids, 'bulkupdate');

When I worked on the D6 patch, it became more and more apparent that, with a batch API driven mass-update, the actual (bulk)update of individual entities is not different to a regular update of an individual entity.

I would therefore suggest to drop the entire "bulkupdate" operation and replace it with the already existing "update". Any informational messages are (or should be) generated by the caller anyway. That, however, could also be deferred to another issue.

Likewise, I do not see what other kind of special queries Pathauto should ever perform, so a query tag of "pathauto_bulk_update" seems overly lengthy/verbose to me -- "pathauto" sufficiently cuts it and leads to hook_query_pathauto_alter(). Additionally, it's a bit wonky to tag a SELECT query with the term "update".

+++ pathauto.pathauto.inc	26 Jul 2010 19:33:45 -0000
@@ -0,0 +1,422 @@
+ * Implement hook_pathauto() for forum module.

s/Implement/Implements/

+++ pathauto.pathauto.inc	26 Jul 2010 19:33:45 -0000
@@ -0,0 +1,422 @@
+function forum_pathauto($op) {
+  switch ($op) {
+    case 'settings':
...
+      return (object) $settings;
+    default:
+      break;
+  }
+}

(and elsewhere)

1) Why $op if there is just one possible? Also, we de-$op-ified D7, so hook_pathauto_info() or similar would be more appropriate.

2) Those $settings are being type-casted to an object everywhere, that doesn't make much sense to me.

3) Missing newline between return and default.

4) default case is empty in all of these switch structures (and a default case wouldn't make much sense for this kind of hook either).

+++ pathauto.pathauto.inc	26 Jul 2010 19:33:45 -0000
@@ -0,0 +1,422 @@
+function forum_pathauto_bulk_update_batch_process(&$context) {
...
+function user_pathauto_bulk_update_batch_process(&$context) {
...
+function blog_pathauto_bulk_update_batch_process(&$context) {

"bulk" and "batch" are synonyms, these function names can definitely be shortened.

+++ pathauto.test	26 Jul 2010 19:33:45 -0000
@@ -49,11 +49,21 @@ class PathautoTestHelper extends DrupalW
+  function assertNoEntityAlias($entity_type, $entity, $langauge = LANGAUGE_NONE) {

Typo in LANGUAGE_NONE

+++ pathauto.test	26 Jul 2010 19:33:45 -0000
@@ -404,3 +419,55 @@ class PathautoLocaleTestCase extends Pat
+      'name' => 'Pathuato bulk updating',

Typo in Pathauto

+++ pathauto.test	26 Jul 2010 19:33:45 -0000
@@ -404,3 +419,55 @@ class PathautoLocaleTestCase extends Pat
+    // Run the update again which should only run against the new node.
+    $this->drupalPost('admin/config/search/path/update_bulk', $edit, t('Update'));
+    $this->assertText('One node URL alias generated.');
+    $this->assertText('No new user URL aliases generated.');

As visible in this test code, it would be much more readable if these messages would simply use numbers instead of complex wording ("one"/"no new"). They are administrator-only messages after all.

Additionally, the test needs to use the same mechanism to generate the messages, i.e. format_plural()

Powered by Dreditor.

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
52.17 KB

I don't understand why this entire module_implements_alter is needed. You don't need it just to support hook implementations on behalf of core modules. Also not sure why pathauto's implementation needs to be called first?

EDIT: Read the explanation elsewhere. Still looks a bit like abusing hook_module_implements_alter() to me, and just because we want to remove 4 other lines that include a file? At the very least, this should be discussed in a separate issue/patch, as it has little to do with this issue.

This is same thing we have to do with Token.module to add core-support for hooks if we don't want to clutter them into the main module file. I think this is a valid change for this. Especially since I would have had to add more calls to _pathauto_include() rather than just module_invoke_all('pathauto').

When I worked on the D6 patch, it became more and more apparent that, with a batch API driven mass-update, the actual (bulk)update of individual entities is not different to a regular update of an individual entity.

I would therefore suggest to drop the entire "bulkupdate" operation and replace it with the already existing "update". Any informational messages are (or should be) generated by the caller anyway. That, however, could also be deferred to another issue.

Using $op = 'bulkupdate' disables verbose mode. I'd prefer not to change this functionality too.

Likewise, I do not see what other kind of special queries Pathauto should ever perform, so a query tag of "pathauto_bulk_update" seems overly lengthy/verbose to me -- "pathauto" sufficiently cuts it and leads to hook_query_pathauto_alter(). Additionally, it's a bit wonky to tag a SELECT query with the term "update".

The tag should at least be kinda descriptive? We would have the possibility that we'd have other queries in Pathauto. Maybe use 'pathauto_bulk_update_select' as the tag? We don't use 'node' as the node access query tag.

1) Why $op if there is just one possible? Also, we de-$op-ified D7, so hook_pathauto_info() or similar would be more appropriate.

2) Those $settings are being type-casted to an object everywhere, that doesn't make much sense to me.

3) Missing newline between return and default.

4) default case is empty in all of these switch structures (and a default case wouldn't make much sense for this kind of hook either).

Existing code that I'm not touching. It's just being moved from one file to the next. I'll cover cleanups with this API function in a separate issue.

"bulk" and "batch" are synonyms, these function names can definitely be shortened.

It's the batch operations of the bulk update process. They're two different things to me. Plus they match the function signatures in pathauto.admin.inc.

As visible in this test code, it would be much more readable if these messages would simply use numbers instead of complex wording ("one"/"no new"). They are administrator-only messages after all.

I guess I see your point, but I guess I was taught to always use the word for smaller numbers. Core is mixed on this use with full sentences (sometimes uses 'one', most other times '1'). I've changed it to just use format_plural with 1/@count and provide the total number of aliases generated.

Additionally, the test needs to use the same mechanism to generate the messages, i.e. format_plural()

We shouldn't need to if we know exactly the number that is expected. We do the same thing in many core tests.

Revised patch with minor tweaks to UI text/messages, fixes spelling errors

Dave Reid’s picture

Revised patch that changes the post-batch message to use format 'Generated @count URL aliases.' instead of '@count URL aliases generated.'.

Dave Reid’s picture

Revised patch with a little more abstraction and cleanups with the individual entity alias generation helpers.

bleen’s picture

subscribe

Dave Reid’s picture

Revised patch for D7 that found some tiny mistakes in documentation and the batch processing start.

Dave Reid’s picture

Version: 7.x-1.x-dev » 6.x-2.x-dev
FileSize
46 KB

Posting same patch for D6 to see how the bot likes it.

Dave Reid’s picture

Version: 6.x-2.x-dev » 7.x-1.x-dev
FileSize
46.02 KB

Re-testing the d7 patch just to be sure before I pull the trigger.

Status: Needs review » Needs work

The last submitted patch, 201151-pathauto-bulk-update.patch, failed testing.

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
56.99 KB

Heh, D6 patch uploaded instead of D7. :/

Dave Reid’s picture

Status: Needs review » Fixed
bleen’s picture

Cool beans man!

Dave Reid’s picture

A side note to anyone here, we've got a small D6 core bug that will hold up any module tests that include a batch process (including this new bulk update feature in Pathauto). If anyone could please help review #867722: PHP notices when batch is used without JavaScript that would be greatly appreciated.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Gabriel R.’s picture

The module at #17 is made for updating node aliases, not users or others.

Also, it doesn't work for me. On my setup, it throws the error user warning: UID is not a number. in /blahblah/user_stats/user_stats.module on line 88. ...

asb’s picture

Bulk updating taxononomy terms with pathauto 6.x-2.0 doesn't work for me either. Tried this on two sites; in the first one, the bulk update script seemed to loop somehow and continued to generate -0, -1, -2 aliases, until I canceled at -48, -49...; on the second site, it does not even start (just says "initializing..."). A couple of thousand unaliased taxonomy terms... <irony>it could be worse</irony>. Ah yes, it will be worse when Googlebot hits the broken links and triggers search404 thousands of times.

Maybe we could file a couple of new issues for those, but probably we should save us the trouble of just another bunch of "can not reproduce" issues :(

Dave Reid’s picture

Filing new bug reports are better than commenting on an issue that's been closed for over a year.

asb’s picture

Sorry, sometimes the frustration with Drupal is simply overwhelming.

Issue 1 ("bulk creating aliases does not work") is already known and has been tagged for 7.x-1.x-dev: #1289918: Bulk update stuck on initializing - is there a command line I could use?. This issue has been closed as "duplicate" because someone requested an action for Drush: #867578: Add drush commands for bulk alias updating/deleting. While the Drush action definitely would be a "nice to have", it'd be still a workaround for the original issue (bulk alias creation is stuck when initializing). However, #1289918 is closed. It is beyond me if it'd be proper procedure to reopen a closed issue against 7.x-1.x-dev which is also virulent in 6.x-2.0 (and probably 6.x-1.x-dev, which were both last updated on 2011-Oct-31).

Issue 2 ("Pathauto creates duplicate aliases") is a freak phenomenon probably nobody will even bother to look into because there is not procedure to reproduce it, and it is known in various incarnations for a couple of years as well (#537800: duplicate url aliases, #593048: _pathauto_alias_exists handles language-neutral aliases wrong, and some more). In my case, it's probably some weird interaction between 'Pathauto', 'Globalredirect', 'Path_redirect', and 'Content_taxonomy', that fits nowhere in the issue queue respectively already exists multiple times in various closed (duplicate, can not reproduce, leave me alone...) states.

JCB’s picture

I had some issues updating term aliases as I have vocabulary with 50,000 terms.
This lead to internal server errors.

I can confirm that updating URL alias is working with views bulk operations (VBO).

I installed the latest dev version which supplied the required options to make this possible.
6.x-2.x-dev tar.gz (42.96 KB) | zip (47.99 KB) 2012-Sep-28 Notes