Project:Views Bulk Operations (VBO)
Version:6.x-1.x-dev
Component:Code
Category:task
Priority:normal
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

Using VBO to set the taxonomy on a set of 500 modules generally works, but the execution time is very long. On one server it takes 75 seconds. On another server, it is faster, but often times it also simply fails to finish (I don't have access to logs there to determine why). On larger data sets, an "out of memory" error is reported (I unfortunately don't have specs on that server configuration).

We actually have interest to use this on sets of ~ 8K nodes, but that is not feasible presently.

Let me know what you think about this--if the code could be optimized, what is involved, and if I could assist with this, if that would help.

Thanks.

Comments

#1

I've started looking at performance a couple of days ago. But be aware that memory consumption and speed are diametrically opposed. For now, I'll focus on making VBO succeed for any number of nodes. Later I'll look into the speed issue.

#2

Great. Thank you.

#3

I committed a fix that allows the direct execution of actions on 10K+ nodes to succeed. Please try it (from CVS if you're in a hurry) and let me know.

#4

I took the dev version marked April 02 but I don't see any appreciable difference:

500 rows processed in about 70956 ms

When I run it against 14K nodes, I get an error that the MySQL server has gone away (on a Windows machine). This I tried a few times--same result each time. The machine has 2G RAM.

#5

More important than the total RAM of the machine is the memory allocated to PHP in php.ini. I tested with 128MB, and whereas the earlier version exceeded the memory, this new version completed successfully. What's your memory setting?

Also, the time taken is irrelevant at this point, since we're only looking at memory consumption.

#6

I also successfully tested publishing 10K nodes under 64MB of PHP memory, running Ubuntu Linux.

#7

Hi,

I've just tested a bulk operation on 9000 nodes, using the module released 2 days ago.

Well,
- VBO : 6.x Dev version, 14th of April 2009
- Hardware : Athlon 64 X2 (don't remember the model), 2GB RAM
- Drupal 6 : 2 CCK Fields per node
- PHP : 30 minutes timeout, 96MB limit.
- System : Ubuntu Server 8.04

I tried to resave all the nodes (because I've added a Computed CCK Field, that is only taken into account when the node is saved), and it takes about 25 minutes for about 800 nodes, then it fails because of a memory overflow.
Is this expected ?

#8

@artscoop, do you mean that you're using the Save node action?

I'm testing again on my own config. I'll follow up on this thread.

#9

Yep kratib, it's the "Save node" action.
Thanks

#10

@artscoop: can you please select the Batch API option in the VBO settings and try again? I believe this is a better option for large datasets.

#11

I tried editing the taxonomy on 9361 nodes today. It ran for ~ 11 minutes on a Ubuntu 9 machine with 2G dual core Intel CPU and these settings in php.ini:

max_execution_time = 3000 ; Max execution time in seconds
memory_limit = 256M ; Max amount of memory a script may consume

So it took 11 minutes but it finished. The Batch API option works beautifully and makes this 11 minute issue not too bad. :)

I told my client to try this option and this was his report:

I just enabled this on the live site and selected 3,300 items to change the workflow state (as a test). Before I could get to the confirmation page I got this error message:

Fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 6551759 bytes) in /home/xxx/drupal/includes/database.mysqli.inc on line 303

That shared host machine indeed has memory_limit set to 64M.

Any idea why it might have worked for you on 64M but not on this shared host?

Thanks.

#12

I tried the Batch API scenario on 64M and ran into the memory problem. It was failing for a different reason. I fixed it now, so please try it.

#13

Client said that he used "the 6.x-1.x-dev snapshot dated 5/6/09" and that "it took about 80 minutes to handle ~8k items. This is with the batch API turned on."

So while this is extremely slow, it does work *and* there is feedback for the user via the batch API.

#14

Status:active» fixed

Shared hosting is notoriously under-resourced. There are a lot of different ways to tune a Drupal installation for performance, but I think that thanks to this thread (and your patience), we've at least enabled VBO to work in low-resource environments. I'll mark this as fixed :-)

#15

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.