So when I import from a feed using a form (/import/node_importer), I get a nice little progress bar and I can import roughly 4k rows in about a minute and a half. How can I get the same performance using cron/periodic imports/process in background? I currently have it configured to run every 15 minutes but it seems to make about 1% progress every time it runs- in other words about 4% every hour. Is there a way to make it import the whole lot (100%) every 15 minutes? Am I missing something?
Thanks a bunch.
Comments
Comment #1
sorensong commentedAnyone?
Comment #2
sorensong commentedComment #3
surf12 commentedi have the same problem
Comment #4
surf12 commentedi have the same problem
Comment #5
davemaxg commentedCron processes feeds in chunks. The default chunk size of 50 is extremely small. Just add this line to your settings.php file to change the chunk size.
$conf['feeds_process_limit'] = 2000;
I believe the only limit is a number that will not cause a php script time out. 2000 works fine for me.
Comment #6
star-szrThanks @davemaxg, that did the trick. I had a large CSV file that was only importing 1% at a time.
Comment #7
ressaThanks @davemaxg I was having the same issue using Elysia cron to trigger a job_scheduler_cron import job, but it got stuck after 33 nodes. Adding
$conf['feeds_process_limit'] = 2000;in settings.php fixed it.I wonder why the max. limit is set that low, and not just fx 2500 to start with. If people have issues with php scripts timing out, they could always lower that value.
Comment #8
ndf commentedTo all,
The default chunk-size is set low, because a high value can realy kill your website-performance.
The default php.ini settings (used on frontend) gives you 30 seconds to finish a php script (max_execution_time). On high performance sites, this setting could be lower.
If you put your feeds_process_limit high, than the import process can take more than 30 seconds easily. At 30 seconds you get your timeout error.
Feeds has multiple ways to run the import process:
In most setups cron runs with different php-settings (via php-cli) than the normal frontend php.ini. This is cool, because on frontend you want speed (a lot of concurrent users / php-processes running short time) and on backend you want power (a php process that imports all your nodes).
Drush also uses these php-cli.ini settings.
The default max_execution_time for php-cli is 0, which means that it could run forever. That should be enough for 2000 nodes.
So if you want to import all your nodes every 15 minutes on a live site I would recommend to import via php-cli. That way you can set your feeds_process_limit high, without hitting on your frontend performance.
Howto do it:
Drupal variables like "feeds_process_limit" can be changed on multiple locations. Most easy ways are:
Comment #9
bennos commentedthink we can close this. Solution is above.
About the limit: Normaly the chunk size of 50 works in every drupal enviroment. If the imports are bigger, you can set set this higher.
Comment #10
bennos commentedComment #11.0
(not verified) commentedwording
Comment #12
mozh92 commentedI need help
if I set $conf['feeds_process_limit'] = 2000; I get 504 Error.
I tried to put 1000, 500, 200, but always get 504 Error
If I set limit 50 - it works!
What me settings on my server? I want 2000 limit. Thanks!
I have max_execution_time 3600 and max_input_time 3600
Comment #13
Alexandre360 commentedHello,
I have very large base of user entity to import.
It seems taht the $conf['feeds_process_limit'] only works for node, how to speed up import for users entity.
Comment #14
Alexandre360 commentedAny news about that ? I wonder if I'm the only one that import users with drupal...
Comment #15
megachrizThe effect of 'feeds_process_limit' setting depends on the parser being used. Not every parser respects this setting. For example, the CSV parser and the parsers from Feeds extensible parsers respect this setting. So the solution posted in #8 should work for every processor (node, user, taxonomy term).
Since Feeds 7.x-2.0-beta1, Feeds will try to import multiple batches per cron run, depending on if there is still time left to run another batch. The time limit for this is 60 seconds, see
feeds_cron_queue_info(). So if each batch takes 25 seconds it will do three batches per cron run (as when the second batch is completed it ran for 50 seconds, so it will do another one). This behaviour was added in #1231332: periodic import imports only one file per cron.Comment #16
maxplus commentedThanks,
I'm also started testing the setting "$conf['feeds_process_limit'] = 2000;" because of very slow import of big sources.
Comment #17
cyclone321 commentedI agree with #15 setting the limit merely increases the amount of nodes per Queue Item, so if like in my case you have multiple Queue Items, you need to increase the processing time in the function feeds_cron_queue_info().
Will Execute as many Queue items as possible in 60 seconds one time per cron run.
$queues['feeds_source_import'] = array(
'worker callback' => 'feeds_source_import',
'time' => 60,
);
Will Execute as many Queue items as possible in 5 minutes.
$queues['feeds_source_import'] = array(
'worker callback' => 'feeds_source_import',
'time' => 300,
);
hacking the feeds module is probably not the best solution, but the hard coded value makes it tricky.
After changing this value, Drupal prior to 7.40 will timeout after 240 seconds because of a setting in common.inc, so you probably upgrade the core or you need to hack that aswell...
Comment #18
jomarocas commentedok i make something for feeds working, with this configuration, the import upload but dont showing progress
$conf['feeds_debug'] = true;
$conf['feeds_process_limit'] = 100000;
$conf['http_request_timeout'] = 100000;
in php.ini
max_execution_time = 10000
max_input_vars = 10000
memory_limit = 19200M
post_max_size = 500M
i have a lot of memory
and i have 7.x-2.0-beta4, dont show progress but upload the items, sometime with errors
change with $conf['feeds_process_limit'] = 30; i see the progress
Comment #19
grahamvalue commentedLooks like the comments on this page have been copied verbatim as a tutorial on Setup large imports with Drupal Feeds.