Node Import Memory Issues - Suggestions and Best Practices

yountod - October 27, 2009 - 23:55
Project:Node import
Version:6.x-1.0-rc4
Component:Miscellaneous
Category:support request
Priority:normal
Assigned:Unassigned
Status:active
Description

I was wrestling with giant import files (millions of records and such) and couldn't get the load to complete. I reduced the import file to 1 million records with only one field (title) per record just to see if it would complete. It still bombed after a few thousand rows.

I was just about to give up on this method of force-feeding Drupal my data when I realized that the imports were still queued up and marked as "in progress" - I got a superchub when I clicked on them and they resumed! Very sweet.

Now, with renewed vigor, I am seeking "best practices" on how to keep that barber pole spinning as long as possible without the import bombing out. Measures so far:

- I'm the only session on the box
- I jacked up my php.ini memory to like 256M or something
- I set ApacheSolr and core Search to the chillest settings and unreference my importing node type
- I whisper sweet things to the server and stroke it while it runs my batch job

Is there more I could be doing? Is it possible to leave a system unattended and have it complete hundreds of thousands of records without bombing? That would be awesome.

#1

zeezhao - October 28, 2009 - 09:01

Hi. I have imported a few million records using version 5 but not version 6 yet... But I assume some of the issues I found then may be of help.

See here for old notes: http://drupal.org/node/309563

More memory always helps... I used 512MB or so. but i was never able to get it run faster than 30,000 nodes per hour... So 1million = 33hrs... Averagely 20,000/hour

My initial tests with version 6 gave me 16,000/hour on a laptop with IGB RAM but 512MB or less to php.

Also, you will need to allocate more memory to mysql via your my.cnf file.

 
 

Drupal is a registered trademark of Dries Buytaert.