Instead of using drupal_write_record, we should really use the built-in Drupal APIs, node_save() most notably. Without them, it is not possible to write extensions mapping thinks like Joomla content to Organic Groups.

Comments

malclocke’s picture

I've considered doing this, my main concern is that comments in the original 5 code have indicated that this is too slow for large imports, but I think it's probably important to at least have it as an option. The addition of cron processing in the 6.x version means we have an option to import a configurable amount of new items per run.

Patches welcomed.

agentrickard’s picture

For cron run we might use the Job Queue module.

For non-cron runs, are you using the batch API in D6?

I may have time to roll some patches. It might also be interesting to integrate with Migrate module.

jcisio’s picture

node_save doesn't only slow but also memory leak. My site has 20k user, 14k nodes and 15k comments. I write a customized script:
- drupal_write_record took less than 5 minutes to convert them all
- node_save took no less than 3 hours, many timeout (30 seconds, I can't increase this) and memory usage (mesured by devel perf log) increase from 10M (the first loops) to 40M (the ending loops). I need to modify the nodes to convert in each loop to about only 50 (Batch API).

Spec: Xeon Quad Core X3220, 2 GB RAM, SATA2.

agentrickard’s picture

Or Batch API so that each node is saved individually.

ParisLiakos’s picture

+1 on batch API since with the current procedure,it is very likely to get out of memory limit and it also provides some feedback instead of just waiting the script to finish

ParisLiakos’s picture

Title: Use node_save() and the APIs » Use node_save() and the batch API
Version: 6.x-1.0-alpha3 » 7.x-1.x-dev
Category: task » feature

about node_save though i dunno....jcisio's benchmark is pretty convincing

ParisLiakos’s picture

Status: Active » Needs work

well i run an import on 6k nodes with both drupal_write_record and node_save...

it takes the same time more or less (about 25-30 mins) with both, but node_save times out of 30secs on 50%, drupal_write_record times out on 90%..

So they both time out, batch API can fix that. i think i will convert it to batch api anyways, so it wouldnt matter in the long run, but i still need to figure out what to do with the cron runs, since batch api is not for crons afaik

agentrickard’s picture

D7 is supposed to have a solution for that. See http://api.drupal.org/api/examples/queue_example--queue_example.module/7.

You might also consider changing this to a Feeds plugin, since feeds manages memory for you.

ParisLiakos’s picture

great!thank you, i have the solution now

i already converted drupal_write_records to apis, since there is much more flexibility now and ability to map joomla data to drupal fields:)

i will now integrate :

batch api for visual imports
queue api for cron imports

ParisLiakos’s picture

Version: 7.x-1.x-dev » 7.x-2.x-dev
Status: Needs work » Fixed

batch api is ready in 7.x-2.x.
about queue api i am not that sure..i think not even having cron imports,there is no point to me and i doubt anyone used cron.
the ability to update content will be there, but not for cron runs

that really is an other issue though

I will mark this as fixed now

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.