Posted by agentrickard on October 9, 2009 at 4:27pm
Jump to:
| Project: | Joomla to Drupal |
| Version: | 7.x-2.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed (fixed) |
Issue Summary
Instead of using drupal_write_record, we should really use the built-in Drupal APIs, node_save() most notably. Without them, it is not possible to write extensions mapping thinks like Joomla content to Organic Groups.
Comments
#1
I've considered doing this, my main concern is that comments in the original 5 code have indicated that this is too slow for large imports, but I think it's probably important to at least have it as an option. The addition of cron processing in the 6.x version means we have an option to import a configurable amount of new items per run.
Patches welcomed.
#2
For cron run we might use the Job Queue module.
For non-cron runs, are you using the batch API in D6?
I may have time to roll some patches. It might also be interesting to integrate with Migrate module.
#3
node_save doesn't only slow but also memory leak. My site has 20k user, 14k nodes and 15k comments. I write a customized script:
- drupal_write_record took less than 5 minutes to convert them all
- node_save took no less than 3 hours, many timeout (30 seconds, I can't increase this) and memory usage (mesured by devel perf log) increase from 10M (the first loops) to 40M (the ending loops). I need to modify the nodes to convert in each loop to about only 50 (Batch API).
Spec: Xeon Quad Core X3220, 2 GB RAM, SATA2.
#4
Or Batch API so that each node is saved individually.
#5
+1 on batch API since with the current procedure,it is very likely to get out of memory limit and it also provides some feedback instead of just waiting the script to finish
#6
about node_save though i dunno....jcisio's benchmark is pretty convincing
#7
well i run an import on 6k nodes with both drupal_write_record and node_save...
it takes the same time more or less (about 25-30 mins) with both, but node_save times out of 30secs on 50%, drupal_write_record times out on 90%..
So they both time out, batch API can fix that. i think i will convert it to batch api anyways, so it wouldnt matter in the long run, but i still need to figure out what to do with the cron runs, since batch api is not for crons afaik
#8
D7 is supposed to have a solution for that. See http://api.drupal.org/api/examples/queue_example--queue_example.module/7.
You might also consider changing this to a Feeds plugin, since feeds manages memory for you.
#9
great!thank you, i have the solution now
i already converted drupal_write_records to apis, since there is much more flexibility now and ability to map joomla data to drupal fields:)
i will now integrate :
batch api for visual imports
queue api for cron imports
#10
batch api is ready in 7.x-2.x.
about queue api i am not that sure..i think not even having cron imports,there is no point to me and i doubt anyone used cron.
the ability to update content will be there, but not for cron runs
that really is an other issue though
I will mark this as fixed now
#11
Automatically closed -- issue fixed for 2 weeks with no activity.