Biblio should do imcremental imports. [#118000]

I did the initial import of the database from endnote, of about 2500 entries, and it did loaded the entire file in 1 go.

With the features of Drupla 4.7 you should be able to do imcremental imports so that memory requirements are not that large and there is no chance of a time out.

Comments

Comment #1

rjerome CreditAttribution: rjerome commented 11 February 2007 at 13:47

Good idea, but I'm not sure how I would handle it. What format are you using for import?

I'll have to check an see what's the biggest memory/time waster, the parsing or the DB inserts (I suspect it's the inserts, since they are done one at a time through the standard "node_save" mechanism.

Comment #2

gordon CreditAttribution: gordon commented 11 February 2007 at 21:54

I am actually using the XML format to do imports.

I actually started devloping a biblio module to do this early last year, but I never got advanced as your module. I did however have incremental imports running under 4.7. It was most likely slower as it would open the XML file each time, but it used a lot less resources.

Comment #3

rjerome CreditAttribution: rjerome commented 12 February 2007 at 01:04

So you are saying that it parsed the XML file for each entry? (i.e. 2500 times?). Any chance I could see the code maybe I could use some of it.

By the way, how long did it take to import 2500 records? I presume you had to modify the script run time limits to get it to finish.

Comment #4

gordon CreditAttribution: gordon commented 14 February 2007 at 03:40

No, not on each entry, but on each request. And each request would process between 50 and 100 entries.

So it would take longer than processing them all at once, but it is nicer to the web server.

Comment #5

rjerome CreditAttribution: rjerome commented 14 February 2007 at 13:29

I made some changes since you brought this up. Previously, large chunks of data (the complete contents of the file and the node array) were being passed by value, they are now being passed by reference, so that should help memory wise. There is still the issue of inserts, they are done one at a time (because it was easier), but could be en mass, by bypassing the node_insert function.

Comment #6

ntripcevich CreditAttribution: ntripcevich commented 9 July 2008 at 01:35

Version:

4.7.x-2.4

» 5.x-1.9

Following up on this.
I just imported 600+ records using the latest release of Biblio for D5.7 and I also was unable to upload Endnote 8+ XML files with more than 100 records. So by breaking it into chunks of 100 (as Groups in Endnote) and export each to XML it went OK, but the problem described above seems to be persisting.
Thanks