automatic import

jeppius - October 17, 2007 - 18:18
Project:Node import
Version:HEAD
Component:Code
Category:support request
Priority:normal
Assigned:jeppius
Status:active
Description

Hi,
I just installed the node_import module and now it's time to test it. Apparently there is a wizard to help with the import procedure, which is fine.
Eventually, though, I need to be able to import the nodes in an unsupervised way, would that be possible¿?

With my scenario, it would be even more elegant to use node_import before? here it is:

*cron calls php script
*php script queries and external database and extracts some data
*data is collected in a csv file
*node_import gets the file and adds nodes for each row in it

I wish I could skip al the php->csv->node_import and write directly into the drupal database...

#1

alakon - October 29, 2007 - 21:34

I would also be interested.

#2

millenniumtree - July 23, 2008 - 15:58

I second this. Automated import would make my life so much easier.

#3

jako0013 - July 24, 2008 - 20:55

Same here. I'm working on trying to accomplish this very same task.

#4

drewschmaltz - July 27, 2008 - 05:53

I've sponsored a development of this, and the module is coming along great. E-Mail me at drew@imoria.com to inquire. If you are interested in helping me sponsor this development (the module is already usable, but will be getting some spiffy upgrades) - I will be happy to give you a patch that will make node_import drupal-5 and cron seamless as soon as today. I will also update you with patches as soon as they come in.

#5

goodeit - July 28, 2008 - 16:34

awesome news, Drew! Will this support automatically updating nodes as well (or could it tie in with the work here)?

#6

drewschmaltz - August 6, 2008 - 23:45

Yep, auto update based on customizable criteria. (ie: if (this = that) {update or do not update node}) Haven't taken a look at the work there yet, I'll get back to you after I do.

To be specific, this patch can/will be able to:

  1. Trigger via cron
  2. Set Max Execution Time
  3. Set Number of Nodes per cron
  4. Cancel jobs
  5. Ability to upload straight from file in a folder already on server (default "files") or upload new file
  6. Ability to skip "preview" step
  7. Import Based on Criteria (if(this=that){import}) where "this" is a column header in the csv file and "that" is a text field. For my project it will be if the updated date is earlier (so it will have =, !=, >=, and <=)

Like I said in my previous post, about 3/4 of this is finished and ready go. Anyone who can help me fund the last 1/4 of it can have the patch as it is now (perfect for everything I listed except # 7). The faster I get the developer the money, the faster we can have this reviewed by the maintainer and get a patch, or updated module up there. Anything is appreciated, let me know if you're willing to help.

#7

zeezhao - August 8, 2008 - 15:59

Sounds great! Please do you know roughly how fast the automatic import performs e.g. how many nodes per minute for a given spec machine?

The reason I ask, is I want to know if some performance improvements were also implemented. Please see comments on:
http://drupal.org/node/270584

Thanks

#8

drewschmaltz - August 21, 2008 - 05:40

120 nodes per minute is about average. There were some tweaks made, and CCK is to blame for some of the lag, but when you get up 75,000 + it slows down to about 10 / minute.

EDIT: at first, it flies... maybe 700 - 800 / minute.

#9

scottrigby - August 22, 2008 - 02:13

Sounds useful! are there currently any plans for a 6.x port? :) Scott

#10

zeezhao - September 10, 2008 - 17:17

Ok, thanks. I can help out with testing, if necessary. Please what's the current status of the patch for:

"5. Ability to upload straight from file in a folder already on server (default "files") ..."

#11

asak - September 12, 2008 - 16:57

Subscribing (And very exicted about this!)

#12

ianchan - September 30, 2008 - 00:50

Subscribing

#13

akaserer - September 30, 2008 - 14:10

Subscribing

#14

duncano74 - October 1, 2008 - 17:48

sub

#15

asak - October 1, 2008 - 18:50

Now that everyone's here...

What/How much would it take to get this completed?

 
 

Drupal is a registered trademark of Dries Buytaert.