Update existing nodes on import
| Project: | Node import |
| Version: | 6.x-1.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs work |
For Node Import to work for me, I need update existing node functionality
so I have been hacking my way through code for the past few days.. What
I came up with was to write a one function module called member_import
that hooks into the form_X_node_form_alter path.. In there, I look at the
current record in $data, do a look up on my CCK content type, if the record
already exists then I set the nid... From there node_save() should detect
nid and perform an update.
It all seemed very easy... and from inspecting data, it appeared
that it would work. but in node_import.inc at line 1869, the line
of code
if ((!empty($form_state['submitted'])) && !form_get_errors() && empty($form_state['rebuild'])) {
fails because form_get_errors() reports that the form has been altered by another user and the
update will not be performed..
Below is my code... Am I on the right track ? Is this approach in good drupal form ?
Any pointers into the Form API on how to override the form alter detection..
I picked node import as my first module to cut my teeth on for writing drupal modules...
nothing like jumping into the deep end first.
Thanks
John G
function member_import_form_member_node_form_alter($data)
{
$memberid = $data['#post']['cck:field_memberid:value'][0];
if (is_numeric($memberid))
{
$memberid = intval($memberid) ;
$nid = db_result(db_query(
"SELECT nid FROM {content_type_member} WHERE field_memberid_value = %d LIMIT 1", $memberid));
if (!empty($nid))
{
$data['nid']['#value']= intval($nid);
}
}
}

#1
I got it working under Drupal 6, it should would similarly under drupal 5.
The primary problem was that $data['changed'] was not being updated
setting only nid will cause this test to fail in node_validate
if (isset($node->nid) && (node_last_changed($node->nid) > $node->changed))
Then on Drupal 6 you need to set vid also.. On Drupal 5 it appears that it
will auto increment vid if it is missing on update. see
http://api.drupal.org/api/function/node_save/5
-John G
function member_import_form_member_node_form_alter($data)
{
$memberid = $data['#post']['cck:field_memberid:value'][0];
if (is_numeric($memberid))
{
$memberid = intval($memberid) ;
$row = db_fetch_array(db_query("SELECT nid, vid FROM {content_type_member} WHERE field_memberid_value = %d LIMIT 1", $memberid));
if (!empty($row))
{
$data['nid']['#value']= intval($row['nid']);
$data['vid']['#value']= intval($row['vid']);
$data['changed']['#value']= time ();
}
}
}
#2
this is very interesting... could this be adapted to ignore empty fields, so that we could basically import one new column of data into our nodes (say we wanted to add a new field to existing nodes, and had it in a spreadsheet).
thanks!
#3
If I read this right, if a column on import is blank, you want to remove it from the
import so that the old / original value stays in place.. I don't know if you can delete
a value from the form, but you can expand your query to return all the original columns
then if a column from the import is empty, then you replace it with the queried value..
I also use it for come up with aggregate columns,, In my case this is for a club I belong
to where the person in charge of the member list manages the list in Lotus Organizer,
I take a CSV export and import into my Member content type.. but there are columns
that the other person does not want to manage like Title.. so in my module, I query
the first and last name values and build my own title...
-John G
#4
Yes, that's exactly what I'm saying. If I don't select a column during the "mapping" stage, it should keep the existing value for that field.
There could be a node ID column that, if a column is mapped to it, would automatically perform the query and update the existing nodes instead of creating new ones. Even better would be to allow one field (other than nid) to be set by the user as the "primary key" for checking if a node already exists and updating in that case.
#5
Initially I had a plan to add check boxes to the "Options" pane, where you could select what columns were unique key columns...
but my understanding on how the Forms API in Drupal works is not sufficient since I made a mess of it... I need to get back into it and learn more about how the Forms API is being used for the "options" pane..
#6
This is a great hook, thanks a lot.
#7
Hi John, shouldn't there be a patch for review and future commitment?
Subscribing
#8
Hey,
This looks like a very valuable feature. Could someone please tell me how i would go ablout implementing this i.e. where would i place this code and is there anything else i need to do to get it functioning, as i haven't been able to find a way so far.
Any help would be appreciated
Thanks
Alex
#9
I have posted a few times about this specific item and other related posts but Robrecht
has been mute on the subject. All I was looking for is if I this approach was clean Drupal
style or was I trying to slam a square peg in a round hole. So far the module has been
working for me.. I say module because I found this hook while single stepping through
the node import process. This hook is part of Drupal 6, and allows you to modify the
update behavior without modifying the node import code.
There is no code review per se, its more of a template to follow for your specific
content type / input file, its actually quite powerful as you can see in the code I am
attaching here I am also using it to derive a title based on a concatenate of multiple
fields in my input data.
Alex to use this code, unzip the attached module and review the code. The code will
ultimately go in your sites\all\module directory.. I named the module member_import
because the name of my CCK content type is called member.
This code takes advantage of a native Drupal call back hook in common.php called
drupal_alter($type, &$data) See http://api.drupal.org/api/function/drupal_alter/6
Actually until now, I though I the hook was actually implemented in Node Import,
but now I see its a native function in Drupal 6 & 7.
For me, the $type parameter is "form_member_node_form" which translates to the
call back function member_import_form_member_node_form_alter($data) in my module.
The name of you function will be dependent on the name of your module and content
type.. Think of it as {MODULENAME}_import_form_{CCKTYPE}_node_form_alter($data)
The $data includes all the data that is part of the form processing and is too large to
decode in this post.. If you have a breakpoint debugger like XDebug or PHPEd debug.so
the best thing to do is implement a module similar to mine that does not try to alter any
data, just use it to breakpoint on so you can inspect the variable.
If you don't have a debugger, you can add the following two lines as the first lines in your
node_form_alter function, but beware it prints a mess..
function member_import_form_member_node_form_alter($data)
{
print_r($data);
die();
... rest of the code.
}
IMHO this could be easily handled by defining the Primary key components as part of
Step 5, "Defining the Node Import Options" as the Options array is available as part
of the $data array. A generic function could be written to inspect the Options array
to determine what the primary key components are, then using that information to
discover the Node ID.
I ventured down this path, and managed to totally mess up that form.. I need to
spend more time with the Drupal forms API..
Due to the lack of feedback I was getting on this effort and that it worked for my
immediate need, if fell from the top of my list..
If the project implementers want to go down this path, I am willing to put the time
into porting this separate project to be part of the Node Import package..
John Gentilin
#10
Thanks for the help John
I haven't got around to modifying the module for my needs yet but it looks like a great feature and one that i hope will be included in the r5 release.
Alex
#11
The ability to update nodes is a very welcome feature and should be added.
Thanks for your hard work!
#12
Thank you so much!
#13
This is a different approach than #9, but should allow users to skip the custom code. I'm not sure there is anything wrong with the hook_form_alter approach, so long as users know how to write the db query. It could also break if the db location of the cck field changes.
This patch (against 6.x-1.x-dev) should:
Still lacking:
#14
subscribing
#15
Hi Tauno. I tried your patch with ubercart products.
It worked but I noticed that it only updated one of the columns e.g. I changed price & stock quantity, but the stock quantity update which is the last column is the only one that took effect.
Please confirm if this is a bug or how current version works. Thanks.
#16
subscribing
+++++ Allow non-cck columns as keys (title and nid could be useful)
That would be great!
#17
@zeezhao - it should update all the fields, but I haven't tested it much. I definitely haven't testing it against ubercart product nodes. I do have an ubercart site that I can try to test it against, but it will be a few weeks before I can get to it. Does a normal ubercart import work fine with all the fields?
#18
I've tested it with Ubercart nodes and it works like a charm.
#19
@tauno - thanks for your reply. Yes normal import works fine for all fields. I am also using a patch for stock levels - see:
http://www.ubercart.org/contrib/11013
So maybe this patch is causing conflict or, may be an issue related to the version of node_import 6.x-1.x-dev I am using. Using an old one from Mar 2009 I had, as the latest (from 2009-Apr-22) did not seem to work.
Also, looking forward to the non-cck fields selection once it is available. Thanks
#20
Hi there,
I would like to make the nid as the key value if using this patch,
Been trying to set the default value for a cck field to be the same as the nid to quickly achieve the desired results,
But having no luck,
Can anyone help with this?
Thanks in advance..
#21
hi!
applied #13 by tauno for ver. 6.x-1.x-dev.
works nice. waiting for other fields to use as keys and inclusion into stable version of module.
thank you!
#22
At first, thanks Robrecht Jacques for this great tool.
like psychoman, i'd applied the patch in #13, wrote by fauno. Thanks fauno.
WORKS WELL !!! in a test site, with basic cck fields. A little tweak in theme (garland). Not important, at all. I'll make some tests with the reall stuff and report any issue.
I would like to help Robrecht with code, but my skills in php is "near none" :). sorry.
@robrecht - how could I help with documentation? (translation for portuguese? or spanish?)
cheers for all
#23
subscribe, looks very usefull :)
#24
An update to the module that I definitely am interested in. Subscribing.
#25
+1 for this feature. Subscribing...
#26
Suscribing - this would be a great core feature
#27
I needed to be able to import from CSV and update existing nodes identified by node id, not based on a key CCK field, so I adapted the patch from #13 to do that. This patch doesn't allow for the possibility of using a key field, but perhaps the final version should incorporate both features. As it happens the content type I'm working with didn't have any applicable CCK fields so I removed that functionality because I couldn't test it.
To use this, upload a CSV file with a Node ID field. If the field is blank, a new node will be created.
I see there's a lot of interest in this feature; if everyone who has subscribed can help test this patch I'm sure it will ready for commit soon.
#28
Hi mvc,
I used both patches, 13 and your addition. Almost there...
My csv has also an image name for the CCK module. In this case, I got the error message "The path is already in use.". This is because the image is already found and imported in a previous run. This is correct and if the name did not change, no problems with that part.
Can you point me into the right direction or update it in your patch?
furtheron, I need to export as well the list to update the NID. which module is best to use? The products from my website are frequently updated by a csv file which comes from several suppliers.
Thanks in advance!!
#29
@hessie: just to be clear, which CCK image module are you using? I personally only tested this with emimage (I submitted a patch for that elsewhere).
To export your nodes, try the CSV format feed display provided by the views bonus module.
Also, I never tested this with both my patch and #13 applied, so if you do that you're on your own, I'm afraid.
#30
@mvc, thanks for your quick reply.
Changed the original node_import with only your patch gave me the same result. CCK is talking about that the path already exists.
How can i handle that exeption to ignore it because it is ok?
Did you mean by emimage, Embedded Media Fields?
The details you asked:
drupal core: 6.13
CCK version: 6.x-3.1
node_import: 6.x-1.x-dev (latest)
ubercart: 6.x-2.0-rc6
My 'problem' is?
I have a csv file with products. This one is attached in this mail. In there I have the SKU and an imagename. All other fields (exept for attribute color) I have nog problems.
The image is uploaded to a temp folder for drupal. With the import I link the image to the product. Only when an image name is already linked, it should not give an error.
The list is frequently update by the supplier, so this list contains new and current products. the current ones should update the current records in drupal, the new ones should be added.
I hope you have a solution. I do not yet master the drupal essentials, but i am getting there.
Did/Does anyone else has a similar 'problem'?
hope you can help me. Thanks in advance.
#31
@hessie: Yes, when I said emimage, I meant Embedded Image Field, part of the Embedded Media Field project.
I can tell from your export that you're using a different module to handle images, but you didn't answer my question: which image module are you using?
I suspect that whatever module you're using is not supported by Node Import. This makes sense, because the image file can't be included inside the CSV file. So, the only time someone would want to support the image handling module you're using would be when updating existing nodes, which isn't possible without this patch anyways.
I suggest you either 1) switch to either storing images outside the nodes you wish to import, using Node Reference or Embedded Image Field, or 2) you write (or pay someone to write) a Node Import extension which handles whatever image handling module you're using. Good luck.
(PS: Changing issue title for clarity.)
#32
@mvc: I am learning the vocabulairy. Thanks.
I use Image toolkit GD2 to handle the images. All within drupal. I hope this will answer your question.
The import goes great! Update however not. That one stalls on the node_import update. without everything goes as the software (including patch) is written.
However, I installed the embedded field image but will that one also take care of the images i put in the temp/files folder to process?
Thanks in advance for this answer. Then I know what to do.
#33
@hessie: First of all, this issue is for updating existing nodes on import. So, I'm changing the title back. Please wait for that feature to land, and then you can file a new feature request to extend this to handle images better :)
No, you didn't answer my question about how you are handling images, you merely told me which toolkit you are using with the Image API module. What I asked is which module you use to attach images to a node.
The embedded image field module is used to reference images stored elsewhere on the internet, such as on Flickr or Picasa, not those in a folder on your webserver. I gather you are working with a CSV file from a supplier who will perhaps not want to start storing all their photos on Flickr, so perhaps that's not the solution for you. If you do decided to use the embedded image field module, you will also need the node import patch I wrote to handle that module: #565424: Support for embedded media fields
I gather you are not an experienced Drupal developer. So, my advice to you is to use the simplest solution. Just upload all the images to your webserver somewhere, and store the filenames in a plain CCK textfield. Then, in your theme templates, use the filename to generate the IMG tag to display the image.
#34
applying tauno's patch in #13 against 6.x-1.x-dev I get the following SQL error in step 7:
user warning: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '= "PC-60A" LIMIT 1' at line 1 query: SELECT nid, vid FROM content_ WHERE = "PC-60A" LIMIT 1 in /var/www/html/sites/default/modules/node_import/supported/node.inc on line 293.
I can go on to step 8 and run the import, however new nodes are created.
By customizing the dbquery code, replace:
$row = db_fetch_array(db_query('SELECT nid, vid FROM {%s} WHERE %s = "%s" LIMIT 1', $key_dbinfo['table'], $key_dbinfo['columns']['value']['column'], $key_value));with
$row = db_fetch_array(db_query('SELECT nid, vid FROM {uc_products} WHERE "$key" = "%s" LIMIT 1', $key_dbinfo['table'], $key_dbinfo['columns']['value']['column'], $key_value));I 'm able to eliminate the error message, however new nodes are still created not updated.
Any ideas?
mysql 5.1.37
#35
I don't wanna start more threads, so can I ask the status of node update on import? Seems there are two flavors cooking: using a NID and not. I need the "not" version, as my CSV has no knowledge of NID's but I don't mind requiring a unique key-like field whether it be Title or otherwise.
Is this working now in a patch-induced state or is its release pending? Thanks...
#36
Is it likely one of these two flavours (of node import update patches) will be supported in a future release?
cheers
scotjam
#37
I vote for "not NID". But a unique CCK field can be set as primary key, for example an article number.
#38
For the record, I believe this module would ideally allow both possibilities. There's certainly no technical reason to choose just one or the other. Judging from this issue queue, both are of significant interest to the community.
So, I would say the next step is to merge the patches in #13 & #27, after which we can call for review. Marked "needs work" until this is done, and "feature request" because this is a new feature for node import.
#39
Hi mvc,
Applying patch in #13 gives me errors...
* warning: mysqli_real_escape_string() expects parameter 2 to be string, array given in /blahblah/includes/database.mysqli.inc on line 323.
* user warning: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '= "" LIMIT 1' at line 1 query: SELECT nid, vid FROM content_type_store WHERE = "" LIMIT 1 in /blahblah/sites/all/modules/node_import/supported/node.inc on line 293.
Got no idea what this means, but I thought it might be useful to share.
best wishes
scotjam
#40
Quick Update. I'm getting the same error as 'scholar' see #34 above.
#41
+1 for this feature. Subscribing...
#42
For anyone interested, I hard coded the product SKU field into node.inc
Replace:
$row = db_fetch_array(db_query('SELECT nid, vid FROM {%s} WHERE %s = "%s" LIMIT 1', $key_dbinfo['table'], $key_dbinfo['columns']['value']['column'], $key_value));With:
$row = db_fetch_array(db_query("SELECT nid, vid FROM {drupal.uc_products} WHERE $key = \"$key_value\" LIMIT 1", $key_dbinfo['table'], $key_dbinfo['columns']['value']['column'], $key_value));The error I was getting in #34 is caused by an empty array call. The above code will only suit people updating products in Ubercart 2 as drupal.uc_products is a hard coded table.
#43
scholar,
Thanks for the solution.
I have the same error on a CCK field, do you have any idea how I can solve it ?