I made this patch so that taxonomy can be included in the CSV file and will be imported along with the other node data. It's quick and dirty, and ignores any sort of portability and such. The documentation on Drupal's APIs is a bit on the sparse side, so I had to sort of muddle through it. The code works, though.
I don't use this exact version on our site, because I had to make further modifictaions for our needs, but I figured the community might be interested.
The whole reason behind my need for this is that we needed to import classifieds to our site. I set up a Classified taxonomy, and have our classified department add the classifieds from a local CSV to the site using my version of the node_import module. The version we use actually deletes the classifieds from the previous days, too, but that seems a little specialized for wide release.
So, to add taxonomy to your CSV file, just use the taxonomy number. for instance, here is a sample CSV file:
title,name,type,term,teaser,body
"AFFORDABLE",thechronicle,story,8,"AFFORDABLE plowing.","AFFORDABLE plowing."
"CNA weeken",thechronicle,story,8,"CNA weekends.","CNA weekends."
"DENTAL Ass",thechronicle,story,8,"DENTAL Assistant.","DENTAL Assistant."
This isn't what our CSV really looks like, but they are similar. The first row has all of the categories that the node_import module uses to parse the file. Each subsequent row is the information that is entered into these categories.
title is the title of the node.
name is the name of the user that will be credited on this node.
type is the type of node. Flexinode isn't supported, but the built-in nodes work fine.
term is the taxonomy term number. In this case, we use a term called Classified, which is represented by the number 8 on our site. To find out what number corresponds to what term, simply go to that term's edit page. The number will be at the end of the URL.
teaser is the text that will show up when people view the nodes on the front page or some other multi-node page.
body is the whole text of the node.
In our case, teaser and body are always the same, but this allows you to specify what should show up where.
| Comment | File | Size | Author |
|---|---|---|---|
| #7 | csv-arrays.patch | 1.27 KB | Robrecht Jacques |
| #4 | node_import-unserialize.patch | 745 bytes | Robrecht Jacques |
| #3 | node_import_2_0.patch | 1.62 KB | MJoyce-1 |
| #2 | node_import_2.patch | 1.62 KB | MJoyce-1 |
| node_import_1.patch | 1.74 KB | rael9 |
Comments
Comment #1
moshe weitzman commentedwe need a generic wayto handle fields that expect an array, like $node->taxonomy
Comment #2
MJoyce-1 commentedSame as the previous patch
http://drupal.org/node/18207
Except term value can now be "8;22;37" so multiple taxonomy associations can be add during import.
This patch should be applied to the original.
I'm not a programmer so I expect there are inefficiencies in the code.
Comment #3
MJoyce-1 commentedSame as the previous patch
http://drupal.org/node/18207
Except term value can now be "8;22;37" so multiple taxonomy associations can be add during import.
This patch should be applied to the original.
I'm not a programmer so I expect there are inefficiencies in the code.
Comment #4
Robrecht Jacques commentedWell, I tried to patch node_import.module too for the purpose of importing taxonomy terms. The route I followed is allowing a CSV file to contain serialized data that is unserialized as needed. One advantage compared to the adding of the special handling of a "term" field for taxonomy is that this solution is more general. On the other side, it is more complex I suppose.
The idea is that when the column-name of the CSV file contains "[]", this is taken as a sign that the data of this column should be "unserialized". The "[]" is then stripped from the fieldname. See the patch, it's quite clear (although a quick and dirty implementation of it).
With this patch you can then import a CSV file like this:
Like I said, a more general solution, but the serialized data is probably more difficult to understand for newbies.
BTW: I think the documentation of node_import.module should be a bit more clear on how to import flexinodes and taxonomy terms (when it becomes possible). See: http://drupal.org/node/6784#comment-39755 (how to import flexinodes) and http://drupal.org/node/6784#comment-39822 (a solution for taxonomy).
Kind regards,
Robrecht
Comment #5
moshe weitzman commentedi like the bracket syntax for the column name. i dislike serializing the data. i suggest pipe delimited items instead.
Comment #6
Robrecht Jacques commentedYou're right that the "serialize" data looks (and is) too hard. It works best if you use "php" to create your CSV file, but not if you do it any other way (like manually).
Maybe the "[]" syntax for the field-name together with "1;2;3" (of the other patch) for the data works best, although not so general as the "serialize": it would only allow for arrays, not more complex structures (are there any more complex structures to import?).
I'm not completely sure how you would see if it is an array of integers or an array of strings, but maybe that's not needed anyway. Or maybe it can autodetect this if needed.
Now it would be easier if instead of all these numbers one could just enter the terms itself (seperated by ??).
BTW: I would like to improve the "flexinode" import too: why would one have to enter "flexinode-X" and "flexinode_Y". Probably the module could figure out for itself whether the type in the "type" column is a flexinode or not.
I'll look into it this weekend and prepare a new patch.
Kind regards,
Robrecht
Comment #7
Robrecht Jacques commentedOK, a new patch.
In order to treat a field as an array you have to add "[]" at the end of the fieldtitle. Optionally you can insert a character between the brackets as the seperator (the default of which is "|"). Eg: "taxonomy[]" as fieldtitle will create an array called "taxonomy" where the values are seperated on "|". Or if you put "taxonomy[;]" the seperator will be ";" instead of "[".
Right? An example:
We have overrulled the default seperator (";") in this example.
Robrecht
Comment #8
Robrecht Jacques commentedThe 4.7 has a nicer design. 4.6 will not be extended.