Needs review
Project:
Node import
Version:
5.x-1.9
Component:
Code
Priority:
Normal
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
1 May 2008 at 06:43 UTC
Updated:
12 Feb 2009 at 17:13 UTC
Jump to comment: Most recent file
Hello, I'm using this great node to upload about 250 pages to my drupal site. The fields are title, body, and path url. The body is the full HTML code the text I've written for each page. When I do the import, the importer cannot seem to read the body as one field and breaks it into a million, useless peices. What could be causing this? I've spent hours troubleshooting this and I can't seem to get it to work. I'd appreciate any help you guys can provide.
| Comment | File | Size | Author |
|---|---|---|---|
| #10 | node_import.patch | 5.58 KB | ngmaloney |
Comments
Comment #1
jonathanwthomas commentedComment #2
Robrecht Jacques commentedThere was a problem with CSV files. That has been fixed in 5.x-1.x-dev. I'll release a new version of node_import soon (5.x-1.6) that will include these fixes and which probably solves this issue as well.
Comment #3
jonathanwthomas commentedOk, great! Thanks for getting back to me! Keep up the good work!
Comment #4
lisrael commentedI'm still experiencing this with the 5.x-1.6 release.
There doesn't seem to be a way to include double quotes in the content. Neither the "" method nor the \" method works.
Comment #5
lisrael commentedComment #6
lisrael commentedWorkaround is to use something other than " for the text qualifier symbol, and use the single " within the content itself.
Comment #7
halstead commentedI had this same problem. I fixed it by opening the CSV file in OpenOffice.org Spreadsheet and doing a find and replace.
I set the find string to \n
The replace string was left blank
I used the More Options button and selected regular expression. And then I checked Replace all.
The newly saved CSV file worked.
Note that you may want to set the replace string to
depending on the data you are importing.
Comment #8
onelittleant commentedFolks:
Node import incorrectly keeps track of where it is in CSV files when parsing individual field values that span multiple lines. There is a very easy fix for this. I don't know how to make a patch, but here it is:
change the following line 850:
$str = substr($str, (strlen($value) + 2));to this:
$str = substr($str, $i + 1);The 1.6 method of trimming the current $str based on the length of the current $value does not work when a line wrap is encountered in a field value and the next line is read into $str. AFAIK this has been an issue since 4.7.
Comment #9
snarlydwarf commentedMy solution was much easier: I uncommented the line that calls fgetcsv():
return fgetcsv($handle);
Despite the comments, it works very well on current versions of PHP, and handles multiline CSV records (which were emitted from Perl as part of my conversion process).
If your PHP is current, you should be fine with fgetcsv() instead of all the code after it.
Comment #10
ngmaloney commented@snarlydwarf
I ran into a similar problem with CSV values not parsing correctly. In my case I had a field containing HTML. I tried changing delimiters in the source csv to no avail.
I modified the node_import module to use the native fgetcsv() and the fields mapped correctly. Rather than fork the code I created a patch. This patch creates a drop-down in the settings tab that allows users to select what CSV parser they wish to use. It defaults to the included node_import parser but also provides an option to use native fgetcsv.
In the _node_import_csv_get_row() i just threw in a conditional that parses based on the settings value.
Comment #11
jdhildeb commentedWhile the above patch is helpful, it seems like when running recent versions of PHP, the native fgetcsv parser is much better. Moreover, it seems like the user shouldn't need to select which parser they want to use. Instead, I recommend this approach:
* determine which PHP version fixed fgetcsv
* at runtime, check whether PHP is newer than this version, and if so then use fgetcsv
* otherwise use the fgetcsv from node_import, but display a warning to the user that they might have better luck on a newer version of PHP.
Comment #12
ngmaloney commented@jdhildeb - I'd be happy to provide a patch with that functionality. Do you happen to know what version of PHP/fgetcsv I should be checking for?