In a file just exported from wordpress I have these two lines. If I remove them, importing works fine.
Otherwise, it fails and shows:
warning: XMLReader::read() [xmlreader.read]: /chroot/tmp/wordpress.2010-08-09.orig_.xml:50: namespace error : Namespace prefix atom on link is not defined in /home/laincons/public_html/modules/wordpress_import/wordpress_import.module on line 1424.
warning: XMLReader::read() [xmlreader.read]: +xml" href="http://mdotg.wordpress.com/osd.xml" title="mdotg@wordpress.com:~ #" in /home/laincons/public_html/modules/wordpress_import/wordpress_import.module on line 1424.
warning: XMLReader::read() [xmlreader.read]: ^ in /home/laincons/public_html/modules/wordpress_import/wordpress_import.module on line 1424.
warning: XMLReader::read() [xmlreader.read]: /chroot/tmp/wordpress.2010-08-09.orig_.xml:51: namespace error : Namespace prefix atom on link is not defined in /home/laincons/public_html/modules/wordpress_import/wordpress_import.module on line 1424.
warning: XMLReader::read() [xmlreader.read]: in /home/laincons/public_html/modules/wordpress_import/wordpress_import.module on line 1424.
warning: XMLReader::read() [xmlreader.read]: ^ in /home/laincons/public_html/modules/wordpress_import/wordpress_import.module on line 1424.
warning: XMLReader::read() [xmlreader.read]: An Error Occured while reading in /home/laincons/public_html/modules/wordpress_import/wordpress_import.module on line 1424.
This file does not appear to be a valid WXR file. The file is either corrupted or invalid XML. In some versions of WordPress, the export function can produce malformed XML. Please see README.txt (included in the module archive) for further guidance.
Comments
Comment #1
pablog_ commentedLines were parsed as html and have not appeared. Here they are:
< atom:link rel="search" type="application/opensearchdescription+xml" href="http://mdotg.wordpress.com/osd.xml" title="mdotg@wordpress.com:~ #" />
< atom:link rel='hub' href='http://mdotg.wordpress.com/?pushpress=hub'/>
Remove the space between "<" and "atom:link".
Comment #2
pablog_ commentedSorry for reopening this issue. I had not seen that it has also been reported and closed.
I have seen that README suggests removing those two lines but it may be hard for a novice user. In my opinion, this plugin should provide one of these workarounds:
1) Ignoring this line and its error.
2) Pre-parsing the file and removing the problematic lines.
What do you think?
Comment #3
lavamind commentedThis specific error occurs because the PubSubHubBub WordPress plugin adds these lines in the export file. A good solution would be to help the author update that plugin so it doesn't add these lines when running the export, where it doesn't make any sense to have it anyway. So, to be honest, this is mainly a problem with a WordPress plugin interfering with the export.
On the Drupal module side, it's not an easy task. This is because XMLReader is a stream reader, which means that it doesn't load the entire file in memory before processing it. In XMLReader there's no option to ignore errors, so #1 is out. The second idea would be somewhat complicated to implement because it's impossible (from what I could tell) to have XMLReader report what lines, exactly, are problematic.
Comment #4
1kenthomas commentedOpen in emacs
META-x
delete-matching lines
input atom:link
Save.
Comment #5
1kenthomas commentedBump. Would be nice to auto clean data.