Hi there,

Could someone help me try to debug the following error when I import my Wordpress database?

Error messageMigrateException: XMLReader::read() [xmlreader.read]: An Error Occured while reading File /home/xylowuco/public_html/dXXXr/test/sites/all/modules/migrate/plugins/sources/xml.inc, line 732 in MigrationBase->errorHandler() (line 541 of /home/xylowuco/public_html/dXXXr/test/sites/all/modules/migrate/includes/base.inc)

I don't even know where to start when understanding what this error means.

Using:

Migrate: 7.x-2.3-rc1+2-dev (although I downloaded and uploaded migrate-7.x-2.x-dev.tar)

I disabled Migrate Extras although I am using pathauto 7.x-2.2 because it was not working together well with Wordpress_migrate and the Migrate version I am using for some reason.

Wordpress_migrate 7.x-2.x-dev

Thanks for any pointers I am at a dead end.

CommentFileSizeAuthor
#4 3-20-2012 3-50-22 PM.png49.93 KBlinkanp

Comments

mikeryan’s picture

Title: Import Error » XML error on import
Priority: Major » Normal
Status: Active » Postponed (maintainer needs more info)

I'm afraid I don't understand what the error means either - unfortunately, when there's an XML parsing error, there's little information provided to help tell where it is. Is it possible for you to send me the Wordpress dump that's giving you the trouble? If so, please contact me through my profile.

Also, what do you mean by migrate_extras "not working together well"? What precisely is the issue? Are you using the latest version of migrate_extras?

Thanks.

sunflowerseeds’s picture

I'm getting the same error when I try to upload the WXR file, and I also get it if I try to import directly from the WP site

MigrateException: XMLReader::read(): An Error Occured while reading File C:\Projects\turn\sites\all\modules\migrate\plugins\sources\xml.inc, line 732 in MigrationBase->errorHandler() (line 541 of C:\Projects\turn\sites\all\modules\migrate\includes\base.inc).

I have Migrate 7.x-2.3-rc1+6-dev, Migrate Extras 7.x-2.3-rc1+3-dev, and Migrate from Wordpress 7.x-2.x-dev

mikeryan’s picture

@sunflowerseeds - could you contact me through my profile and send me your file? Unfortunately, the XML library is not reporting anything specific enough through that error message for me to tell what might be giving you and @zoomzoom trouble, I need to be able to reproduce the problem to help.

Thanks.

linkanp’s picture

Priority: Normal » Major
Status: Postponed (maintainer needs more info) » Active
StatusFileSize
new49.93 KB

Unfortunately I am facing the same problem.
I've migrated WXR generated from Wordpress 3.0.1 successfully.
But I am getting problem migrating WXR generated from Wordpress 3.1.2.
And at the same time I've tried to import Wordpress 3.0.1 generated WXR ( works fine ) but Wordpress 3.1.2 generated WXR does not work.
What is the wrong? Is it in WXR or in the module?
Please help.
Error screenshot is attached. Thanks in advance

mikeryan’s picture

Status: Active » Postponed (maintainer needs more info)

Again, that error message provides no useful information to tell what's going wrong - the only way I'll be able to diagnose the problem is if someone sends me a WXR file that triggers it.

MaffooClock’s picture

After several hours of poking at this, I determined that my 60,000+ line XML file had TONS of non UTF-8 space characters. I used the `xmllint` command-line tool, which at first didn't make any sense. Afterall, how can your eyeball tell a ISO-8859-1 space from an UTF-8 space? I figured out a way to find all the spaces and perform a find and replace. When `xmllint` finally spit out the entire XML file without any complets, I knew I would be able to import it... and I was right.

So, moral of the story: your XML file has some characters which are not UTF-8, and you may not even be able to tell. I think the module could be fixed to fail more gracefully when encountering non UTF-8 characters, or try to transliterate, or...

xurizaemon’s picture

I had the same error on OSX. Here's how I got around two issues importing WXR from WP2.3.3

First of all, XMLReader error - line 124 of wordpress.inc for me. Ran WXR through xmllint.

path/to/wordpress.2012-03-30.xml:3396: parser error : Entity 'copy' not defined
on hardboard. You can see the full image here.        All rights reserved ©
                                                                          ^
path/to/wordpress.2012-03-30.xml:7338: parser error : Entity 'copy' not defined
t their new album artwork, done by yours truly.       All rights reserved ©
                                                                          ^

replaced © with a UTF8 ©

perl -pi -e 's#©#©#g' path/to/wordpress.2012-03-30.xml

OK.

Next issue: error about it not being a valid WXR file because it lacked a version string. "The uploaded file is not a valid WordPress export."

Fixed by adding inside the <channel> tag ...

	<wp:wxr_version>1.1</wp:wxr_version>

OK.

Then WXR imported OK. @mikeryan you are welcome to a copy if you care about importing from WP2.3.3 :)

drewish’s picture

Yeah Wordpress doesn't seem to properly escape content. I found one line with a <3 it that I had to switch to a &lt;3.

torvous’s picture

grobot, thank you!! I was about to give up, but this worked for me. If there are any less techy types out there like me that might be helped by what I did, I used http://www.cometdocs.com/xmllint.htm to clean up my xml, added the wp:wxr_version tag detailed by grobot, saved the new xml file and uploaded it. Presto!

mikeryan’s picture

Status: Postponed (maintainer needs more info) » Fixed

Finally got my own WXR file with bogus UTF-8, with a little trial-and-error found a formula to strip bogus control characters - committed.

knitfreedom’s picture

Great! Is there a way I can take advantage of the fix you just did? I am running into the same problem as the above members.

mikeryan’s picture

Just install the latest -dev release from the project page. Note that you will also need to get the latest -dev releases of Migrate and Migrate Extras as well.

knitfreedom’s picture

Thank you!! I'm new here, I didn't even think of that. Yay!

knitfreedom’s picture

Hi, I installed the dev versions of all three modules. I am still getting a similar error, when I import from a file or when I try to do it directly from a URL. I can't attach the XML file here because it's too big, so I sent you a message with a link to download the XML file.

Here is the error message:
An AJAX HTTP error occurred. HTTP Result Code: 500 Debugging information follows. Path: /batch?render=overlay&id=17&op=do StatusText: Service unavailable (with message) ResponseText: MigrateException: XMLReader::read() [xmlreader.read]: An Error Occured while reading File /Users/Liat/Desktop/knitfreedom drupal site/sites/all/modules/migrate/plugins/sources/xml.inc, line 732 in MigrationBase->errorHandler() (line 541 of /Users/Liat/Desktop/knitfreedom drupal site/sites/all/modules/migrate/includes/base.inc).

Thank you for any help you can provide!

Liat

mikeryan’s picture

@knitfreedom: What you have is another instance of #1055310: Could not load WXR file - mismatched tag due to embedded CDATA, let's pick it up over there.

rimousky’s picture

Hi

I'm new in this forum

I had (and I have) a lot of problems beginning to import my WP Database into Drupal.

  • The first point to check before importing a WXR file is to look at the XML rules and if not any fault was introduced by WP : special characters like "&" is one of the main source of error or some mistakes in the HTML tags organisation are also often introduced by WP (open tag not closed ==> example

    without

    , this is drastic to avoid the import process to continue. If the DB is'nt to big (close to 120000 lines depends also of your computer configuration) it's possible to parse the XML file using a text editor (Notepad++) and executing its with FireFox, wich indicates error's type and corresponding line. This problem is now solved for me and I performed a lot of imports using this process

  • I've a second problem : The import is no more possible from the last modifications in the migrate module ==> version (2.3). The module new wordpress_migrate is now incompatible with this version.

How to turn this difficulty ?

mikeryan’s picture

Issue tags: +wordpress migration migrate

Sorry, I thought I had updated the project page to reflect this - it requires a -dev release of Migrate from April 20 or later, because it's making use of changes to Migrate's file handling that were committed then. I've now written that in the project page.

axolx’s picture

I had the same issue even with the latest dev versions (as of today) of `migrate` and `wordpress_migrate`. The issue for me was not with these modules but with the XML generated by wordpress. I used `xmllint` to find the errors, fixed with a text editor, and then the migration worked.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

ioanmar’s picture

@axolx what tool did you use to find and fix the errors using xmllint? Is xmllint accessible throught terminal (Mac OS)?

xurizaemon’s picture

@weborion seems it is on the OSX system I'm using, but any XML validator should be fine

ioanmar’s picture

Can you suggest me a user-friendly XML validator that should do the job?

xurizaemon’s picture

#1469754-9: XML error on import says he used http://www.cometdocs.com/xmllint.htm while axolx says he used xmllint from commandline

bigwavemaui’s picture

Hi,

Nothing I do seems to work with migrating from wordpress 3.4.1. to a fresh drupal 7 install with the newest plugins.
http://www.cometdocs.com/xmllint.htm
Doesn't do it for the wordpress .xml file. I tried a bunch of other validation tools. And
<wp:wxr_version>1.1</wp:wxr_version>
is in place

I hate to fail, the Drupal thing is so attractive. Thanks for any suggestion on how to proceed.

Thanks