Active
Project:
Content Construction Kit (CCK)
Version:
7.x-2.x-dev
Component:
upgrade path
Priority:
Normal
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
11 Feb 2011 at 23:18 UTC
Updated:
25 Jun 2012 at 15:38 UTC
I get a bunch of constraint errors due to duplicate values, both when reading and writing during the data migration. Not sure how these got in the field in the first place, but they were halting the migration mid-way due to output killing the batch process ("500 server error"), and for some errors (like duplicate values) there is no particular action to take other than informing the user - dropping the duplicate values is a reasonable fix in many situations, for example.
Attached is a patch that adds a try/catch around the read and write blocks, similar to the field creation error handling.
| Comment | File | Size | Author |
|---|---|---|---|
| data-error.patch | 2.34 KB | owen barton |
Comments
Comment #1
karens commentedYou don't seem to have the latest code that fixed the revisions handling, which was causing duplicate value errors. See if that fixes the problem. We had a try/catch in here originally and it was removed because there were reports that the try/catch seemed to be causing errors if it got triggered during batch processing.
Comment #2
dwwFWIW, I'm using master from today and I fairly regularly get these sorts of exceptions when trying to run this stuff via the UI:
etc. It does seem like try/catch would be nice, to prevent this from killing the batch and halting the whole migration.
@KarenS: do you have a link to an issue about "the latest code that fixed the revisions handling" -- seems like it's not fully fixed or something. I'd try to investigate and try to make that more robust, if you can give me a pointer where to look.
Thanks!
-Derek
Comment #3
owen barton commentedThe code just after KarenS's comment about "the latest code that fixed the revisions handling" did fix that error for me, I think, so perhaps this is something else.
Comment #4
bfroehle commenteddww: The code Karen was referring to is in #1043956: Content Migrate fails for nodes with revisions / http://drupalcode.org/project/cck.git/commitdiff/f5df077d
Comment #5
karens commentedWe had a try/catch in there originally and it was removed because of reports that it created problems when used in batch processing.
@dww it looks like you are the only one seeing this, so I have no idea what your problems are.
Comment #6
dww@all: Closer investigation is revealing bugs with Batch API. Not yet sure if it's Batch API itself, or content_migrate's use of it. :/ But, I'm finding that when trying to migrate fields from one of my node types (my image nodes, with imagefield and a few text + int fields for metadata), after 1 second of processing, the batch starts over from scratch. The $context object passed to _content_migrate_batch_process_migrate_data() is empty again. :( So, that's why I'm hitting the duplicate key errors, since it really is trying to reinsert data for nodes it already processed. Peppering the code with watchdog() calls further confirms this. I wasn't able to track down the actual bug yet. I'd like to be able to reproduce on another site, to make sure it's not something insane with my local dev environment. If this is reproducible and I find an underlying bug, I'll definitely let y'all know.
Comment #7
karens commentedWhat is the state of this issue? Do we have a reproducible bug?
Comment #8
dwwFunny you should ask. ;) I'm back to working on this site after some time away from that particular project. I couldn't get content_migrate to work *at all* on my laptop now. I'm immediately getting PHP errors that kill the batch as soon as I try to launch any conversion. I'm using the end of the master branch for cck (7.x-2.x-dev) and the end of the 7.x branch for core. The error is that _content_migrate_batch_process_create_fields() is getting a $context argument that's the string "teaser", not a real batch context, so this line dies a horrible death:
Since you can't use a string as an array. I haven't tried to debug what's happening with these batches, nor have I had a chance to move the whole thing to a test site on a publicly accessible server and try again for another data point. Sadly, due to how much trouble I hit last time on all the data migration bugs (and my fear that D7 overall is just not stable enough for prime-time), we had to make the reluctant choice to bail on D7 for the moment and just try to get the site working again on D6 for now. So, that's what I'm doing for the rest of this week. Mostly it's not a waste of time, since it's all porting and conversions I need to be doing anyway...
I'll definitely have to come back to this at some point to see if I can get us up to D7, but it'll probably be at least a few months, maybe more. Sorry I can't provide more useful info now. :( I'm going to leave this in "needs more info" for now, since if I were you, I wouldn't consider this reply enough info to call this "active" again. ;)
Thanks,
-Derek
Comment #9
karens commentedThe fatal errors got fixed I think, whenever you want to try this again.
Comment #10
dww@KarenS: in the master branch of cck's content_migrate? Just wondering what to upgrade before I try again. Thanks!
Comment #11
karens commentedYes, the master branch.
Comment #12
davidhk commentedI've run in to this same problem. I'm using the current (2012-Apr-15) dev version of CCK with Drupal 7.14.
I'm trying to migrate a nodereference field. I've successfully migrated several other nodereference fields on this site (plus several other field types) but this one always fails with error:
An AJAX HTTP error occurred. HTTP Result Code: 500 Debugging information follows. Path: /batch?id=2026&op=do StatusText: Service unavailable (with message) ResponseText: PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'node-440-0-0-und' for key 'PRIMARY': INSERT INTO {field_data_field_place}
Like #6 above, I added watchdog statements to the code, and see the same problem: _content_migrate_batch_process_migrate_data($field_name, &$context) runs ok for the first couple of batches, then it's called with what looks like an empty $context. That resets the migration causing it to try and migrate a node it's already processed, and triggering the error shown above.
Specifically I added this line at the start of _content_migrate_batch_process_migrate_data:
watchdog('cck-migrate', 'progress= ' . $context['sandbox']['progress'] );Then in Drupal's log I see a php error: "Notice: Undefined index: progress in _content_migrate_batch_process_migrate_data()"
I've rolled this migration forward / backward several times, and one thing that's fishy is that sometimes it fails on the third call to _content_migrate_batch_process_migrate_data, and sometimes on the fourth. (there are almost 7,000 nodes to process, so there should be appx 70 calls to the batch process in all).
Here's how the field appears on the Migrate fields page:
Field: field_place
Field type: node_reference
Content type(s):
•Film / Movie
•Image
•Image (external)
•Image (gwulo)
•Place
Other information
•The field uses the view -- to determine referenceable nodes. You will need to manually edit the view and add a display of type 'References'.
•Missing formatter: The 'node_reference_hidden' formatter used in 1 view modes for the field_place field is not available, these displays will be reset to the default formatter.
•Missing formatter: The 'node_reference_hidden' formatter used in 1 view modes for the field_place field is not available, these displays will be reset to the default formatter.
I'm not sure where to look next - suggestions welcome!
Regards, David
Comment #13
davidhk commentedA bit more digging around and as far as I can tell the problem is happening in the batch API. _batch_process() in batch.inc calls cck's _content_migrate_batch_process_migrate_data() repeatedly until the migration is complete.
It calls it repeatedly until one second has passed, then it finishes but triggers a new http request that starts it off again.
In #12 above _batch_process() can call cck's _content_migrate_batch_process_migrate_data() 3 or 4 times before the one second time limit expires. When the http request calls _batch_process(), for some reason the $context it passes in to _content_migrate_batch_process_migrate_data() doesn't have the right data, and we get the 'Duplicate entry' error.
My workaround is to edit _content_migrate_batch_process_migrate_data() and change the for loop to be a while loop that processes all the nodes in one go - effectively avoiding the batch API altogether.
Not sure if it is relevant but the test server I'm using for this is running XAMPP 1.7.7