Migrate_d2d knows about the CCK naming structure used in D5 CCK and attempts to figure out field subvalues from their names. But it gets confused if a node has two CCK fields, one that is a substring of another (in our case, field_triprequest_budget and field_triprequest_budgetnotes). In this case, d2d thinks that field_triprequest_budgetnotes is a subvalue and creates queries to read field_triprequest_budget_otes_value. The 'n' is lost because it assumes that there is a _ or : following the matched substring.

I'm attaching a patch that fixes this problem for me, but I don't believe it's a complete solution, because there could be other field names that would still cause the problem. (Ie, I don't think there's a perfect solution to the problem.) Another way around this issue, if possible, is to rename the field in the source DB so it doesn't contain the name of another field.

CommentFileSizeAuthor
d5.inc_.patch556 bytesdarrylri
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

EclipseGc’s picture

Issue summary: View changes
Priority: Normal » Major
Status: Needs work » Reviewed & tested by the community

Confirmed this issue exists and that the patch solves it, at least in my case. Would love to see this fixed soonish.

Eclipse

mikeryan’s picture

Version: 7.x-2.0 » 7.x-2.x-dev
Status: Reviewed & tested by the community » Needs work

Just like module hook namespacing issues, eh?

I think making the check simply strncmp($field_name . '_', $column_name, strlen($field_name) + 1) would work just as well for your case, but neither would work if that longer field name was field_triprequest_budget_notes. In that case, there's no way getFieldTypeColumns() with the information it has available to it can know for sure whether the column field_triprequest_budget_notes_value is the 'value' data for field_triprequest_budget_notes, or the 'notes_value' data for field_triprequest_budget. Actually, it only knows about one field at a time, so it's always going to accept anything that looks like it could be a subfield of that field.

So, the field detection needs to have more context, at least for fields in content_type_% tables. Basically, get all the columns from the table once, keep track of which ones it's already accounted for, and check against field names in descending order of field name length (meaning, it will favor the shorter possible subfield - 'value' over 'notes_value' in the scenario above).