The Migration class is built on the fundamental assumption that there is a one-to-one correspondence between source rows coming in and destination objects going out. What if on the source side you have multiple values spread among multiple database rows for one or more of your fields? For example, if your source were a Drupal site, your first stab at a source query to migrate nodes and related terms might look like

$query = db_select('node', 'n')
         ->fields('n', array('nid', 'title')
         ->fields('tn', array('tid'));
$query->leftJoin('term_node', 'tn', 'n.vid=tn.vid');

The problem is, if you have multiple terms for a node, you'll end up with multiple rows for the same node feeding into the migration:

1 First node  23
1 First node  89
2 Second node 23
...

That won't work - you'll create 'First node' with term 23, then the second row will be skipped because nid 1 has already been migrated. What you want is a way to deal with one node at a time, and pull in all related terms with that node. Here are a couple of options:

GROUP_CONCAT

If your source data is in a MySQL database, you can use GROUP_CONCAT to pull the multiple values into a single row:

$query = db_select('node', 'n')
         ->fields('n', array('nid', 'vid', 'title'));
$query->leftJoin('term_node', 'tn', 'n.vid=tn.vid');
$query->addExpression('GROUP_CONCAT(DISTINCT tn.tid)', 'tag_list');
$query->groupBy('n.nid');

The results of your query then look like:

1 First node  23,89
2 Second node 23

Make sure you tell the field mapping to expect multiple comma-separated values (and, since we're migrating tids rather than term names, to expect tids):

$this->addFieldMapping('field_tags', 'tag_list')
     ->separator(',');
$this->addFieldMapping('field_tags:source_type')
     ->defaultValue('tid');

Query in prepareRow

An alternative (and the only option if your source data isn't in MySQL) is to pull the data in after the initial query.

$this->addFieldMapping('field_tags', 'tag_list')
     ->description('Populated in prepare()');
$this->addFieldMapping('field_tags:source_type')
     ->defaultValue('tid');
...
public function prepareRow($row) {
  $row->tag_list = db_select('term_node', 'tn')
                   ->fields('tn', array('tid'))
                   ->condition('vid', $row->vid)
                   ->execute()
                   ->fetchCol();
}

$row->tag_list will be array(23, 89) for node 1, array(23) for node 2, etc.

Comments

patrick.thurmond@gmail.com’s picture

So how would this work for file fields? I don't get that one and I would really like to know.

Patrick Thurmond
patrickthurmond.com

smichel’s picture

MySQL has a size limit on text returned by GROUP_CONCAT. The default value is 1024, so if you are querying to get a large number of items, it can truncate the results on you. It doesn't warn you, of course, and I couldn't even find anything in the error logs about this.

It's explained at in the MySQL documentation.. I had a case where the site I was migrating from had hundreds of files attached to nodes, and it took a while to figure out why it wasn't importing them all.

The solution is to edit my.cnf, and add a line like:

group_concat_max_len = 16K

Which worked for me.

bdimaggio’s picture

Shouldn't the call to prepare() above be prepareRow()? I haven't used either yet, but the docs here indicate that we'd need an $entity as the first argument if you meant prepare.

gaas’s picture

Yes. I've fixed it now.

Bhanuji’s picture

@bdimaggio

As in link, prepareRow() method is called by the source class next() method so it needs to call first then comes prepare, where prepare() method is called by the destination class prepare() method.

lpeabody’s picture

This isn't really a valid example and should cause PHP to fatally error. Specifically I'm referring to building up the SQL query. Building the query in this example should generate:

PHP Fatal error: Call to a member function fields()

<?php
$query = db_select('node', 'n')  // returns a new SelectQuery object
         ->leftJoin('term_node', 'tn', 'n.vid=tn.vid')  // operates on SelectQuery but returns a string (alias of the joined table I believe)
         ->fields('n', array('nid', 'title')  // fails because it's trying to operate on a string
         ->fields('tn', array('tid'));
?>

I would recommend updating the example with a valid solution.

adaddinsane’s picture

The idea of running a query for every row is fairly bad - I'm importing 6000 users currently and it's managing about 60/minute, and that's *not* running a query for every user to fetch their roles (damn I wish I'd read about GROUP_CONCAT before).

Anyway as a better alternative, run one query in the __construct() to fetch all the data you need into an array, and then in prepareRow() just extract the data from the array. (I should say that I'm migrating data from a Drupal 6 database to Drupal 7.)

  protected $userRoles = array();

  function __construct() {
    ...
    // Separately collect and store user roles
    $this->collectUserRoles();
    ...

calls this:

  protected function collectUserRoles() {
    $result = Database::getConnection('default', 'for_migration')
      ->select('users_roles', 'ur')->fields('ur', array('uid', 'rid'))
      ->orderBy('ur.uid')->execute();
    while ($row = $result->fetchAssoc()) {
      $this->userRoles[$row['uid']][$row['rid']] = $row['rid'];
    }
  }

and then:

  public function prepareRow($row) {
    // Build any extra roles for this user
    // (and remove the existing setting which is a dummy)
    $row->rids = array_key_exists($row->uid, $this->userRoles)
      ? implode(',', $this->userRoles[$row->uid])
      : '';
    return TRUE;
  }

And that's it.

Steve Turnbull, D8 developer

Like Science Fiction, Fantasy or Steampunk stories? Check out https://taupress.com/book-list

msbrar’s picture

Thanks. This helped me. I was getting a csv feed where user name of the author was passed in the csv but i did not have the uid. The users had been migrated seperately out of the migrate process. I used the above snippet like this

public function __construct() {
...
$columns = array(
<>
array('User', 'Author Name'),
array('uid', 'Author Id') //this column was introduced even though it was not there in the csv
);
//called a method to collect user ids and user names from table .
$this->collectUserIds();
....
$this->addFieldMapping('uid', 'uid');
...
}

public function prepareRow($row) {
// Build any extra roles for this user
// (and remove the existing setting which is a dummy)
$row->uid = array_key_exists($row->User, $this->userIds)
? $this->userIds[$row->User]
: 1;
return TRUE;
}
protected function collectUserIds() {
$result = db_select('users', 'u')
->fields('u', array('name', 'uid'))
->execute();
while ($row = $result->fetchAssoc()) {
$this->userIds[$row['name']] = $row['uid'];
}
}

ovidenov’s picture

Tip:
Don't forget to use groupBy, if you are using GROUP_CONCAT. Failing to do so will result in only one result row.

ryansians’s picture

Has anyone tried migrating commerce file fields (https://www.drupal.org/project/commerce_file) with multiple source data rows?
It only migrated 1 id where it should have multiple id's.
I've checked the query in the source database, it outputs correctly, my problem is upon migration.

Here's what I did.

$query->addExpression('GROUP_CONCAT(file.field_file_fid)', 'file_id');
.
.
.
.
$this->addFieldMapping('commerce_file', 'file_id')
        ->sourceMigration('PrivateFile')
        ->separator(',');
$this->addFieldMapping('commerce_file:file_class')
        ->defaultValue('MigrateFileFid');
$this->addFieldMapping('commerce_file:preserve_files')
        ->defaultValue(1);

Thanks for any help!