Last updated April 16, 2013. Created by mikeryan on January 2, 2011.
Edited by joetsuihk, Gisle. Log in to edit this page.
The Migration class is built on the fundamental assumption that there is a one-to-one correspondence between source rows coming in and destination objects going out. What if on the source side you have multiple values spread among multiple database rows for one or more of your fields? For example, if your source were a Drupal site, your first stab at a source query to migrate nodes and related terms might look like
<?php
$query = db_select('node', 'n')
->fields('n', array('nid', 'title')
->fields('tn', array('tid'));
$query->leftJoin('term_node', 'tn', 'n.vid=tn.vid');
?>The problem is, if you have multiple terms for a node, you'll end up with multiple rows for the same node feeding into the migration:
1 First node 23
1 First node 89
2 Second node 23
...That won't work - you'll create 'First node' with term 23, then the second row will be skipped because nid 1 has already been migrated. What you want is a way to deal with one node at a time, and pull in all related terms with that node. Here are a couple of options:
GROUP_CONCAT
If your source data is in a MySQL database, you can use GROUP_CONCAT to pull the multiple values into a single row:
<?php
$query = db_select('node', 'n')
->fields('n', array('nid', 'vid', 'title');
$query->leftJoin('term_node', 'tn', 'n.vid=tn.vid');
$query->addExpression('GROUP_CONCAT(DISTINCT tn.tid)', 'tag_list');
$query->groupBy('n.nid');
?>The results of your query then look like:
1 First node 23,89
2 Second node 23Make sure you tell the field mapping to expect multiple comma-separated values (and, since we're migrating tids rather than term names, to expect tids):
<?php
$this->addFieldMapping('field_tags', 'tag_list')
->separator(',')
->arguments(array('source_type' => 'tid'));
?>Query in prepareRow
An alternative (and the only option if your source data isn't in MySQL) is to pull the data in after the initial query.
<?php
$this->addFieldMapping('field_tags', 'tag_list')
->description('Populated in prepare()')
->arguments(array('source_type' => 'tid'));
...
public function prepareRow($row) {
$row->tag_list = db_select('term_node', 'tn')
->fields('tn', array('tid'))
->condition('vid', $row->vid)
->execute()
->fetchCol();
}
?>$row->tag_list will be array(23, 89) for node 1, array(23) for node 2, etc.
Comments
So how would this work for
So how would this work for file fields? I don't get that one and I would really like to know.
Patrick Thurmond
pjerky.com
GROUP_CONCAT Limit
MySQL has a size limit on text returned by GROUP_CONCAT. The default value is 1024, so if you are querying to get a large number of items, it can truncate the results on you. It doesn't warn you, of course, and I couldn't even find anything in the error logs about this.
It's explained at in the MySQL documentation.. I had a case where the site I was migrating from had hundreds of files attached to nodes, and it took a while to figure out why it wasn't importing them all.
The solution is to edit my.cnf, and add a line like:
group_concat_max_len = 16KWhich worked for me.
Shouldn't the call to
Shouldn't the call to prepare() above be prepareRow()? I haven't used either yet, but the docs here indicate that we'd need an $entity as the first argument if you meant prepare.
Yes. I've fixed it now.
Yes. I've fixed it now.
prepareRow will call first
@bdimaggio
As in link, prepareRow() method is called by the source class next() method so it needs to call first then comes prepare, where prepare() method is called by the destination class prepare() method.
Invalid example
This isn't really a valid example and should cause PHP to fatally error. Specifically I'm referring to building up the SQL query. Building the query in this example should generate:
<?php$query = db_select('node', 'n') // returns a new SelectQuery object
->leftJoin('term_node', 'tn', 'n.vid=tn.vid') // operates on SelectQuery but returns a string (alias of the joined table I believe)
->fields('n', array('nid', 'title') // fails because it's trying to operate on a string
->fields('tn', array('tid'));
?>
I would recommend updating the example with a valid solution.
Or this way...
The idea of running a query for every row is fairly bad - I'm importing 6000 users currently and it's managing about 60/minute, and that's *not* running a query for every user to fetch their roles (damn I wish I'd read about GROUP_CONCAT before).
Anyway as a better alternative, run one query in the __construct() to fetch all the data you need into an array, and then in prepareRow() just extract the data from the array. (I should say that I'm migrating data from a Drupal 6 database to Drupal 7.)
<?php
protected $userRoles = array();
function __construct() {
...
// Separately collect and store user roles
$this->collectUserRoles();
...
?>
calls this:
<?phpprotected function collectUserRoles() {
$result = Database::getConnection('default', 'for_migration')
->select('users_roles', 'ur')->fields('ur', array('uid', 'rid'))
->orderBy('ur.uid')->execute();
while ($row = $result->fetchAssoc()) {
$this->userRoles[$row['uid']][$row['rid']] = $row['rid'];
}
}
?>
and then:
<?phppublic function prepareRow($row) {
// Build any extra roles for this user
// (and remove the existing setting which is a dummy)
$row->rids = array_key_exists($row->uid, $this->userRoles)
? implode(',', $this->userRoles[$row->uid])
: '';
return TRUE;
}
?>
And that's it.
Steve Turnbull
The Drupal7-ish Blog
Migrating nodes when user name is provided
Thanks. This helped me. I was getting a csv feed where user name of the author was passed in the csv but i did not have the uid. The users had been migrated seperately out of the migrate process. I used the above snippet like this
public function __construct() {
...
$columns = array(
<>
array('User', 'Author Name'),
array('uid', 'Author Id') //this column was introduced even though it was not there in the csv
);
//called a method to collect user ids and user names from table .
$this->collectUserIds();
....
$this->addFieldMapping('uid', 'uid');
...
}
public function prepareRow($row) {
// Build any extra roles for this user
// (and remove the existing setting which is a dummy)
$row->uid = array_key_exists($row->User, $this->userIds)
? $this->userIds[$row->User]
: 1;
return TRUE;
}
protected function collectUserIds() {
$result = db_select('users', 'u')
->fields('u', array('name', 'uid'))
->execute();
while ($row = $result->fetchAssoc()) {
$this->userIds[$row['name']] = $row['uid'];
}
}
Double post
(ooops)
Steve Turnbull
The Drupal7-ish Blog