Creating your own source class

Implementing your own source class - say you're migrating from a database system not supported by Migrate, or from some unusual file format - is not that difficult. You simply need to derive a class from MigrateSource, and implement a few simple methods (well, how simple they are depends on your source...).

Here's a very simple source class that generates ten sample source rows. For more sophisticated examples, review the source classes implemented by Migrate itself in plugins/sources.

<?php
class SimpleMigrateSource extends MigrateSource {
protected $currentId;
protected $numRows;

// Your constructor will initialize any parameters to your migration. It's
// important to pass through the options, so general options such as
// cache_counts will work.
public function __construct($num_rows = 10, array $options = array()) {
parent::__construct($options);
$this->numRows = $num_rows;
}

/**
* Return a string representing the source, for display in the UI.
*/
public function __toString() {
return t('Generate %num sample rows', array('%num' => $this->numRows));
}

/**
* Returns a list of fields available to be mapped from the source,
* keyed by field name.
*/
public function fields() {
return array(
'id' => t('ID'),
'title' => t('Title'),
'body' => t('Body'),
);
}

/**

MigrateSourceCSV

The MigrateSourceCSV migrate source class allows CSV files to be used as a source.

The class constructor has the following parameters:

  • $path: The full system filepath to the source CSV file.
  • $csvcolumns: An array describing the CSV file's columns. Keys are integers (or may be omitted), values are an array of field name then description. This may be left empty if the CSV file has a header row: see below.
  • $options: An array of options. See below for options relevant here.
  • $fields: Optional - keys are field names, values are descriptions. Use to override the default descriptions, or to add additional source fields which the migration will add via other means (e.g., prepareRow()).

The options used by the CSV source class are:

  • header_rows: The number of rows to count as headers. If this is set, you you can pass an empty array for $csvcolumns.
  • embedded_newlines: Set to TRUE if your input file has embedded newlines which throw the record count off. Setting this does make getting the record count significantly slower.
  • 'length', 'delimiter', 'enclosure', 'escape': These are passed as parameters to the PHP fgetcsv() function.

MigrateSourceSQL

Basics

When migrating directly from a database system which is supported by the Drupal database API, and accessible to the destination server where you're running Migrate, use the MigrateSourceSQL source class. In its simplest form, just construct a query of the source data you need and pass it to the MigrateSourceSQL constructor:

$query = db_select('example_pages', 'p')
         ->fields('p', array('pgid', 'page_title', 'page_body'));
$this->source = new MigrateSourceSQL($query);

Do not execute the query - execution will be controlled by the Migrate module.

It is important to understand that the query you use as your source must return a single row for each object to be created. While it may seem natural at first blush to join to, say, a "category" table which has multiple rows for a given content item, that will produce multiple rows in the source query - see the topic Multiple source data rows for suggestions on dealing with such situations.

The above example assumes that the example_pages table is in your default Drupal database - usually, however, your source data will be in a different database. If you have defined a connection named external_db in settings.php, your query above would look like:

<?php

Commonly implemented Migration methods

While simple migrations may get by with just a constructor, in many real-world cases you'll find a need to implement one or more of these Migration class methods:

function prepareRow($row)

The prepareRow() method is called by the source class next() method, after loading the data row. The argument $row is a stdClass object containing the raw data as provided by the source. There are two primary reasons to implement prepareRow():

  1. To modify the data row before it passes through any further methods and handlers: for example, fetching related data, splitting out source fields, combining or creating new source fields based on some logic.
  2. To conditionally skip a row (by returning FALSE).

Consider this example, where there is additional data not directly accessible to the source class (e.g., the source is an XML feed and there's also related data from a database, or vice versa):

<?php
public function prepareRow($row) {
// Always include this fragment at the beginning of every prepareRow()
// implementation, so parent classes can ignore rows.
if (parent::prepareRow($row) === FALSE) {
return FALSE;
}

$related_data = $this->getRelatedData($row->id);
// If marked as spam in the related data, skip this row
if ($related_data->spam) {
return FALSE;
}

// Add the related data of interest

External references

People have applied the Migrate module in many different contexts, using it in different ways. Fortunately, some have taken the time to share their experiences:

Migration classes

The Migrate module provides an abstract class named Migration - in most cases, you'll do nearly all your work in extending this class. The Migration class handles the details of performing the migration for you - iterating over source records, creating destination objects, and keeping track of the relationships between them.

Steps to take:

Pages

Subscribe with RSS Subscribe to RSS - migrate