File classes

File classes, implementing the PHP interface MigrateFileInterface, are the key to the file import approach. A file class represents the logic for taking a particular representation of a file (such as a URL or a database blob) and tying it to a Drupal 7 file entity in the appropriate way. Each file field mapping, or migration to a file destination, will have a file class associated with it, and that file class determines what options and fields are available when mapping it.

There are three file classes provided by the Migrate module itself:

MigrateFileUri

This is the default file class - the one that will be used if you don't specify a file class. When using this class, the primary value (the value you map to a file or image field, or to 'value' in MigrateDestinationFile) is taken to be a URI or a local filespec, and the assumption is that the resource referenced will be copied into a Drupal file scheme.

It takes the following subfields and options:

source_dir - This is, logically, the source directory relative which to interpret the primary value. As a practical matter, it's a prefix applied to all primary values. That is, the migration process will look for the source file at $source_dir/$value. If omitted, the primary value will be treated as an absolute path. Drupal must be able to access that path. In most cases a public URL will suffice e.g https://drupal.org/

destination_dir - This represents the parent path within Drupal to store the file. It should use a Drupal-defined stream wrapper such as public://. If omitted, it defaults to public:// for file destinations, and the configured file directory for file and image fields.

destination_file - This is the filename relative to destination_dir at which to store the file. If omitted, the filename portion of the primary value will be used. If you have a hierarchical structure you wish to preserve, pass the same value here as you do for the primary value. For example, suppose you implement a migration given physical files at:

/mnt/files/images/foobar.jpg
/mnt/files/images/2012/newfile.png
/mnt/files/misplaced.gif

and a database table with these filename values:

images/foobar.jpg
images/2012/newfile.png
misplaced.gif

with these mappings:

$this->addFieldMapping('value', 'filename');
$this->addFieldMapping('source_dir')
     ->defaultValue('/mnt/files');

You will end up with files in Drupal at public://foobar.jpg, public://newfile.png, and public://misplaced.gif. However, if you add

$this->addFieldMapping('destination_file', 'filename');

you will get public://images/foobar.jpg, public://images/2012/newfile.png, and public://misplaced.gif.

file_replace - This determines the behavior when attempting to import a file and discovering there is already a file at the destination location. It takes three possible values, two of them core constants and one a Migrate extension:

MigrateFile::FILE_EXISTS_REUSE - If the destination file already exists, do not attempt to copy in the source file - just use the existing file (creating a file entity for it if one doesn't already exist).

MigrateFile::FILE_EXISTS_REPLACE - Copy the new source file over the pre-existing one, rewriting the pre-existing file entity if it exists.

MigrateFile::FILE_EXISTS_RENAME - Create a new, unique name for the incoming file (by appending _1, _2, etc.) and make a file entity pointing to that. This is the default behavior.

    $this->addFieldMapping('field_my_image:file_replace')
         ->defaultValue(MigrateFile::FILE_EXISTS_REPLACE);

preserve_files - Normally, importing and rolling back a migration involving file imports will delete the files on rollback. Sometimes this is not desirable - for example, one approach to file migration may be to copy all the files into their desired location in the Drupal installation and use MigrateFile::FILE_EXISTS_REUSE to just link to them, or you may have a symbolic link to a readonly mounted filesystem containing the files. Setting preserve_files to TRUE will cause Migrate to add a file_usage entry so that the deletion of referencing entities or fields will not delete the actual file. Note that preserve_files requires Migrate 7.x-2.6 or later for use with a file_class of MigrateFileFid.

    $this->addFieldMapping('field_my_image:preserve_files')
         ->defaultValue(TRUE);

urlencode (new in Migrate 2.6) - When fetching files from a remote URL, if any of the path components contain special characters like % they need to be encoded - MigrateFileUri will do this encoding by default. However, sometimes the characters are already encoded in your source data - in this case, you can disable the encoding:

    $this->addFieldMapping('field_my_image:urlencode')
         ->defaultValue(0);

MigrateFileUri can be used in combination with file source data to import an existing directory of files.

An example:


/**
 * Migration for files.
 */
class ExampleFileMigration extends Migration {

  protected $baseDir;

  public function __construct($arguments) {
    parent::__construct($arguments);
    $this->description = t('Import files.');
    $this->baseDir = '/path/to/source/directory';

    $this->map = new MigrateSQLMap($this->machineName,
      array(
        'sourceid' => array(
          'type' => 'varchar',
          'length' => 255,
          'not null' => TRUE,
          'description' => t('Source ID'),
        ),
      ),
      MigrateDestinationFile::getKeySchema()
    );

    $directories = array(
      $this->baseDir,
    );

    // Edit to include the desired extensions.
    $allowed = 'jpg jpeg gif png txt';
    if (module_exists('file_entity')) {
      $allowed = variable_get('file_entity_default_allowed_extensions', $allowed);
    }
    $file_mask = '/^.*\.(' . str_replace(array(',', ' '), '|', $allowed) . ')$/i';
    $list = new MigrateListFiles($directories, $this->baseDir, $file_mask);
    // Send FALSE as second argument to prevent loading of file data, which we
    // don't need.
    $item = new MigrateItemFile($this->baseDir, FALSE);
    $fields = array('sourceid' => t('File name with path'));
    $this->source = new MigrateSourceList($list, $item, $fields);
    $this->destination = new MigrateDestinationFile('file', 'MigrateFileUri');

    // Save to the default file scheme.
    $this->addFieldMapping('destination_dir')
      ->defaultValue(variable_get('file_default_scheme', 'public') . '://');
    // Use the full file path in the file name so that we retain the directory
    // structure.
    $this->addFieldMapping('destination_file', 'destination_file');
    // Set the value to the file name, including path.
    $this->addFieldMapping('value', 'file_uri');
    // Uncomment this if you want to replace existing files.
    // $this->addFieldMapping('file_replace')
    //   ->defaultValue(FILE_EXISTS_REPLACE);
  }

  public function prepareRow($row) {
    if (parent::prepareRow($row) === FALSE) {
      return FALSE;
    }

    $row->file_uri = $this->baseDir . $row->sourceid;

    // Remove the leading forward slash.
    $row->destination_file = substr($row->sourceid, 1);
  }

}

MigrateFileBlob

This file class will interpret the primary value (the value you map to a file or image field, or to 'value' in MigrateDestinationFile) as file data (for example, from a database blob field) to be saved as a true file in Drupal.

MigrateFileBlob has the same subfields and options as MigrateFileUri, except for source_dir, which does not apply. The one difference in interpretation is that destination_file is required.

MigrateFileFid

This file class interprets the primary value as an existing file entity ID (fid). It has no subfields or options, and is only relevant with fields, not with file destinations. This is used to tie files migrated to MigrateDestinationFile to a file field:

$this->addFieldMapping('field_my_image', 'source_filename')
     ->sourceMigration('Image');
$this->addFieldMapping('field_my_image:file_class')
     ->defaultValue('MigrateFileFid');

This can also be used if you already have the files inside of Drupal or you are handling the import by some other means. For example, you can use the prepareRow function in your migration class to add a list of fids to an fid parameter on $row.

  public function prepareRow($row) {
    // ...
    $row->fids = $this->myfunctionWhichFindsFids($row);
    // ...
}

Then you can specify the use of the MigrateFileFid in your mappings like this:

$this->addFieldMapping('field_images', 'fids');
$this->addFieldMapping('field_images:file_class')
     ->defaultValue('MigrateFileFid');

File destinations

To migrate files directly to file entities, define a migration with a destination of MigrateDestinationFile. The constructor takes three arguments. First is the entity bundle, which defaults to 'file' - this is what you'll usually use with core file entities. Second is the file class, described above, which defaults to MigrateFileUri. Finally is an array of destination options, passed through to the parent destination class - MigrateDestinationFile defines none of its own.

The critical field to map in a file migration is 'value' - this is the primary value, dependent (per above) on the file class. Besides the file class fields described above, the key fields available for mapping are uid (owner of the file) and timestamp (upload date of the file). A simple blob migration would would like this:

$query = db_select('legacy_file_data', 'f')
         ->fields('f', array('imageid', 'imageblob', 'filename', 'file_ownerid'));
$this->source = new MigrateSourceSQL($query);
$this->destination = new MigrateDestinationFile('file', 'MigrateFileBlob');
$this->addFieldMapping('value', 'imageblob');
$this->addFieldMapping('destination_file', 'filename');
$this->addFieldMapping('uid', 'file_ownerid')
     ->sourceMigration('User');

File and image fields

To migrate files directly into a file or image field, you can map the subfields directly. Note that you generally don't need to specify destination_dir, which is defaulted in the field definition, and that you should not specify destination_file if migrating multiple values per field (i.e., with an array for image_filename).

$this->addFieldMapping('field_image', 'image_filename');
$this->addFieldMapping('field_image:source_dir')
     ->defaultValue('/mnt/files');
$this->addFieldMapping('field_image:alt', 'image_alt');
$this->addFieldMapping('field_image:title', 'image_title');

Implementing your own file class

To make your own file class, implement MigrateFileInterface, with concrete implementations of fields() (documenting any options or fields your implementation supports) and processFile (which will take whatever the incoming file representation is and return a file entity).

This complete example, from Migrate Extras, implements support for migrating Youtube URLs into the media_youtube module:

class MigrateExtrasFileYoutube implements MigrateFileInterface {
  /**
   * We have no custom fields
   */
  static public function fields() {
    return array();
  }

  /**
   * Implementation of MigrateFileInterface::processFiles().
   *
   * @param $value
   *  The URI or local filespec of a file to be imported.
   * @param $owner
   *  User ID (uid) to be the owner of the file.
   * @return object
   *  The file entity being created or referenced.
   */
  public function processFile($value, $owner) {
    // Convert the Youtube URI into a local stream wrapper.
    $handler = new MediaInternetYouTubeHandler($value);
    $destination = $handler->parse($value);

    // Create a file entity object for this Youtube reference, and see if we
    // can get the video title.
    $file = file_uri_to_object($destination, TRUE);
    if (empty($file->fid) && $info = $handler->getOEmbed()) {
      $file->filename = truncate_utf8($info['title'], 255);
    }
    $file = file_save($file);
    if (is_object($file)) {
      return $file;
    }
    else {
      return FALSE;
    }
  }
}

Comments

david.gil’s picture

As for migrate 2.3 in D7, image fields has different syntax that the one here presented, for source_dir, alt, title...
In beer example it is corret:


    $arguments = MigrateFileFieldHandler::arguments(drupal_get_path('module', 'migrate_example'),
      'file_copy', FILE_EXISTS_RENAME, NULL, array('source_field' => 'image_alt'),
      array('source_field' => 'image_title'), array('source_field' => 'image_description'));
    $this->addFieldMapping('field_migrate_example_image', 'image')
         ->arguments($arguments);

cmalek’s picture

I had to migrate a library of images plus associated articles from a site where one image in the library can be used in many articles. I want to share how I finally got this to work with MigrateFileFid. I wanted to end up with Drupal having an Article content type with a field_images file field that could contain an unlimited number of images.

First, I migrated the images with their own migration class. Then in my article migration class, I have a prepareRow() function build an array() of the Drupal fids of the images I needed to link to article. Then:

$this->addFieldMapping('field_images', 'fids');
$this->addFieldMapping('field_images:file_class')
     ->defaultValue('MigrateFileFid');

Note that 'fids' here is an array() of integers.

oknate’s picture

FYI, if your file is using the media module, you'll need to enable migrate_extras to migrate files from Drupal 6 to Drupal 7.

First, migrate the files.

Then in your migrate class:


      // Map the image field to the new image field, base this on the Files migration
      $this->addFieldMapping('field_image', 'field_images')
        ->sourceMigration('Files');
      $this->addFieldMapping('field_image:file_class')
        ->defaultValue('migrateFileFid');
      $this->addFieldMapping('field_image:preserve_files')
           ->defaultValue(TRUE);  

I spend a long time trying to figure out why this wouldn't work. I figured it was due to the media module. Finally, I saw that migrate_extras provides support for it.

jcandan’s picture

For those wishing to Migrate directly into a File Field or an Image Field, the following might clear up any misunderstanding.

$this->destination = new MigrateDestinationCommerceProduct('commerce_product', 'product');

$this->addFieldMapping('commerce_file','filename');
$this->addFieldMapping('commerce_file:file_class')->defaultValue('MigrateFileBlob');

You may also find this useful (not documented, nor listed as a destination):

$this->addFieldMapping('commerce_file:description', 'description');

I was able to do this with Amazon S3 and FileField_Paths, but in my case had to specify

$this->addFieldMapping('commerce_file:destination_file', 'filename');

Where filename contained what was the filefield_paths custom token path.

I was also able to pass filename as an un-associative array, built within prepareRow()! However, was not able to assign a commerce_file:description for each file in the array. :(

rsbecker’s picture

The example of a file class for importing directories was very helpful. Thanks

But it has one small typo problem and I have another issue it doesn't solve.

The typo is in the line.

  $this->base_Dir = ...

It needs to be:

  $this->baseDir = ...

The problem it doesn't solve for me arises because a novice (me) set the site up originally, and it did not have the transliteration module installed. So there are spaces and other undesirable characters in file names. I cannot figure out how in the prepareRow() to replace those characters with '_' and still have the class work properly. If I use str_replace() on the destination_file or file_uri fields the class breaks because $sourceid no longer matches.

elephant.jim’s picture

S'pose you have a separate migration just for image files (and aren't migrating files as part of, say, a node migration directly into a field). If you're using file_entity or media along with migrate_extras, then you can set the $bundle of the MigrateDestinationFile to 'image' to get access to the extra $fields field_file_image_alt_text and field_file_image_title_text.

For example, if you're migrating PDFs, then the code sample above that says $this->destination = new MigrateDestinationFile('file', 'MigrateFileUri'); is entirely appropriate.

But s'pose you're migrating JPEGs or PNGs or GIFs. Then you'll want to adjust the bundle on the destination: $this->destination = new MigrateDestinationFile('image', 'MigrateFileUri');. Among your field mappings, you'll want something like:

    $this->destination = new MigrateDestinationFile('image', 'MigrateFileUri');
    $this->addFieldMapping('value', 'my_source_uri');
    $this->addFieldMapping('field_file_image_alt_text', 'my_source_alt_text');
    $this->addFieldMapping('field_file_image_title_text', 'my_source_title_text');

You can check all the extra fields you get with the bundle change by running drush migrate-fields-destination MyFileMigration.

Hope this helps someone out in the ether(net).

Alan D.’s picture

Having hit an unusual use-case, to use image source directory where the images would always have the same filename, I created a special MigrateItem class to take this into account, which creates the hash based off the file properties rather than just the filename.


/**
 * Implementation of MigrateItemFile.
 *
 * Overrides the hash to take file properties into account.
 */
class MigrateItemFileProperties extends MigrateItemFile {

  /**
   * Generate a hash of the source file.
   *
   * @param $row
   *
   * @return string
   */
  public function hash($row) {
    migrate_instrument_start('MigrateItemFileProperties::hash');
    // Overkill, checks size, mtime, width and height.
    $properties = array(
      'row_hash' => md5(serialize($row)),
      'ctime' => 0,
      'size' => 0,
      'width' => 0,
      'hight' => 0,
    );

    if ($data = stat($row->file_uri)) {
      $properties['ctime'] = $data['ctime'];
      $properties['size'] = $data['size'];
    }

    // Get the image width / height.
    $data = @getimagesize($row->file_uri);
    if (isset($data) && is_array($data)) {
      $properties['width'] = $data[0];
      $properties['height'] = $data[1];
    }

    $hash = md5(serialize($properties));
    migrate_instrument_stop('MigrateItemFileProperties::hash');

    return $hash;
  }
}

And it's as simple as flushing your registry and changing the class.

  $item = new MigrateItemFileProperties($this->baseDir, FALSE);

Total overkill, just size & ctime would probably be enough ;)


Alan Davison