Just discovered this code of MigrateDestinationFile::import() while I was wondering why the updated fields of a field entity weren't stored:

    if ($existing_files = file_load_multiple(array(), array('uri' => $file->uri))) {
      // Existing record exists. Reuse it.
      $file = reset($existing_files);
      // TODO: Do we really need to save again? Copied from File Field.
      // $file = file_save($file);
    }
    else {
      // Get this orphaned file into the file table.
      $file->fid = NULL;
      $file->status |= FILE_STATUS_PERMANENT; // Save a write in file_field_presave().
      $file = file_save($file);
    }

Since files are entities too we really should consider to call the save. Otherwise the contents of the fields won't be updated even if the image is still the same.
Suggestion from my side - change the code to this:

    if (!file_load_multiple(array(), array('uri' => $file->uri))) {
      // Get this orphaned file into the file table.
      $file->fid = NULL;
      $file->status |= FILE_STATUS_PERMANENT; // Save a write in file_field_presave().
    }
    $file = file_save($file);
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mikeryan’s picture

Assigned: Unassigned » moshe weitzman
Status: Needs work » Active

Seems to make sense - Moshe, you committed the lines in question, any thoughts?

Thanks.

moshe weitzman’s picture

Committed suggested fix to 7.x-2.x. Looks right to me. I think I was trying to save cycles but I miunderstood that file_save() is not part of file api but rather entity api.

moshe weitzman’s picture

Status: Active » Fixed
das-peter’s picture

Status: Fixed » Needs work

Thank you very much for the change - unfortunately it looks like I missed one important thing before.
I played around with the change today and got nice exceptions like this one (I wonder why I didn't get them before):

SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry    [error]
'public://MyImage.JPG' for key 'uri': UPDATE
{file_managed} SET uid=:db_update_placeholder_0,
filename=:db_update_placeholder_1, uri=:db_update_placeholder_2,
filemime=:db_update_placeholder_3, filesize=:db_update_placeholder_4,
status=:db_update_placeholder_5, timestamp=:db_update_placeholder_6,
type=:db_update_placeholder_7
WHERE  (fid = :db_condition_placeholder_0) ; Array
(
    [:db_update_placeholder_0] => 2
    [:db_update_placeholder_1] => MyImage.JPG
    [:db_update_placeholder_2] => public://MyImage.JPG
    [:db_update_placeholder_3] => image/jpeg
    [:db_update_placeholder_4] => 1255583
    [:db_update_placeholder_5] => 1
    [:db_update_placeholder_6] => 1319894500
    [:db_update_placeholder_7] => image
    [:db_condition_placeholder_0] => 8
)
 (drupal/includes/common.inc:6909)
Drush command terminated abnormally due to an unrecoverable error.

I don't get the point why this happens - this is a freak-in UPDATE statement and not an INSERT. But it's likely I don't know enough about MySQL to understand that or my setup is somehow broken (I've the Media Module with its File Entity Module installed).
Thus it would be nice if someone could give this a try - if the error happens on other setups I suggest to change the code to remove the $file->uri before update:

   if (!file_load_multiple(array(), array('uri' => $file->uri))) {
      // Get this orphaned file into the file table.
      $file->fid = NULL;
      $file->status |= FILE_STATUS_PERMANENT; // Save a write in file_field_presave().
    }
    if ($updating) {
      unset($file->uri);
    }
    $file = file_save($file);
    migrate_instrument_stop('file_save');
    $this->complete($file, $row);

Better ideas very much appreciated :D

drewish’s picture

Status: Needs work » Postponed (maintainer needs more info)

I'm guessing you're not running into a Migrate bug. The URI field has a unique key but it's not case sensitive so if you have "public://foo" and "public://Foo" that'll throw the exception. See #966210: DB Case Sensitivity: system_update_7061() fails on inserting files with same name but different case for more.

donquixote’s picture

Status: Postponed (maintainer needs more info) » Needs work

Something else is wrong with the code in MigrateDestinationFile::import() for file saving.

<?php
    if (!file_load_multiple(array(), array('uri' => $file->uri))) {
?>

$file->uri is NOT the primary key for files!
The primary key is $file->fid.

Effect:
The migration will create new files, instead of updating existing ones.
In some cases this might not matter, but it does matter if you create stub files (such as, for user avatars).

The following things have to be checked:
- Is the $file->fid set, or are we creating a new file?
- If $file->fid is set, does this file actually exist?
- Does another file exist with the same filepath (uri)? Is this our friend that has the same fid, or is this something else?
- If $file->fid is set, and the file entity with that fid does exist already, do we have a modified uri?

So, here is the logic we should be using:

<?php
    // Look for existing files
    $existing_by_uri = file_load_multiple(array(), array('uri' => $file->uri));
    // TODO:
    //   We already load $old_file above, but only for Migration::DESTINATION.
    $old_file = file_load($file->fid);
    if (empty($file->fid)) {
      // We are creating a new file entity.
      if (!empty$existing_by_uri)) {
        // Another file entity does already exist, which refers to the same physical file.
        // -> dedupe the uri!
        // In fact, we should have done this before already.
        // So, we rather throw an exception.
        throw new MigrateException(t('Trying to create a new file entity for an already-existing physical file location.'));
      }
      else {
        // We are safe to create a new file.
        .. // create
      }
    }
    elseif (empty($old_file)) {
      // Trying to save a file with explicit fid,
      // but there is no existing entity to overwrite.
      // Do we allow this?
    }
    elseif {
      // This is an update.
      if ($old_file->uri === $file->uri) {
        // uri has not changed, but the file contents might have.
        // At least, we don't need to be afraid of conflicts with other file entities.
      }
      elseif (!empty($existing_by_uri)) {
        // Another file entity does already exist, with the same file location (uri).
        // -> dedupe the uri!
        // In fact, we should have done this before already.
        // So, we rather throw an exception.
        throw new MigrateException(t('Trying to update a file entity with an uri that is already occupied.'));
      }
      else {
        // uri has changed. file contents might have changed.
        // Delete the old one, create the new file, update the uri.
      }
    }
?>

Or am I missing something?

donquixote’s picture

Amazing, now this does actually work :)

Here is a very basic patch.

EDIT:
There is probably a lot of redundancy with other parts of code, where things are already being checked.

Also, the dedupe stuff that I talk about in the comments, needs to happen somewhere. This all depends how the $file objects are prefilled with uri etc, stuff that is happening before this function.
Honestly, I did not study any of this.

donquixote’s picture

Btw, about dedupe, there is also this beast still lurking.
#1358318: dedupe() overdedupes on --update
(for anyone who tries to work on this, this will be useful to know)

mikeryan’s picture

Status: Needs work » Fixed

All this stuff about URIs has nothing to do with the original issue, it should be opened as a separate issue.

donquixote’s picture

Damn, now I don't remember any of this.
Issue: #1400284: Offer options on how to handle uri collisions

I could wait for a better time to post this, but there is also the chance that I will totally drop it. So I rather do it now.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.