I'm migrating between two installations on the same box and the use of rawurlencode() breaks the copy() on windows when the file names had spaces in them. I report back in a hour or two on how unix is handling this.

The quick hack was just:

-    $this->sourcePath = $prefix . rawurlencode($basename);
+    $this->sourcePath = $prefix . $basename;

Comments

alan d.’s picture

And also fails on Unix :(

Workaround for a File field

// Remember to define this in your module.info file
// files[] = filename_containing_class.inc
class MigrateLocalFile extends MigrateFileUri {
  protected function copyFile($destination) {
    // Perform the copy operation, with a cleaned-up path.
    $basename = basename($this->sourcePath);
    $prefix = substr($this->sourcePath, 0, strlen($this->sourcePath) - strlen($basename));
    // This class is defined just to remove rawurlencode() from this line.
    $this->sourcePath = $prefix . $basename;
    if (!@copy($this->sourcePath, $destination)) {
      throw new MigrateException(t('The specified file %file could not be copied to ' .
        '%destination.',
        array('%file' => $this->sourcePath, '%destination' => $destination)));
    }
    else {
      return TRUE;
    }
  }
}

Usage

    // Standard File field mapping
    $this->addFieldMapping('field_resources', 'field_documents__filename');
    $this->addFieldMapping('field_resources:source_dir')
        ->defaultValue(variable_get('htmigrate_file_path', ''));
    $this->addFieldMapping('field_resources:description', 'field_documents__description');
    // This forces migrate to use our class rather than MigrateFileUri
    $this->addFieldMapping('field_resources:file_class')
        ->defaultValue('MigrateLocalFile');
David_Rothstein’s picture

Title: Local files and rawurlencode() in MigrateFileUri » Local files and rawurlencode() don't work correctly in MigrateFileUri
Priority: Minor » Normal
StatusFileSize
new711 bytes

I'm using a slightly older version of Migrate, but the rawurlencode() was giving me trouble too, and for remote files in that case (basically anything with an unusual character in the remote file name wouldn't copy correctly).

Is it possible rawurlencode() isn't needed here at all?

Here's the patch I used to remove it. It doesn't apply against the latest 7.x-2.x dev code though (I'm running a bit of an older version), so I'm leaving the issue status at "active" for now.

mikeryan’s picture

Category: bug » feature

That piece of code has evolved to urlencode all segments of the path, so won't apply.

Simply removing the urlencoding will break existing migrations depending on it, although I have also run into situations where it's not appropriate (when the incoming paths were already encoded) - we should add a boolean option 'urlencode', defaulting to TRUE for compatibility, then you'll be able to set it to FALSE if it gives you trouble.

mikeryan’s picture

Issue tags: +Migrate 2.6

Tagging feature requests I'd like to get into Migrate 2.6.

mikeryan’s picture

Status: Active » Fixed

Done, set urlencode to 0 to prevent encoding.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

evanbarter’s picture

StatusFileSize
new2.32 KB

I backported this to 2.5.

donquixote’s picture

Issue summary: View changes
Status: Closed (fixed) » Needs work

I think the real problem here is that MigrateFileUri::urlencode() is meant only for web urls like http://.., but it also applies to filesystem schema uris like private://.. or public://..

As a simple fix, I would suggest to extend the test done in MigrateFileUri::urlencode(), instead of introducing a setting.

  /**
   * Urlencode all the components of a remote filename.
   *
   * @param $filename
   *
   * @return string
   */
  static public function urlencode($filename) {

    // Detect if this is a URL.
    $pos = strpos($filename, '://');
    if ($pos === FALSE) {
      // This is clearly not a URL.
      return $filename;
    }
    $protocol = substr($filename, 0, $pos);
    if ($protocol === 'private' || $protocol === 'public') {
      // This is a stream wrapper file path, not a remote url.
      return $filename;
    }

    // This is a remote URL, so apply the rawurlencode() on all components.
    $components = explode('/', $filename);
    foreach ($components as $key => $component) {
      $components[$key] = rawurlencode($component);
    }
    $filename = implode('/', $components);
    // Actually, we don't want colons encoded
    return str_replace('%3A', ':', $filename);
  }

@mikeryan:
I'm reopening, but maybe you can clarify why we need a setting for this? Then we could close it again, if I am wrong.
And why do we need rawurlencode() for a url that goes into copy() ? Does copy() need encoded remote urls, but raw local paths?

And if so, how would I apply this setting?
E.g. in my Migrate subclass, for a filefield:

    $this->addFieldMapping('field_document_file', 'doc_file');

(I had a look at the doc page at https://www.drupal.org/node/1540106, but was a little confused because in many of these examples the first argument to addFieldMapping() does not look like a Drupal field name.)

donquixote’s picture

Or for a more complete understanding:
- File paths with public:// or private:// or anything that passes file_valid_uri() is a local file and should not ever be urlencoded, no matter what the setting says.
- File paths with http:// or http:// are remote files and need to be urlencoded, BUT the user might already have done this. So here the setting from #5 should apply.
- File paths with e..g randomprotocol:// are tricky, here we have no idea if it is remote or local. We would need some mechanism to find out.

pifagor’s picture

Status: Needs work » Closed (outdated)