Would replacing file_copy with file_move in field_file.inc cause anything to fail?

I ask because my app is designed to store huge data sets and users are uploading files to the tune of 1GB+, and the SAN it's hosted on has awful I/O speed, so there's a lag after upload completes and the copy takes place.

And if this would work, then will I also have to keep in mind that I can't spec a path on another filesystem?

  if (!file_copy($file, $file->destination, FILE_EXISTS_RENAME)) {
    form_set_error($file->source, t('File upload error. Could not move uploaded file.'));
    watchdog('file', 'Upload error. Could not move file %file to destination %destination.', array('%file' => $file->filename, '%destination' => $file->destination));
    return 0;
  }

Also, I was still setting up filefield when a user uploaded 20GB, so I'd like to manually move the files and edit the DB directly to change the path. I'm pretty comfortable editing filepath fields of the files table, which seems to be it. Can you confirm this?

Thanks.

Comments

quicksketch’s picture

That's a good question, I'm not sure why file_copy() is being used there, when it looks like file_move() should work. I haven't tested this out at all, but it's definitely worth investigating.

As for moving files, that'd be a good API improvement (Drupal 7 already has file_move support that updates the database records for you). But if you'll be doing it manually, yes just update the "files" database table whenever you move a file around and FileField will use the value there.

mudd’s picture

Thanks for verifying that QS.

PS, by looking at file.inc inD6 core I think file_copy and file_move do the same thing (file_move calls file_copy in file.inc). Also, now I'm not seeing any lag -- the large files appear to be moved rather than copied. (I'm using tokens in the path, so the files get uploaded to [path-token-name]/filename then renamed to [path-token-value]/filename.)

So all is well! :)

quicksketch’s picture

Category: support » task

Oh sure enough, so file_move() won't be any faster at all, that's too bad. I'm going to keep this open as a potential "task" since I'm very curious about if we're potentially leaving a copy of the file around on the server somewhere by using file_copy() instead of file_move().

davidredshaw’s picture

I've just been looking at the same thing (I'm using field_file_save_file to attach multi-GB files to nodes) and it would definitely be useful to move files rather than copy them (or copy/delete).

Depending on the SAN software some of them will detect that the copy destination is on the same volume and not actually copy the data which practically makes a copy instantaneous (and may well be what's happening in mudd's case), but it can't be guaranteed so I would say that moving rather than copying was a valid change.

I presume that file_move does a copy/move for safety and I suspect that would be difficult to change. It might be worth adding a parameter to field_file_save_file which forces the use of an internal move function rather than the core file_copy. This could then be requested as core file_move change (or we could suggest the addition of a new file_rename() function in core (doesn't exist today).

The point about leaving files around is valid - my calls currently do leave the source files but whilst I'm still testing it's useful to keep them around...

Overall I think I'd vote for field_file_rename() being added to field_file.inc and the use of it triggered by a parameter which was set by default.

I'm happy to look at this if you think it's a good idea.

quicksketch’s picture

Status: Active » Closed (fixed)

This change won't be made to FileField, but if you're interested in improving this functionality, I'd suggest changing file_move() in Drupal core, since that will fix FileField in the future (which is being merged into core anyway).