field_file_save_file should take an extra parameter for how it should threat existing files. We almost always use FILE_EXISTS_REPLACE in our code. Inside the function calls to file_destination and file_copy the file replace behavior is hard coded to FILE_EXISTS_RENAME.

This is how it is today:

function field_file_save_file($filepath, $validators = array(), $dest = FALSE, $account = NULL)

Something like this would be nice:

function field_file_save_file($filepath, $validators = array(), $dest = FALSE, $account = NULL, $replace = FILE_EXISTS_RENAME)

Then have the functions file_destination and file_copy use the variable $replace

Comments

quicksketch’s picture

This is hard-coded to FILE_EXIST_RENAME because it makes a new entry in the "files" database table. If you just replaced the file, you'd end up with two database rows pointing at the same physical file, but with different properties (such as mime/type, filesize, etc.). If you just want to replace a file, use core's file_move() or file_save_upload().

erlendstromsvik’s picture

Status: Active » Closed (fixed)

Yes. Discovered the same thing, while going through the code. One problem with this, is that the same image can't be used for different nodes, but that might be a known case and "as described"?
I worked around it with just doing a select fid from db where path = filepath. But because of compatibility I ended up going back to the stock code.

greg.harvey’s picture

Category: feature » support
Status: Closed (fixed) » Active

Bear with me... Re-opening, as this might be useful as an answer to people Googling having hit the same issue (as I did) - not to ask again for a "fix". =)

So, related to this - I have this code in a custom import module:

/**
 * This function adds a file to the named CCK file field.
 * 
 * @param $node
 * (object) Drupal node object to add the file to.
 * 
 * @param $file_temp
 * (string) The full system path to the file.
 * 
 * @param $field_name
 * (string) The field name of the CCK file field to save to.
 * 
 * @return
 * (array) An array representing the Drupal file object saved or
 * FALSE if the file does not exist.
 */
function _drivingforce_import_attach_file($node, $file_temp, $field_name) {
  $image = $file_temp;
  
  if (file_exists($file_temp)) {
    // Load up the CCK field
    $field = content_fields($field_name, $node->type);
  
    // Load up the appropriate validators
    $validators = array_merge(filefield_widget_upload_validators($field), imagefield_widget_upload_validators($field));
  
    // Where do we store the files?
    $files_path = filefield_widget_file_path($field);
  
    // Create the file object, replace existing file with new file as source and dest are the same
    $file = field_file_save_file($image, $validators, $files_path);
  
    // Apply the file to the field, this sets the first file only, could be looped
    $node->$field_name = array(0 => $file);
  
    // Save the node with image data
    node_save($node);
  
    // send back the file array, we might need it
    return $file;
  }
  // the image wasn't there
  else {
    watchdog('drivingforce_import', t('Expected image was missing. %filename was not attached to node of ID %nid.', array('%filename' => basename($file_temp), '%nid' => $node->nid)), array(), WATCHDOG_ERROR);
    return FALSE;
  }
  
}

Am I right in thinking the line $file = field_file_save_file($image, $validators, $files_path); should simply be replaced with the core function, like so:

    // Create the file object, replace existing file with new file as source and dest are the same
    $file =file_save_upload($image, $validators, $files_path, FILE_EXISTS_REPLACE);

Then all will be cool?

quicksketch’s picture

Yes, that should work, with the exception that any modules that implement hook_file_field_insert() will not be triggered. You'll need to make sure that you set the files path manually (as it seems you are), since the filepath will not automatically be read from the $field instance.

quicksketch’s picture

Status: Active » Closed (fixed)
TrevorBradley’s picture

Status: Closed (fixed) » Active

(Mostly documented here for my own sanity)

Doing a bit of research on this, I believe greg.harvey's solution only works if the file has been uploaded via the web, as . I'm trying to do something similar with batch API and a local file store: I'm trying to import node data from a csv, which includes a list of files to read in and import. Many of the nodes share files but the filefield generates duplicates (which is fair for most scenarios)

However, blindly making a copy of the field_file_save_file function and replacing FILE_EXISTS_RENAME with FILE_EXISTS_REPLACE results in bizarre behaviour. Files are uploaded without error but then go missing. file_copy returns true, but after the node is saved the file isn't in the destination). I've finally tracked down how this works and what's breaking.

The normal behaviour for filefield uploading is as follows:

1) Rename existing file (during field_file_file_save)
2) Upload file with current filename (during field_file_file_save)
3) Delete Old file (during node_save's content_update hook)

If you skip step 1, and tell Step 2 to overwrite (with FILE_EXISTS_REPLACE in a modified local version of field_file_file_save), this procedure becomes:

1) Ignore existing file
2) Upload and overwrite existing file
3) Delete Old file (which is also the current file).

So the file exists right up to the moment the node is saved, then the "old" file, also the new file, is deleted, while the database entries remain intact.

The key is in the fieldfield_field_update function, near the bottom is the function:
filefield_field_delete_file($oitem, $field);
If I comment this out my import behaves the way I want it to.

However all of this is nested so deeply in the cck node apis I'm not sure it's easy to replace.

I'm going to keep this closed at the moment, it's not a fix request, just documenting. I think attempting to permit Drupal to keep a single copy of the same file, referenced from multiple nodes is exceptionally more complicated than I feared.

quicksketch’s picture

Status: Active » Closed (fixed)
TrevorBradley’s picture

I'm going to try to keep this closed, honest... (Sorry for my continued ramblings here.. this is more to document what I've been going through than anything else)

I've managed to build my filefield importer. Basically I have a number of conditions checking the existing file. file_field_save_file should only be called sparingly when a new file is truly uploaded.

I thought I'd report on one huge glitch I found in my database that might be useful to others. If for some reason your node has a filefield, the file data entry exists, but for some reason the file is missing, it's impossible to upload another copy of that file with the same name. The copy works fine, but during node_save, the file is saved, and the old file entry (with the same file name) is deleted, and once again you're left with a filefield entry with no actual file attached to it.

There's two solutions for this: 1) upload a file with a different name, then reupload your original file. 2) delete the file, and then reupload. My importer actually does (2)... when it detects a corrupt filefield entry (filefield existing with the filename about to be uploaded, but file doesn't exist, it saves the node without the file, then declares a "do-over", and calls node-save *again* this time with the field entry.

tl;dr: filefields can be badly corrupted by deleting the physical files associated with them. Be careful about this.

EDIT: Just discovered something awesome. filefield_field.inc, line 185, makes reference to a $node->skip_filefield_delete... just set this to a non-null value, things aren't deleted!!!