Filefield suffers from the same critical database problem that imagefield suffers from: deleted files are deleted from the {files} table, but not from the CCK field (residing in the {content_type_TYPE} table). Also see the comments further below in this imagefield issue.

This patch doesn't fix the problem, but adds an additional guard so that non-existing files are not even attempted to be displayed. Please review and apply (this patch is for filefield HEAD).

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

jpetso’s picture

Ok, here's an attempt for the real solution. Rationale:

  • File deletions happen during hook_field($op='update'), so this is where the important stuff belongs.
  • For multiple-value fields, $node_field entries have to be unset so that they are deleted.
  • For single-value fields, once a database entry has been written (either by filefield or by another single-value CCK field), you won't get rid of it with current CCK. So for single-value fields, write a default entry with fid == 0 instead. I can't extract the default value from hook_field_settings, because it's only given in "database string" format there.
  • Because of the latter one, you'll possibly get filefield entries loaded even though they've never been written, or if they have been deleted again. In order to prevent such empty filefield fields from being loaded, they are being removed in hook_field($op='load'). For that matter, filefield_file_load() gets the same check that imagefield got before, which also returns an empty array when fid == 0.
  • The "return '';" in the theme function won't make much of a difference, but makes for more consistency between filefield and imagefield.
jpetso’s picture

As requested on IRC, here's a version with a bit more documentation, and the steps on how to reproduce the bug that this patch fixes. I seem to like bullet lists, so here's for each of the three different cases that can happen if this patch is not applied:

Database inconsistency with a multiple-value upload field

  • Create a content type (named "multitype") containing a filefield (named "multifile") with Multiple on and Required off.
  • Create a node of this type, upload two files, and submit. Have a look at the db, where there's two entries in {content_field_multifile}.
  • Delete both files, and their entries remain in {content_field_multifile}, even if replaced by default entries. They should be deleted instead.
  • This is already fixed in imagefield, with the unset() part in 'update'.

Database inconsistency with a single-value upload field

  • Create a content type (named "singletype") containing a filefield (named "singlefile") with Multiple off and Required off.
  • Create a node of this type, upload nothing, and submit. Have a look at the db, where there's no entry in {content_type_singletype}.
  • Edit the node, upload a file, and submit. Have a look at the db, where there's now a filled-in entry in {content_type_singletype}.
  • Delete the file, and its entry remains unchanged in {content_field_singletype}. The only thing that has changed is that the file itself has been deleted. Ideally, the entry should be deleted again, but current CCK offers no way to do that, because there can be more than one field in {content_type_singletype}. (See the explanation below.)
  • This is not yet fixed in imagefield, and requires the default item assignment in 'update'.

Database inconsistency with two single-value fields, one of which being filefield

  • Create a content type (named "doubletype") containing a filefield (named "coll_damage_file") with Multiple off and Required off, as well as an imagefield (or any other field type instead of imagefield) (named "inflicting_image") also with Multiple off and Required off.
  • Create a node of this type, upload nothing, and submit. Have a look at the db, where there's no entry in {content_type_doubletype}.
  • Edit the node, upload an image to the imagefield, but leave the filefield as is. Have a look at the db, where there's now a filled-in entry for the imagefield in {content_type_singletype}, but also a default entry for the filefield, even though it has never been filled in by the user. The latter can't be avoided as the row has to be filled in, but now causes hook_field($op='load') and all subsequent hooks to be called with a real filefield entry with default values (fid == 0, and stuff).
  • The check and unset() in hook_field($op='load') removes those unwanted items, and as a consequence, all items with default database values (fid == 0).

I consider this last issue to be a shortcoming of CCK, because it places empty single-field items into {content_type_TYPE} even when they are not required, that is, even when they may be empty. And that causes the relatively unexpected behaviour mentioned in the last explanation. ...I guess that's a good opportunity to file a bug against CCK so that it also places single-value, non-required fields into their own {content_field_FIELD} tables. When that happens, we can remove the additional check on $op='load'.

yched’s picture

I consider this last issue to be a shortcoming of CCK, because it places empty single-field items into {content_type_TYPE} even when they are not required, that is, even when they may be empty. And that causes the relatively unexpected behaviour mentioned in the last explanation. ...I guess that's a good opportunity to file a bug against CCK so that it also places single-value, non-required fields into their own {content_field_FIELD} tables. When that happens, we can remove the additional check on $op='load'.

This is not wanted nor required. CCK tries to keep fields in the same (content-type) table whenever possible, for performance reasons, to limit the number of table joins. OK, node data are cached in a dedicated cache table so node_load is OK, but this is no reason to multiply tables. Caching has no effect on views queries, for instance.

Field modules have to be prepared to the fact a value can get stored even for empty field (the value defined as 'default' in hook_field_settings($op = 'database columns') )

As a matter of fact, the right behaviour is the opposite of what you propose : multiple field or not, shared field or not, a value (possibly NULL) should _always_ be stored in the db for empty fields.
This ensures consistency for empty fields between single / non shared fields and multiple or shared fields. That way, you can differentiate records (nodes) for which the field exists but is empty, from records for which the field 'is not defined', and build queries selecting the nodes where a given field is empty / not empty.

jpetso’s picture

As a matter of fact, the right behaviour is the opposite of what you propose : multiple field or not, shared field or not, a value (possibly NULL) should _always_ be stored in the db for empty fields.
This ensures consistency for empty fields between single / non shared fields and multiple or shared fields. That way, you can differentiate records (nodes) for which the field exists but is empty, from records for which the field 'is not defined', and build queries selecting the nodes where a given field is empty / not empty.

Whatever, let it be one way or the other :) Your desired approach would still be less error-prone than the current one, because at the moment you just don't know what to expect unless you dive into content.module's code. No matter what the best behaviour might be, it ought to go all the way. imho.

jpetso’s picture

Oops, forgot to remove the parameter from filefield_default_item(), which was a leftover from when I tried it with filefield_field_settings($op='database columns'). Revised version attached.

jpetso’s picture

Another one, eliminating the $output variable in filefield_field() in favor of direct return statements, like done in imagefield.

jpetso’s picture

Another one, using array_merge() instead of the "+=" operator for adding the $file to $node_field['delta'] in filefield_field($op='load'). Imagefield does so as well.

jpetso’s picture

And yet another revision, this time with better code documentation as requested in the related imagefield issue.

jpetso’s picture

Status: Needs review » Fixed

Committed to filefield HEAD. I am so relieved that this could finally be applied.

Anonymous’s picture

Status: Fixed » Closed (fixed)