Following up on the work done in #115267 we need to add a hook for file operations.

Here's a list of possible operations (taken from dopry's filefield module):

function hook_file(&$file, $op) {}
where $op is one of the following:
'prepare'  // after upload, file is temporary?
'validate' // before save, check quotas, file format, etc
'load'     // allows other modules to load extended file attributes
'save'     // file becomes a non-temporary file?
'delete' 

In #115267 there's still a lot of upload module specific validation that's occurring in file_validate(). I'd like to move that into upload_file($op=validate).

I'd also like to have hook_file($file = null, $op = 'validate') test the users' quotas and permissions so that if they've got file restrictions we can inform them of the before they begin uploading.

CommentFileSizeAuthor
#249 file.php_.txt5.22 KBdrewish
#248 file.php_.txt4.61 KBdrewish
#248 hook_file.patch77.04 KBdrewish
#241 hook_file.patch76.03 KBdrewish
#238 hook_file.patch73.93 KBdrewish
#237 hook_file.patch67.6 KBAnonymous (not verified)
#235 hook_file.patch70.26 KBjpetso
#232 hook_file.patch70.26 KBdrewish
#230 hook_file.patch70.91 KBdrewish
#226 hook_file.patch70.38 KBdrewish
#225 hook_file.patch66.14 KBdrewish
#224 hook_file.patch65.09 KBdrewish
#221 hook_file.patch61.25 KBdrewish
#219 hook_file.patch58.84 KBdrewish
#216 hook_file_c216.patch58.7 KBjpetso
#214 hook_file_c213.patch58.84 KBjpetso
#213 hook_file.patch58.69 KBdrewish
#211 hook_file.patch58.82 KBdrewish
#206 hook_file_c206.patch110.53 KBjpetso
#200 hook_file_142995.patch110.52 KBdrewish
#198 hook_file_142995.patch108.29 KBdrewish
#195 hook_file.patch104.72 KBdrewish
#194 hook_file.patch63.69 KBdrewish
#186 hook_file.patch104.71 KBdrewish
#184 hook_file.patch102.71 KBjhedstrom
#183 hook_file.patch102.38 KBdrewish
#182 hook_file.patch62.19 KBdrewish
#181 hook_file.patch62.19 KBdrewish
#180 hook_file_142995.patch87.92 KBjhedstrom
#175 hook_file_142995.patch58.17 KBjhedstrom
#166 file.test26.39 KBdrewish
#166 hook_file_142995.patch58.42 KBdrewish
#165 file.test20.73 KBdrewish
#164 hook_file_142995.patch57.53 KBdrewish
#161 hook_file-142995-159.patch71.08 KBfloretan
#157 hook_file-142995-157.patch71.24 KBfloretan
#156 hook_file_142995.patch73.49 KBdrewish
#155 hook_file_142995.patch70.91 KBjhedstrom
#154 hook_file_142995.patch65.97 KBaaron
#150 hook_file_142995.patch58.27 KBdrewish
#149 hook_file_142995.patch53.15 KBdrewish
#145 drupal-hook-file-142995-145.patch37.63 KBcburschka
#143 SP20080620-hook-file.patch37.63 KBsamirnassar
#136 hook_file_142995.patch54.31 KBdrewish
#130 drupal.hook-file.patch40.81 KBsun
#129 drupal_hook_file-142995-128.patch39.78 KBSenpai
#126 drupal_hook_file-142995-126.patch41.03 KBfloretan
#122 drupal_hook_file.patch38.74 KBjpetso
#119 drupal_hook_file-142995-119.patch40.98 KBfloretan
#111 drupal_hook_file2.patch39.65 KBquicksketch
#110 drupal_hook_file.patch48.67 KBquicksketch
#104 drupal_hook_file.patch48.66 KBquicksketch
#103 hook_files_142995.patch48.66 KBquicksketch
#102 hook_files_142995.patch48.55 KBquicksketch
#94 hook_files_142995.patch48.56 KBdrewish
#90 142995.patch45.09 KBdopry
#88 142995.patch45.21 KBdopry
#84 142995.patch45.66 KBdopry
#82 hook_file_142995.patch47.28 KBdrewish
#81 142995.patch45.59 KBdopry
#68 142995.patch43.86 KBdopry
#61 142995.patch43.91 KBdopry
#56 142995.patch45.13 KBdopry
#50 142995_4.patch45.41 KBdopry
#48 142995_3.patch45.04 KBdopry
#46 142995_2.patch44.31 KBdopry
#44 142995_1.patch42.6 KBdrewish
#41 142995_0.patch42.59 KBdrewish
#38 hook_file_2.patch36.29 KBFrando
#30 142995.patch35.36 KBdopry
#27 hook_file_142995_7.patch35.23 KBdrewish
#23 hook_file_142995_6.patch28.71 KBdrewish
#22 hook_file_142995_5.patch27.24 KBdrewish
#17 hook_file_142995_4.patch25.89 KBdrewish
#15 hook_file_142995_2.patch19.25 KBdrewish
#11 hook_file_142995_1.patch9.06 KBdrewish
#10 hook_file_142995_0.patch7.72 KBdrewish
#8 hook_file_142995.patch12.07 KBdrewish
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

drewish’s picture

it might make sense to roll hook_file_download into this as a $op = 'download'

moshe weitzman’s picture

subscribe .. a reasonable enhancement

dopry’s picture

I'd rather keep the op names as terminology related to whats actually going on with the files, instead of trying to copy form or node events, I think the op names should correlate to the functions they're being called from as well.

so maybe upload, validate, load, copy, move, update status, delete. I'll try to roll a patch over the weekend...

validate would allow modules to validate new uploads before saving to db.

upload would allow modules to act on new uploads while they're still in temp status. generating admin ui thumbnails.

load would allow modules to extend file objects. This doesn't exists because file_load doesn't exist yet.

copy would allow modules to react to copy events, creating new derivatives, access control entries, or whatever they may need to do. this would happen after a file has been successfully copied on the filesystem and inserted into the database.

move is the same as copy, but you would probably move existing derivs or update existing information.

update status will allow modules to react to status updates.. changing from temporary to permanent, or whatever we dream up in the future.

delete would be called after successful file_deletion.

Owen Barton’s picture

Subscribe

Owen Barton’s picture

I marked http://drupal.org/node/18934 as a duplicate of this (since it was super old).

However there was a patch (basically the same idea as here) - so go check it out anyway :)

magico’s picture

(subscribe)

keve’s picture

Good idea! Finally this could solve my long-awaited problem of conversion of filenames containing i18n characters to alphanumerics. ( http://drupal.org/node/146752 )

drewish’s picture

Status: Active » Needs work
FileSize
12.07 KB

Now that #115267 has been committed we can get started. I talked with dopry on IRC a bit yesterday and got an idea of his plan, this patch reflects my (probably incorrect) understanding of it. This patch is not anywhere close to done but I've got to run out and wanted to get it posted.

It turns file_copy, file_delete and file_move into wrappers which handle the database updating end of things while _file_copy and _file_delete do the file end of things. It also adds file_load() and file_save() to read and write to the database.

Stefan Nagtegaal’s picture

*subscribing*

I definatly want this in Drupal 6. Not sure yet what I can do to help..

drewish’s picture

FileSize
7.72 KB

here's a bit better patch. it moves the code in a more logical manner so that the changes are clearer.

stefan, if you wanted to add some code to delete the record when the file is deleted that'd be good. i've tried to add a couple of TODOs to make it easy to see what needs to happen.

drewish’s picture

FileSize
9.06 KB

whoops, that was missing some stuff. this has better TODOs. feel free to jump in and work on some.

Stefan Nagtegaal’s picture

I'll try to come up with something after 10 hours from now...
I don't promiss anything, but i'll give it a shot..

Stefan Nagtegaal’s picture

I did some quick reviewing on the patch and while I am not sure it's in scope of this patch, I think using arrays instead of objects is much easier to grasp.

Perhaps this could be a personal taste, but drupal is moving to arrays more and more.. (Think FAPI as an example)

Furthermore this goes pretty much above my knowledge of the file api. While I would *love* to see the functionality hit the trunk, my PHP skills are not up to the level where it meets the criterium to move this further.

I'll keep an eye on the issue, and review/test wherever I can to help, but unfortunatly cooperation on developing it wouldn't be a good idea.

Stefan Nagtegaal’s picture

@drewish: Any progress on this?

drewish’s picture

FileSize
19.25 KB

here's something closer. i've done a little testing on it but i almost need to write a module that uses files to shake all the bugs out.

the current hook operations look like:

  • load - fired by a call to file_load(), ignores returned values.
  • insert - fired by a call to file_save() for a new file, ignores returned values.
  • update - fired by a call to file_save() for an existing file, ignores returned values.
  • status - fired by a call to file_set_status(), ignores returned values.
  • copy- fired by a call to file_copy(), ignores returned values.
  • move - fired by a call to file_move(), ignores returned values.
  • delete - fired by a call to file_delete(), ignores returned values.
  • validate - fired during file_save_upload(), expects an array of error messages.
chx’s picture

If we would use file arrays instead of file objects we could spare some drupal_clones. Second, if we are changing the name of the operating functions, like _file_copy, it's a good time to make that a variable, like $function = variable_get('file_backend', '_file') .'_copy' -- can you spell s3_backend.module :) ?

drewish’s picture

FileSize
25.89 KB

I liked chx's suggestion of adding a layer of indirection on the underlying file operations. I started looking at converting $file to an array and it'll be a major undertaking. I think it's the right thing to do but it'd touch a lot of files and result in a brittle patch that's out of the scope of this issue.

Digging into this I realized that there are really two classes of file operations, database tracked ones and non-database ones. Things like the aggregated JS and CSS should only be written locally and, at least in this patch, aren't tracked by the DB.

At this point there are only three functions a file driver would need to implement: _copy, _delete, _save_data.

dikini’s picture

I like the patch and the approach it takes. Makes sense.

What can I do to help?

dikini’s picture

Is this the place to add folder handling logic?

A module implementing this hook could, in theory, do that but it can be a chicken and egg problem, depending on module logic and invocation time of the hook. We could separate the path from filename, or just store the path separately, the memory overhead will be minimal considering the possible use cases - uploads and file manipulations are rare, usially one off ops on a few of files per run.

I'm think about the implications for modules creating or using some form of logical folders

Gábor Hojtsy’s picture

I would not convert $file objects to arrays, especially not in this patch, if you want this to get reviewed ;)

dopry’s picture

The layer of indirection is out of scope. Fear the scope creep! There are more issue involved than just the indirection. Settings for particular backends, backend selection, and handling http timeouts on your webserver with large transfers to and from services like S3 or over FTP none of which none are addressed in this patch. I would prefer the scope be limited to just the hooks, and the indirection layer become another feature request.

drewish’s picture

FileSize
27.24 KB

pulled out the indirection layer. it was a bit off topic.

cleaned up a lot of the phpdocs. started working the upload module. it really should be implementing hook_file to load its data into the file object. i made some changes to it and everything seems to be working correctly. i added some caching to file_load based on what's in node_load.

feedback on the best way to treat upload.module would be much appreciated.

drewish’s picture

Status: Needs work » Needs review
FileSize
28.71 KB

re-rolled. there's probably still some issues but at this point i need reviews.

drewish’s picture

the patch still applies.

drewish’s picture

Title: hook_file » Add hook_file and make files into a 1st class Drupal object

little bit better title.

moshe weitzman’s picture

Status: Needs review » Needs work

Very nice architecture improvement. I tested just about all the code changed by this patch. I just have a laundry list of minor improvements.

  • typos: search for scalled and memebers and thew
  • FILE UPLOAD SETTINGS: 'List files by default' should be a checkbox
  • FILE UPLOAD SETTINGS: "If an image toolkit is installed, files exceeding this value will be scalled down to fit.". this message could be more helpful if it looked for a toolkit and tailored the text accordingly. furthermore, the toolkit page says "No image toolkits found. Drupal will use PHP's built-in GD library". So i don't know if my images will be scaled or not.
  • when uploading an image to a node: notice: Trying to get property of non-object in /Users/mw/htd/multisite/modules/upload/upload.module on line 632.
  • perhaps document the status values in the comments for file_set_status() or tell the developer where to look.
  • tabs in upload_file(). use spaces instead.
  • i got these errors when uploading a user picture:
    # Object of class stdClass could not be converted to string in /Users/mw/htd/multisite/includes/file.inc on line 252.
    # notice: Object of class stdClass to string conversion in /Users/mw/htd/multisite/includes/file.inc on line 252.
    # The selected file /Users/mw/htd/multisite/Object could not be copied, because no file by that name exists. Please check that you supplied the correct filename.
    # Failed to upload the picture image; the pictures directory doesn't exist or is not writable.
    
  • upon uploading a logo in theme settings: Fatal error: Cannot use object of type stdClass as array in /Users/mw/htd/multisite/modules/system/system.module on line 2224
drewish’s picture

Status: Needs work » Needs review
FileSize
35.23 KB

Thanks for the review Moshe. I think I've corrected all the issues you raised in your review other than:

perhaps document the status values in the comments for file_set_status() or tell the developer where to look.

The thing is at this point there are only values temporary and permanent. And dopry wanted the status as a bitmapped field.

I ran into a little, related problem with this testing the upload quotas on non-uid-1 users. It looks like their temp files are counted against their quota, which makes sense but it's a pain for them because they can't force the cron run that will delete their old files. That might be in the scope of a post feature freeze fix.

As a status update, I'm going to be offline next week. I'll try to keep this patch up until I leave and pick it up when I get back if it hasn't been committed.

moshe weitzman’s picture

1 hunk now fails on upload.module :(. please reroll and i will retest.

dopry’s picture

FileSize
35.36 KB

This version of the patch updated to to work with HEAD.

It makes behavioral changes.

  • file_space_used to only return total filesize used by permanent files when passed a uid.
  • file_delete will not delete a file if another module blocks the delete and return an array of modules blocking the deletion, that will still test as true.

I also made a few minor documentation improvements, and document the implementation of hook_file in upload.module.

dmitrig01’s picture

Status: Needs review » Needs work

Applies with offset. Could you re-roll?

dmitrig01’s picture

Status: Needs work » Needs review

/me heard that fuzz was fine
review starts in a few minutes

drewish’s picture

I really don't like that hook_file(op=delete) can block. It doesn't jive with the rest of Drupal's deletion. hook_delete and hook_nodeapi(op=delete) don't allow other modules to prevent a node's deletion, they allow modules to react to the deletion. I'd also like to see the deletion queries put into the delete API's package but there's probably not a clean way to do that.

The documentation for hook_file shouldn't be put into upload_file()'s comment, but rather added to contributions/docs/developer/hook/core.php

dopry’s picture

File deletion is not node deletionm if it were the same we would use the same api. There should be another hook in file_delete to react to the deletion.

The idea is based on the behavior of hard links in posix file systems. Where you can have multiple links to the same file data. In Drupal the file data would be the row in the files table and the file itself. If you call file_delete from a module and the deletion is blocked by another module, file_delete still returns true, since there wasn't an error with the actual file deletion, so only the reference to the file gets removed. Once no modules are blocking the physical file_deletion the file will be removed. It is the responsibility of the blocking modules to provide any notices about why they blocked the file delete.

The deletion API has been brought up. There is one problem with the deletion API. It seems to enforce a process of Database query, then callbacks. We really cannot remove files from the database until we confirm that the file has been physically deleted, otherwise the database becomes inconsistent.

One major issue has arisen in this patch. I'll try to re-roll with corrections tonight.

My last implementation of file_delete only allows the delete hook to be a reference count. Modules implementing hook_file cannot react to file deletion consistently this way. Drewish has reservations to the approach.

drewish’s picture

okay after lots of chatting on IRC with hunmonk and dopry i think we've got a solution that extends the deletion api.

i wanted to make one, unrelated, note though: _file_delete should return TRUE if the file doesn't exist... it's only an error if the file exists and can't be deleted.

drewish’s picture

i opened up a deletion api issue: http://drupal.org/node/154604

drewish’s picture

i'd had high hopes for getting #154604 committed and having a nice clean way of implementing the file deletion but dries is talking about rolling back the deletion api patch.

at this point i don't really care how the pre/post-delete hooks are setup. #30 is fine by me but i think dopry wants to make some changes. i'd settle for just about anything so long as the rest gets committed.

Frando’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
36.29 KB

I am just reviewing the patch in #30. IMO, it is a *huge* improvement over what we currently have in Drupal. This opens up many many more possiblities for file handling.

I tested upload module functionality and the new or changed API functions. Everything worked as advertised. The concept of hook_file is great. Deleting a file that is still used on a node by upload.module successfully fails. Documentation is very good.

The attached patch only fixes a minor code style issue, removes some debug code that was still in the last patch and fixes one single E_ALL notices in upload.module hook_file op load.

This is RTBC. Great work.

Gábor Hojtsy’s picture

Code-reviewed the patch:

- you introduce wrappers around internal _file_* functions, but still use some internal looking functions in blogapi and system module with this patch
- the phpdoc is not up to current conventions... I understand that the file.inc comments are not properly formatted, and to keep this reviewable, it should not be reformatted to comply to current conventions... but new functions added elsewhere should not follow that bad convention
- I don't understand the rationale behind this:

+      // If upload.module is still using a file, do not let other modules delete it.
+      $count = db_result(db_query('SELECT count(*) FROM {upload} WHERE fid = %d', $file->fid));
+      if ($count) {
+        return 'upload';
+      }
+      // Delete all file revision information associated with the file.
+      db_query('DELETE FROM {upload} WHERE fid = %d', $file->fid);
+      break;

If there are rows, then the deletion is blocked. If there are no rows, we delete these rows??

Dries’s picture

Status: Reviewed & tested by the community » Needs work

Why is this useful? Maybe add some code comments so we got to understand this a bit better?

+      $values = db_fetch_array(db_query('SELECT vid, description, list FROM {upload} u WHERE u.fid = %d', $file->fid));
+      if (!empty($values)) {
+        foreach ($values as $key => $value) {
+          $file->{$key} = $value;
+        }
+      }
drewish’s picture

Status: Needs work » Needs review
FileSize
42.59 KB

I still question the rationale behind the delete blocking stuff and since I'm re-rolling I'm going to drop it. I think it complicates the patch and I just want this done with.

Gábor Hojtsy,
- I intentionally didn't modify the blogapi to use file_save_data because the workflow of that module doesn't make it clear to me that it should be tracked in the database. Where is that file deleted? The system.module is a little different case, if that's what people want to do then I'm willing to modify it.
- The attached patch fixes all the PHPDoc in file.inc to bring it to current standards. I'm of the opinion that while we're in there we might as well fix it all.

Dries,
- I added a comment to explain what was happening there.

drewish’s picture

I also pulled out this block of text from upload.module. It needs to go in the hooks.php file once this patch is committed:

/**
 * Implementation of hook_file().
 *
 * @param $op the file operation being performed.
 *  'copy' - the file was just copied to $file->filepath
 *  'move' - the file was just moved to $file->filepath
 *  'delete' - The file is about to be deleted.
 *  'validate' - The file is being validated.
 *  'status' - the status of the has just been set.
 *  'load' - the file is being loaded.
 *  'insert' - the file is being inserted in the database.
 *  'update' - the file's database entry is being updated.
 *
 * @param $source The original path to the file on copy and move
 *                operations.
 * @return mixed
 *     'validate' - return an array of errors.
 */
moshe weitzman’s picture

Status: Needs review » Needs work

@drewish - sorry i get a hunk fail in upload.module. i will review if you can fix this. it is more than 2 lines so i bailed.

drewish’s picture

Status: Needs work » Needs review
FileSize
42.6 KB

here's a re-roll that fixes conflicts in system.module and upload.module.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

tested uploads and user pictures and all works fine. code looks good as well. RTBC.

dopry’s picture

FileSize
44.31 KB

Here is a reintroduction of file_delete with reference counting. I'll leave it to the powers that be to decide which to commit. It also fixes some direct deletions from the files table in upload_delete and upload_delete_revision.

dopry’s picture

Status: Reviewed & tested by the community » Needs review

re-rolled to sync with head. Set back to code needs review, since I added the reference counting on delete.

dopry’s picture

FileSize
45.04 KB

I guess it helps if I attach the patch.

There are also some changes in _file_delete I've added two additional tests when $path is not a file.

  1. in the case a path doesn't exists it returns true, but watchdogs an error since the desired final state of _file_delete($path) is already true, This makes the soft delete of file_delete behave nicely in case you attempt to delete a file from the db where the file has already been removed from the file system.
  2. in the case $path is a directory it returns false and watchdogs an error.
RobRoy’s picture

Status: Needs review » Needs work

Test
- Applied fine, uploaded two files attached to a page node okay.

Bugs
- When editing page node I checked "Delete" next to an existing upload File B and then attached a third file upload File C. When hitting Submit, File B gets deleted but File C doesn't stick around leaving only File A.
- Are files supposed to be deleted from file system when not used anymore? If so, they weren't as all files stuck around after checking "Delete" and hitting save, and still stayed after deleting the node to which they were uploaded.
- By default the file upload box desc says "The maximum size of file uploads is 1 MB. Images larger than 0 will be resized." Images larger than 0? I know that's the default, but is poor usability and confusion. So we need a check in there if (!empty(variable_get('...', 0)) { $desc .= ' '. t('Images larger than...'); }
- Also, I think we should set the default for the above var to '' instead of 0. We tell users to leave setting blank for no restriction elsewhere IIRC, why not here? 0 seems unnecessary and inconsistent. If we keep the !empty() code we'll be okay with 0 or ''.
- Why do all files get an underscore before the .ext? Maybe this is stated above, if so, sorry.
- We should remove "node" from user facing content on the File uploads config screen. I know it was there before, but since we're changing it, let's change line 185 of upload.module to something like 'Determines whether files attached to posts are listed or not in the full page view by default.' That might suck, but we should try and remove node. :P

dopry’s picture

Assigned: Unassigned » dopry
Priority: Normal » Critical
Status: Needs work » Needs review
FileSize
45.41 KB
  1. - When editing page node I checked "Delete" next to an existing upload File B and then attached a third file upload File C. When hitting Submit, File B gets deleted but File C doesn't stick around leaving only File A.

    I could not reproduce.

  2. - Are files supposed to be deleted from file system when not used anymore? If so, they weren't as all files stuck around after checking "Delete" and hitting save, and still stayed after deleting the node to which they were uploaded.

    There was indeed a missing file_delete.

  3. - By default the file upload box desc says "The maximum size of file uploads is 1 MB. Images larger than 0 will be resized." Images larger than 0? I know that's the default, but is poor usability and confusion. So we need a check in there if (!empty(variable_get('...', 0)) { $desc .= ' '. t('Images larger than...'); }

    This is fixed using the !empty test instead of isset.

  4. - Also, I think we should set the default for the above var to '' instead of 0. We tell users to leave setting blank for no restriction elsewhere IIRC, why not here? 0 seems unnecessary and inconsistent. If we keep the !empty() code we'll be okay with 0 or ''.

    This is not from this patch. It should be filed as a separate UI issue.

  5. - Why do all files get an underscore before the .ext? Maybe this is stated above, if so, sorry.

    I am unable to reproduce. although some files get that so apache won't try to execute them via it's multi type handling.

  6. - We should remove "node" from user facing content on the File uploads config screen. I know it was there before, but since we're changing it, let's change line 185 of upload.module to something like 'Determines whether files attached to posts are listed or not in the full page view by default.' That might suck, but we should try and remove node. :P

    This is not from this patch and should be filed as a separate UI issue. There are too many API changes happening in this patch for me to want to entertain scope creep.

RobRoy’s picture

Status: Needs review » Needs work

Uploading files does not work with this recent patch. Did a clean install and attaching files appears to work, but hitting Submit results in no files attached.

dopry’s picture

Status: Needs work » Needs review

Upload failures are not related to this patch, but to changes in FAPI, see http://drupal.org/node/157747, for details. Setting back to CNR, so I can solicit architectural and code style reviews until 157747 is resolved.

drewish’s picture

patch still applies

Dave Cohen’s picture

I'm looking forward to these improvements!

For those interested in this bug, I wanted to point out this module I built for drupal 5.x. In particular it has a form element for file uploads. If files are really a first class entity in Drupal, I feel that should be part of it.

My form field is somewhat complex because the various actions that must take place when a file is uploaded (and previewed) do not map well to the form hooks. I understand improvements to FAPI will make this sort of thing easier in the future (yeah!). But I'd encourage folks to check out what I did and consider it's features when adding this stuff to core. That's all - thanks for your time.

chx’s picture

Version: 6.x-dev » 7.x-dev
dopry’s picture

FileSize
45.13 KB

And just to keep things somewhat close to head.. here is an updated patch.

philippejadin’s picture

Just tested #56 on head, it works fine.

Only strange behavior :

- add a node
- browse for a file
- click attach
- click delete checkbox
- save node

-> File is still attached

This is a very good addition to Drupal too bad it's too late for 6. This is my first patch review, I came too late to the game

catch’s picture

subscribing.

jpetso’s picture

subscribing. should have done that long ago.

mlncn’s picture

subscribing

dopry’s picture

FileSize
43.91 KB

A couple conflicts... Updating to current HEAD.

dopry’s picture

@phillipjadine, I can no longer reproduce the bug in comment #57. Can you confirm?

drewish’s picture

i don't think 'reference' is a good name for the operation name. all the others ('copy', 'validate', 'load', 'move') are verbs directing the module to take some action. if 'reference' were to be consistent the module would have to add a reference to that file. what you're really doing is directing the module to report the number of references. something like 'count_references', 'get_usage', 'are_using' would also be better.

dopry’s picture

how about just 'references'? that is a noun and it almost feels like a interrogative to me.

mlncn’s picture

list_references

philippejadin’s picture

Cannot reproduce bug in #57 so consider it fixed. Thank you

Overall, this is a good patch and great addition to filesystem api

catch’s picture

I like 'list_references'.

dopry’s picture

FileSize
43.86 KB

but I so don't like the underscore and I like the single word for brevity... So I made it references.

@peanut gallery:
How about reviewing the patch? Please post some actual feed back. Things like 'work's for me', 'doesn't work for me', 'part Y is confusing can you document it better' are useful.

@drewish, phillipe:
You most definitely aren't in the peanut gallery. :) Thanks for your continued attention to this patch. I really appreciate your feedback.

@drewish:
I really appreciate you doing the lions share of the work to port this to core.

catch’s picture

OK stepping out of the peanut gallery.

I'm not sure I understand why the private _file_copy function has exactly the same phpdoc as file_copy() Maybe I missed something and that's standard for private functions, but it looked weird to me.

Otherwise I applied it, enabled upload module, and everything worked as normal.

mikl’s picture

+1 for this. I hate to having to hack the core to fix broken filenames :(

dopry’s picture

@catch: the doc isn't the same.. It's just similar. file_copy work on file objects _file_copy works on file paths.

@all: 6 is out, head is open? *wink* *wink* can we get some reviews or gatekeeper reservations?

Wim Leers’s picture

@dopry: I'd appreciate your comments here: http://drupal.org/node/214934. I'm not sure yet if there's overlap… will give it a look this weekend.

RobLoach’s picture

Should hook_file also interact with Forms API? If so, we could move the Form API stuff to this related issue.

drewish’s picture

Wim Leers, I know that one of dopry's goals was to allow that type of functionality.

Rob Loach, they're definitely related but I don't think there's much form stuff going on in this patch...

Wim Leers’s picture

drewish, ok cool! :)

I'll await dopry's feedback to see if they can continue to be separate patches or not. I'd prefer separate patches because I have to ship my patch in D5 and D6 versions as well, for my CDN integration module.

dopry’s picture

Status: Needs review » Reviewed & tested by the community

@Rob Loach: No the form stuff is out of scope for this patch. This patch only includes api changes to file.inc and hook_file. I am working on a new file form element that leverages the new file api, since it allows file operations to be done asynchronously of the node creation process.

@Wim Leers: Again I think it is out of scope. file_create_url will need modification to support public and private files simultaneously.

@all: can we get reviews of this patch or a RTBC?

dopry’s picture

Status: Reviewed & tested by the community » Needs review

oops didn't mean to change status...

RobLoach’s picture

I'd be happy to give it a review, but I'm not too sure how to test it. Could you provide a module or something that makes use of hook_file()?

cburschka’s picture

This will allow node-independent files for uploading, browsing and downloading, and move the n:1 file:node relationship to upload.module? Subscribing.

breyten’s picture

Status: Needs review » Needs work

The patch from #68 doesn't apply cleanly anymore:

patching file includes/common.inc
Hunk #1 succeeded at 1787 (offset -3 lines).
Hunk #2 succeeded at 1888 (offset -3 lines).
Hunk #3 succeeded at 2246 (offset -3 lines).
Hunk #4 succeeded at 2256 (offset -3 lines).
patching file includes/file.inc
Hunk #14 FAILED at 474.
Hunk #19 FAILED at 641.
Hunk #20 FAILED at 662.
Hunk #25 FAILED at 1124.
4 out of 25 hunks FAILED -- saving rejects to file includes/file.inc.rej
patching file modules/blogapi/blogapi.module
Hunk #1 succeeded at 378 (offset 9 lines).
patching file modules/system/system.admin.inc
patching file modules/system/system.module
Hunk #1 FAILED at 1223.
1 out of 1 hunk FAILED -- saving rejects to file modules/system/system.module.rej
patching file modules/upload/upload.admin.inc
patching file modules/upload/upload.module
Hunk #2 succeeded at 261 (offset -8 lines).
Hunk #3 succeeded at 434 (offset -8 lines).
Hunk #4 succeeded at 450 (offset -8 lines).
Hunk #5 succeeded at 463 (offset -8 lines).
Hunk #6 succeeded at 493 (offset -8 lines).
Hunk #7 succeeded at 518 (offset -14 lines).
Hunk #8 succeeded at 599 (offset -14 lines).
patching file modules/user/user.module
Hunk #1 succeeded at 389 (offset -29 lines).
Hunk #2 succeeded at 1500 (offset 2 lines).

Otherwise, I'm just subscribing :)

dopry’s picture

Status: Needs work » Needs review
FileSize
45.59 KB

This patch has been updated for head as of today, including the removal of drupal_clone.

@Rob Loach: hook_file upload module implements hook_file op==load and op==references. I don't think we will be able to fully test hook_file in this patch. I suspect as long as core file handling works as expected there should be no issues with getting it committed. We can clear up bugs and issues with additional hook_file ops in the future. Once hook_file is in core I will begin port filemeta and derivative to core, both of which utilize hook_filefield thoroughly.

@Arancaytar: how about reviewing the patch instead of just subscribing and adding useless noise to the issue queue. The file/node relationships were moved to upload.module in D6.

@breyten: this new patch should apply cleanly In the future you can just let me know the patch doesn't apply cleanly. I don't need to know which hunks failed :).

This patch implements:

  • hook_file which allows module developers to interact with the file lifecycle
  • stratifies the file.inc api into two layers, one that works on drupal file objects and one that works on filepaths
  • provides a reference counting hook so file_delete is only a soft delete, and will only remove records from the files table and delete the actual file if no modules claim to be using it.

How to test:

  1. Core file handling:
    1. upload logo in the admin >> themes section of your test site. (only tests path based layer of file.inc api)
    2. upload a favicon in the admin >> themes section of your site.(only tests path based layer of file.inc api)
  2. User.module
    1. upload a user picture. (only tests path based layer of file.inc api)
  3. Upload.module
    1. create a story and attach/remove files. (This tests the drupal file object layer of the file.inc api and hook_file op==load)
    2. create another story without any uploads. manually insert a record in the upload table point to a file
      that is associated with another node. Delete the file from one of the nodes through the UI. (This tests the soft delete feature)
drewish’s picture

Status: Needs review » Needs work
FileSize
47.28 KB

looks like your last patch had some unrelated dblog changes that have since been committed... re-rolled to fix those failed hunks.

there's a problem with the way the user.module's picture are being tracked. the comment says that it should be making an untracked copy but it's calling file_copy() (rather than _file_copy()) but isn't calling files_set_status() to mark it as permanent. i kind of think it makes sense to track the user's profile picture in the database. it'd make it a lot easier to avoid bugs like #92224. if we do track them then we also need to delete the files in user_delete().

i wonder if it might be better to rename the _file_* functions that deal with untracked files to something like file_raw_*() since the leading underscore implies a private function and there are good reasons that people might not want to manipulate non-db tracked files. thoughts?

drewish’s picture

oh i should mention that i banged on the upload module portion of the code and that all seemed to be working correctly. i didn't get around to messing with the theme portions.

dopry’s picture

Status: Needs work » Needs review
FileSize
45.66 KB

This patch fixes to user.module to use the fileapi as documented, also removes my db_log snafu which is another patch that snuck in there.

I think renaming the functions and altering the way user.module behaves is scope creep. We're doing a lot of API changes as is in this patch. I don't want to change more than absolutely necessary. I think _ is a perfectly acceptable prefix for the low level file functions. If you really want raw file access you have php's file functions. I also hope to implement the mountpoint system in the _file_* layer which will make them more private and drupal specific and not raw file access at all. If you want raw file access user php's file functions.

moshe weitzman’s picture

I read through the patch and found only 2 small issues:

  • case 'delete':
    +      // Delete all information associated with the file.
    +      db_query('DELETE FROM {upload} WHERE fid = %d', $file->fid);
    +      break;

    Can we save a query here by checking if $file "belongs" to upload module?

  • file_save() does separate insert and update statements with each field named. It looks like the old code does a cleaner drupal_write_record(). search for drupal_write_record in the patch.
dopry’s picture

To check if the file is owned by upload.module would require a select against the upload table. I opted for one query instead of two.

I didn't keep the drupal_write_record on purpose. I'd still have to do the insert/update logic in file_save and call drupal_write_record ( which includes a bunch of query building insanity ) in both conditional blocks. So it doesn't improve readability or enhance performance I didn't see any advantage in using it in this case.

moshe weitzman’s picture

One of the benefits of drupal_write_record() is that you can add a column in your own install and add a properly named value to the $file object (for example) and drupal will automatically write to the custom column. I'm not going to +1 a patch that avoids using schema API functions as you are proposing.

dopry’s picture

FileSize
45.21 KB

@moshe, here is one with drupal_write_record...

moshe weitzman’s picture

Status: Needs review » Needs work

The test plan in #81 works fine except for the node upload which does not work properly at all. I could not attach a file when clicking the Preview button and I saw a number of NOTICE errors. After thats fixed, I'll test soft delete and we should be good to go.

There is a regression also for the small bug fixed in http://drupal.org/node/193331.

dopry’s picture

Status: Needs work » Needs review
FileSize
45.09 KB

This patch fixes the $replace regression, and uses the correct table name in drupal_write_record which is what broke node uploads.

keith.smith’s picture

Status: Needs review » Needs work

I read through this, and I think it all looks good. There are (a very) few typos in code comments:

+ * - Adds thew new file to the files database. If the source file is a 

("thew")

+ * move is performed by copying the file to the new location and the deleting

(the -> then)

+  // file will not be deleted from drupal, but file_delete will

(we use "Drupal")

+  // return a populated array that tests as TRUE.

(Initial caps.)

+  // Make sure the file is deleted before removing it's row from the 
+  // database, so UI's can still find the file in the database.

("it's" -> "its")

+  // Return true for non-existant file, but log that nothing was actually delete, as the intended
+  // result of _file_delete is in fact the current state.

(I think we use "TRUE", and also "delete" -> "deleted".)

+ *   - FILE_EXISTS_REPLACE - Replace the existing file
+ *   - FILE_EXISTS_RENAME - Append _{incrementing number} until the filename is unique
+ *   - FILE_EXISTS_ERROR - Do nothing and return FALSE.

(Inconsistent use of periods.)

+ *   - FILE_STATUS_PERMANENT - A permanant file that will drupal's garbage

("permanant" => "permanent", capitalize "drupal's".)

+ *   A numeric file id or string containg the file path.

("containg" => "containing")

-    '#description' => t('Display attached files when viewing a post.'),
+    '#description' => t('Determines whether files attached to nodes are listed in the node view by default.'),

(For better or worse, we generally avoid mention of a "node" in user-facing text.)

+        // return the name of the module and how many references it has to the file.

(Capitalize "return".)

cburschka’s picture

Just on reading those lines...

+  // Make sure the file is deleted before removing it's row from the
+  // database, so UI's can still find the file in the database.

("it's" -> "its")

And UI's -> UIs. It seems that the apostrophe is only added when the acronym has periods, as in U.I.'s.

+ *   - FILE_STATUS_PERMANENT - A permanant file that will drupal's garbage
+ *                             collection will not remove.

("permanant" => "permanent", capitalize "drupal's".)

And drop the extra "will".

aaron’s picture

subscribing

drewish’s picture

FileSize
48.56 KB

made corrections based on Arancaytar's fixes.

dopry’s picture

Status: Needs work » Needs review
mfer’s picture

subscribe.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

I went through the test plan in #81 and all tests pass. Nice work.

vaish’s picture

subscribe

scroogie’s picture

subscribe

Freso’s picture

This is so exciting! (And, *subscribes*)

Arto’s picture

Subscribing.

quicksketch’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
48.55 KB

I went over this patch today and found a few bugs which have been fixed in this version.

Bugs:
- There wasn't a way to test file_move() in core, so I added file_move() to index.php to try to move a file with a record in the database. But I received that the function "c()" was undefined. Looks like it's supposed to be _file_delete() from a previous patch.
- _file_move() was undefined, so I was getting undefined function when enabling CSS/JS caching. I wrote a function for this purpose and utilized it in file_move() rather than doing a _file_copy() and _file_delete inside of file_move().

Docs:
- I updated all the documentation so that the first line of PHPDoc is a brief description, with a longer description separated into another paragraph. I also elaborated a bit on file_move vs. _file_move (and other similarly named functions).
- Added @see lines where appropriate.

quicksketch’s picture

FileSize
48.66 KB

I find that using _file_move/_file_copy/_file_save all seems very dirty. The preceding underscore is supposed to mean "private", and not to be used outside of the module it's defined in. Encouraging module developers to use "non-API" functions throughout their modules as system, upload, and user have done in this patch would introduce new inconsistencies in what it means to prefix a function with an underscore.

So, to correct this I've renamed all the prefixed functions with a new name: the suffix '_plain'. So we now have file_move_plain, file_copy_plain, and file_save_plain. I considered trying to rename 'file_move' to 'file_record_move' (or similar), but because file_move invokes hook_file_move(), I think it's the right decision to rename the prefixed versions of the functions.

quicksketch’s picture

FileSize
48.66 KB

Fixes remaining _file_delete() calls to file_delete_plain().

drewish’s picture

quicksketch, I'm glad you changed those function names, back on comment #82 I'd proposed _raw but I think _plain is a better choice. I'll try to give this a review shortly.

dopry’s picture

Assigned: dopry » Unassigned

You guys have fun. I'm totally off this issue. There has been no feedback from a committer in a while, and I'm not going to review superfluous change at this juncture in this patches life... You can hound people to re review it.

Anonymous’s picture

subscribe

dopry’s picture

@justinrandell, wow you're really helpful with you subscribe comment! thank you for participating in the development process. We appreciate your extremely useful feedback.

@drewish, @quicksketch, before running off and changing patches that are RTBC... I really wish you guys would wait for a committer to comment on the patch and what needs changing. You basically moved this patch out of their field of vision by restarting development to cover a personal aesthetic issue. I'll point to _ being reserved for private methods which are object constructs, not functions. I'll point to the fact that I'm only keeping the private functions in use in those modules because I want to limit the changes introduced by this patch so it is easier to review. Later I would much rather them using file_* functions instead...

I have a lot of work and time invested in this patch, not just the patch but foundation work in filesystem and fileapi. I could have done all this in contrib by now, had I not decided to work on getting this stuff in core instead. I even got Dries sorta blessing at OSCMS on this path for D6... This patch even received a code freeze exception and couldn't garner enough reviews or feedback from a core committer to get comitted...

I have pretty much lost hope for playing nice with core, and seriously feel like going back to contrib only development. I feel this patch should just be abandoned.

quicksketch’s picture

FileSize
48.67 KB

Hey dopry, I wasn't meaning to be discouraging. I was just trying to clean up a little further. The last time Gabor looked at the patch he specifically stated (#39):

- you introduce wrappers around internal _file_* functions, but still use some internal looking functions in blogapi and system module with this patch
- the phpdoc is not up to current conventions... I understand that the file.inc comments are not properly formatted, and to keep this reviewable, it should not be reformatted to comply to current conventions... but new functions added elsewhere should not follow that bad convention

So even if we marked it RTBC, I doubt it would get in without fixing these problems. I gave this code another review today and found that I missed one last _file_copy() instance. Patch attached. Another review would be appreciated.

quicksketch’s picture

FileSize
39.65 KB

Okay, eating my own words here. Gabor specifically said *not* to update the documentation (which is exactly what keith.smith did in #91, drewish in #94, and me in #102).

So here's the patch again, only with the documentation fixes not included. (PHPdoc is still correct for new or modified functions though). Sorry for the trouble. I'll re-roll the documentation fixes after this is committed in a different issue.

Anonymous’s picture

i applied the patch at #111 and ran through the test procedure at #81 - everything Just Worked :-)

are there any other things that need testing?

dopry’s picture

Still private/_file_* is a matter of scope. Both drewish and I have indicated that we didn't want to convert those modules to use the new file_* functions because it was unclear how to implement it properly and created workflow changes that make the patch harder to review. Neither of us got a return comment from Gabor regarding if it was acceptable to leave them unchanged until after this patch got committed... I think you're addressing the wrong issue by renaming the functions instead of updating modules using private functions that *should* no longer be used outside of file.inc.

I still believe one of the most wonderful features of functional coding and OO design is the ability to break out of OO when it's effective. I still don't agree with the function name change, but then again that is just a matter of preference... Anyways have fun finding more reviewers...

dopry’s picture

Status: Needs review » Reviewed & tested by the community

I hate file_*_plain but who cares anymore, eh? So you have you look at the end of the function name to determine if you're working with objects or paths...

Would some who could commit this please take a look and give us a final say on filenaming or remaining reservations, or just commit it?

Gábor Hojtsy’s picture

FYI at this time only Dries can commit to 7.x.

Stefan Nagtegaal’s picture

Now, I'm really wondering why it takes so long to get this in (or even get feedback)..
No wonder you gave up on this dopry...

Dries’s picture

I'll review this patch shortly. Stay tuned, I've been working with/other patches. :)

Dries’s picture

Btw, note that I'm not the only one who can review patches. If we all think it is truly ready to be committed, than it should be easy to get this in as soon as I've looked at it.

floretan’s picture

The patch from #111 still applied with some offset, just re-rolling to apply cleanly to HEAD.

Tested the patch as well, and I can confirm the RTBC status.

dopry’s picture

@Dries, I don't for one second thing you're the only person who can review patches, but there are bandwidth issues for community review. There are several questions floating that are waiting on a commiter's feedback... and there hasn't been a reply from a committer in almost a year... Really this issue has needed is someone to step in and make an architecture and name space decision the principals working on this patch haven't been able to agree upon. It would have take a 5 minute note...

like follow up #100 to present should sum it up.

catch’s picture

Status: Reviewed & tested by the community » Needs work

Like 95% of active patches against HEAD at the moment, this needs re-rolling for the concat operator spacing changes.

jpetso’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
38.74 KB

Rerolled with concat operator spaces. cvs diff is darn slow for me at the moment, so this is svn diff output instead. Pushing back to its previous state.

David Strauss’s picture

Subscribe.

litwol’s picture

Please don't use hook_file($op). instead use hook_file_$op to take advantage of hook registry.

cwgordon7’s picture

Status: Reviewed & tested by the community » Needs work

Agreed, so this needs to be updated for the registry.

floretan’s picture

Updated patch to use hook_file_$op instead of hook_file($op).

We also want to be able to test this patch, but at the moment the upload.test has multiple failures and exceptions. I have a patch to clean it up (see #253506: contact.test broken (was upload.test)), but simulating file uploads with simpletest seems to be broken at the moment. If you are familiar with file uploads using curl, please review that issue.

floretan’s picture

Status: Needs work » Needs review
Senpai’s picture

Status: Needs review » Reviewed & tested by the community

Tested against a fresh copy of HEAD. Although it works, and works well, several things stand out as possibly needing attention.

1. File uploads while creating a new node (as uid1) work as expected, but the form admonished me that the maximum file upload size was 1MB, and then proceeded to save the node with a 9MB music file attached to it.

2.

Senpai’s picture

Weird. My point number 2 above got cut off, and now I can't remember what it was. Ah well.

I went to test the patch again, to see if it would remind me, but the patch no longer applies. Re-rolled, retested, and still RTBC.

sun’s picture

FileSize
40.81 KB

Re-rolled with minor fixes. Still needs review and feedback by Dries (since more than a month now; I can understand dopry).

IMO, file_*_plain() sounds a bit awkward -- maybe file_*_direct(), file_*_unsync(), file_*_now()... or to call it appropriately by layer file_*_filesystem(), resp. file_*_fs()?

Stefan Nagtegaal’s picture

I gave this patch another test and it really _is_ RTBC.

Now, Dries why are you not commenting/reviewing this patch? What is wrong, and why aren't we getting any pointers what could be wrong or does not meet your vision?

You wanted better file/media handling, and this is certainly a big part of it!

webchick’s picture

dopry helped me with some file-handling questions, and I promised him I'd take a look at this patch in return. That probably can't happen before the weekend, but subscribing. :)

dopry’s picture

So I noted that file_move is probably incorrect.. I should probably only return a boolean, and not clone the $source, only update the file path if the move is successful.

chx’s picture

Status: Reviewed & tested by the community » Needs work

First of all, I know how this feels. You have such a great patch and all you get is nitpicking. Sorry, that's my duty. Please understand that I would not have spent so much time with reviewing this patch if I would not think Drupal needs it. I have patched up file.inc and began to read where the patch began so it's not impossible I am addressing old wrongs in file.inc -- feel free to say "this was always so, we will fix in a new issue" but then please open said issue and link to it.

  1. Line 249, A file object of the original file. <= I am sorry but what's that? What about "A stdClass object with properties A, B, C" listed on place and adding @see everywhere else? file_save_upload sounds good because that's where the file object is born.
  2. Drupal's 'files' directory will be used <= system_file_system_settings says t('File system path'), which might be more udnerstandable.
  3. the FILE_EXISTS_RENAME line is 88 chars long.
  4. the @see file_copy_plain() would be better at the end of the copy block.
  5. "If the source file is a temporary file, the resulting file will also be a temporary file" -- as file_save_upload explains what's a temporary file, we need a "@see file_save_upload about temporary files".
  6. $dest really shoudl be $destination. Everywhere in file.inc
  7. clone($source); there is no need for () around $source, that was a trick for PHP4 w/ drupal_clone.
  8. When should I use and when I should use file_copy_plain? And why "plain" ? I understand it's not easy to name these functions (see drupal_get_form vs drupal_retrieve_form ) but as the doxygens are almost the same, there are no directions for the aspiring developers. From the issue it's clear I more want to use file_copy but then why I would use file_copy_plain...? Help!
  9. By changing the incoming parameters to file_copy_plain , it remains a pain to debug -- experience shows that by the time the messages fire, one or another variable becomes empty.
  10. file_destination does not handle FILE_EXISTS_REPLACE at all.
  11. file_move_plain does not use rename or move_uploaded_file. I do not readily understand why.
  12. file_delete "Force File Deletion ignoring reference counting" Too Many Caps Words to begin with. But then again what's a reference in this content? there is documentation inside the function, why not in the doxygen? Also, consider a better word because reference and reference counting are already pwnd by Zend Engine itself.
  13. if (!$force && $references = module_invoke_all('file_references', $file)) please add () around the assignment just to avoid guessing on operator precedence.
  14. You can save a little code by if ($return = file_delete_plain($file->filepath)) { } return $return;
  15. file_delete_plain @return description is too long, 81 chars.
  16. Again, at the end of file_delete_plain you can save a return statement.
  17. file_space_used o_O is the bitwise AND ANSI SQL? Can't we use = IN or something? Also, I would be so much happier if you would concat the SQL together based on $uid than doubling (almost) the same code.
  18. file_save_upload doxygen, "The file will be added to the files table as a temporary file. Temporary files" ends at 82th column.
  19. you put the $validators assigment before the cache check...? Why?
  20. I find a bit strange the $validators array AND the file_validate hook. Why do we need both?
  21. $message .= '<ul><li>' . implode('</li><li>', $errors) . '</li></ul>'; HTML tags? In Drupal 7? Heathens! theme('item_list')
  22. if ($file = file_save($file)) erm, references were invented quite long ago and I would prefer that. This is weird.
  23. file_save_data_plain really should use file_put_contents and the doxygen should mention that like file_copy and file_move doxygen mentions the relevant php functions.
  24. A numeric file id or string containing the file path just say no. Move this function into _file_load and give me wrappers for a numeric file id and another for the string.
  25. return clone($files[$file_id]); another clone with superflous().
chx’s picture

About #22, I am getting tired. $file is an object, no need for references, then... just file_save($file) will do.

drewish’s picture

FileSize
54.31 KB

hey chx, thanks for such a thorough review. i'd way rather have a hard review than having this continue to be ignored.

1. Line 249, A file object of the original file. <= I am sorry but what's that? What about "A stdClass object with properties A, B, C" listed on place and adding @see everywhere else? file_save_upload sounds good because that's where the file object is born.

TODO i tried this and couldn't come up with anything that was an improvement.

2. Drupal's 'files' directory will be used <= system_file_system_settings says t('File system path'), which might be more udnerstandable.

i think in this context that's less understandable.

3. the FILE_EXISTS_RENAME line is 88 chars long.

fixed it and all the other long comments.

4. the @see file_copy_plain() would be better at the end of the copy block.

looking else where in core the @see's are listed below @param and @return.

5. "If the source file is a temporary file, the resulting file will also be a temporary file" -- as file_save_upload explains what's a temporary file, we need a "@see file_save_upload about temporary files".

done.

6. $dest really shoudl be $destination. Everywhere in file.inc

done

7. clone($source); there is no need for () around $source, that was a trick for PHP4 w/ drupal_clone.

done.

8. When should I use and when I should use file_copy_plain? And why "plain" ? I understand it's not easy to name these functions (see drupal_get_form vs drupal_retrieve_form ) but as the doxygens are almost the same, there are no directions for the aspiring developers. From the issue it's clear I more want to use file_copy but then why I would use file_copy_plain...? Help!

TODO still needs work

9. By changing the incoming parameters to file_copy_plain , it remains a pain to debug -- experience shows that by the time the messages fire, one or another variable becomes empty.

TODO I agree, this could use some work.

10. file_destination does not handle FILE_EXISTS_REPLACE at all.

added an empty case so it's clear that nothing is done intentionally.

11. file_move_plain does not use rename or move_uploaded_file. I do not readily understand why.

that's the original drupal behavior... i'm guessing the idea was that it was a safer way to rename...

12. file_delete "Force File Deletion ignoring reference counting" Too Many Caps Words to begin with. But then again what's a reference in this content? there is documentation inside the function, why not in the doxygen? Also, consider a better word because reference and reference counting are already pwnd by Zend Engine itself.

i updated the comment to explain that this would force the file to be deleted even if hook_file_references() reports it is in use.

13. if (!$force && $references = module_invoke_all('file_references', $file)) please add () around the assignment just to avoid guessing on operator precedence.

done

14. You can save a little code by if ($return = file_delete_plain($file->filepath)) { } return $return;

personally, i like the current version. i think it makes it clearer what's returned when.

15. file_delete_plain @return description is too long, 81 chars.

fixed

16. Again, at the end of file_delete_plain you can save a return statement.

again, i like the current version.

17. file_space_used o_O is the bitwise AND ANSI SQL? Can't we use = IN or something? Also, I would be so much happier if you would concat the SQL together based on $uid than doubling (almost) the same code.

i can't really speak to this. personally i'd rather have separate sql queries than have to deal with people bitching about the security of dynamically building the where clause.

18. file_save_upload doxygen, "The file will be added to the files table as a temporary file. Temporary files" ends at 82th column.

fixed

19. you put the $validators assigment before the cache check...? Why?

good call moved, that.

20. I find a bit strange the $validators array AND the file_validate hook. Why do we need both?

because the caller wants validation and other modules want validation.

21. $message .= '

  • ' . implode('
  • ', $errors) . '

'; HTML tags? In Drupal 7? Heathens! theme('item_list')

i don't think that function existed when this patch was first started ;)

22. if ($file = file_save($file)) erm, references were invented quite long ago and I would prefer that. This is weird.

TODO i could go either way on this.

23. file_save_data_plain really should use file_put_contents and the doxygen should mention that like file_copy and file_move doxygen mentions the relevant php functions.

good call, this was written before we were php5 only.
TODO i didn't really have time to test this so i got it half way there... it uses it to write to the temp file and then call file_move_plain() so that renaming is handled.

24. A numeric file id or string containing the file path just say no. Move this function into _file_load and give me wrappers for a numeric file id and another for the string.

the current behavior is consistent with node_load and user_load...

25. return clone($files[$file_id]); another clone with superflous().

done.

leaving this as CNW until the TODOs are resolved

chx’s picture

#11 that's the original drupal behavior... <= not a really good argument as you are just throwing out the original drupal behaviour :D

than have to deal with people bitching about the security of dynamically building the where clause. <= What "people"? There are very few people who would bitch over the security of something I suggested :) But you have not answered the more important very strange bitwise SQL operator.

I still do not understand why a module can't simply define hook validators. Maybe we want to instead name/tag/whatever the file objects so the hook_validator knows when to fire?

node_load and user_load takes complex data structures or a scalar. NOT two different kind of scalars. You are opening up for really weirdo bizarre bugs when the file is named "2".

dopry’s picture

@chx, I chose the bit wise operator for the file.status column, and that is not being added in this patch. It was added in a previous patch. Therefore it's of no consequence in the review of the patch.

The time to lodge your complaints against the bitwise approach would have been when the file table restructuring and initial file handling changes were made. This is fact in D6 as well. My original intention was for this field to also include public, private, viewable, processing currently unavailable, read only etc as core status... However to date only FILE_STATUS_TEMPORARY and FILE_STATUS_PERMANENT are valid statuses.
@see: http://drupal.org/node/115267 where this column and behavior was introduced.

to answer your question regarding ANSI SQL and the bitwise AND operator.... No the bitwise AND is not a part of the ANSI SQL92 specification but is implemented in Oracle, MSSQL, MySQL, PostGres, SQLlite, Sybase, and a few others.

samirnassar’s picture

FileSize
37.63 KB

Several hunks had offsets and some hunks failed. Re-rolled against head. Also learning how to cvs diff properly.

cburschka’s picture

Parsing error; you're missing a curly brace on the end of an if-block on line 119 of file.inc.

cburschka’s picture

Status: Needs work » Needs review
FileSize
37.63 KB

Here's a fix.

Susurrus’s picture

In file_create_path, shouldn't a more sensible default be used, like the empty string, instead of a 0? It would further emphasize that $destination is a string.

Also, can the test plan in #81 be turned into a real SimpleTest? I'm not familiar enough with the framework to know if uploads are possible, but it seems like this patch is ready to roll based on others comments. I'll try to test it out later today.

cwgordon7’s picture

Uploads are definitely possible with SimpleTest. :)

aaron’s picture

update for anyone subscribed: we're in the middle of the Media Code Sprint right now, and this issue is top of the list. You're welcome to jump in for some heavy-duty testing and whatnot on this patch over the next couple of days! find aaronwinborn, drewish, or dopry at #drupal in irc if you're interested.

drewish’s picture

FileSize
53.15 KB

the patch that steamedpenguin posted in #143 is lost a bunch of changes (54k down to 37k). i've merged his re-roll and my last patch on #137. i also changed the simpletest module's call to file_delete so that tests don't totally fail.

aaronwinborn and i are working on getting a baseline of tests that pass before applying this patch vs those that work afterwards.

drewish’s picture

FileSize
58.27 KB

here's the start of a file.test file that aaronwinborn and i worked on today. mostly just trying to get a feel for the testing system.

aaron’s picture

file_create_filename should probably be called file_create_filepath, based on what it does. not sure if we want to get into that in this patch, but i wanted to record that thought while it was freshly thunk.

aaron’s picture

or maybe it should even be file_create_new_filepath, since my first recommendation could be confusing if a developer thought it would simply create the path to an existing file, which the function name could be interpreted as doing without reading the documentation.

Senpai’s picture

In the interests of naming things in plain, understandable, non-RTFM style, I vote a hearty +1 for file_create_new_filepath. Nice call.

aaron’s picture

Status: Needs review » Needs work
FileSize
65.97 KB

This patch rolls the latest additions to file.test into #150. It includes more complete documentation, and FileSaveTest. It also includes better documentation for file_check_location, which previously confused the heck out of me while writing tests... Note that two tests included currently fail. And there are more tests to be written.

jhedstrom’s picture

FileSize
70.91 KB

This patch adds to #154, creating a FileDirectoryTest class with many directory handling tests, and moves one test (testFileDirectoryPath) from FileSaveTest to FileDirectoryTest. The patch also fixes an error in file_check_directory() where it would return true if a directory wasn't writeable, but the mode was 0. I also edited the comments referring to file_create_directory(), which is a non-existent function, to correctly refer to file_check_directory().

drewish’s picture

Status: Needs work » Needs review
FileSize
73.49 KB

this adds in the rest of the file_validate_* tests.

floretan’s picture

FileSize
71.24 KB

I took a look at the two failing tests. The first one points out a bug with file_check_location(), so I'm creating a separate issue for it.

The second failing test is a simple $this->assertEqual() instead of $this->assertNotEqual(). Here's a patch with that minor change.

Senpai’s picture

Status: Needs review » Needs work

The testing environment
Tested against a clean copy of HEAD, but not a clean db.

9 PHP notices against core form handling in system.admin.inc
With patch #157 applied, I get a

notice: Undefined index: #value in /www/drupalhead/modules/system/system.admin.inc on line 776

on the admin/build/modules page, which triggers about 8 other related notices as well. Line 776 looks like this:

<?php
772.  else {
773.    $form['enable'] = array(
774.      '#markup' =>  theme('image', 'misc/watchdog-error.png', t('incompatible'), $status_short),
775.    );
776.    $form['description']['#value'] .= theme('system_modules_incompatible', $status_long);
777.  }
?>

Simpletests
When running the File tests, Directory Tests passed, File Save failed, and File Validation passed. The single point of failure in File Save is the

Non-existent file fails when checked against realpath().

which happens in the testFileCheckLocation() function. Since this is a separate issue now, I'll paste a link in here once flobruit makes that issue know to me. (Florian!!! Send me the linky!)

File uploads and saves
While testing the capability of Drupal to upload, attach, save, and delete files, well, I didn't come across any problems. Nor did I find anything that could even remotely be considered a bug.

EDIT: Woops, I found something right after I wrote that. warning: strtr() [function.strtr.html]: The second argument is not an array in /www/drupalhead/modules/syslog/syslog.module on line 106. occurs when attempting to delete a picture that's been attached to an article by a user greater than uid1.

I'm calling this RTBC once the other issue gets resolved.

drewish’s picture

senpai, we don't touch that part of system.admin.inc... are you sure you don't get the same warning without the patch?

Senpai’s picture

I'll clean house until I get a virgin sandbox, and try again.

EDIT ~ Yup drewish, A fresh copy of core plus a fresh db yields these same "line 776" notices. I tried it with and without the #157 patch. It's not me, it's you. I mean, it's not this patch, it's core. ;-)

As soon as that one little problem with testFileCheckLocation() gets fixed, this baby is ready to take to the skies. Over a year later too. W00t!

floretan’s picture

Status: Needs work » Needs review
FileSize
71.08 KB

Apparently, since PHP 5.2, most distributions' PHP package has a patch which alters the behavior of realpath(): http://ch2.php.net/manual/en/function.realpath.php#82770

Since realpath() can have two different return values for a non-existing file, we can't test for it, attached patch removes the bad assertion. The file tests now run without failures on Ubuntu Hardy (which uses the BSD version of the realpath() behavior).

After a quick look through the instances of realpath(), the only place in core that seem to be dependent on the alternate behavior of realpath() is in file_check_location().

Senpai’s picture

Status: Needs review » Reviewed & tested by the community

Tested again, and still working.

chx’s picture

Nice job. I reviewed my list above and I agree with this being RTBC.

drewish’s picture

Status: Reviewed & tested by the community » Needs work
FileSize
57.53 KB

using the patch in #161 i still get the fail on OSX for the "Non-existent file validates when checked for location in directory, but name contains a non-existent subfolder." test.

organized the tests and added tests for file_delete(), file_delete_plain(), and file_load(). the tests for file_load() found a fatal error when loading an invalid filepath or fid so i fixed that bug.

drewish’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
20.73 KB

i didn't mean to switch the status... though my patch didn't include file.test. i'm going to keep writing tests and rather updating the whole patch I think I'll just keep file.test separate and post it by itself.

drewish’s picture

Status: Reviewed & tested by the community » Needs work
FileSize
58.42 KB
26.39 KB

the patch makes a change so you can create "temp" files by not passing in a name (basically the same as if you passed in an empty name it just documents the behavior). also tries to get rid of the =0 defaults in favor of more applicable values. also adds some @see docs.

more tests, this covers file_save_data_plain(), file_save_data(), file_move_plain() and file_move(). i'm getting failures on file_move that don't seem right. out of time to debug them now so i'm posting the tests as is.

drewish’s picture

Read over the #134-136 and came up with a small list of things still to do with this patch:

* Finish writing tests for file_move*() and file_copy*().

* It would be really helpful if someone could write up a paragraph or two for the top of the file to document the file object format. The files table schema definition would probably have some useful bits to copy and paste.

* Read over the docs for file_copy/move/delete/save_data functions and compare them to their plain counter parts. Make sure it's clear when each function should be used.

* Give file_load() a really hard look. I'm coming to appreciate chx's concern about accepting two different kind of scalars as a parameter opening up the possibility for really weirdo bizarre bugs when the file is named "2".

* Another, lesser concern for file_load() is the caching of loaded files. Should we be clearing the cache after a file is moved or deleted? node_load() does similar caching but doesn't seem to be called to reset the cache on delete or save.

drewish’s picture

looks like the file_check_location() problem with non-existant paths has been known for a while: #200586: file_check_location and non-existing directories

robertDouglass’s picture

Subscribing and adding words of encouragement. When Dries returns from vacation I'm sure he'll feel it's like Christmas with such a nice patch waiting for him!

bdragon’s picture

subscribing

mfer’s picture

What needs to happen to get this RTBC?

drewish’s picture

mfer, scroll up a couple of comments and read #167 ;)

greg.harvey’s picture

I was going to raise a feature request, having picked through the Form Interface stuff, which was going to be something like this:

PROBLEM:
file_save_data() does not add your file to the 'files' table. However, file_save_upload() does.

RESOLUTION:
Make file_save_upload more flexible, so it doesn't have to get the file from a form field - it could be passed as binary data in variable after being received over a web service, or something like that.

It seems this new hook will resolve this issue though, right? It looks like it, at first glance. For Drupal 6.x I guess I will have to manually insert in to the 'files' table. =(

drewish’s picture

greg.harvey, how about actually trying out the patch? if you did you'd notice file_save_data() puts records into the database while file_save_data_plain() does not. i know it's almost as 200 comment issue but at least look at the patch before chiming in.

jhedstrom’s picture

FileSize
58.17 KB

Re-rolling to patch cleanly. Also removed references to dsm() from file.inc. The file.test from #166 is still the latest set of tests, although many of them are currently failing. If I get a chance tomorrow I'll try to get those fixed and finish the tests mentioned in #167.

patching file includes/common.inc
Hunk #1 succeeded at 1819 (offset 19 lines).
Hunk #2 succeeded at 1920 (offset 19 lines).
Hunk #3 succeeded at 2283 (offset 24 lines).
Hunk #4 succeeded at 2294 (offset 24 lines).
patching file includes/file.inc
Hunk #18 FAILED at 705.
Hunk #33 succeeded at 1598 (offset 372 lines).
1 out of 33 hunks FAILED -- saving rejects to file includes/file.inc.rej
patching file modules/blogapi/blogapi.module
patching file modules/simpletest/simpletest.install
patching file modules/simpletest/simpletest.module
Hunk #1 succeeded at 556 (offset 6 lines).
patching file modules/system/system.admin.inc
patching file modules/system/system.module
patching file modules/upload/upload.admin.inc
patching file modules/upload/upload.module
patching file modules/user/user.module
Hunk #4 succeeded at 1535 (offset 3 lines).

greg.harvey’s picture

greg.harvey, how about actually trying out the patch? if you did you'd notice file_save_data() puts records into the database while file_save_data_plain() does not. i know it's almost as 200 comment issue but at least look at the patch before chiming in.

Woah, that's a bit harsh!

I can't try the patch because I'm not running Drupal 7.x - perhaps that wasn't clear enough, but I did elude to Drupal 6.x in my original post. I was actually trying to do the decent thing and make sure this was being addressed in 7.x *before* raising a feature request in Drupal 6.x. Since I don't have the luxury of *any* free time on my current project, I think it's grossly unfair of you to have a pop at me for not spending several hours installing a totally new instance of D7 just to see how a patch works. Sure, I could've read the patch file - but I read the jist of the post instead and decided it was easier to just seek clarification.

A simple "Yes" would have sufficed. I didn't need a lecture.

drewish’s picture

greg.harvey, my comment was blunt but there's already been a ton of noise in this issue and your comment didn't add anything to it. i apologize if i hurt your feelings.

chx’s picture

"When should I use and when I should use file_copy_plain? And why "plain" ? I understand it's not easy to name these functions (see drupal_get_form vs drupal_retrieve_form ) but as the doxygens are almost the same, there are no directions for the aspiring developers. From the issue it's clear I more want to use file_copy but then why I would use file_copy_plain...? Help!"

Edit: in general the issue is stalled because of this: " I care, though not enough to dive through the issue and read up on it and test it (I wouldn't even know where to start), etc. It's just so overwhelming" quoted straight from #drupal. We either need smaller pieces or we need testing instructions or something. There is absolutely no writeup about what does the patch do, what were the design considerations... nothing. It's very hard to comprehend such a big change without any help whatsoever.

webchick’s picture

Testing instructions and issue summary can go at Patch spotlight. That's what it's there for; so people don't have to read all 400 replies to help get something in.

jhedstrom’s picture

FileSize
87.92 KB

The attached patch adds the ability to test for file_hook_*() being invoked via a file_test helper module which implements the various hook_file hooks (although not all of them yet). I've moved file.test back into the patch, since the patch also adds file_test.module and file_test.info. Many tests still fail (from actual errors in file.inc from what I can tell).

Edit: changed file.inc to file.test, which is now back in the patch.

drewish’s picture

FileSize
62.19 KB

Following up on jhedstrom's work. Expanded out the file_test module to catch all the file_* hooks and a couple of functions to see which have been called.

Working on my issue list from comment #167, I went ahead and changed file_load() to take either an integer file id or an array with search criteria. And based on my work with the hook_file that dopry built into filefield I've refactored the code from file_save_upload() that did validation into file_validate() making it more re-usable.

There's some changes to file_copy_plain() to handle bad scenarios like copying a file onto itself.

Leaving it as CNW because there's still some documentation to be written:
- A paragraph or two for the top of the file to document the file object format.
- Make sure it's clear when the docs for file_copy/move/delete/save_data functions should be used and when the _plain versions should be used.

And it needs unit tests for:
- file_copy()
- file_set_status()
- file_save_upload()

drewish’s picture

FileSize
62.19 KB

This adds a test for file_copy(). And modifies file_save_data() to use the new file_get_mimetype(). I've got a partially working upload test form working but curl doesn't seem to want to POST for me for any tests.

drewish’s picture

FileSize
102.38 KB

jhedstrom brought it to my attention that my patch didn't include the tests.

jhedstrom’s picture

FileSize
102.71 KB

I was able to get the upload test working, although hook_file_save is currently not getting called. I think this is a problem with the file_test module's implementation, rather than a bug in file.inc.

I didn't get a chance to separate out the upload tests into their own class.

webchick’s picture

I would like to give this a thorough look over this weekend. So if there's a way to get it RTBC by then, that would be awesome. :)

drewish’s picture

Status: Needs work » Needs review
FileSize
104.71 KB

okay got all the tests working (generated a few issues for simpletest in the process) and i'm going throw this out for reviews.

Status: Needs review » Needs work

Patch failed to apply. More information can be found at http://testing.drupal.org/node/14128. If you need help with creating patches please look at http://drupal.org/patch/create

Patch failed to apply. More information can be found at http://testing.drupal.org/node/14126. If you need help with creating patches please look at http://drupal.org/patch/create

Patch failed to apply. More information can be found at http://testing.drupal.org/node/14125. If you need help with creating patches please look at http://drupal.org/patch/create

Patch failed to apply. More information can be found at http://testing.drupal.org/node/14124. If you need help with creating patches please look at http://drupal.org/patch/create

Patch failed to apply. More information can be found at http://testing.drupal.org/node/14127. If you need help with creating patches please look at http://drupal.org/patch/create

Patch failed to apply. More information can be found at http://testing.drupal.org/node/14130. If you need help with creating patches please look at http://drupal.org/patch/create

Patch failed to apply. More information can be found at http://testing.drupal.org/node/14129. If you need help with creating patches please look at http://drupal.org/patch/create

drewish’s picture

Status: Needs work » Needs review
FileSize
63.69 KB

okay so on the odd chance those last 6 comments were correct i've re-rolled the patch...

drewish’s picture

FileSize
104.72 KB

for got the -N on the cvs diff... this isn't just a shameless attempt to get this to 200+ comments.

webchick’s picture

Awesome!! I've allocated 2 hours tomorrow to dedicate to this patch. Thanks so much, drewish.

webchick’s picture

So it turns out 2 hours is not nearly enough, but here's what I managed to get done. This is my first major API patch review, so please be kind. :D

So what's this all about then?
Step 1 was figuring out what exactly this patch does, which took the bulk of the time. Unfortunately, at ~200 replies, it's really difficult to sift through all the comments and determine which are still valid and which are not. Please add this information to http://drupal.org/patch/spotlight to make it easier for testers. I've put the bulk of the contents of this post over there as a starter.

According to what I've studied, this table summarizes the changes made in this patch. Please feel free to correct me if I'm wrong (or better yet, correct patch spotlight! :D).

Drupal 6 Drupal 7 Description
n/a file_copy() Copies a file to a new location and adds a file record to the database. Also invokes hook_file_copy() so that other modules may act on the copy action.
file_copy() file_copy_plain() Copy a file to a new location without saving a record in the database.
n/a file_move() Move a file to a new location and update the file's database entry. Also invokes hook_file_move() so that other modules may act on the move action.
file_move() file_move_plain() Move a file to a new location but make no changes to the database.
n/a file_delete() Delete a file and its database record. Also involes hook_file_delete() to let other modules perform clean-up actions when file is deleted.
file_delete() file_delete_plain() Delete a file.
n/a file_save_data() Save a string to the specified destination and create a database file entry.
n/a file_load() Load a file object from the database. Also invokes hook_file_load() to allow other modules to do things as the file is loaded.
n/a file_validate() Check that a file meets the criteria specified by the validators. Accepts an associative array of callback functions used to validate the file. Also calls hook_file_validate() to let other modules perform validation on the new file.
n/a file_save() Save a file object to the database. Calls either hook_file_insert() or hook_file_update(), depending on whether a $file->fid is specified.

There are obviously some other things, such as tests for file API!!!! :D, renaming $dest to $destination in various places, and cleaning up bucket-loads of code comments which cant help but jump out at you when you're staring at something like file.inc for 300 hours. ;) But this is the major jist: treat file operations similar to node operations, and let other modules step in and do things when files are being manipulated in the database. We also retain a group of low-level file manipulation functions which just do their basic deeds (delete a file) without invoking hooks or manipulating the database. This is to allow for things like the CSS aggregator or favicon.ico upload, which don't need that level of overhead.

Patch review plz!
Ok, so now on to the patch review. Unfortunately, this is only a partial review because that sucker is BIG and most of my time was spent figuring out what exactly was going on. Anyway!

This is absolutely beautiful, gorgeous work. This addresses a number of glaring inconsistencies in file.inc and starts to make working with files in Drupal really powerful. Two of the top 13 things in Dries's hit-list at Szeged are images handling in core and better document management in core. I feel like this patch is going to get us there. :)

I have two classes of feedback: actual "looks like a bug" stuff, and nit-picky webchick stuff.

Buggy stuff
I know that this has been brought up by others, and I know you're sick of hearing it, but _plain() just... it just doesn't work. :( My first thought upon reading the patch was that the _plain() extension meant that this function was for dealing with plaintext files. Obviously, this is completely and utterly wrong. So far the only alternative I like is file_save_fs() but then it becomes "why not file_save_db()?" and if we do that, it ruins the simple elegance of the API as you have it currently. Maybe we can brainstorm on IRC about it this week?

Some code-level stuff:

   file_scan_directory(file_create_path('js'), '.*', array('.', '..', 'CVS'), 'file_delete', TRUE);
+  file_scan_directory(file_create_path('js'), '.*', array('.', '..', 'CVS'), 'file_delete_plain', TRUE);

Looks like you forgot to remove the line above this that's referencing the old file_delete() function?

+ *     Files that are temporary will removed be removed during

during...? :) Also, this line should not be indented, for consistency with filename.

+    // TODO: test the reset parameter

Is this (and other) TODOs in this patch pending the bugs you found with simpletest.module, or..?

-    if ($user->picture && file_exists($user->picture)) {
-      file_delete($user->picture);
+    if ($user->picture) {
+      file_delete_plain($user->picture);

There are a couple of places like this in user.module where you've removed file_exists() wrappers. Just curious, why is that? Are we instantiating this earlier in the process so we know that it's guaranteed to be there, or..?

Nit-picky stuff

+ *   filename - Name of the file with no path components. This may differ

Needs a - before filename like the line above it.

+ * Move a file to a new location but make no changes to the database.

The description for file_copy_plain() is "Copy a file to a new location without **saving a record** in the database." Let's make all of these functions' descriptions consistent, so it's clear they all belong together. Same with the PHPDoc descriptions of the ones that do save to the db.

+  // Let other modules perform validation on the new file.
+  return array_merge($errors, module_invoke_all('file_validate', $file));

I really liked seeing inline comments like this to explain what other modules might do with the hook. Could you add a little sentence like this to each one?

Ok, what next?
In order for this to be committed, a couple of things need to happen:

  1. I need a pow-wow with Dries about it; since this is a pretty major thing that touches a lot of stuff, I wouldn't feel right about committing it without running it past him first. At least not until I get a couple hundred more commits under my belt. ;) j/k
  2. I would love for everything (or at least most things) that touches a file to be run through by a human. There are enough weird things going on with our testing framework that it'd be nice to get old-fashioned human confirmation on some of this stuff. As far as I can tell, that means the following:
    - Adding, re-adding, and deleting a user picture, and posting a comment and node with user pictures enabled on the theme to ensure that they show up properly.
    - Adding, re-adding, and deleting a favicon and site logo in the themes settings.
    - Adding, re-adding, and deleting multiple files through upload.module.
    - Post a file to Drupal with an desktop editor that uses BlogAPI.
    - Testing the CSS and JS aggregators, clearing cache.
    - Testing picking various color schemes in Color module.
    I estimate this won't take someone longer than about 15-20 minutes to do.
  3. I need to find another uninterrupted chunk of time to sit down and go over this again -- for example, I only glanced through the .test file and didn't get a chance to run it yet. Drewish, are you coming to the Lullabot Portland meetup next Tuesday? If so, maybe you, Nate, and I could hammer out any final issues if they remain, and I'll work on "sitting down" with Dries between now and then and get his shortlist as well?
drewish’s picture

Status: Needs review » Needs work
FileSize
108.29 KB

webchick, i've addressed the bugs things that you'd mentioned. basically removed the line, fixed the comments, and restored the file_exists (i'm not sure why that changed). i've also started making some additions to the spotlight page. i noticed that there seems to be an issue with a couple of the user and upload tests so i'm bumping this back to CNW until i can dig into it further.

as for the _plain names i agree that it's not good but it's the least bad we've come up with. perhaps _bare? _raw? _dont_use_this? ;) i'd rather that the db variations be the shorter names because that's what most modules will be using, so a bit more typing for the non-db ones isn't that big of a deal.

i will be at the lullabot even next week and i'm looking forward to torturing nate with this further ;)

drewish’s picture

Status: Needs work » Needs review

got the tests working again. upload was failing because of a UI string change and user was failing because the path needed to be relative to the site root rather than the files directory.

drewish’s picture

FileSize
110.52 KB
scroogie’s picture

drewish, webchick: If I understand correctly, file_delete_plain() checks for non-existing files, so I think it's unneccessary to check again. Additionally, having an entry in $user->picture without the file being existent should be an illegal state and thus be logged (which file_delete_plain() does), so I think it's even _better_ without the file_exists() check.

jpetso’s picture

For the _plain() issue, maybe it could help to swap the verb and the suffix? I would think that file_plain_copy() is more intuitive to read than file_copy_plain(). It might also open up more suffixes prefixes to consider, e.g. file_pure_copy(), file_unmanaged_copy(), or whatever you think of.

catch’s picture

file_copy_raw (or file_raw_copy) sounds better to me than _plain. Maybe _unmanaged would work too.

drewish’s picture

scroogie, re: the file_exists() the reason I restored it to user.module is that file_delete_plain() logs a watchdog message and it seemed like since it was going to be overwritten any way it wasn't worth cluttering it up. i didn't think it mattered much either way and in cases like that i'll usually just skip the change.

jpetso, i think you're absolutely correct about switching the ordering. file_plain_* or file_unmanaged_* sound way better. i'm actually a big fan of file_unmanaged_* it's very descriptive and it's long to type so you'll be annoyed everytime you have to and finally break down and use the managed functions.

jpetso’s picture

As for the ordering, I'm actually totally indifferent on that, it was just an idea to maybe get better ideas.

I only thought of _unmanaged when I had already written the whole post, and the more I think about it, the more I like it. Might sound a little C#-ish, but it seems like a nice mixture of "caution, don't do that unless you know what you're doing" and actually describing what it does (or rather, doesn't) accomplish. Putting it in front of the verb would also have the effect that autocompleting IDEs don't show it together with the "managed" function but apart from those in a separate section.

jpetso’s picture

FileSize
110.53 KB

And as drewish seems to like it, let's play my 1337 regex skills and have the exact same patch from above with file_unmanaged_* instead of file_*_plain.

drewish’s picture

i haven't eyeballed my way through jpetso's changes but i ran all the tests and it didn't affect those.

scroogie’s picture

drewish: As I said, I think a watchdog message is really adequate here, because $user->picture shouldn't point to a file that doesn't exist. Or am I wrong with that assumption? If you leave the file_exists() in the if() this illegal state is just skipped without reporting it.

drewish’s picture

Status: Needs review » Postponed

in the interest of getting this freaking thing moving, i've split off most of the non-db and non-hook related changes from this patch and created #308434: Clean up file.inc ahead of hook_file and add unit tests.. it should be much easier to review since the changes are much more targeted. please give it a look.

drewish’s picture

Status: Postponed » Needs work

okay webchick helped me to get #308434: Clean up file.inc ahead of hook_file and add unit tests. committed just now so this should be a lot smaller patch that focuses solely on the cool new database backed files and hooks. it's going to be fun to re-roll this...

drewish’s picture

FileSize
58.82 KB

Here's a re-roll, it's looking a lot lighter and easier to review.

There's something broken in simpletest_clean_temporary_directories() that's causing a bunch of exceptions during clean up which is affecting every test (even non-file ones). It'd be nice if someone else could rerun the tests for me to make sure it's not just my machine. Marking CNW until this is sorted out.

drewish’s picture

Status: Needs work » Needs review

turns out that was a problem with my files directory. this patch is okay to review.

drewish’s picture

FileSize
58.69 KB

re-rolled to fix a few rejects around the followup to #308434: Clean up file.inc ahead of hook_file and add unit tests..

jpetso’s picture

FileSize
58.84 KB

When your mind feels narrowed and you're lazy as hell, boring janitorial stuff might be the appropriate response. Consequently, yay for fixing a few spelling errors and ending sentences with dots. I guess I should feel embarrassed now.

Questions:
- Is there any logic behind marking asserts with the 'File' group and not doing so?
- In file.test, testFileSaveData(), why is there a drupal_get_messages() that's commented out? (Shouldn't it be either in use or not exist at all?)

+    $this->assertTrue(file_delete($this->file), t("Delete worked."));

If file_delete() returns either TRUE, FALSE or an array with references, then this would also catch references, right? Does it make sense to have it as

+    $this->assertTrue(file_delete($this->file) === TRUE, t("Delete worked."));

or am I missing something?

Other than this pointless nitpicking, I don't find coding issues. Not tested functionality-wise, but style and concepts are fine. (I had the pleasure of working with the latter in filefield, and they also work in practice :P )
Good work!

If someone rerolls the patch instead of modifying the .patch file (yeah, *that* lazy), you might also replace the remaining "existant" with "existent" in file.test. Not that someone would care about spelling errors or missing sentence-trailing dots in comments.

Status: Needs review » Needs work

Patch failed to apply. More information can be found at http://testing.drupal.org/node/14417. If you need help with creating patches please look at http://drupal.org/patch/create

jpetso’s picture

Status: Needs work » Needs review
FileSize
58.7 KB

Merging my changes into drewish's new patch. Sorry for the testbot and overall being an annoyance today.

drewish’s picture

Status: Needs review » Needs work

jpetso looks like you just edited the patch? i'm hoping you documented all the changes you made...

drewish’s picture

Status: Needs work » Needs review

looks like we cross posted...

drewish’s picture

FileSize
58.84 KB

jpetso, thanks for taking a look at this.

Is there any logic behind marking asserts with the 'File' group and not doing so?

Uh, I think that was just having different people working on the unit tests. I did some cleanup but there still not a lot of consistency to them.

+ $this->assertTrue(file_delete($this->file) === TRUE, t("Delete worked."));

Excellent point, I've changed this to assertIdentical() which uses ===.

If someone rerolls the patch instead of modifying the .patch file (yeah, *that* lazy), you might also replace the remaining "existant" with "existent" in file.test. Not that someone would care about spelling errors or missing sentence-trailing dots in comments.

got those fixed.

smk-ka’s picture

Status: Needs review » Needs work
  • file_unmanaged_save_data() contains a unnecessary global $user; at the beginning.
  • file_set_status() always executes the part inside the if() statement unless there is an SQL error (which would be quite unusual at that point), I suspect the condition should read if (db_affected_rows(...)).
drewish’s picture

Status: Needs work » Needs review
FileSize
61.25 KB

smk-ka, right on both counts. i've fixed both issues and in the process realized that there wasn't any test coverage on file_set_status() so i added a basic one.

drewish’s picture

marked #6005: file_move() should return the new object on success as a duplicate since this would do exactly that.

drewish’s picture

Status: Needs review » Postponed

I'm going to postpone this until #310358: Add a test for file_save_upload and clean up file.test gets committed. It'll require some big cleanups in these tests too.

drewish’s picture

Status: Postponed » Needs work
FileSize
65.09 KB

here's are re-roll with the cleaner test format. adds a new base test class to handle the hook tests. the upload test is failing because the validate and insert hooks aren't being called. need to figure that out.

drewish’s picture

Status: Needs work » Needs review
FileSize
66.14 KB

got all the tests passing. flushed out a bunch of the tests for the new managed file code.

drewish’s picture

FileSize
70.38 KB

At chx's urging I've been working on a write up to explain what's screwed up with file.inc and how this patch addresses it. I just posted this over on the patch spotlight page but I'll post it here so everyone can see it. Also posting a re-roll of the patch to get rid of fuzz.

The problems

Currently Drupal's provides little to no support for tracking files in the database. The file_save_upload() function creates temporary file records but provides no support for loading or saving records after that point.

When a module moves, deletes or alters a file they need to be sure to update the database. The burden of loading and saving the database records falls to each contributed module that needs to work with files. Every module re-implements the same functionality, and each additional implementation brings the possibility of new bugs and incompatibilities between modules.

Drupal also lacks any mechanism for providing notification and confirmation of file related events. The files table was re-designed in D6 to allow modules to share files but the lack of a mechanism for modules to notify each other of changes has prevented it's use. One module can delete a file and the others will have no way of knowing short of specifically checking for it's existence in the database.

A solution

The primary motivation of this patch is to make it easier for contributed modules to correctly deal with files. This is achieved by moving the management of file database records into core, and adding a set of hooks that inform modules of changes and provide a mechanism for affecting these changes.

This patch promotes files to a first class Drupal object by adding file_load() and file_save() for loading and saving file database records and adding several file hooks so modules are notified of changes to the files and can influence these changes.

Since there are times when you only need basic file manipulation--such as uploading a site logo--that don't require the overhead of databases and hooks, the current unmanaged copy, move and delete operations have been preserved but given new names.

meba’s picture

subscribing

CalebD’s picture

Status: Needs review » Needs work

I decided to give this a shot. This is a functional review, not a code review.

I tested the following:

- Adding, re-adding, and deleting a user picture, and posting a comment and node with user pictures enabled on the theme to ensure that they show up properly.
- Adding, re-adding, and deleting a favicon and site logo in the themes settings.
- Adding, re-adding, and deleting multiple files through upload.module.
- Testing the CSS and JS aggregators, clearing cache.
- Testing picking various color schemes in Color module.

Most of the above worked without a problem, however I ran across two issues:

1. When removing an attachment to a story node, I got a PHP warning:
warning: strtr() [function.strtr]: The second argument is not an array in /usr/local/apache2/htdocs/drupal/head/dev2/modules/syslog/syslog.module on line 106.

That may be a problem with my Linux server's syslog, I haven't had time to check. Of course when the syslog module was disabled, the warning did not occur.

2. When downloading an attachment with the file system set to private, the download worked, but upon next page load I got:
notice: ob_end_clean() [ref.outcontrol]: failed to delete buffer. No buffer to delete. in /usr/local/apache2/htdocs/drupal/head/dev2/includes/file.inc on line 1172.

Looks like an unnecessary ob_end_clean() on that line?

Marking as code needs work because of the above issues. I'll try to review the code when I find more time.

Oh, and all tests passed when I ran them.

jpetso’s picture

Error 2 from the above list has already been there before the patch, and I believe somewhere further up in this issue there is a statement that this shouldn't be fixed by this patch. Don't know about error 1, therefore leaving the issue state at "code needs work".

drewish’s picture

Status: Needs work » Needs review
FileSize
70.91 KB

CalebD, thanks for taking the time to run though all those tasks.

The first point is a bug in the current version of file_delete(). Its watchdog() call is incorrectly calling t(). I've fixed that in the attached patch.

As jpetso indicated your second point is a known issue: #205227: If output_buffering = Off then ob_end_clean fails to delete buffer

RobLoach’s picture

Is there any reason why there's a drupal_get_messages() in there at all? It's good to have the messages and error handlings during tests.....

Index: modules/simpletest/drupal_web_test_case.php
===================================================================
RCS file: /cvs/drupal/drupal/modules/simpletest/drupal_web_test_case.php,v
retrieving revision 1.45
diff -u -r1.45 drupal_web_test_case.php
--- modules/simpletest/drupal_web_test_case.php	20 Sep 2008 20:22:24 -0000	1.45
+++ modules/simpletest/drupal_web_test_case.php	30 Sep 2008 19:51:58 -0000
@@ -312,7 +312,7 @@
       }
     }
     // Clear out the error messages and restore error handler.
-    drupal_get_messages();
+#    drupal_get_messages();
     restore_error_handler();
   }

I got one error after running the Upload and File tests:

File sites/default/files/simpletest/html-1.txt was allowed to be uploaded Other upload.test 121 UploadTestCase->testFilesFilter()

Other then that, I was able to upload and delete file attachments to nodes without any problem.

drewish’s picture

FileSize
70.26 KB

RobLoach, thanks for taking the time to review the patch.

i shouldn't be commenting that out, it was done for debugging purposes. i've removed that hunk of the patch. see #314112: Add an easy way to write debug statements for some discussion on the topic of messages though.

as for the the failing tests, i ran into those and posted #314351: UploadTestCase improvements but it's hard to replicate. i'm guessing that if you uninstalled simpletest, deleted the files/simpletest directory, and re-installed you wouldn't have the problem any more. but try the patch first to confirm that it works for you both before and after.

meba’s picture

Status: Needs review » Needs work

Doesn't apply to HEAD anymore :(

RobLoach’s picture

I worry, are we starting to eat baby kittens here?

jpetso’s picture

Status: Needs work » Needs review
FileSize
70.26 KB

No baby kittens, this is one coherent piece of code. The test cases for existing functionality and other fixes have been split out for the largest part, what remains is really just the "make file.inc database aware" functionality and associated new test cases that wouldn't make sense otherwise. Heck, if I had just a bit of motivation for testing stuff, I'd RTBC this in a minute.

Oh, and attached is the patch that applies to current HEAD.

meba’s picture

Tested following:

- Adding attachments to nodes, listing/delisting, deleting them.
- Switching to private files, downloading
- CSS/JS aggregation, color module, favicon & logo
- User picture, reupload, delete
- Changing file system path

Without any error.

I've run simpletest, with some errors, not sure about them:
Upload: 102 passes, 24 fails, 2 exceptions
User - Upload user picture: 61 passes, 12 fails, 0 exceptions

Anonymous’s picture

FileSize
67.6 KB

the patch applies with some fuzz to HEAD. ran all of the file, user and upload tests, 567 passes, 0 fails, 0 exceptions.

don't have time to kick the tyres more than that, but here's an updated patch without any fuzz. can anyone else reproduce test errors? looks good to me.

drewish’s picture

FileSize
73.93 KB

webchick went over the patch this morning and pointed out some things that needed to be fixed. Highlights:
* Added @see hook_file_* to each function to make it easier to get the docs for each function.
* Updated the file.inc and changed parts of upload.module to use TNGDB calls.
* Added PHPDocs to all the file.test test functions.
* The op killing patch had added upload_nodeapi_delete() and upload_nodeapi_delete_revision() that did nothing more than call upload_delete() and upload_delete_revision() respectively. This seemed silly since upload_delete() and upload_delete_revision() weren't called from any where else so I just merged the code up into the upload_node_delete*() functions.

webchick’s picture

+  elseif (is_array($param)) {
+    // Turn the conditions into a query.
+    $cond = array();
+    $arguments = array();
+    foreach ($param as $key => $value) {
+      $cond[] = 'f.' . db_escape_table($key) . " = '%s'";
+      $arguments[] = $value;
+    }
+    $result = db_query('SELECT f.* FROM {files} f WHERE ' . implode(' AND ', $cond), $arguments);
+  }

Does the new database layer give us a nicer way to do this by any chance?

I spent a good hour on this patch earlier. Want to give it another once-over probably tomorrow night (esp. the tests where I kind of spaced out for a bit), but the "meat" of this patch has the strong webchick seal of approval. :)

drewish’s picture

webchick, i couldn't find one. i need to track down crell and pick his brain but it doesn't seem like there's an easy way to do a select * in a dynamically built query: http://drupal.org/node/310075#comment-1047879

drewish’s picture

FileSize
76.03 KB

webchick, i opened an issue for the whole select * problem: #318440: Make it easier to add multiple fields in a dynamic SELECT

the attached patch splits the file_load() and file_save() tests into their own classes.

Dries’s picture

I did a good review and this patch seems RTBC to me. It is simple, it is clean, and it is well-tested. I have no issues with it, but I'll leave it up to webchick to commit as she's been on top of it more than I have.

According to http://acquia.com/files/test-results/index.html, file.inc has 89% code coverage (pre-patch), I'll be interested to see what it will have after this patch lands.

Excellent work drewish et al!

webchick’s picture

YAY!!!

Ok, I need to finish up some phone calls, then I'll give this one last look-over and commit in a couple hours. :) Then.. .UNSTABLE-2!

Crell’s picture

DBTNG version of the snippet in #239:

$select = db_select('files', 'f');
$select->addField('f', 'fid');
$select->addField('f', 'uid');
$select->addField('f', 'filename');
$select->addField('f', 'filepath');
$select->addField('f', 'filemime');
$select->addField('f', 'filesize');
$select->addField('f', 'status');
$select->addField('f', 'timestamp');

foreach ($param as $key => $value) {
  $select->condition($key, $value);
}

$result = $select->execute();
drewish’s picture

Crell, actually that's not quite correct. Until #316868: Default SelectQuery::addField() alias to $field, not $table_$field is committed it would have to be:

$select = db_select('files', 'f');
$select->addField('f', 'fid', 'fid');
$select->addField('f', 'filename', 'filename');
$select->addField('f', 'filename', 'filename');
$select->addField('f', 'filepath', 'filepath');
$select->addField('f', 'filemime', 'filemime');
$select->addField('f', 'filesize', 'filesize');
$select->addField('f', 'status', 'status');
$select->addField('f', 'timestamp', 'timestamp');

foreach ($param as $key => $value) {
  $select->condition($key, $value);
}

$result = $select->execute();

All I got to say about that way of doing the query is "yuck!" Lets hold off on changing that until there's a better DBTNG idiom for adding multiple fields to a query.

jpetso’s picture

Such as, say, this? Not sure if it really needs to be more specialized than that.

$select = db_select('files', 'f');
$fields = array('fid', 'filename', 'filepath', 'filemime', 'filesize', 'status', 'timestamp');

foreach ($fields as $field) {
  $select->addField('f', $field, $field);
}
foreach ($param as $key => $value) {
  $select->condition($key, $value);
}
$result = $select->execute();
webchick’s picture

Hm. Neither one of those are much more legible than what's there now, and what's there now has an important thing functionality-wise where other modules that hook_schema_alter() additional fields onto the files table would also get their stuff loaded in one query, rather than a join on another table.

So I'm fine with leaving it like that for now. Making final pass now.

drewish’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
77.04 KB
4.61 KB

one last tweak that removes a couple of mystery $source = NULL parameters to the hook_file_* functions in the upload.module. not sure where those came from but i noticed them when i was trying to put together some documentation for the hooks. attached a really rough version as a .txt file.

just re-ran ALL the tests and everything looks good (ignoring that node revisions problem that's been around for a while).

added in a changelog.txt message as well since it seems like we're close ;)

drewish’s picture

FileSize
5.22 KB

more work on the hook docs.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Fiiiiiiiiixxxeeeeeeddddd!!! :)

drewish’s picture

Status: Fixed » Active

webchick, you're too kind but as chx pointed out i still need to do some update documentation. i went ahead and committed the rough hook documentation.

drewish’s picture

Status: Active » Fixed

posted update instructions on: http://drupal.org/node/224333#unmanaged_files

quicksketch’s picture

Three cheers for drewish! Few people could carry on through so much uncertainty, (sadly) indifference, and neglect. I had to use a calculation to figure out how long this had been open!

format_interval(time('October 10, 2008') - strtotime('May 10, 2007')) == '1 year 22 weeks' // !!

Congratulations to everyone, this will make substantial improvements in our managing of files. Finally we can have a real file-management solution shared between modules. Thanks again drewish, we never could have done it without you.

RobLoach’s picture

Oh man! How did this go in without me noticing 5 minutes after the fact?! Yay Drewish! There goes another item on the wish list.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.