Add hook_file and make files into a 1st class Drupal object

drewish - May 10, 2007 - 18:19
Project:Drupal
Version:7.x-dev
Component:file system
Category:task
Priority:critical
Assigned:Unassigned
Status:patch (code needs work)
Description

Following up on the work done in #115267 we need to add a hook for file operations.

Here's a list of possible operations (taken from dopry's filefield module):

function hook_file(&$file, $op) {}
where $op is one of the following:
'prepare'  // after upload, file is temporary?
'validate' // before save, check quotas, file format, etc
'load'     // allows other modules to load extended file attributes
'save'     // file becomes a non-temporary file?
'delete'

In #115267 there's still a lot of upload module specific validation that's occurring in file_validate(). I'd like to move that into upload_file($op=validate).

I'd also like to have hook_file($file = null, $op = 'validate') test the users' quotas and permissions so that if they've got file restrictions we can inform them of the before they begin uploading.

#1

drewish - May 10, 2007 - 18:23

it might make sense to roll hook_file_download into this as a $op = 'download'

#2

moshe weitzman - May 10, 2007 - 18:34

subscribe .. a reasonable enhancement

#3

dopry - May 11, 2007 - 19:04

I'd rather keep the op names as terminology related to whats actually going on with the files, instead of trying to copy form or node events, I think the op names should correlate to the functions they're being called from as well.

so maybe upload, validate, load, copy, move, update status, delete. I'll try to roll a patch over the weekend...

validate would allow modules to validate new uploads before saving to db.

upload would allow modules to act on new uploads while they're still in temp status. generating admin ui thumbnails.

load would allow modules to extend file objects. This doesn't exists because file_load doesn't exist yet.

copy would allow modules to react to copy events, creating new derivatives, access control entries, or whatever they may need to do. this would happen after a file has been successfully copied on the filesystem and inserted into the database.

move is the same as copy, but you would probably move existing derivs or update existing information.

update status will allow modules to react to status updates.. changing from temporary to permanent, or whatever we dream up in the future.

delete would be called after successful file_deletion.

#4

Grugnog2 - May 23, 2007 - 05:02

Subscribe

#5

Grugnog2 - May 23, 2007 - 05:13

I marked http://drupal.org/node/18934 as a duplicate of this (since it was super old).

However there was a patch (basically the same idea as here) - so go check it out anyway :)

#6

magico - May 23, 2007 - 11:49

(subscribe)

#7

keve - May 28, 2007 - 09:34

Good idea! Finally this could solve my long-awaited problem of conversion of filenames containing i18n characters to alphanumerics. ( http://drupal.org/node/146752 )

#8

drewish - May 30, 2007 - 17:33
Status:active» patch (code needs work)

Now that #115267 has been committed we can get started. I talked with dopry on IRC a bit yesterday and got an idea of his plan, this patch reflects my (probably incorrect) understanding of it. This patch is not anywhere close to done but I've got to run out and wanted to get it posted.

It turns file_copy, file_delete and file_move into wrappers which handle the database updating end of things while _file_copy and _file_delete do the file end of things. It also adds file_load() and file_save() to read and write to the database.

AttachmentSize
hook_file_142995.patch12.07 KB

#9

Stefan Nagtegaal - May 30, 2007 - 20:15

*subscribing*

I definatly want this in Drupal 6. Not sure yet what I can do to help..

#10

drewish - May 30, 2007 - 22:25

here's a bit better patch. it moves the code in a more logical manner so that the changes are clearer.

stefan, if you wanted to add some code to delete the record when the file is deleted that'd be good. i've tried to add a couple of TODOs to make it easy to see what needs to happen.

AttachmentSize
hook_file_142995_0.patch7.72 KB

#11

drewish - May 30, 2007 - 22:28

whoops, that was missing some stuff. this has better TODOs. feel free to jump in and work on some.

AttachmentSize
hook_file_142995_1.patch9.06 KB

#12

Stefan Nagtegaal - May 30, 2007 - 22:37

I'll try to come up with something after 10 hours from now...
I don't promiss anything, but i'll give it a shot..

#13

Stefan Nagtegaal - May 31, 2007 - 13:14

I did some quick reviewing on the patch and while I am not sure it's in scope of this patch, I think using arrays instead of objects is much easier to grasp.

Perhaps this could be a personal taste, but drupal is moving to arrays more and more.. (Think FAPI as an example)

Furthermore this goes pretty much above my knowledge of the file api. While I would *love* to see the functionality hit the trunk, my PHP skills are not up to the level where it meets the criterium to move this further.

I'll keep an eye on the issue, and review/test wherever I can to help, but unfortunatly cooperation on developing it wouldn't be a good idea.

#14

Stefan Nagtegaal - June 3, 2007 - 21:19

@drewish: Any progress on this?

#15

drewish - June 4, 2007 - 00:12

here's something closer. i've done a little testing on it but i almost need to write a module that uses files to shake all the bugs out.

the current hook operations look like:

  • load - fired by a call to file_load(), ignores returned values.
  • insert - fired by a call to file_save() for a new file, ignores returned values.
  • update - fired by a call to file_save() for an existing file, ignores returned values.
  • status - fired by a call to file_set_status(), ignores returned values.
  • copy- fired by a call to file_copy(), ignores returned values.
  • move - fired by a call to file_move(), ignores returned values.
  • delete - fired by a call to file_delete(), ignores returned values.
  • validate - fired during file_save_upload(), expects an array of error messages.
AttachmentSize
hook_file_142995_2.patch19.25 KB

#16

chx - June 4, 2007 - 00:57

If we would use file arrays instead of file objects we could spare some drupal_clones. Second, if we are changing the name of the operating functions, like _file_copy, it's a good time to make that a variable, like $function = variable_get('file_backend', '_file') .'_copy' -- can you spell s3_backend.module :) ?

#17

drewish - June 4, 2007 - 16:48

I liked chx's suggestion of adding a layer of indirection on the underlying file operations. I started looking at converting $file to an array and it'll be a major undertaking. I think it's the right thing to do but it'd touch a lot of files and result in a brittle patch that's out of the scope of this issue.

Digging into this I realized that there are really two classes of file operations, database tracked ones and non-database ones. Things like the aggregated JS and CSS should only be written locally and, at least in this patch, aren't tracked by the DB.

At this point there are only three functions a file driver would need to implement: _copy, _delete, _save_data.

AttachmentSize
hook_file_142995_4.patch25.89 KB

#18

dikini - June 5, 2007 - 08:38

I like the patch and the approach it takes. Makes sense.

What can I do to help?

#19

dikini - June 5, 2007 - 11:34

Is this the place to add folder handling logic?

A module implementing this hook could, in theory, do that but it can be a chicken and egg problem, depending on module logic and invocation time of the hook. We could separate the path from filename, or just store the path separately, the memory overhead will be minimal considering the possible use cases - uploads and file manipulations are rare, usially one off ops on a few of files per run.

I'm think about the implications for modules creating or using some form of logical folders

#20

Gábor Hojtsy - June 5, 2007 - 14:53

I would not convert $file objects to arrays, especially not in this patch, if you want this to get reviewed ;)

#21

dopry - June 5, 2007 - 15:34

The layer of indirection is out of scope. Fear the scope creep! There are more issue involved than just the indirection. Settings for particular backends, backend selection, and handling http timeouts on your webserver with large transfers to and from services like S3 or over FTP none of which none are addressed in this patch. I would prefer the scope be limited to just the hooks, and the indirection layer become another feature request.

#22

drewish - June 5, 2007 - 20:41

pulled out the indirection layer. it was a bit off topic.

cleaned up a lot of the phpdocs. started working the upload module. it really should be implementing hook_file to load its data into the file object. i made some changes to it and everything seems to be working correctly. i added some caching to file_load based on what's in node_load.

feedback on the best way to treat upload.module would be much appreciated.

AttachmentSize
hook_file_142995_5.patch27.24 KB

#23

drewish - June 8, 2007 - 18:15
Status:patch (code needs work)» patch (code needs review)

re-rolled. there's probably still some issues but at this point i need reviews.

AttachmentSize
hook_file_142995_6.patch28.71 KB

#24

drewish - June 13, 2007 - 22:34

the patch still applies.

#25

drewish - June 13, 2007 - 22:35
Title:hook_file» Add hook_file and make files into a 1st class Drupal object

little bit better title.

#26

moshe weitzman - June 14, 2007 - 15:28
Status:patch (code needs review)» patch (code needs work)

Very nice architecture improvement. I tested just about all the code changed by this patch. I just have a laundry list of minor improvements.

  • typos: search for scalled and memebers and thew
  • FILE UPLOAD SETTINGS: 'List files by default' should be a checkbox
  • FILE UPLOAD SETTINGS: "If an image toolkit is installed, files exceeding this value will be scalled down to fit.". this message could be more helpful if it looked for a toolkit and tailored the text accordingly. furthermore, the toolkit page says "No image toolkits found. Drupal will use PHP's built-in GD library". So i don't know if my images will be scaled or not.
  • when uploading an image to a node: notice: Trying to get property of non-object in /Users/mw/htd/multisite/modules/upload/upload.module on line 632.
  • perhaps document the status values in the comments for file_set_status() or tell the developer where to look.
  • tabs in upload_file(). use spaces instead.
  • i got these errors when uploading a user picture:
    # Object of class stdClass could not be converted to string in /Users/mw/htd/multisite/includes/file.inc on line 252.
    # notice: Object of class stdClass to string conversion in /Users/mw/htd/multisite/includes/file.inc on line 252.
    # The selected file /Users/mw/htd/multisite/Object could not be copied, because no file by that name exists. Please check that you supplied the correct filename.
    # Failed to upload the picture image; the pictures directory doesn't exist or is not writable.
  • upon uploading a logo in theme settings: Fatal error: Cannot use object of type stdClass as array in /Users/mw/htd/multisite/modules/system/system.module on line 2224

#27

drewish - June 15, 2007 - 05:48
Status:patch (code needs work)» patch (code needs review)

Thanks for the review Moshe. I think I've corrected all the issues you raised in your review other than:

perhaps document the status values in the comments for file_set_status() or tell the developer where to look.

The thing is at this point there are only values temporary and permanent. And dopry wanted the status as a bitmapped field.

I ran into a little, related problem with this testing the upload quotas on non-uid-1 users. It looks like their temp files are counted against their quota, which makes sense but it's a pain for them because they can't force the cron run that will delete their old files. That might be in the scope of a post feature freeze fix.

As a status update, I'm going to be offline next week. I'll try to keep this patch up until I leave and pick it up when I get back if it hasn't been committed.

AttachmentSize
hook_file_142995_7.patch35.23 KB

#28

moshe weitzman - June 22, 2007 - 03:55

1 hunk now fails on upload.module :(. please reroll and i will retest.

#30

dopry - June 23, 2007 - 07:46

This version of the patch updated to to work with HEAD.

It makes behavioral changes.

  • file_space_used to only return total filesize used by permanent files when passed a uid.
  • file_delete will not delete a file if another module blocks the delete and return an array of modules blocking the deletion, that will still test as true.

I also made a few minor documentation improvements, and document the implementation of hook_file in upload.module.

AttachmentSize
142995.patch35.36 KB

#31

dmitrig01 - June 23, 2007 - 14:35
Status:patch (code needs review)» patch (code needs work)

Applies with offset. Could you re-roll?

#32

dmitrig01 - June 23, 2007 - 19:41
Status:patch (code needs work)» patch (code needs review)

/me heard that fuzz was fine
review starts in a few minutes

#33

drewish - June 25, 2007 - 18:32

I really don't like that hook_file(op=delete) can block. It doesn't jive with the rest of Drupal's deletion. hook_delete and hook_nodeapi(op=delete) don't allow other modules to prevent a node's deletion, they allow modules to react to the deletion. I'd also like to see the deletion queries put into the delete API's package but there's probably not a clean way to do that.

The documentation for hook_file shouldn't be put into upload_file()'s comment, but rather added to contributions/docs/developer/hook/core.php

#34

dopry - June 25, 2007 - 19:22

File deletion is not node deletionm if it were the same we would use the same api. There should be another hook in file_delete to react to the deletion.

The idea is based on the behavior of hard links in posix file systems. Where you can have multiple links to the same file data. In Drupal the file data would be the row in the files table and the file itself. If you call file_delete from a module and the deletion is blocked by another module, file_delete still returns true, since there wasn't an error with the actual file deletion, so only the reference to the file gets removed. Once no modules are blocking the physical file_deletion the file will be removed. It is the responsibility of the blocking modules to provide any notices about why they blocked the file delete.

The deletion API has been brought up. There is one problem with the deletion API. It seems to enforce a process of Database query, then callbacks. We really cannot remove files from the database until we confirm that the file has been physically deleted, otherwise the database becomes inconsistent.

One major issue has arisen in this patch. I'll try to re-roll with corrections tonight.

My last implementation of file_delete only allows the delete hook to be a reference count. Modules implementing hook_file cannot react to file deletion consistently this way. Drewish has reservations to the approach.

#35

drewish - June 25, 2007 - 20:21

okay after lots of chatting on IRC with hunmonk and dopry i think we've got a solution that extends the deletion api.

i wanted to make one, unrelated, note though: _file_delete should return TRUE if the file doesn't exist... it's only an error if the file exists and can't be deleted.

#36

drewish - June 25, 2007 - 22:58

i opened up a deletion api issue: http://drupal.org/node/154604

#37

drewish - June 29, 2007 - 21:25

i'd had high hopes for getting #154604 committed and having a nice clean way of implementing the file deletion but dries is talking about rolling back the deletion api patch.

at this point i don't really care how the pre/post-delete hooks are setup. #30 is fine by me but i think dopry wants to make some changes. i'd settle for just about anything so long as the rest gets committed.

#38

Frando - July 1, 2007 - 14:46
Status:patch (code needs review)» patch (reviewed & tested by the community)

I am just reviewing the patch in #30. IMO, it is a *huge* improvement over what we currently have in Drupal. This opens up many many more possiblities for file handling.

I tested upload module functionality and the new or changed API functions. Everything worked as advertised. The concept of hook_file is great. Deleting a file that is still used on a node by upload.module successfully fails. Documentation is very good.

The attached patch only fixes a minor code style issue, removes some debug code that was still in the last patch and fixes one single E_ALL notices in upload.module hook_file op load.

This is RTBC. Great work.

AttachmentSize
hook_file_2.patch36.29 KB

#39

Gábor Hojtsy - July 1, 2007 - 16:58

Code-reviewed the patch:

- you introduce wrappers around internal _file_* functions, but still use some internal looking functions in blogapi and system module with this patch
- the phpdoc is not up to current conventions... I understand that the file.inc comments are not properly formatted, and to keep this reviewable, it should not be reformatted to comply to current conventions... but new functions added elsewhere should not follow that bad convention
- I don't understand the rationale behind this:

+      // If upload.module is still using a file, do not let other modules delete it.
+      $count = db_result(db_query('SELECT count(*) FROM {upload} WHERE fid = %d', $file->fid));
+      if ($count) {
+        return 'upload';
+      }
+      // Delete all file revision information associated with the file.
+      db_query('DELETE FROM {upload} WHERE fid = %d', $file->fid);
+      break;

If there are rows, then the deletion is blocked. If there are no rows, we delete these rows??

#40

Dries - July 1, 2007 - 17:09
Status:patch (reviewed & tested by the community)» patch (code needs work)

Why is this useful? Maybe add some code comments so we got to understand this a bit better?

<?php
+      $values = db_fetch_array(db_query('SELECT vid, description, list FROM {upload} u WHERE u.fid = %d', $file->fid));
+      if (!empty(
$values)) {
+        foreach (
$values as $key => $value) {
+         
$file->{$key} = $value;
+        }
+      }
?>

#41

drewish - July 1, 2007 - 21:47
Status:patch (code needs work)» patch (code needs review)

I still question the rationale behind the delete blocking stuff and since I'm re-rolling I'm going to drop it. I think it complicates the patch and I just want this done with.

Gábor Hojtsy,
- I intentionally didn't modify the blogapi to use file_save_data because the workflow of that module doesn't make it clear to me that it should be tracked in the database. Where is that file deleted? The system.module is a little different case, if that's what people want to do then I'm willing to modify it.
- The attached patch fixes all the PHPDoc in file.inc to bring it to current standards. I'm of the opinion that while we're in there we might as well fix it all.

Dries,
- I added a comment to explain what was happening there.

AttachmentSize
142995_0.patch42.59 KB

#42

drewish - July 1, 2007 - 21:48

I also pulled out this block of text from upload.module. It needs to go in the hooks.php file once this patch is committed:

<?php
/**
* Implementation of hook_file().
*
* @param $op the file operation being performed.
*  'copy' - the file was just copied to $file->filepath
*  'move' - the file was just moved to $file->filepath
*  'delete' - The file is about to be deleted.
*  'validate' - The file is being validated.
*  'status' - the status of the has just been set.
*  'load' - the file is being loaded.
*  'insert' - the file is being inserted in the database.
*  'update' - the file's database entry is being updated.
*
* @param $source The original path to the file on copy and move
*                operations.
* @return mixed
*     'validate' - return an array of errors.
*/
?>

#43

moshe weitzman - July 2, 2007 - 14:51
Status:patch (code needs review)» patch (code needs work)

@drewish - sorry i get a hunk fail in upload.module. i will review if you can fix this. it is more than 2 lines so i bailed.

#44

drewish - July 2, 2007 - 15:08
Status:patch (code needs work)» patch (code needs review)

here's a re-roll that fixes conflicts in system.module and upload.module.

AttachmentSize
142995_1.patch42.6 KB

#45

moshe weitzman - July 2, 2007 - 15:34
Status:patch (code needs review)» patch (reviewed & tested by the community)

tested uploads and user pictures and all works fine. code looks good as well. RTBC.

#46

dopry - July 2, 2007 - 19:20

Here is a reintroduction of file_delete with reference counting. I'll leave it to the powers that be to decide which to commit. It also fixes some direct deletions from the files table in upload_delete and upload_delete_revision.

AttachmentSize
142995_2.patch44.31 KB

#47

dopry - July 5, 2007 - 11:35
Status:patch (reviewed & tested by the community)» patch (code needs review)

re-rolled to sync with head. Set back to code needs review, since I added the reference counting on delete.

#48

dopry - July 5, 2007 - 12:54

I guess it helps if I attach the patch.

There are also some changes in _file_delete I've added two additional tests when $path is not a file.

  1. in the case a path doesn't exists it returns true, but watchdogs an error since the desired final state of _file_delete($path) is already true, This makes the soft delete of file_delete behave nicely in case you attempt to delete a file from the db where the file has already been removed from the file system.
  2. in the case $path is a directory it returns false and watchdogs an error.
AttachmentSize
142995_3.patch45.04 KB

#49

RobRoy - July 6, 2007 - 10:17
Status:patch (code needs review)» patch (code needs work)

Test
- Applied fine, uploaded two files attached to a page node okay.

Bugs
- When editing page node I checked "Delete" next to an existing upload File B and then attached a third file upload File C. When hitting Submit, File B gets deleted but File C doesn't stick around leaving only File A.
- Are files supposed to be deleted from file system when not used anymore? If so, they weren't as all files stuck around after checking "Delete" and hitting save, and still stayed after deleting the node to which they were uploaded.
- By default the file upload box desc says "The maximum size of file uploads is 1 MB. Images larger than 0 will be resized." Images larger than 0? I know that's the default, but is poor usability and confusion. So we need a check in there if (!empty(variable_get('...', 0)) { $desc .= ' '. t('Images larger than...'); }
- Also, I think we should set the default for the above var to '' instead of 0. We tell users to leave setting blank for no restriction elsewhere IIRC, why not here? 0 seems unnecessary and inconsistent. If we keep the !empty() code we'll be okay with 0 or ''.
- Why do all files get an underscore before the .ext? Maybe this is stated above, if so, sorry.
- We should remove "node" from user facing content on the File uploads config screen. I know it was there before, but since we're changing it, let's change line 185 of upload.module to something like 'Determines whether files attached to posts are listed or not in the full page view by default.' That might suck, but we should try and remove node. :P

#50

dopry - July 7, 2007 - 01:07
Priority:normal» critical
Assigned to:Anonymous» dopry
Status:patch (code needs work)» patch (code needs review)
  1. - When editing page node I checked "Delete" next to an existing upload File B and then attached a third file upload File C. When hitting Submit, File B gets deleted but File C doesn't stick around leaving only File A.

    I could not reproduce.

  2. - Are files supposed to be deleted from file system when not used anymore? If so, they weren't as all files stuck around after checking "Delete" and hitting save, and still stayed after deleting the node to which they were uploaded.

    There was indeed a missing file_delete.

  3. - By default the file upload box desc says "The maximum size of file uploads is 1 MB. Images larger than 0 will be resized." Images larger than 0? I know that's the default, but is poor usability and confusion. So we need a check in there if (!empty(variable_get('...', 0)) { $desc .= ' '. t('Images larger than...'); }

    This is fixed using the !empty test instead of isset.

  4. - Also, I think we should set the default for the above var to '' instead of 0. We tell users to leave setting blank for no restriction elsewhere IIRC, why not here? 0 seems unnecessary and inconsistent. If we keep the !empty() code we'll be okay with 0 or ''.

    This is not from this patch. It should be filed as a separate UI issue.

  5. - Why do all files get an underscore before the .ext? Maybe this is stated above, if so, sorry.

    I am unable to reproduce. although some files get that so apache won't try to execute them via it's multi type handling.

  6. - We should remove "node" from user facing content on the File uploads config screen. I know it was there before, but since we're changing it, let's change line 185 of upload.module to something like 'Determines whether files attached to posts are listed or not in the full page view by default.' That might suck, but we should try and remove node. :P

    This is not from this patch and should be filed as a separate UI issue. There are too many API changes happening in this patch for me to want to entertain scope creep.

AttachmentSize
142995_4.patch45.41 KB

#51

RobRoy - July 7, 2007 - 03:15
Status:patch (code needs review)» patch (code needs work)

Uploading files does not work with this recent patch. Did a clean install and attaching files appears to work, but hitting Submit results in no files attached.

#52

dopry - July 8, 2007 - 21:19
Status:patch (code needs work)» patch (code needs review)

Upload failures are not related to this patch, but to changes in FAPI, see http://drupal.org/node/157747, for details. Setting back to CNR, so I can solicit architectural and code style reviews until 157747 is resolved.

#53

drewish - July 18, 2007 - 00:18

patch still applies

#54

Dave Cohen - July 18, 2007 - 03:35

I'm looking forward to these improvements!

For those interested in this bug, I wanted to point out this module I built for drupal 5.x. In particular it has a form element for file uploads. If files are really a first class entity in Drupal, I feel that should be part of it.

My form field is somewhat complex because the various actions that must take place when a file is uploaded (and previewed) do not map well to the form hooks. I understand improvements to FAPI will make this sort of thing easier in the future (yeah!). But I'd encourage folks to check out what I did and consider it's features when adding this stuff to core. That's all - thanks for your time.

#55

chx - August 30, 2007 - 16:40
Version:6.x-dev» 7.x-dev

#56

dopry - December 4, 2007 - 12:56

And just to keep things somewhat close to head.. here is an updated patch.

AttachmentSize
142995.patch45.13 KB

#57

philippejadin - December 7, 2007 - 09:27

Just tested #56 on head, it works fine.

Only strange behavior :

- add a node
- browse for a file
- click attach
- click delete checkbox
- save node

-> File is still attached

This is a very good addition to Drupal too bad it's too late for 6. This is my first patch review, I came too late to the game

#58

catch - December 7, 2007 - 10:41

subscribing.

#59

jpetso - December 22, 2007 - 20:31

subscribing. should have done that long ago.

#60

Benjamin Melançon - December 29, 2007 - 22:04

subscribing

#61

dopry - January 19, 2008 - 06:50

A couple conflicts... Updating to current HEAD.

AttachmentSize
142995.patch43.91 KB

#62

dopry - January 19, 2008 - 07:59

@phillipjadine, I can no longer reproduce the bug in comment #57. Can you confirm?

#63

drewish - January 19, 2008 - 08:25

i don't think 'reference' is a good name for the operation name. all the others ('copy', 'validate', 'load', 'move') are verbs directing the module to take some action. if 'reference' were to be consistent the module would have to add a reference to that file. what you're really doing is directing the module to report the number of references. something like 'count_references', 'get_usage', 'are_using' would also be better.

#64

dopry - January 20, 2008 - 01:39

how about just 'references'? that is a noun and it almost feels like a interrogative to me.

#65

Benjamin Melançon - January 20, 2008 - 19:04

list_references

#66

philippejadin - January 21, 2008 - 11:51

Cannot reproduce bug in #57 so consider it fixed. Thank you

Overall, this is a good patch and great addition to filesystem api

#67

catch - January 21, 2008 - 11:57

I like 'list_references'.

#68

dopry - January 22, 2008 - 08:21

but I so don't like the underscore and I like the single word for brevity... So I made it references.

@peanut gallery:
How about reviewing the patch? Please post some actual feed back. Things like 'work's for me', 'doesn't work for me', 'part Y is confusing can you document it better' are useful.

@drewish, phillipe:
You most definitely aren't in the peanut gallery. :) Thanks for your continued attention to this patch. I really appreciate your feedback.

@drewish:
I really appreciate you doing the lions share of the work to port this to core.

AttachmentSize
142995.patch43.86 KB

#69

catch - January 22, 2008 - 10:42

OK stepping out of the peanut gallery.

I'm not sure I understand why the private _file_copy function has exactly the same phpdoc as file_copy() Maybe I missed something and that's standard for private functions, but it looked weird to me.

Otherwise I applied it, enabled upload module, and everything worked as normal.

#70

mikl - February 1, 2008 - 10:24

+1 for this. I hate to having to hack the core to fix broken filenames :(

#71

dopry - February 14, 2008 - 06:18

@catch: the doc isn't the same.. It's just similar. file_copy work on file objects _file_copy works on file paths.

@all: 6 is out, head is open? *wink* *wink* can we get some reviews or gatekeeper reservations?

#72

Wim Leers - February 14, 2008 - 19:37

@dopry: I'd appreciate your comments here: http://drupal.org/node/214934. I'm not sure yet if there's overlap… will give it a look this weekend.

#73

Rob Loach - February 14, 2008 - 20:48

Should hook_file also interact with Forms API? If so, we could move the Form API stuff to this related issue.

#74

drewish - February 14, 2008 - 21:23

Wim Leers, I know that one of dopry's goals was to allow that type of functionality.

Rob Loach, they're definitely related but I don't think there's much form stuff going on in this patch...

#75

Wim Leers - February 14, 2008 - 22:24

drewish, ok cool! :)

I'll await dopry's feedback to see if they can continue to be separate patches or not. I'd prefer separate patches because I have to ship my patch in D5 and D6 versions as well, for my CDN integration module.

#76

dopry - February 18, 2008 - 01:11
Status:patch (code needs review)» patch (reviewed & tested by the community)

@Rob Loach: No the form stuff is out of scope for this patch. This patch only includes api changes to file.inc and hook_file. I am working on a new file form element that leverages the new file api, since it allows file operations to be done asynchronously of the node creation process.

@Wim Leers: Again I think it is out of scope. file_create_url will need modification to support public and private files simultaneously.

@all: can we get reviews of this patch or a RTBC?

#77

dopry - February 18, 2008 - 01:34
Status:patch (reviewed & tested by the community)» patch (code needs review)

oops didn't mean to change status...

#78

Rob Loach - February 18, 2008 - 08:30

I'd be happy to give it a review, but I'm not too sure how to test it. Could you provide a module or something that makes use of hook_file()?

#79

Arancaytar - February 18, 2008 - 09:30

This will allow node-independent files for uploading, browsing and downloading, and move the n:1 file:node relationship to upload.module? Subscribing.

#80

breyten - February 18, 2008 - 22:24
Status:patch (code needs review)» patch (code needs work)

The patch from #68 doesn't apply cleanly anymore:

patching file includes/common.inc
Hunk #1 succeeded at 1787 (offset -3 lines).
Hunk #2 succeeded at 1888 (offset -3 lines).
Hunk #3 succeeded at 2246 (offset -3 lines).
Hunk #4 succeeded at 2256 (offset -3 lines).
patching file includes/file.inc
Hunk #14 FAILED at 474.
Hunk #19 FAILED at 641.
Hunk #20 FAILED at 662.
Hunk #25 FAILED at 1124.
4 out of 25 hunks FAILED -- saving rejects to file includes/file.inc.rej
patching file modules/blogapi/blogapi.module
Hunk #1 succeeded at 378 (offset 9 lines).
patching file modules/system/system.admin.inc
patching file modules/system/system.module
Hunk #1 FAILED at 1223.
1 out of 1 hunk FAILED -- saving rejects to file modules/system/system.module.rej
patching file modules/upload/upload.admin.inc
patching file modules/upload/upload.module
Hunk #2 succeeded at 261 (offset -8 lines).
Hunk #3 succeeded at 434 (offset -8 lines).
Hunk #4 succeeded at 450 (offset -8 lines).
Hunk #5 succeeded at 463 (offset -8 lines).
Hunk #6 succeeded at 493 (offset -8 lines).
Hunk #7 succeeded at 518 (offset -14 lines).
Hunk #8 succeeded at 599 (offset -14 lines).
patching file modules/user/user.module
Hunk #1 succeeded at 389 (offset -29 lines).
Hunk #2 succeeded at 1500 (offset 2 lines).

Otherwise, I'm just subscribing :)

#81

dopry - February 19, 2008 - 00:33
Status:patch (code needs work)» patch (code needs review)

This patch has been updated for head as of today, including the removal of drupal_clone.

@Rob Loach: hook_file upload module implements hook_file op==load and op==references. I don't think we will be able to fully test hook_file in this patch. I suspect as long as core file handling works as expected there should be no issues with getting it committed. We can clear up bugs and issues with additional hook_file ops in the future. Once hook_file is in core I will begin port filemeta and derivative to core, both of which utilize hook_filefield thoroughly.

@Arancaytar: how about reviewing the patch instead of just subscribing and adding useless noise to the issue queue. The file/node relationships were moved to upload.module in D6.

@breyten: this new patch should apply cleanly In the future you can just let me know the patch doesn't apply cleanly. I don't need to know which hunks failed :).

This patch implements:

  • hook_file which allows module developers to interact with the file lifecycle
  • stratifies the file.inc api into two layers, one that works on drupal file objects and one that works on filepaths
  • provides a reference counting hook so file_delete is only a soft delete, and will only remove records from the files table and delete the actual file if no modules claim to be using it.

How to test:

  1. Core file handling:
    1. upload logo in the admin >> themes section of your test site. (only tests path based layer of file.inc api)
    2. upload a favicon in the admin >> themes section of your site.(only tests path based layer of file.inc api)
  2. User.module
    1. upload a user picture. (only tests path based layer of file.inc api)
  3. Upload.module
    1. create a story and attach/remove files. (This tests the drupal file object layer of the file.inc api and hook_file op==load)
    2. create another story without any uploads. manually insert a record in the upload table point to a file
      that is associated with another node. Delete the file from one of the nodes through the UI. (This tests the soft delete feature)
AttachmentSize
142995.patch45.59 KB

#82

drewish - February 21, 2008 - 00:23
Status:patch (code needs review)» patch (code needs work)

looks like your last patch had some unrelated dblog changes that have since been committed... re-rolled to fix those failed hunks.

there's a problem with the way the user.module's picture are being tracked. the comment says that it should be making an untracked copy but it's calling file_copy() (rather than _file_copy()) but isn't calling files_set_status() to mark it as permanent. i kind of think it makes sense to track the user's profile picture in the database. it'd make it a lot easier to avoid bugs like #92224. if we do track them then we also need to delete the files in user_delete().

i wonder if it might be better to rename the _file_* functions that deal with untracked files to something like file_raw_*() since the leading underscore implies a private function and there are good reasons that people might not want to manipulate non-db tracked files. thoughts?

AttachmentSize
hook_file_142995.patch47.28 KB

#83

drewish - February 21, 2008 - 00:27

oh i should mention that i banged on the upload module portion of the code and that all seemed to be working correctly. i didn't get around to messing with the theme portions.

#84

dopry - February 21, 2008 - 01:27
Status:patch (code needs work)» patch (code needs review)

This patch fixes to user.module to use the fileapi as documented, also removes my db_log snafu which is another patch that snuck in there.

I think renaming the functions and altering the way user.module behaves is scope creep. We're doing a lot of API changes as is in this patch. I don't want to change more than absolutely necessary. I think _ is a perfectly acceptable prefix for the low level file functions. If you really want raw file access you have php's file functions. I also hope to implement the mountpoint system in the _file_* layer which will make them more private and drupal specific and not raw file access at all. If you want raw file access user php's file functions.

AttachmentSize
142995.patch45.66 KB

#85

moshe weitzman - February 22, 2008 - 03:31

I read through the patch and found only 2 small issues:

  • case 'delete':
    +      // Delete all information associated with the file.
    +      db_query('DELETE FROM {upload} WHERE fid = %d', $file->fid);
    +      break;

    Can we save a query here by checking if $file "belongs" to upload module?

  • file_save() does separate insert and update statements with each field named. It looks like the old code does a cleaner drupal_write_record(). search for drupal_write_record in the patch.

#86

dopry - February 22, 2008 - 04:30

To check if the file is owned by upload.module would require a select against the upload table. I opted for one query instead of two.

I didn't keep the drupal_write_record on purpose. I'd still have to do the insert/update logic in file_save and call drupal_write_record ( which includes a bunch of query building insanity ) in both conditional blocks. So it doesn't improve readability or enhance performance I didn't see any advantage in using it in this case.

#87

moshe weitzman - February 22, 2008 - 12:30

One of the benefits of drupal_write_record() is that you can add a column in your own install and add a properly named value to the $file object (for example) and drupal will automatically write to the custom column. I'm not going to +1 a patch that avoids using schema API functions as you are proposing.

#88

dopry - February 22, 2008 - 19:09

@moshe, here is one with drupal_write_record...

AttachmentSize
142995.patch45.21 KB

#89

moshe weitzman - February 28, 2008 - 15:49
Status:patch (code needs review)» patch (code needs work)

The test plan in #81 works fine except for the node upload which does not work properly at all. I could not attach a file when clicking the Preview button and I saw a number of NOTICE errors. After thats fixed, I'll test soft delete and we should be good to go.

There is a regression also for the small bug fixed in http://drupal.org/node/193331.

#90

dopry - March 4, 2008 - 13:48
Status:patch (code needs work)» patch (code needs review)

This patch fixes the $replace regression, and uses the correct table name in drupal_write_record which is what broke node uploads.

AttachmentSize
142995.patch45.09 KB

#91

keith.smith - March 4, 2008 - 14:56
Status:patch (code needs review)» patch (code needs work)

I read through this, and I think it all looks good. There are (a very) few typos in code comments:

+ * - Adds thew new file to the files database. If the source file is a

("thew")

+ * move is performed by copying the file to the new location and the deleting

(the -> then)

+  // file will not be deleted from drupal, but file_delete will

(we use "Drupal")

+  // return a populated array that tests as TRUE.

(Initial caps.)

+  // Make sure the file is deleted before removing it's row from the
+  // database, so UI's can still find the file in the database.

("it's" -> "its")

+  // Return true for non-existant file, but log that nothing was actually delete, as the intended
+  // result of _file_delete is in fact the current state.

(I think we use "TRUE", and also "delete" -> "deleted".)

+ *   - FILE_EXISTS_REPLACE - Replace the existing file
+ *   - FILE_EXISTS_RENAME - Append _{incrementing number} until the filename is unique
+ *   - FILE_EXISTS_ERROR - Do nothing and return FALSE.

(Inconsistent use of periods.)

+ *   - FILE_STATUS_PERMANENT - A permanant file that will drupal's garbage

("permanant" => "permanent", capitalize "drupal's".)

+ *   A numeric file id or string containg the file path.

("containg" => "containing")

-    '#description' => t('Display attached files when viewing a post.'),
+    '#description' => t('Determines whether files attached to nodes are listed in the node view by default.'),

(For better or worse, we generally avoid mention of a "node" in user-facing text.)

+        // return the name of the module and how many references it has to the file.

(Capitalize "return".)

#92

Arancaytar - March 4, 2008 - 15:10

Just on reading those lines...

+  // Make sure the file is deleted before removing it's row from the
+  // database, so UI's can still find the file in the database.

("it's" -> "its")

And UI's -> UIs. It seems that the apostrophe is only added when the acronym has periods, as in U.I.'s.

+ *   - FILE_STATUS_PERMANENT - A permanant file that will drupal's garbage
+ *                             collection will not remove.

("permanant" => "permanent", capitalize "drupal's".)

And drop the extra "will".

#93

aaron - March 4, 2008 - 17:05

subscribing

#94

drewish - March 4, 2008 - 17:43

made corrections based on Arancaytar's fixes.

AttachmentSize
hook_files_142995.patch48.56 KB

#95

dopry - March 4, 2008 - 18:01
Status:patch (code needs work)» patch (code needs review)

#96

mfer - March 7, 2008 - 11:07

subscribe.

#97

moshe weitzman - March 8, 2008 - 01:23
Status:patch (code needs review)» patch (reviewed & tested by the community)

I went through the test plan in #81 and all tests pass. Nice work.

#98

vnd - March 8, 2008 - 01:28

subscribe

#99

scroogie - March 8, 2008 - 13:51

subscribe

#100

Freso - March 10, 2008 - 14:03

This is so exciting! (And, *subscribes*)

#101

Arto - March 10, 2008 - 15:36

Subscribing.

#102

quicksketch - March 14, 2008 - 22:43
Status:patch (reviewed & tested by the community)» patch (code needs review)

I went over this patch today and found a few bugs which have been fixed in this version.

Bugs:
- There wasn't a way to test file_move() in core, so I added file_move() to index.php to try to move a file with a record in the database. But I received that the function "c()" was undefined. Looks like it's supposed to be _file_delete() from a previous patch.
- _file_move() was undefined, so I was getting undefined function when enabling CSS/JS caching. I wrote a function for this purpose and utilized it in file_move() rather than doing a _file_copy() and _file_delete inside of file_move().

Docs:
- I updated all the documentation so that the first line of PHPDoc is a brief description, with a longer description separated into another paragraph. I also elaborated a bit on file_move vs. _file_move (and other similarly named functions).
- Added @see lines where appropriate.

AttachmentSize
hook_files_142995.patch48.55 KB

#103

quicksketch - March 14, 2008 - 23:11

I find that using _file_move/_file_copy/_file_save all seems very dirty. The preceding underscore is supposed to mean "private", and not to be used outside of the module it's defined in. Encouraging module developers to use "non-API" functions throughout their modules as system, upload, and user have done in this patch would introduce new inconsistencies in what it means to prefix a function with an underscore.

So, to correct this I've renamed all the prefixed functions with a new name: the suffix '_plain'. So we now have file_move_plain, file_copy_plain, and file_save_plain. I considered trying to rename 'file_move' to 'file_record_move' (or similar), but because file_move invokes hook_file_move(), I think it's the right decision to rename the prefixed versions of the functions.

AttachmentSize
hook_files_142995.patch48.66 KB

#104

quicksketch - March 16, 2008 - 07:27

Fixes remaining _file_delete() calls to file_delete_plain().

AttachmentSize
drupal_hook_file.patch48.66 KB

#105

drewish - March 16, 2008 - 21:25

quicksketch, I'm glad you changed those function names, back on comment #82 I'd proposed _raw but I think _plain is a better choice. I'll try to give this a review shortly.

#106

dopry - March 19, 2008 - 01:15
Assigned to:dopry» Anonymous

You guys have fun. I'm totally off this issue. There has been no feedback from a committer in a while, and I'm not going to review superfluous change at this juncture in this patches