Now that Drupal core has a file entity, we would prefer to extend that entity instead of having the Media module create a new entity

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

becw’s picture

I pushed what we worked on this afternoon into a sandbox. It runs. http://drupal.org/sandbox/bec/1087308

aaron’s picture

w00t!

aaron’s picture

JacobSingh’s picture

So, this *kinda* runs, but it doesn't solve the tough part here. Since there is no file object on the file entity, it doesn't really display files. Adding a file object to a media entity has always been a weird kludge, but the fact of the matter is, if you add a file object to a File entity, the universe explodes as it references itself.

So to really do FileEntity, and not go into a recursive loop, it seems that we'd have to rewrite how filefield works. Or have a completely different way of talking about formatters / view modes.

Ultimately, filefield formatters are just talking about the file, and we want something which talks about the entity. We want a node reference pattern, without all the node junk that comes along with it and with some other UI enhancement specifically for dealing with media.

It might be that the "Media" entity we have (meaning the file + the fields) is actually the right approach even if kinda weird. The issue people run into then is choosing a formatter for a Media field which is actually a media view mode and feeling restricted by that. (e.g. you pick Large which references a different formatter depending on the type of media). I understand this, but I also am not sure it is solvable.

So where do you go:

1. FileEntity is fieldable, however the formatters used to reference it no longer do it justice. Image styles no longer work, etc because they only talk about the file, not the entity (file + fields). Ultimately then, they would have to reference a view mode (like w/ media), and on the manage display page we'd have to hack in a formatter selection even though there is no file field. This means changing how filefields are referenced when using formatters AND hacking the manage display / fields API bit.

2. Keep what we have (optionally move Media to a new base table, something I've been reticent about, but increasingly okay with). Then we basically have the old node-as-file reference pattern. Only now, we have targeted interfaces and workflow for Media w/o all the node overhead.

becw’s picture

I agree, Jacob. This was a useful proof of concept; it's clear now that changing the behavior of the core file entity would prevent us from taking full advantage of the file + fields + styles stack.

Would moving Media to a new base table (your #2) mean creating a table that stores the file id and the media type? Would all files be represented in this table? Because this would create an opportunity to have multiple media types that apply to a particular type of files (ie, pngs could be 'photo' or 'screenshot' media types, rather than just 'image' media).

JacobSingh’s picture

@bec

I'm not sure we'd want an n:n join there if we did pull it out... I was thinking more that we would then make the separation more clearly understood. I don't know the performance impact though of doing this. One use case I do see coming down the road though is the Video with screenshot file field and video with multiple formats as different fields...

I could see some usecases where you want different meta-data on the same file, but I like the fact that we just create an entity for every file on file_save now and it isn't another API you have to intentionally use. If we lost that association, I feel like we'd be hurting ourselves... Whattya think?

Best,
Jacob

quicksketch’s picture

I know I'm jumping in on this from a theoretical standpoint (since I haven't looked much at the code). But regarding Jacob's question on how to proceed with this approach:

1. FileEntity is fieldable, however the formatters used to reference it no longer do it justice. Image styles no longer work, etc because they only talk about the file, not the entity (file + fields). Ultimately then, they would have to reference a view mode (like w/ media), and on the manage display page we'd have to hack in a formatter selection even though there is no file field. This means changing how filefields are referenced when using formatters AND hacking the manage display / fields API bit.

I totally agree that formatters are not suitable for including a full range of display options for all the fields attached to a file entity. What I would have expected here is that we create a dedicated UI for configuring display modes for different file types (audio, video, image, other) similar to what we have for configuring fields on node types or taxonomy terms. Then when you configure your formatters, you actually choose which display style you want to render. So in other words, the display style *is* the formatter. You configure how that display style works just like you do for any other entity under a dedicated "Display fields" tab when configuring the file entity.

If you wanted to get super fancy, you'd probably actually only create a single formatter, like "Entity display", which then is configurable to select 4 different display styles, depending on whether the file uploaded is an image, video, audio, or other; making it so that you could upload whatever you wanted into a single field and it would be rendered according to it's $file->type appropriately.

I'm still a huge fan of the File entity over a Media entity, so I'd love to see this continue forward.

EDIT: Wrapped Jacob's comment in blockquote.

aaron’s picture

@quicksketch: you describe the architecture of the styles module precisely there! particularly the 'Entity display'. it's the stupid ui for it all where everything goes haywire right now; i'd live to get some #d7ux eyes on this...

JacobSingh’s picture

@quicksketch: We need to chat about this more. You just described exactly the architecture of media + styles. But the problem is mainly in the UI, and the WTF of a parallel entity system, etc... We need to fix this. Can we pick a time to have a 1hr call w/ screen sharing to hash out some objectives?

becw’s picture

I think quicksketch is right on. The fact that the architecture quicksketch describes matches the intent behind the way the Styles module works right now is a good indication (to me) that it's the right direction. If there's an opportunity to take better advantage of D7 view modes and formatters, I'm all for it.

effulgentsia’s picture

Assigned: Unassigned » effulgentsia

FYI. I'm continuing on becw's sandbox. Will post again here when there's something worth looking at.

effulgentsia’s picture

My latest work is pushed to http://drupal.org/sandbox/bec/1087308. There's now a separate "File Entity" module in there, along with a "File Entity Test" module for demonstrating it. Those can be enabled without Media or Styles. Media is also rewritten there to use it, and it no longer has a Styles dependency. I still want to add a little glue code, so that people who do have the Styles module can benefit from its advanced features and existing other modules that are written as Styles plugins, but the basics of getting files to be formatted differently based on the file is now working entirely via a new hook, hook_file_formatter_info(), inspired by Field formatters, but made appropriate to the needs of files.

Either this weekend or Monday, I'll write up an architectural summary.

At this time, there's some BC breakage with no update functions yet written. So if you want to play with this, do so using a fresh install.

JacobSingh’s picture

First thing:

Sandbox
I'm thinking we should put a branch in the project for this and not use a sandbox. I'd like to make some commits as I look at stuff or at least make it more widely accessible since it's really far along.

A few next steps

- I can't see previews out of the box when I install. I figured out eventually how to configure this, but we need to add it to the media install so that the browser works. This applies to all styles which would commonly be setup I suppose.

- Also will need an issue to fix the install profile

- We need to prove the model by providing preview filters (or sourcing them) for non-images. If nothing else, we have to fix the youtube formatter stuff. Not sure how this will work yet...

- WYSWIYG is broken if you try to select a view mode for which there is no formatter provided.

- Even though the file entity is not bound to media (which is awesome). It seems like you can't really use it w/o media. Do you plan to extend the existing file and image fields somehow to support it?

effulgentsia’s picture

I still want to add a little glue code, so that people who do have the Styles module can benefit from its advanced features and existing other modules that are written as Styles plugins

That is now working and pushed to the sandbox. I added directly into the File Entity module the ability to reuse file field formatters, and therefore, by extension, Styles (since they're just a special case of file field formatters). This also provides the foundation with which to write update functions for existing users of Media, though I have not yet written them.

We need to prove the model by providing preview filters (or sourcing them) for non-images

There's now a media_youtube_test module in the "modules" folder that does exactly that. Its code can eventually move to media_youtube itself. It shows that the system works without Styles. Though again, you can also use Styles if you want to.

Even though the file entity is not bound to media (which is awesome). It seems like you can't really use it w/o media.

Media is the canonical use-case right now. But it's written to be possible to use without it. All you need is a module that defines a page for viewing a file entity, a form for adding/editing one, and one or more type definitions within hook_file_entity_info(). The "File Entity Test" module demonstrates exactly this minimum. I'm sure that quicksketch will want to use this without the Media module, and will find an appropriate home in which to implement the equivalent of what's in File Entity Test, plus whatever extra stuff he cares about.

Meanwhile, the Media module defines the basic media types (image, audio, video, other), some view modes (preview, large, etc.), a UI for selecting existing file entities by browsing your "library" or adding new items to the library, a field that connects to that browser UI, wysiwyg integration for that UI, and some other stuff.

By the way, the system is also written to support multiple modules that can separately define file types, view modes, and formatters. For example, you can enable both Media and "File Entity Test". So for example, someone might create a foundational module for document management, separate from Media, and then that module can co-exist with Media, but not require it.

Either this weekend or Monday, I'll write up an architectural summary.

I lied. Will try to do it tomorrow though.

A few next steps...

Thanks for the review. Will try to make progress on those tomorrow.

JacobSingh’s picture

We should really move file_entity to its own module, huh? so we can file bugs against it?

anyway, this bug:
#1015580: Media entities : Cannot access empty property in field.attach.inc line 677 is happening again. I'm going to re-roll the fix for media onto it.

JacobSingh’s picture

Status: Active » Needs review
FileSize
14.5 KB

Okay this fixes the most heinous bug:
#1015580: Media entities : Cannot access empty property in field.attach.inc line 677.

I'll spare you reading the entire thing, but basically if you have a column specified as the bundle key (type in our case), and that column has a null value in it. Drupal will implode with a fatal error.

This was fixed in media by doing two things:

1). Trying to find the type and updating upto 100 records on install.
2). If there are more than 100 file records with no type, it showed a nag message on every admin screen until you click it and go to a form which launches a batch operation to "re-type" everything.

To fix this in file_entity:
1). Made the type field default to 'default' and NOT NULL
2). Implemented hook_file_type_info to declare the 'default' type in file_entity.
3). Moved the batch rebuild code and batch op to file_entity, changed the language a bit. (this is at admin/config/file-types/rebuild but not exposed in the UI right now. Not sure if it should be
4). Left the "fix it on install" code and nag message in media since Media is adding types, file_entity doesn't care - everything gets marked as default.
5). Changed the file_presave() hook which is supposed to rebuild type info if there is no type to check for no type or type == 'default' I did this because file_load() doesn't even work if there is no type.

Status: Needs review » Needs work

The last submitted patch, fix_undefined_type_in_file_entity.patch, failed testing.

quicksketch’s picture

Yay! Nice one. Great to see this moving along. Such segmented out code will make putting it into Drupal core much easier (and easier for non-Media modules to benefit from a file entity in D7). I'm all-for a separate project.

jeffwidman’s picture

subscribe

effulgentsia’s picture

Assigned: effulgentsia » Unassigned
Status: Needs work » Needs review
FileSize
138.64 KB

Latest work is pushed to the 1086958-file-entity branch of the media repo (becw's sandbox is no longer the latest). For those who prefer patch format, here's a patch against 7.x-1.x. I believe everything is essentially complete and ready for merging to dev, though it's a big patch so could use a review or two (or ten). We should also release beta4 prior to merging this to 7.x-1.x, so people can get up to date on something stable, before this destabilizes everything again (for example, Media Gallery and some other media related modules will be broken by this patch, until they update their code accordingly).

I still plan on writing up a summary for this. Sorry I haven't gotten to it yet. I decided to spend my time finishing the code itself first. Much thanks to JacobSingh for several intermediary reviews that helped get it to this point.

Leeteq’s picture

Subscribing.

Trunkhorn’s picture

sub

emackn’s picture

#20: media-file-entity.patch queued for re-testing.

emackn’s picture

patch applied to my 7.x-1.x branch, had to run update as well. things seemed to be working OK. but yes, all the other media dependent modules started failing. Disabled them and issues went away. Any update on write up for patch? Just curious to read it.

effulgentsia’s picture

Ok, here's an explanation of what this patch does and why. I hope it's helpful for whoever wants to review the patch.

The background situation is that Drupal core's File entity is incomplete in 2 ways:

  • It's not fieldable. Prior to D7, you could only add fields to nodes. In D7, you can add fields to nodes, comments, taxonomy terms, and users, but not to file entities.
  • There's no API for viewing a file entity. Drupal core comes with node_view() and node_view_multiple() functions for viewing nodes, comment_view() and comment_view_multiple() functions for viewing comments, etc., but no equivalent for files. Within core, the way you view a file is via a file field or image field, and their respective formatters.

The Media project has a long history (as part of other projects, like http://drupal.org/project/asset, http://drupal.org/project/emfield, and others) of living with these limitations. If I remember correctly, even when the current (D7) incarnation of the Media project started, D7 core at the time didn't really have a well defined concept of files as entities: we still thought of them as fields only. D7 core later evolved to define files as entities, though as per above, quite limited ones.

So, the architectural direction that Media took was to define a 'media' entity type consisting of a mandatory file field plus additional fields (e.g., Tags, Description, License, etc.) that other modules or the administrator could add. Sounds a lot like resurrecting the old debate of Image nodes vs. image fields, doesn't it? But this time with a twist. The special relationship between the media entity and the file field would be preserved by using the file's fid as the id of the media entity, and a whole lot of other code got added to address the simultaneous goals of wanting the media entity to both be the file and also be more than just the file.

The other component of all this is the Styles module. If you structure a Media entity as *containing* a file field, then the way you display the file portion of the media entity is via a file field formatter. But, how the file needs to be displayed depends on the file itself. The markup for displaying video is different than the markup for displaying an image, and the markup for displaying a YouTube video is different than the markup for displaying a blip.tv video is different than the markup for displaying an uploaded FLV file. One way to deal with this is to map each of these to a different field type, so you can have a blip.tv video field, and a youtube video field, but that leads to various annoyances, so the Styles module provides its own architecture for keeping it all as file fields, but allowing for a lot of administrator control and runtime flexibility in what happens when a Styles module provided file field formatter is told to display a file field. The price of this flexibility is complexity, as Aaron and I wrote about, and as people are struggling with.

The key thing that's happened since the Media project went down this approach is the D7 core entity APIs have improved and stabilized. While the core file entity has the limitations I mention at the top of this comment, the core APIs allow for the file entity type to be extended. That's what this patch does: it adds a File Entity module that extends the File entity type to be fieldable and viewable, and then changes the Media module to use the File entity type directly, not via a Media entity. An update function is added to the Media module to update existing site content and settings accordingly.

Fieldable file entities:
When you make an entity fieldable, you also need to decide how its bundles are determined. You can either have something like the node entity, where $node->type determines the bundle (so that you can have different fields on Article nodes vs. Page nodes), or you can have something like the user entity, where while you can add fields to users, you can't add different fields to different sets of users. Media's history is to assume that people will potentially want different fields for different types of media (audio might require an "album cover" field, whereas images might not). Therefore, the File Entity module in this patch takes that approach. But, as per #14, the goal of File Entity is to be usable indepenantly of the Media module, and potentially with multiple different modules using it (for example, a document management module in addition to the Media module). So, File Entity adds a 'type' column to the {file_managed} table, and connects that column to field bundles, but it does not by itself "install" any particular types (e.g., "image", "audio", etc.). Rather, it invokes hooks to find out from other modules what the different file types are. The Media module implements the hook to define its "image", "audio", "video", and "default" file types.

No self-referential field:
As per first paragraph in #4, it's not desirable to have a file entity contain a file field that references itself. So, this patch doesn't do it. Instead, it implements hook_field_extra_fields() to let the administrator configure the file's visibility and weight relative to the other fields on the "Manage fields" and "Manage display" pages.

Managing file display without the Styles module:
The File Entity module includes file_view() and file_view_multiple() functions to provide an API for viewing the file entity. Note that these are API functions only; the File Entity module itself does not add any hook_menu() pages for viewing files; that is left to other modules to do (and deal with things like access control accordingly). Still, even if just an API function, the API function must work, and that gets us into the same problem space of flexible file display described above for which the Styles module exists. However, since we're no longer dealing with a file field, but rather with the underlying file object, we don't have the same limitations that Styles has to deal with. We're not limited by existing field formatter APIs for addressing this problem. Instead, we can define a new API, which this patch does. The File Entity module defines a hook_file_formatter_info() hook, which is similar to hook_field_formatter_info(). The patch doesn't yet document it fully, but the File Entity Test module has an example, and the file_view_file() function can be read to see how the hook is invoked in order to display a file. The way it works is that unlike field formatters (where the administrator can only enable one for a particular field instance in a particular view mode), with file formatters, there's a dedicated "Manage file display" tab next to the normal "Manage display" tab, and on the "Manage file display" tab, you can enable multiple file formatters. The UI is nearly identical to the way in which you can enable multiple text filters when configuring a text format. But, in this case, instead of all enabled formatters being applied, it's more of a pass through system. Each formatter can either return a render array or not. Once the first formatter returns one, that's the file's display. But this way, you can for example, configure a youtube formatter and a blip.tv formatter, and each of these can be coded to only return a render array if the passed in $file is from the expected provider, so when a youtube file is being displayed, the youtube formatter grabs it, and when a blip.tv file is being displayed, the blip.tv formatter grabs it. Think of it as Styles lite. It's a simpler UI, and a simpler API, baked right into the File Entity module, so that the File Entity module can run without any dependency on Styles.

Styles module (and all other file field formatters) can still be used:
The File Entity module implements a bridge function, file_entity_file_formatter_file_field_view(), that allows any file field formatter to be used for displaying the file. Even though the file isn't a field, this bridge function mocks up the parameters needed to make it look like it is to field formatter functions. So, if you have the Styles module enabled, you can use it, and all of the plugins already written for it. This also provides a seamless upgrade path for existing websites using Media (and therefore, Styles).

Media cleanup:
With a fieldable and viewable file entity module in place, the rest of the patch is just cleaning up Media to use it. Specifically:

  • Removes MediaEntityController.
  • Removes media_type_configure_fields(), media_type_configure_formatters(), media_load(), media_save(), media_admin_type_list(), and other code that specifically dealt with managing the media entity / file field relationship.
  • In lieue of media_enable() calling media_type_configure_formatters() to apply certain file display defaults, similar logic is now handled in media_file_type_media_default_view().
  • Renamed media_file_view() to media_view_page(), for consistency with node_view_page(), and to avoid it running as hook_file_view().
  • And other details that are hopefully self explanatory within the patch.

BC breakage:
By its nature, this patch does break quite a lot of backwards compatibility. Fortunately, we're still in beta, so we can do that without moving to a 2.x versioning. Though as per #20, we should get beta4 out first to help manage the transition. Modules that would need to be updated are any that invoke API functions that assume the media entity / file field architecture (e.g., media_load() and media_type_configure_formatters() are two of the most common). This includes the Media Gallery module that the Acquia engineering team is committed to updating once this patch gets committed. We'll want to make sure other Media related module maintainers get notified of this as well. But not all media related modules fall into this category. In particular, modules that simply implement provider integration (e.g., the Media: YouTube module) should not require any changes. In my testing, the Media: YouTube module continues to work exactly as is. This patch does include a "media_youtube_test" module, but only to prove that the non-Styles model of displaying files works as intended, not to imply any breakage of provider modules.

bryancasler’s picture

subscribe

JacobSingh’s picture

Epic write-up

Dave Reid’s picture

Quick thought: I would think that other contrib modules might want to be able to take advantage of fieldable file entities, but to do so they'd have to declare dependencies[] = file_entity while the actual download dependency is Media module. Maybe file_entity should be spun off into its own contrib module?

JacobSingh’s picture

It will be. But for the time being, since we've got huge refactoring to do, it will stay in Media as we build around it. It would be great if there were other maintainers who want to step forward to work with file_entity. I think that's a big motivating factor for moving it out as well. ;)

JacobSingh’s picture

Status: Needs review » Fixed
quicksketch’s picture

Wooohoo! Thanks guys. This was a great issue to watch from the peanut gallery. :)

bryancasler’s picture

agreed

RobW’s picture

There should be a best-of-Drupal a la craigslist for that write up.

good_man’s picture

subscribe just for the sake of the giant patch + explanation.

Niklas Fiekas’s picture

Subscribe.

aaron’s picture

wow! great work, @effulgentsia!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.