There are a few places in the D7 port of Project* where it seems like the most simple and sane solution involving "off-the-shelf" parts would be to add a dependency on field_collection:
#1609382: Port release files with extra metadata to D7
#1612050: Figure out how to represent branches for D7 project_release
...
I really like the idea of just using a generic module that lots of other sites are using, instead of trying to roll our own custom entity types or otherwise implement these ourselves in Project*.
However, I don't want to get into a situation N weeks from now when I'm ready to deploy and the rest of the Infra Team freaks out about this new dependency. ;) So, I wanted to open this issue now to get "pre-approval" for deploying this.
Other than it being a useful/generic solution to some problems we're facing, field_collection has the following things going for it:
- timplunkett is an extremely responsive and helpful co-maintainer. He's already offered to help in any ways he can as needed for d.o.
- both myself and merlinofchaos have worked with and contributed to field_collection in the past
- even though it's still only on beta4, it's already got 14078 sites using it
- timplunkett has said that an official 7.x-1.0 release is entirely feasible by the time we'd actually be deploying
Not sure what else to say. ;) killes, nnewton, drumm -- any objections?
Thanks!
-Derek
Next Steps
This issue is no longer postponed now that #1609382: Port release files with extra metadata to D7 is fixed.
Comments
Comment #1
gerhard killesreiter commentedcan we see the kind of queries that it produces?
Comment #2
dwwUhhh. I don't know what you mean. It's D7 field API hell. It's a module for basically "fieldable fields". You make 1 field, on the parent entity, which is essentially an entity reference to a sub-entity. Then you can add whatever fields you want to the sub entity. So, for example, on a release node entity, we might have a field collection called "release files", which is multi-valued. This just points to a new entity that field_collection creates for us called "field_release_files". And then we hang whatever fields off of that that we need for each file associated with a release (a file field, download count, md5hash, etc).
Fields might not even live in a DB, so "the kind of queries that it produces" is sort of a meaningless question. Everything gets reassembled during entity_load(). That's The D7 Way(tm). You missed your opportunity to object to that about 2 years ago. ;)
Cheers,
-Derek
Comment #3
gerhard killesreiter commentedWell, whatever the API, it might somehow impact the database. And since that's the part of the infra that is hardest to scale, the question makes totally sense. If the answer is "it doesn't impact the DB queries", then this can be marked fixed, of course.
Comment #4
dwwI think you need to spend more time with the internals of D7 to know what I'm talking about. Field API has pluggable storage. By default, every field lives in a separate DB table. However, you can swap out the storage such that all the fields for a given bundle (think "node type") live in a single DB table. Or, you could store your fields in MongoDB. Or any other crazy storage mechanism you want. The code isn't supposed to know how you're storing your fields.
Nothing about field_collection changes any of these fundamental aspects of how D7 fields work. All field_collection is doing is making it easier for me to assemble the fields and entities I care about to represent the stuff I need to represent. And it makes it possible for sites to extend the kinds of things Project* is storing just via the UI instead of necessarily having to write code for it.
But ultimately, the (non)performance of all the fields and entities is D7 core's problem. That's really a separate question. I agree that we should be afraid of how d.o is going to scale with D7 core fields. We're probably going to have to jump through a lot of hoops to make that work. But that's totally orthogonal to the question of if those fields are coming from the core field UI, if they're created programatically in Project*, if they're clicked together via field_collection, etc.
Comment #5
gerhard killesreiter commentedI still want to know the database impact. :p
If you say, that the impact will be the same regardless whether we use field_collection or not, that's fine too.
Comment #6
senpai commentedI think what we should do is benchmark the loading of two entities. One with a hundred fields in it, and one with 10 fields that also includes a field collection containing 90 fields. Let's see if there's any real world difference between the two in a direct comparison test. If it's acceptable, we use Field Collection. If it's not, we don't.
Come to think of it, the creators of Field Collection have to have done something like this before in order to see if their idea was valid. Let's ask them for some metrics, shall we?
Comment #7
dwwJust had an IRC chat with nnewton, killes, and timplunket. Summary:
So, that's the plan. I'll move forward with field_collection at #1609382 and then we can circle back and try to assess again once we have something real to test. For now, this issue is postponed...
Comment #8
damien tournoud commentedI would recommend not using Field Collection in places where it would make more sense to have a custom entity type. It's the case where there is a strong business logic in play. Not sure it applies here or not.
The impact of field collection itself in terms of performance should be the same as creating a custom entity type. Nothing we cannot manage by denormalizing some queries.
Comment #9
j0rd commentedI'm curious if the performance implications of Core Fields vs. Custom Entity vs. Field Collection ever came to light. This would be useful information for people like me, who find this thread via Google.
By guessing I would assume Custom Entities would be less of a burden on the database due to having multiple field values in a single table, when using the standard MySQL database backend....but I'm curious what you guys have discovered.
Comment #10
senpai commentedThe field_collection module is now a dependency of D7's project_release, and thus it needs to go live on drupal.org.
Comment #10.0
senpai commentedAdding a Postponed section
Comment #11
senpai commentedSo that's that then. Since it's been five months with no benchmarks and no real decision-making discussions, the module is already in the the staging server's codebase and is also a dependency of this new D7 site, we're going with field_collection on drupal.org.
If @nnewton's performance testing finds this to be a problem, we can revisit it then. Until that point, carry on my friends.
Comment #12.0
(not verified) commentedThis issue is no longer postponed now that #1609382: Port release files with extra metadata to D7 is fixed.