continuation of the thread i started on the devel list:
http://lists.drupal.org/pipermail/development/2007-May/024194.html
basically, everything and its dog stuffs whatever it wants directly into $node
. there are tons of problems caused by this, e.g. where many modules all think $node->pid
is their pid. pid's a great example, since that nails even core (both book.module and path.module think they "own" $node->pid
.
attached patch is an example of one way to enforce this. nodeapi('load') gets an array back from each module. instead of array_merge()'ing that into $node, it could put each array into $node->module_name instead.
however, node.module doesn't itself use nodeapi, it does its own thing in node_load(). in order to make this not require a 2000+ line patch, which will basically break every other patch in the queue and be very difficult to get into D6, i was going to use a shortcut, and say that everything in {node} and {node_revisions} that's handled by node_load() gets to stay at the top level of the $node object, so every place in all of drupal that does $node->nid
, $node->title
, etc, etc, would still work.
unfortunately, this won't work, either, due to comment.module. :( {node} contains a 'comment' column, which is how we know if comments are enabled or disabled on a given node. that'd get clobbered by the $node->comment
array when nodeapi loads in comment's additions. :(
comment is the only conflict like this in core, but in IRC, chx pointed out we'd be screwed by a contrib calling itself the promote.module, for example. :(
so, either we have to break *everything* and have $node->node['comment']
(and therefore $node->node['nid']
), or we have to give up on this approach for nodeapi's results to get stuffed into an array at $node->module_name
...
also, chx was dreaming about having this exactly mirror the schema array, and have $node[table_name][column_name]
but that couldn't be enforced at the nodeapi('load') level, since not every DB table matches the module name, and a given module might have N tables. so, if we wanted to go schema-style, we'd have to rely entirely on convention, and we couldn't force modules to get this right.
Comment | File | Size | Author |
---|---|---|---|
node_namespace_0.patch.txt | 784 bytes | dww |
Comments
Comment #1
dwwoh, and maybe a compromise to avoid the worst of the callsite badness, maybe we keep
$node->nid
, but then put everything else in their respective module-named-arrays. something like:Comment #2
pwolanin CreditAttribution: pwolanin commentedcould hook_nodeapi just check isset($node->modulename) before doing the assignment? This would still penalize the novice developer, but in a way that's perhaps less likely to make their whole site break.
A change of this magnitude also makes a change of $node->comment to $node->comment_setting (or some such) seem reasonable.
Comment #3
nedjoYes, we need to fix this.
In doing so, we should be thinking ahead to the possibilities that the new schema API opens up.
From http://drupal.org/node/145684:
With this structure, eventually, we won't need
node_load()
ornode_save()
.drupal_load('node', 21)
will be enough. We read what needs to be loaded (or saved) from the table relationships as mapped in the schema API.Yes, those heady days are a bit far off now. But if we're taking the trouble to change our object structure, we may as well get it right.
Comment #4
pwolanin CreditAttribution: pwolanin commentedthis patch: http://drupal.org/node/146425#comment-253153 now puts all the book module data in $node->book, and nothing else in the top-level $node object.
Comment #5
Frando CreditAttribution: Frando commentedThis is such a great thing.
I mostly agree with nedjo, and I like his '@tablename' syntax very much.
I think it is better to stick data into arrays named after the tables the data comes from than to just have an array for each module that invokes nodeapi, because this would be a big step towards a data api. There would however also have to be a place to stick data in that is not directly coming from a table (e.g. node links).
So, what we could do is
- change the $node object to a $node array (for consistency with the sub-arrays)
- only fields of the node table get top level keys
- data coming from other tables (like terms or files) lives in a '@tablename'-array
- other data that modules want to insert into the node array are either stored in '#properties' or in a 'modulename' array (this is basically the same discussion as with the node rendering patch - what is an array, and what is a child? There could also be namespace conflicts between module arrays and fields of the node table, as they'd both be top level keys. Hmmm.)
This would be a major step forward.
A fixed syntax of data arrays makes many things possible, a unified data syntax is, now that the schema API is in, probably the most important step towards a consistent and simple CRUD or data API.
Comment #6
pwolanin CreditAttribution: pwolanin commentedWhile I see the possible advantages of naming per table, how would you *enforce* this? A major advantage of the suggestion above to have the nodeapi code do this is that it is automatic.
Comment #7
chx CreditAttribution: chx commentedWe will enforce it by documentation and good example (ie. core). Woe upon to you if you do not keep convention. In an upcoming Drupal version your code will suddenly break...
Comment #8
pwolanin CreditAttribution: pwolanin commented@chx - I though that that breaking code was the whole point of new Drupal versions? Not much of a threat ;-)
Comment #9
dwwGlad to see lots of clueful folks participating here, that's a good sign. ;)
After spending time with this yesterday, I'm afraid of allowing *anything* (with the possible exception of nid) at the top level of the $node object/array -- otherwise, we're going to have namespace collisions again (modules and/or tables that conflict with columns from {node} and {node_revisions}, etc). If we had different syntax for table names, module names, and raw column names, these collisions wouldn't necessarily break things (though it could get a little confusing). With enough crazy syntax, we *could* keep some stuff at the top level, but if we're moving it to an array, we're going to have to break *every* call site, anyway, so it doesn't really buy us much to keep anything at the top.
I can certainly see the benefit of mirroring the schema API's array layout, but it's true we also have stuff in $node that's not from the DB (at least not directly) such as links, etc. I'm also very attracted to the nodeapi('load') forcing you into a unique namespace already (since every module must have a unique name). But if core is universally doing the schema-based array thing, and it's documented as such, I can live with nodeapi('load') still providing you with access to the whole $node so you can populate it as appropriate for your module.
Re: syntax. If we're going to special-case tables in here, how about we use
{tablename}
, which is the universal Drupal visual cue for a DB table? If we end up needing both table names and module names,@
is nice for modules, since it's sort of like an address conceptually ("if I'm not getting this info directly from a table, where must I go to find it? It's over there @ the foo module..."). Then, the actual column names have no special syntax at all. So, a new $node array might look something like this:It's certainly cleaner/safer this way, and is a step towards the data API Nedjo and others are talking about, which is cool. However, I see a few potential problems with this:
$node['{node}']
array to indicate it's been joined on *that* particular uid, or do we just put it at the top level like my example above, and somewhere document that the stuff in$node['{users'}]
is data JOIN'ed with the uid from {node}?$node->links
array? This data is the result of a hook invocation, so it's neither from a specific table nor a specific module. However, it's also not a property of the node array in the regular FAPI sense of#property
entities. Maybe we also have$node['%hook_name']
entries? such as:$node->taxonomy
doesn't strictly follow the schema's table layout. It's also got some more funky JOIN stuff going on (sort of like the problem with {users} above). So, it's not accurate to use:Perhaps this is a candidate for just saying "this is taxonomy.module's data in its own special format", and treat this as:
However, I fear developers will be confused by this, and it'll introduce this yucky ambiguity of if you should do it table-based using {} or module-based compound data using @.
All that said, I think we're getting close to something worth starting a patch for. I'd just like to have a better top-level view of (and agreement on) where we're headed and how we're going to address these issues before I start writing a script to convert the call sites and hack up node_load(). ;)
Comment #10
eaton CreditAttribution: eaton commentedThis is the stuff that was hashed through on the *display* side for the node rendering patch. Is there any chance that, if we're going to the work of reshuffling everything about nodes already, we could consider using renderable formapi style array? that would allow us to attach quite a bit of metadata if we needed it, and it supports constructs like access control on elements, relatively clean nesting of sub-elements, and so on.
I would REALLY hate for us to reinvent the wheel. If we're smart about it, we could even make it possible for the results of node_load() to be directly renderable for simple node types.
Comment #11
pwolanin CreditAttribution: pwolanin commentedI'll again argue for the module name solution. I think many modules join 2 or 3 tables to add data to a node, and it makes much more sense to me to have that data together in one array.
Also, in terms of forms, this leads pretty naturally to constructs like:
And then the form values map directly back into the right parts of the node object for the preview, etc.
Comment #12
eaton CreditAttribution: eaton commentedI'm startign to feel a bit concerned about this direction. One of the real advantages of Drupal's system (thus far, at least) is that it gives modules 'equal access' to stuff that's being built. I worry that if we try to enforce a rigid standard like this, we'll just end up pushing more and more complex logic into other areas of the system.
One of the benefits of using FAPI-style arrays is that, in theory at least, #module and #db_table properties could be easily added to any element if needed. Am I missing something? Adding more and more prefix types and naming conventions seems... less than ideal.
Comment #13
pwolanin CreditAttribution: pwolanin commented@eaton - I'm not sure I have a clear idea of the method you're suggesting. Can you post a couple lines of example code?
Comment #14
Frando CreditAttribution: Frando commentedI think eaton is talking of something like this:
$node['node'] = array('#module' => 'node', '#table' => 'node');$node['node']['nid'] = ..
$node['node']['title'] = ..
$node['node']['body'] = ..
$node['author'] = array('#table' => 'user');
$node['author']['uid'] = ..
$node['author']['name'] = ..
$node['taxonomy'] = array('#module' => 'taxonomy')
$node['links'] = array(...)
... and I've to admit that I quite like this... I'll definitely think about this some more.
Comment #15
eaton CreditAttribution: eaton commentedThat still leaves the important question of what the *standards* for that structure should be... but goodness if it doesn't feel more familiar to someone who's worked with FAPI before.
The fact that we already have lots of utility functions for working with this style of array, that we can (if we plan things right) render it straightaway, and the fact that moving forward even something like FQuery could be used for complex manipulation... those are bonus.
Comment #16
eaton CreditAttribution: eaton commentedAlso, it's worth noting that in this scheme, things like nid and title and body would need to be #properties rather than elements. The final tricky bit is that $node['#type'] should probably be 'node' rather than 'story' or 'blog' etc, for safety's sake. But I digress.
Comment #17
RobRoy CreditAttribution: RobRoy commentedThis sounds awesome. Each step we take to become more consistent across the board, is lowering the barrier to Drupal development, reducing code, and giving us greater flexibility to do some powerful stuff. This is a logical step and gives us one more object in that familiar construct. We already have hook_nodeapi('view') adding pieces to $node->content somewhat like this, so why not construct the whole $node array like that?
Thank goodness for the extra 4 weeks. So what's the game plan now?
Comment #18
Jürgen Depicker CreditAttribution: Jürgen Depicker commentedsubscribing
Comment #19
eaton CreditAttribution: eaton commentedI posted my first stab at unifying some of these various issues in a blog post at http://www.lullabot.com/blog/trouble_nodes ... I think if we can figure out a way to load nodes (and any other objects, if possible) as structured arrays we'd gain a lot. The real problem would be the change in syntax. $node->foo would go away, for the most part. $node['foo']['#value'] would be the syntax, instead.
Comment #20
dmitrig01 CreditAttribution: dmitrig01 commentedI had this idea:
here is some psudo code to help
function drupal_load($type, $load_args) {
call $type_load.
in the node case, this would be
drupal_load('node', 234)
ordrupal_load('node', array('node' => array('nid' => 1234))) (syntax: array('tablename' => array(fieldname => fieldval)))
that would return an array of specially formatted table names and field names to load. e.g.
then pull in data from hook_load('node') which would add extra fields
*build query here*
*execute query*
*parse results*
*return results*
}
Comment #21
dwwLast week, Eaton and I discussed this at length. We decided:
a) $node shall become a #FAPI-style structured array.
b) Everything at the top-level of $node will either be #properties, like
$node['#nid']
or arrays keyed by module name, like$node['book'][...]
c) Eaton was going to draft some guidelines on what should be a #property and what should be a regular element with a #value.
d) Eaton was going to draft some ideas on the proper use of #type inside a $node array.
It'd sure be great to make progress on this...
Comment #22
dwwOh, and who am I fooling? ;) I'm not going to have time to work on this anytime soon. Please, let's hope someone else can pick up the torch.
Comment #23
pwolanin CreditAttribution: pwolanin commented@dww If you and Eaton don't have time to work on the full refactoring for D6, we should consider the very simple version of $node->modulename['key'] enforced via the node module as you proposed above.
Comment #24
dwwhttp://lists.drupal.org/pipermail/development/2007-June/024886.html
Comment #25
criznach CreditAttribution: criznach commentedOne thing to keep in mind that no one has mentioned specifically is how the FAPI rebuilds node objects using only the form tree after submission. With any existing module (I'm looking at project), there will be cases where forms are divided up into fieldsets, widgets, etc in the tree. Simply turning the #tree property on does not work in this case because your generated node structure will mimic the UI, and not the desired module namespace tree. In most current cases, the FAPI will simply generate a flat node structure because #tree is false by default. We can hack our way around this by manually setting the #parents array on each form element so the generated node has a namespace structure, but that's not automatic, and a maintenance nightmare.
Comment #26
beginner CreditAttribution: beginner commentedper message linked in #24.
Comment #27
webchickhunmonk brought this patch to my attention tonight.
a) I do think it'd be great to get this resolved in core. I'm all for increased consistency, etc.
b) It's going to break everything.
c) So I need to touch base with Dries about it.
d) However, unless this change is coupled with something that has profound improvements in what it allows Drupal to do (such as the page/node rendering stuff, or data API) I don't think it has much of a shot on its own. :\ It's really hard to justify every module, theme, block, etc. having to change its code around simply because a few contributed modules never name-spaced properly.
In the meantime, I'd suggest contrib modules use the convention proposed by dww in #1.
Comment #28
mdupontJust stumbled upon this old issue. Is it something still interesting for D8+ now we have entities and classed objects, or is it an obsolete issue we can close?
Comment #32
andypostwith new entity api there's no more collisions