Note: this issue is a summary of a private email chain between several developers, apologies if any details are left out.

Many modules (such as mailcomment/mailhandler) use comment_save to insert comments directly, without going through FAPI. These modules may also make changes to a parent node in direct relation to the comment (see #813356: Expanded support for mailcomment attachments). No general mechanism exists to deal with such comments. It has been suggested that modules such as mailcomment either specifically code in support for comment_driven, or that comment_driven implement some mechanism (such as hook_comment) so that ALL new comments can be intercepted, not just those going through FAPI.

The attached patch implements the latter approach and is a proof-of-concept (not for production!). In a nutshell, it implements hook_comment and ONLY acts on comments that have bypassed FAPI (and thus comment_driven). It simply looks at any changes that have occurred to the node since the most recent revision tagged by comment_driven, and associates those changes with the comment. It requires whatever module generated the comment (in this case mailcomment) to make the changes to the node in a new revision.

Comments

danepowell’s picture

Here are comments from arhak and my responses:

1- comment_driven is closest to be a node-related module that a comment-related one (pending explanation that I'll provide in a few days) therefore it "would" make more sense to react to hook_nodeapi that hook_comment

I don't understand how that's the case really... if I'm not mistaken, it seems to simply hook into the comment editing form, pass the results to driven, save the changes to comment_driven_log, and then load the changes with comments when they are displayed. Let me know if I'm missing something.

2- comment_driven_processed is misplaced, because on its absence it is assumed that it wasn't submitted via FAPI when there are a lot of cases that comment_driven steps out of the way and let the comment be submitted without any intervention (this would be fixable placing it earlier in hook_alter, before any "step out of the way" comment)

Cool, not a blocking issue.

3- seeking the two revisions to compare will lead to unpredictable behaviors, since a lot of node edition could happen in the middle an there is no consideration for those in between vids

If we simply compare the "current" node revision to the most recent (by timestamp, not by vid), wouldn't that all but guarantee that only changes that occurred coincident with the comment actually get associated with it?

4- any approach for seeking what vids should proceed is an unnecessary performance impact (they could come already indicated instead)

Hrm? I don't see how this represents any significant performance impact at all...

5- having comment_driven depending on driven_inspect is too much to ask for such a capricious/uncommon case

Okay, we'll make this user-configurable and only allow it to be enabled if comment_driven is enabled.

6- implementing the same logic in a 3rd party module relying on comment_driven and driven_inspect would make more sense to me (even I would expand driven_inspect capabilities to make it easier, having the complex code maintained within the Driven API)

I could see spinning this off into a separate module, however it's such a tiny amount of code that I don't know if it'd make sense. I do think it would make sense to at least distribute and maintain it with the comment_driven package.

7- last but not least, I insist that comparing two revisions is NOT the same, it is LIMITED, several properties like author/published/date don't support revisions and wouldn't be noticed, resulting in an undesired behavior

That's a failing of Drupal revisions, not of this module or this patch :) If another module is bypassing FAPI and absolutely needs support for these properties, we are not stopping them from specifically coding support for comment_driven. Also, this problem isn't one that I've given much thought too; there might still be some way to crack it.

arhak’s picture

1-

Let me know if I'm missing something.

IMO you're missing something, but I haven't explained that part yet, that is what needs time to be written down (we'll get there)

2- it is a minor "bug" of this patch, but somehow related to point 3

3- you will react to comments not posted through FAPI that are NOT intended to introduce any difference, (a little harder to obtain that point 2 above) but the same as cross posting occur and locking framework is needed in some cases, this point might be reached once a node edition has happened with a sufficient timestamp to be wrongly identified as intended to be a diff introduced by the comment when it wasn't (in short, it is not robust, but dirty)

4- you're querying and sorting, when there is no need to seek what at some point you knew already
pass the info through, don't go hunt what you had at hand

5- not a blocker issue, agreed

6- not a blocker issue

7-

That's a failing of Drupal revisions, not of this module or this patch :) If another module is bypassing FAPI and absolutely needs support for these properties, we are not stopping them from specifically coding support for comment_driven.

comment_driven supports them already, but it just can't do it when working with vids
but it can if you pass the actual nodes (before vs after) instead of node_load-ing revision ids
that's why I would prefer to use its full potential over a limited version

arhak’s picture

BTW, this issue is related to #741274: Using comment driven programmatically
and a part of the e-mail thread that is missing is how to provide more support than that
and how/why you got to this proof-of-concept

PS: once I explain the point 1 (mentioned above) I will suggest another proof of concept that I would feel more comfortable

obrienmd’s picture

Subscribing.

obrienmd’s picture

Arhak, any chance to write up your further explanation of point 1?

arhak’s picture

why is comment_driven more close to be a node-related module than a comment-related one?
first of all it relies on Driven API, which addresses only nodes
secondly its main goal is to maximize compatibility with node-related module (at the expense of the compatibility with some comment-related modules that might come up unsupported)
approach: what really happens when comment_driven gets in, is that a comment_form stops being an actual comment_form and it is replaced with a node_form for node edition.
it becomes then a node edition page with a comment_form attached to it, which is almost what revision log textarea would provide if it would work as a threaded history (the only reason to not use revision logs instead is that there are a lot of comment-related modules that are already bringing new features to comment forms and are perfectly compatible with comment_driven, which otherwise would have to duplicate their features to bring them to node's revision logs)
Driven API works with two nodes (not comments at all), and comment_driven seek to handle a node edition and a comment insertion at once

attempting to react on hook_comment makes necessary to determine which would be the two node's snapshots to use, which can be node objects, revisions, or submitted data (a $node_form & its $form_state['values']), resulting in either:
- a performance waste determining which revisions ids to use (IMO the worst approach)
- passing through two nodes (or revision ids)
- working with revisions (instead of actual nodes or submitted data) brings the limitation for properties which don't support revision

last, but not least: inventing a driven diff violates the principle of comment_driven to reflect the changes that were introduced at commenting time, (which actually is a comment being inserted at node edition time), having them apart and then bundling them together might result in a coarse fake

proposal: instead of telling comment_driven whether NOT to react (with the proposed flag to detect whether a form was actually submitted or not) it can be handled in hook_nodeapi, at the time a node its being edited (node_save) we have the two instants we need: pre-insert and insert, on pre-insert time we have the node about to be changed, and on insert time we have it again but already saved
if we also receive a cid (especially flagged to tell comment_driven to react) it would be (IMO) much better:
- NO performance penalty to determine the two node's snapshots neither the comment to bound
- telling comment_driven to react would be as simple as passing a cid (special flag) and then comment_driven_nodeapi would take care of preserving the node_before on pre-insert to make sure it will be available for comparison (against node_after) at insert time
- it wouldn't be working with revisions, and therefore there would be NO limitation for those properties that don't support revisions

PS: I would like to submit this patch, but right now I don't have the time to spare on it.
however, I'll do my best in reviewing proposed patches, and providing more info if needed (so far it seems to me that I accomplished easing Dane's work with the info I previously provided through mails, so IMHO the hook_nodeapi proposal shouldn't be taken lightly)

danepowell’s picture

Thanks for your detailed reply arhak, I believe I fully understand what you are proposing now- I will take a crack at implementing hook_nodeapi and see what I can do.

danepowell’s picture

Status: Active » Needs review
StatusFileSize
new1.01 KB

Okay, here is a patch that implements hook_nodeapi. If a module wishes to add support for comment_driven, then when it saves a node it simply needs to add the parameter $node->associated_cid indicating which comment is associated with changes to the node. Please let me know what you think.

Again, this is part of a larger effort to get Mailhandler/Mailsave/Mailcomment/Comment Driven to play nicely together as part of a unified workflow with attachments and Driven-based tracking. The associated issues in other queues are #855312: Pass full node to implementations of hook_mailhandler_post_save and #813356: Expanded support for mailcomment attachments.

arhak’s picture

is this working as expected?

it seems to me that the $op is pre-insert (instead of "preinsert")

and since all the processing is taking place on the same $op
$changes must be empty (almost) every time

the proposal at #6 talks about reacting on two $op: re-insert and insert
did you choose your own way or did you miss it?

danepowell’s picture

It is working as expected in my test environment.

There is no "pre-insert" op that I'm aware of, just insert. And anyway, we are not reacting to node inserts, we are reacting to node updates, for which the relevant ops are "presave" and "update". See hook_nodeapi.

Maybe I did something slightly different from what you envisioned - I take the version that's about to be saved (passed as a parameter to hook_nodeapi) and compare it to the current version (from node_load). Do you see something wrong with that?

arhak’s picture

sorry, my bad
yes presave it is, and afterwards insert/update

so, $node is meant to be node_after and compared against $node_before from node_load

then two remarks:
- some comments would come handy, because it relies on no other module modifying the $node on presave, which is unlikely due to the enormous weight of cdriven (but should be said IMO)
- insertion code should be common (placed in one place to be reused)

danepowell’s picture

Would it be better to have a low-level insert function like...

function comment_driven_insert_log($cid, $old_vid, $new_vid, $changes, $diff_render) {
  $query = "INSERT INTO {comment_driven_log}(cid, old_vid, new_vid, changes, diff_render, timestamp) VALUES(%d, %d, %d, '%s', '%s', %d)";
  db_query($query, $cid, $old_vid, $new_vid, serialize($changes), serialize($diff_render), time());
 }

or a higher-level function that takes a $node_before, $node_after, and $cid and inserts a log entry like above?

Or both? :)

arhak’s picture

low-level

what if the high-level version receives node_before/after that haven't been saved, and therefore have no vid yet?

PS: I'm running out the door right now, that was from the top of my head (which is not very accurate)

arhak’s picture

on presave $node->vid is equal to $node_before->vid
right?

you'll have to wait until insert/update to get the proper vid once the revision have been saved

danepowell’s picture

StatusFileSize
new1.81 KB

Here is a patch with some improvements made. We must first act on presave to load the latest revision of the node and compare it to the new node, generate the changes, and attach them to the node object. Then we have to wait until update to get the new vid and actually insert the cdriven log.

arhak’s picture

We must first act on presave to load the latest revision of the node

yes, otherwise there won't be a chance to have it back

and compare it to the new node, generate the changes, and attach them to the node object

well, if you're forced to act in two times then comparison doesn't "have to be" in the first phase
you can attach the $node_before instead, and do the comparison in the second phase

Then we have to wait until update to get the new vid and actually insert the cdriven log.

yes, if we want the proper revision id (when revisions are enabled)
then we have to wait for the second phase

question
do we only care about update?
won't the associated_cid work for insert? (I mean for other modules attempting to get advantage of this improvement but not on update)

BTW using driven_inspect_diff_nodes implies a new dependency
(which I don't like that much, I would prefer some kind of module_exists conditional)

danepowell’s picture

you can attach the $node_before instead, and do the comparison in the second phase

That's true- it would mean attaching more data (mostly redundant), not that that really makes a difference though. Do you think attaching $node_before instead of $changes is better, or vice versa?

do we only care about update?

Is there any case you can imagine where a comment would be created simultaneously with a new node creation and need to be logged by cdriven?

BTW using driven_inspect_diff_nodes implies a new dependency

That's true, I actually added that but forgot to include it in the patch. I guess options include adding the dependency as mentioned (I understand why you don't like this), adding a module_exists conditional (I don't really like this on principal, and it might be confusing for people expecting this to work out-of-the-box), or making this functionality a sub-module of cdriven with the appropriate dependencies (I like this idea, since the functionality and methodology are really completely different from the rest of cdriven)

arhak’s picture

Do you think attaching $node_before instead of $changes is better, or vice versa?

it was just a comment, but giving it a second thought, I prefer the comparison in advance (1st phase)

  • it is more clear the intention of the attached data, and easier to recognize within the second phase (otherwise, one might ask, what is that "node_before" about, and if it is going somewhere else)
  • having it after save would mean after every other module react to update which is not the usual approach of cdriven when inserting via FAPI

Is there any case you can imagine where a comment would be created simultaneously with a new node creation and need to be logged by cdriven?

out of the blue cdriven needed it when operating via FAPI,
in addition, read the titles of these issues:
#741274: Using comment driven programmatically
#825304: Support comments not inserted via FAPI
so, I guess yes, we should care

[...] I guess options include [...]

  • adding the dependency as mentioned
  • adding a module_exists conditional
  • making this functionality a sub-module of cdriven with the appropriate dependencies

I guess we agree on why we don't like the first ones (pros and cons), and also why a third module might come handy, just it feels wrong having an extra module for such small code and out of cdriven
... lets not make a big deal of it
make it your call

danepowell’s picture

so, I guess yes, we should care [about node inserts]

Sorry, I guess I'm still not clear on what advantage trying to act on node inserts/creation (not just updates/edits) would offer. I don't know how or why you would create a new node and simultaneously create a comment, and what exactly cdriven would log in that case - you've just created the node, what "change" is cdriven supposed to capture?

Or are you proposing something else?

arhak’s picture

you've just created the node, what "change" is cdriven supposed to capture?

oh, please forgive me, I was talking nonsense

danepowell’s picture

No worries, it happens to the best of us... I've had my fair share of head-desk moments on this project.

danepowell’s picture

Just to keep everyone on the same page, the patch in #15 is the latest and should work- it requires comment driven inspect to be enabled. We/I still need to decide how to handle that dependency- I'm leaning toward rolling a submodule of cdriven at this point.

danepowell’s picture

Issue tags: +mailflow 2.x

Tagging to keep track of issues related to the mailhandler-2.x/notifications-4.x workflow

obrienmd’s picture

Status: Needs review » Reviewed & tested by the community

arhak, can this (#15) be committed? I have CVS commit access, and will do so if you approve.

danepowell’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new2.55 KB

Ah taking another look at this I see that arhak was wanting roll this functionality into a submodule. Here's a revised patch with that in mind- note that I haven't had time to test it, but it's substantially the same code as #15. Note that it creates two new files as part of a "Comment Driven NodeAPI" submodule.

danepowell’s picture

Status: Needs review » Reviewed & tested by the community

Yes, patch #25 does work. Let's commit it!

obrienmd’s picture

I have tested #25 as well, looks to be working great!

arhak’s picture

Dane & Obrien, you are the reviewers of this patch,
the commit call is yours

PS: the sub-module is good for now, after a rewrite it should be most likely within the main module

obrienmd’s picture

Dane - Given your recent work w/ driven, are you comfortable w/ the state this patch is in? If so, I'll commit it.

arhak’s picture

@#29 Dane has commit access, he can do it

danepowell’s picture

I am comfortable with it, I just haven't had time to commit it yet (been working on the other issues). Feel free to commit it or I'll get around to it shortly (probably tomorrow).

danepowell’s picture

Status: Reviewed & tested by the community » Fixed

Sneaking this one in under the wire so to speak :) Committed #25 to HEAD (which I assume is 6.x-1.x).

danepowell’s picture

In the spirit of developing good habits, here's a link to the commit message...
http://drupal.org/cvs?commit=504718

Status: Fixed » Closed (fixed)
Issue tags: -mailflow 2.x

Automatically closed -- issue fixed for 2 weeks with no activity.