it'd be nice to have an input filter for the description textfield in a release node that automatically converted something like:

#83339

into:

#83339: make releases real nodes

so, it turns it into a valid link, and also extracts the title of the pointed-to issue and appends that.
this will make it much easier for folks to write decent changelog/description text for their releases, which will help end users.

chx pointed out in IRC that we can add such a filter but restrict it only to project_release nodes by judicious use of form_alter(). basically, the filter would be installed and enabled site-wide, but in project_release.module (assuming that's what provides the filter), we'd just form alter *every* form that comes our way to strip this radio out of the valid options whenever there's an "input format" fieldset, *except* for the project_release node add form...

patch, anyone? ;)

thanks,
-derek

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

moshe weitzman’s picture

i think any site running project would want this filter running everywhere. one of trac's finest points is a filter just like this one. that filter is really smart and will show the issue as crossed out if the issue is fixed/closed which we could do too if we use the nocache feature of filtering. but nocache might be a bit controversial on drupal.org due to resources. we should ask steven/gerhard what they think.

Gerhard Killesreiter’s picture

sounds like a good idea

Gerhard Killesreiter’s picture

Should have read it all: I don't think a nocache filter is currently a good idea (unless somebody shows benchmarks that it isn't all that bad). The filtered strings are currently stored for 24h, I guess we could live with some delay regarding crossed out issues.

moshe weitzman’s picture

Project: Project » Project issue tracking
Version: 4.7.x-1.x-dev » 5.x-2.x-dev
Component: Releases » Issues

steven reminded me that nocache is for the whole input format so this is basically not solvable within the filter system (real time status). project_issue might be able to solve it itself within the issue system. or we accept 24 hour delay.

moshe weitzman’s picture

steven reminded me that nocache is for the whole input format so this is basically not solvable within the filter system (real time status). project_issue might be able to solve it itself within the issue system. or we accept 24 hour delay.

dww’s picture

Category: bug » feature

i wonder how this got classified as a bug... i'm pretty sure i submitted it as a feature request originally, and no one seems to have changed this with a followup. weird. must be a bug. ;) (needs a different issue, though)...

moshe weitzman’s picture

Title: add input filter to convert #12345 into a link to an issue » add input filter to convert #12345 into an issue link and [789] into a commit link

a related filter which is nice would be [x] where x is a commit number as per cvslog module. for example, http://drupal.org/cvs?commit=49762 would be [49762] ... this syntax is open for debate but this is how trac does it, and feels quite natural. yes, it would occasionally mishyperlink stuff but thats a small price to pay for the easy linking when you want it.

mwrochna’s picture

Assigned: Unassigned » mwrochna
Status: Active » Postponed (maintainer needs more info)
FileSize
2.28 KB

I'm working on it as a part of GHOP, tell me about every mistake - it's easier to change bad habits at the beginning :)
The patch includes the implementation of hook_filter('process'):
-should it be a part of project_release? I think project_issue would be more appropriate.
-should only project_issue node ids be turned into links? (isset($node->sid))
-should the title be separated? Usually #1234 turns into a link+title in the middle of a sentence, so i think the title should be put in parentheses or into the link.
-it's disabled by default in input formats - should it be enabled by default? (, how?)
and the main question:
how should the array of striked issue state ids be customizable? An additional checkbox in Administer>>Project administration>>Project issue status options would be the best, but this would involve changing issue.inc: theme_project_issue_admin_states_form() and a new column in table {project_issue_state} (add a project_release_update_5002() to .install) - is it ok?
I don't understand what "chx pointed out in IRC" (in issue description)- should something be done with input formats and forms?

aclight’s picture

Status: Postponed (maintainer needs more info) » Needs work

Nice work so far! Here are some comments I had regarding the code itself:

  1. + .......Issues with certain statuses are crossed out');
    needs a period at the end of the second sentence:
    + .......Issues with certain statuses are crossed out.');
  2. I would change
    + return array(0 => t('Issue to link filter'));
    to
    + return array(0 => t('Project Issue to link filter'));
  3. + return t('Converts references to project issues (in the form of #12345) into links'); needs a period at the end of the sentence.
  4. + if (!is_object($node) || !isset($node->sid)) continue; does not follow the Drupal coding standards. Specifically, this needs to be broken up into a multiline block enclosed by curly braces. See http://drupal.org/coding-standards for more details.
  5. + if(in_array($node->sid, $striked)) $link = '<strike>'.$link.'</strike>'; also doesn't follow coding standards.
  6. For + if(in_array($node->sid, $striked)) $link = '<strike>'.$link.'</strike>';, I think I would break this out into a (very simple) theme function. This allows individual site administrators to override how links to all issues are displayed. In general, it's best to try to avoid adding any HTML outside of theme_functionname() type functions. Project* does not always follow this rule of course, but in this case I think we gain a lot by creating a theme function to handle this (see below).
  7. I think your regular expression needs to be a little smarter. For example, if someone added a URL like http://example.com/node/123#12345 we don't want the #12345 part to be turned into a link. However, we do want people to be able to include a list of project issues like #12345, #123456,#1234567 and have those be turned into links. (see end for a caveat to this comment).

Now, for your specific questions:

-should it be a part of project_release? I think project_issue would be more appropriate.

I think project_issue is the best place to put this.

-should only project_issue node ids be turned into links? (isset($node->sid))

Yes, I think so. However, I think allowing this filter to work from within any node type is what we want. So, for example, a forum posting could have #12345 turned into a link with the title, if 12345 was a node with $node->type=='project_issue'

-should the title be separated? Usually #1234 turns into a link+title in the middle of a sentence, so i think the title should be put in parentheses or into the link.

I think I would recommend:
#89673: add input filter to convert #12345 into an issue link and [789] into a commit link. This is another place where having a theme function might be nice.
So, off the top of my head, I might do something like this:

function theme_project_issue_issue_link($node) {
  $output = '';
  if (isset($node->sid)) {
    $class = "project-issue-status-$node->sid";
    $output .= "<span class=$class>"
    $output .= l(check_plain($node->title), "/node/$node->nid");
    $output .= '</span>';
  }
  return $output;
}

Then, you'd include in your patch changes to project_issue.css that added styles for .project-issue-status-2, .project-issue-ststus-3, etc. I haven't tested this to see if it will actually work, however. I'm not sure that span is the right thing to use here either.

-it's disabled by default in input formats - should it be enabled by default? (, how?)
and the main question:

No, I think disabled by default is what we want. Administrators can enable this if they would like.

how should the array of striked issue state ids be customizable? An additional checkbox in Administer>>Project administration>>Project issue status options would be the best, but this would involve changing issue.inc: theme_project_issue_admin_states_form() and a new column in table {project_issue_state} (add a project_release_update_5002() to .install) - is it ok?

I think my theming suggestion above removes the need to add any additional options in the administrative interface. dww or hunmonk may have a different idea, however, but if you include a theme function that can be overridden by a sites template.php file and/or style.css files, I think we're providing sufficient opportunity for customization by site admins without cluttering up the project_issue admin UI.

I don't understand what "chx pointed out in IRC" (in issue description)- should something be done with input formats and forms?

Don't worry about what he said. I don't think it's relevant since output of this filter will be cachable.

I haven't actually tested this patch out yet. So, my comment about your regexp needing to be smarter could be wrong. I'll try to test it later today, but if you can get an updated patch before I get around to testing it that would be great.

Again, nice start on this--I think this will be a nice feature to have.

hunmonk’s picture

Title: add input filter to convert #12345 into an issue link and [789] into a commit link » add input filter to convert #12345 into an issue link

@moshe: the commit filter is out of scope for this patch -- i believe if we add that it actually belongs as part of cvs module or it's successor. feel free to open another issue for that as part of cvs.module or versioncontrolAPI.

my vote is also that this work goes in project_issue module. i'll wait for the updated patch and aclight's testing before i dig into this further.

dww’s picture

@mwrochna: thanks for taking on this task, and for the patch!

@aclight: thanks for the great review and comments. i agree entirely with almost everything you said. my one hesitation is that i'm a little torn about the strike-out functionality only living in the .css at the theme layer, but i suppose that's fine. we already make them chose colors for each row in issue listings based on status via hard-coding it in the theme's .css, so i guess it's reasonable to put this logic in the theme, as well.

@all: I just had a related idea: it'd be nice to give the href a title attribute (for mouse-over) that includes the current status of the issue. So, when you mouse-over one of these crossed-out links, you can see that it's either "closed", "won't fix", "duplicate", etc. Something like:

Title of issue (current_status)

So, for example:

add input filter to convert #12345 into an issue link and [789] into a commit link (patch (code needs work))

Or, maybe that's silly, and the mouse-over should just have the current status? Either way, I think it'd be a nice touch...

@moshe, hunmonk: yes, I agree the commit filter is out of scope, and belongs elsewhere, but it'd be worth doing. mwrochna: if you're interested in working on that, too, we can write it up as another GHOP task. ;)

@all: yes, if this is going to be a global filter for any node types to use, then this code belongs in project_issue. i had originally just imagined it for release nodes, but it's true, this auto-linking could be handy in other places. however, my concern is that we're always referring to comments in issues via "patch in #23 is RTBC", etc. so, we're going to get *a lot* of false links in the issue queues, exactly the place it's going to be the most confusing. :( not sure what to do about this... a few possible ideas:

A) Use [#12345] as the syntax if you want the linking.

B) Another possible alternate: #12345#. Maybe slightly wonky, but no one refers to comments by "just tested #12#, but i get fatal errors...", etc.

C) Have a setting somewhere that controls the minimum nid you need for the issue-linking to trigger. This would work on d.o, where we'd set the minimum to something like 500. Yes, there are a handful of issues in the queue somewhere below that, but if they don't get auto-linking, no big deal. and i don't know of any issue that had 500 replies. ;) so, that'd work for us, but wouldn't work at all on other sites with fewer nodes, and more active issues below around nid 200-500. :(

D) Have the filter only target specific node types, and we disable the auto-linking from within issue nodes. I think this would be a shame, since it'd be handy to have this feature in the issue queue, but I know we're going to get tons of false positives if we don't solve this some other way.

E) Some other bright idea someone else comes up with. ;)

Thanks everyone,
-Derek

mwrochna’s picture

Status: Needs work » Needs review
FileSize
1.8 KB
3.31 KB

Thanks for every suggestion!
Periods added, ifs multilined, moved to project_issue, themed.
I think the regex does what it should (back- and forward-asserts there's no word char, asserts string isn't directly in <a>), it won't match issues that are already linked.

I'll be happy to do the [commit] links too, not necessarily as another GHOP task (I guess it's just a copy-paste and some small changes).
@dww
A) I think [#1234] is the best solution, though potentialy good matches in old comments won't change.
C) I also thought about a minimum nid setting, it's easy and it would prevent the most common #1 mismatches.
E) Maybe a #1234 link, which will show the title and status only in an on-hover:visible, floating css box ;) ? (that's probably not to everybody's taste, but similar boxes work well on my forum)
P.S: a.title=status added.
P.P.S: minlink.patch for C) added

aclight’s picture

Status: Needs work » Needs review

Or, maybe that's silly, and the mouse-over should just have the current status? Either way, I think it'd be a nice touch...

I think just having the status in the href title attribute is the best. Otherwise it might be too long, if the issue title is long, and it will be truncated by some browsers.

however, my concern is that we're always referring to comments in issues via "patch in #23 is RTBC", etc. so, we're going to get *a lot* of false links in the issue queues, exactly the place it's going to be the most confusing.

Can't we just change:

+        if (is_object($node)){

to this:

+        if (is_object($node) && $node->type=='project_issue'){

That'll prevent your example from being a link, right? We'd need to add another line to the code to return the unfiltered text if this isn't true though, I think. If we're checking for node type in project_release_filter() then we can take that same check out of the theme function.

I just got a new computer this weekend and am still getting my development environment set up. Hopefully I'll be able to test this tonight.

Again, great work so far.

dww’s picture

Status: Needs review » Needs work

@aclight: Thing is, for example, both http://drupal.org/node/8 and http://drupal.org/node/9 are valid issue nodes (and #8 is still open, in fact!). :( So, anytime someone refers to "#8 is RTBC" in an issue, #8 will become a link. :( Behold:

mysql> select count(*) from node where nid<50 and type='project_issue';
+----------+
| count(*) |
+----------+
|       22 | 
+----------+

mysql> select count(*) from node where nid<100 and type='project_issue';
+----------+
| count(*) |
+----------+
|       51 | 
+----------+

mysql> select count(*) from node where nid<200 and type='project_issue';
+----------+
| count(*) |
+----------+
|      117 | 
+----------+

As you can see, there are a *lot* of issue nodes on d.o in the under 200 nid-space. :( So, restricting to issue nodes doesn't help us. We either need a different syntax for the auto-linking (which is probably best), or a threshold, or something else.

[#12345] seems good to me, but then i'm concerned about the syntax to use for the cvs commit ids. maybe [cvs:12345] ? Kinda off topic for this thread, but we might as well have a clear plan for both so there's no conflicts/confusion.

Patch needs work:
A) Some functions are still named project_release_*
B) Needs a solution to the above problems, probably [#12345]

Thanks,
-Derek

dww’s picture

Status: Needs review » Needs work

@mwrochna: Please just add new comments, instead of editing old ones to add more patches or text. This makes it easier to follow what's going on and see what changed. Thanks.

dww’s picture

Oh, and I don't think minlink is a good solution to this, honestly. It's not going to work well at all for most other sites, and it'll be annoying when you actually do want to link to nid #8. ;) I think #8: Let users cancel their accounts would be better.

Speaking of which, I didn't check, but does your regexp correctly ignore stuff already inside certain HTML tags? For example, stuff wrapped in <code> tags (as above) should never be converted. I think you mentioned it already ignores <a>. It should probably also ignore <pre>, too...

Thanks,
-Derek

aclight’s picture

@dww: ah...I see your point. Sorry for being dense :)

In that case, I agree that something like [#12345] is best.

mwrochna’s picture

FileSize
3.35 KB

Changed project_release_* to project_issue* (remember to recheck your input formats)
Changed format to [#1234], I think everybody agreed.
Oh, <code> and <pre> tags... I added a workaround (it forward-asserts the next '<' isn't '</code>', just like how it did with </a>) - but this will match anything like
<code>[#1234] ... < ... </code>.
I don't know how this could be improved, I think it would be simpler (and faster) for the <code> filter to prepare and process things, so that they couldn't be changed by other filters.

Should $restrict be set to true when calling project_issue_state()?

webchick’s picture

Title: add input filter to convert #12345 into an issue link » GHOP #53: add input filter to convert #12345 into an issue link
aclight’s picture

Status: Needs work » Needs review

@mwrochna--make sure to change the status of the issue back to code needs review when your patch is ready for review. This helps get eyes on the issue and helps us keep track of things better.

aclight’s picture

Status: Needs review » Needs work
FileSize
12.72 KB
12.47 KB

Nice work. Here are some comments on the code:

  1. In project_issue.css, you're setting postponed (sid=4) to be the same style as fixed, closed, etc. I think it should be displayed like active, cnw, etc. since a postponed issue isn't really closed--it's just postponed.
  2. + if ($node->type=='project_issue') {
    should have spaces on each side of the '==' as per drupal coding standards.
  3. It occurs to me that this filter is opening a potential access elevation if users who are not permitted to view issues (or all issues) can view, say, a forum post where there is a reference to an issue. They wouldn't be able to see the actual issue, but would be able to see if an issue with a certain nid existed and what the title and status of the issue was. If I'm missing some access checking somewhere, let me know. The best way to handle this is probably to change:
    +        if (is_object($node)) {
    

    to:

    +        if (is_object($node) && project_issue_access('view', $node)) {
    
  4. Remove the leading '/' in the URL for the link here:
    +                 "/node/$node->nid",
    
  5. There's a problem with filter ordering here. If I put the project_issue filter in this patch to be at the top of my filters, so that the order, from lightest to heaviest, is:
    1. Project Issue to link filter
    2. URL filter
    3. HTML filter
    4. Line break converter

    then I get the results seen in the attached screenshot named project_issue_filter_light.png. Issue #329 is fixed, but is not crossed out. I think this is because is not an allowed tag in the filtered HTML filter, and so it's getting taken out when the HTML filter filters the text after the project issue filter adds the span elements.

    On the other hand, if I move the project issue filter to be the heaviest, so that my filter order is:

    1. URL filter
    2. HTML filter
    3. Line break converter
    4. Project Issue to link filter

    then the results are shown in the attached image named project_issue_filter_heavy.png. In that case, the span elements come through, but now a reference to an issue within a <code></code> block is also made into a link, which is not what we want. See below for more on this.

Should $restrict be set to true when calling project_issue_state()?

Probably. I don't see how this would hurt anything. dww?

One other thing (related to the filter ordering problem I mentioned above):
I think the problem where references to issues within <code></code> tags getting highlighted is that the line break filter inserts <br /> tags, and your regexp isn't ignoring those. I'm not quite sure how to ignore them, but I would guess it's possible.

For example, here's the $text variable on my test page before it's processed by preg_match(). Note that I'm using the php filter instead of the code filter here so that I don't have to escape some of the tags. Also note that the code below doesn't quite match the screenshots, but I didn't feel like remaking the screenshots:

<p>issue [#329] should be linked<br />
issue #329 should not be linked<br />
issue [#329a], [abc#329], [#32abc9] should not be linked<br />
issues [#329],#217: a reindex in the search module times out with a fatal error,#214: drupal sites update via XML-RPC should be bidirectional should all be linked<br />
issues [#329]#217: a reindex in the search module times out with a fatal error#214: drupal sites update via XML-RPC should be bidirectional should all be linked<br />
project release [#326] should not be linked since it's not an issue<br />
it would be cool if [#204#comment-1133] was also linked but it's probably not<br />
maybe [#329,#217,#214] should be linked?  probably not<br />
&lt;br /&gt;&#10;issue #329 should not be linked&lt;br /&gt;&#10;issue [#329] should not be linked because it&#039;s inside code tags&lt;br /&gt; // This does not work and is linked&#10;<br />
issue [#329] should not be linked because it&#039;s inside code tags<br />  // This works and is not linked
this time it should be linked because the issue reference is outside the code tags[#329]<br />
<a href="">this time it should not be linked  [#329] because it's in an &lt;a&gt; tag</a></p>

Let me just say that this filter is really sweet. This is going to be really nice to have, especially if it's installed and enabled on d.o.

dww’s picture

Re: $restrict for project_issue_state() -- no you don't really want that, and yes, it would "hurt" since it complicates the job that project_issue_state() has to do, costing some additional performance load. That option just says "should the array of choices passed back be restricted to the issue states the current user is allowed to use?". In our case, we don't care about that, we just want the name of the current state.

Keep on rocking, this is looking great. Can't wait to deploy it on d.o. ;)

mwrochna’s picture

Status: Needs work » Needs review
FileSize
3.61 KB

I'm not sure why some of the <code>ed ids got linked, but this patch (3) should help.
(Now it forward-asserts there's no match for: any non< chars + any non<code> tags, and then a '</code>').
Sorry if it's still wrong, everything appears correct on my install.

Added support for [#204#comment-1133] - maybe these should look shorter?

by the way - shouldn't issue.inc: project_issue_access('view') return TRUE when user_access('access project issues') is true? (it just break;s and returns nothing instead)

dww’s picture

re: project_issue_access('view') -- no, it most definitely should not return TRUE. That'd clobber all other access control modules (e.g OG, node_access_by_role, etc, etc). By returning NULL, it means "i don't care what happens, let other modules that want to reject it do so". if no one returns FALSE, and the user has permission to view content at all, it will be allowed. for example, if we returned TRUE, it'd mean that issues posted to private groups wouldn't be private for anyone with 'access project issues' permission.

sorry, no time to review this closer now, just wanted to answer your question...

aclight’s picture

Status: Needs review » Needs work

Ok--you've fixed all of the concerns I raised in #21. Unfortunately, I think I wasn't thinking through the access elevation issue clearly--we want to use node_access(), not project_issue_access(). Sorry about that.

While you're re-rolling the patch, you might as well go ahead and add a bit of documentation about the filter and mention that the weight of the filter needs to be heavier than that of the HTML filter. It might even be wise to imple hook_requirements() to make sure that this is the case. If you want an example of how to do this, I know that the GeSHi filter module implements hook_requirements() for this purpose. Of course, you'd want to check in hook_requirements to make sure that the project issue input filter is enabled, because otherwise there's no need to produce an error message.

A few sentences of documentation can go in INSTALL.txt.

After you reroll the patch I'll test it one last time, but this is looking very good. great work!

moshe weitzman’s picture

is this supposed to be a cacheable filter? if so, i don't see how it can be compatible with node_access() modules. whomever happens to view the text when it is evaluated will determine the access control that gets cached.

if we make it non cacheable, it negates all caching for the whole input format so you make this whole effort virtually worthless from a performance perspective. sorry to bear bad news.

i think we need a big status report warning if a node access module is found ... we should still use node_access as dww suggests since some admins are ok with the problem i describe.

mwrochna’s picture

Status: Needs work » Needs review
FileSize
5.33 KB

regex matching: I wrote hook_requirements(), but when I tested it turned out that (after the regex change in (3)?) both weight arangments work. I left the function in the patch in case I'm wrong, but if it works for you too, then it should be removed.

documentation: I only mentioned about enabling the filter in INSTALL.txt. Should I add something in UPGRADE/README/filter_tips?

security: If we removed node_access() any user could check all issue titles by previewing comments. Now (with node_access() used) in the majority of cases, when an author will write [#1234] in a node, he will cache it and make it's title available to all who read the node - I think it's not worth a mention, just like if he used the title directly. Links will be sometimes randomly disabled by users who can see the comment, but not the issue - but I don't think this happens in reality. There's still a little probability that a user could post an unauthorized [#1234] and wait until some admin recaches it, but again - imho it's not worth warning about it.

aclight’s picture

Status: Needs review » Needs work

It looks like you're creating this patch against project_issue.module v. 1.38.2.5. Ideally, you should be creating this against HEAD, which is at 1.73. The patch still applies, though with fuzz:

File to patch: project_issue/project_issue.module
patching file project_issue/project_issue.module
Hunk #1 succeeded at 948 with fuzz 2 (offset 74 lines).

I tested your most recent patch and you are right--it doesn't seem like the order of the filter matters any more. So I think it's safe to remove the hook_requirements(). Sorry about that!

Moshe reminded us of a good point above. The way you've written this so far is to allow the results of this filter to be cached. That's good performance wise, but as Moshe said this won't really work if a node access module is installed. I don't think this will be a problem on d.o, since as far as I know there are no node access modules installed, and all users can view project issues. However, certain sites might want to restrict viewing of project issues to certain users, and this filter needs to respect this, or at least make it pretty clear when it isn't respecting this.

So, here's my suggestion:
A. Create a filter setting that determines if the results of the filter are cached or not. In the description, it needs to be made clear that filter caching is by input format, not by filter. In other words, if the project issue filter is part of the Filtered HTML format, all text that uses the Filtered HTML format will NOT be cached. I think the default of this filter should be NOT to cache. The description should also mention that sites using node access modules should not cache because they will not get the desired results.

B. If caching is turned on, and if the site runs a node access module, I think we should implement hook_requirements and return a message of REQUIREMENT_WARNING level suggesting that this is not a good idea because:
a) Results will be unexpected
b) Issue titles will be revealed to users who may not otherwise have access to the issues. However, users will not be able to read or otherwise access the issues. Instead, they will get an "Access denied" page.

So, if we're going to do B (other suggestions are welcome, of course), that brings up the question of how we know if any node access control modules are enabled. I don't know how to check this, though. Anyone else?

moshe weitzman’s picture

it occurs to me now that we can just change our filter declaration based on whether we see any node access module. here is some code for hook_filter('no cache').


if (empty(module_implements('node_grants'))) {
  return TRUE
}
else {
  return FALSE;
}

aclight’s picture

@mwrochna: I chatted with dww briefly about this today in #drupal-ghop, and he is behind the general plan in #29 and #30. We agreed that moshe's suggestion is much simpler than my suggestion in #29 B, so let's go with that. Let's add some help text that makes it very clear that any site with a node access control module that uses this filter will give up caching on any node with a content type that includes the filter. But I don't expect this to be a huge problem.

mwrochna’s picture

Status: Needs work » Needs review
FileSize
6.28 KB

Updated from v1.38 to HEAD.
When testing the order I forgot the problem was about <span>, not <code>, so the requirement is necessary.
Added moshe's code, a REQUIREMENT_WARNING with same condition and a note in $op='description'.
I understand that issue information is never restricted when there is no hook_node_grants() (because there's no project_issue_access()), yes?

aclight’s picture

Status: Needs review » Needs work

A. moshe's code is reversed. We really want this:

    case 'no cache':
      $grants = module_implements('node_grants');
      if (empty($grants)) {
        return FALSE;
      }
      else {
        return TRUE;
      }

The double negatives are confusing.

B. We still have a problem now if not all users can view project issues. In a sense, project_issue acts like a node access control module with respect to project_issue nodes, but since it doesn't implement hook_node_grants() the input filter will be cached if there are no node access control modules but if all users can't view all project_issues.

So, I initially thought that we could just cache the filter results if anonymous users have access project issues permission. But then I thought that this will be a little too restrictive, because we really only need to not cache the filter results if the access to any project_issue node is more restrictive than access to any node where the filter could be used. So, for example, we'd be fine caching the filter results if anonymous users did not have access nodes privs. but authenticated users had both access nodes and access project issues privs.

Since this is getting pretty complicated, I would be fine if you just added a simple check in the 'no cache' operation that checked to see if anonymous users had 'access project issues' privs. If so, and if there are no node access control modules installed, then the filter results can be cached. Otherwise, they are not.

C. I installed simple_access (a node access control module) after I installed and configured the project issues input filter. However, I don't think that project_issue_filter() was called with $op='no cache' until I clicked the "Save configuration" button at admin/settings/filters/1.

I'm not sure if it's possible to do this, but we probably want to force the 'no cache' operation to be called whenever a module is enabled or disabled so we make sure we're doing the right thing regarding caching.

moshe weitzman’s picture

A: oops

B: there can be no access control like 'access project issues'. if it exists, it is a bug. no core node types have such a permission because it is incompatible with the node access control system. thus B is moot (but we might have uncovered a different problem) ... i am not too familiar with project codebase so sorry for the high level POV.

C: good catch. i would never have thought of that.

aclight’s picture

@moshe:

B: there can be no access control like 'access project issues'. if it exists, it is a bug. no core node types have such a permission because it is incompatible with the node access control system. thus B is moot (but we might have uncovered a different problem) ... i am not too familiar with project codebase so sorry for the high level POV.

Maybe they shouldn't be there, but they're there....

From project.module

function project_perm() {
  $perms = array(
    'administer projects',
    'maintain projects',
    'access projects',
    'access own projects',
  );
  return $perms;
}

and from project_issue.module:

function project_issue_perm() {
  $perms = array(
    'create project issues',
    'access project issues',
    'edit own project issues',
    'access own project issues'
  );
  $states = project_issue_state();
  foreach($states as $key => $value) {
    $perms[] = "set issue status ". str_replace("'", "", $value);
  }
  return $perms;
}
dww’s picture

@moshe: and they've been around a *long* time. :( and, coincidentally, something of a giant pain in the neck: http://drupal.org/node/168760

mwrochna’s picture

FileSize
7.24 KB

A. Oops, sorry, corrected.
B. So.. do I add || !node_access('access project issues', drupal_anonymous_user())?
C. 'no cache' are only called by filter.module:filter_admin_format_form_submit(), then the & result is saved into the database per input format. Anyway, there's no callback function when enabling a module or anything, so we'd have to check often or modify the core. I added a check and a REQUIREMENT_ERROR.
Maybe adding a setting for caching and informing about it would be better?

Tomorrow and the day after I'm away, sorry for any delay.

mwrochna’s picture

Status: Needs work » Needs review
moshe weitzman’s picture

B: there is no right way to do this beyond ripping out the insecure permission 'access project issues' and a'ccess projects' and any code that depends on it. and then warn any sites when the release comes out that the broken feature that they are depending on is no longer available and they should install content_access() or similar instead.

we need to involve dww at this point.

dww’s picture

removing these perms is *WAY* outside the scope of this issue. i don't see why you call them "insecure", either. they're just additional restrictions. no where does hook_access() return TRUE, subverting other node access modules. they're just another way to return FALSE...

that said, we could certainly consider removing them at some point, just not now...

and, *that* said, i'm fine with cutting corners a little in this patch if the edge cases caused by these perms are causing grief. something like don't cache the filter if (!user_access('access project issues', array('uid' => 0))).

i really don't have much time to ponder this more, but i wanted to share my initial impressions.

moshe weitzman’s picture

i call them insecure becuase anyone can see project_issues in listings like home page, search_results, etc regardless of their 'access issues permission. Thats misleading on a very important point.

I agree it isn't especially in scope here.

dww’s picture

@moshe: not true. behold project_db_rewrite_sql(). That's exactly what the SA I pointed you to above was about.

aclight’s picture

dww and I talked about this again on IRC and came up with the following plan:

1. Add a setting in the filter where the user can choose whether the filter results should be cached or not. So, for example,

    case 'settings':
      $formats = filter_formats();
      $form['project_issue'] = array(
        '#type' => 'fieldset',
        '#title' => t('Project Issue to link filter'),
        '#collapsible' => TRUE,
        '#collapsed' => FALSE,
      );
      $form['project_issue']['project_issue_filter_nocache'] = array(
        '#type' => 'radios',
        '#title' => t('Filter caching'),
        '#default_value' => variable_get('project_issue_filter_nocache', 'prevent'),
        '#callback' => 'project_issue_filter_settings_submit',
        '#options' => array(
          'prevent' => t('Prevent caching'),
          'allow' => t('Do not prevent caching')
          ),
        '#description' => t('If you are using any node access control modules on your site,
          <em>or</em> if anonymous users on your site do not have access to view project issues,
          you should choose "Prevent caching". Otherwise, users may have the ability to see the
          titles of project issues that they do not otherwise have access to.
          <p><strong>Please note the following:</strong>
          <ol>
            <li>If you prevent caching any input format in which the Project issue to link filter is enabled
            will not be cached.  This could decrease performance on your site.</li>
            <li>If you choose "Do not prevent caching", output of this input format will not be cached
            if the other enabled filters for this input format prevent caching.</li>
            </ol>', array('%input_format' => $formats[$format]->name))
      );
      return $form;

    case 'no cache':
      $nocache = variable_get('project_issue_filter_nocache', 'prevent') == 'prevent';
      return $nocache;

2. Use hook_requirements() to warn the administrator in the following circumstances:

if (!nocache)
  // see if anonymous users have 'access project issues' perms
  $allowed_roles = user_roles(FALSE, 'access project issues');
  if (!isset($allowed_roles[DRUPAL_ANONYMOUS_RID])) {
        --> warning <--
  }

  // see if there are any node access control modules enabled
  $grants = module_implements('node_grants');
   if (!empty($grants)) {
       --> warning <--
    }
}

The warning would advise the administrator to prevent caching for the project issue input filter.

I spent some time trying to figure out how to do this, and I don't think it's even possible.

hook_filter() is only called with $op == 'no cache' when the settings for the input format (admin/settings/filters/#) are saved. When the configuration for the input filters is saved (admin/settings/filters/#/configure) 'no cache' is not reevaluated. I can't figure out a w ay to do this, either. So, right now, I can't think of a way to make the filter caching conditional based on a filter configuration option.

Any suggestions?

@mwrochna: Unless there are suggestions about how to address this, why don't you concentrate on getting the filter to work properly (in other words, the regular expressions). Here's the problem:
I hadn't tried enabling the codefilter module before, so I did that this time, and it also conflicts with the project_issue filter. The project_issue filter needs to be weighted below the codefilter filter, or otherwise issue references within code tags are linked and references between &lt?php ?> tags are not made into links but instead the HTML for the link is displayed. I believe this is because the codefilter module prepares text it processes by replacing <code> and <?php tags with \xFEcode\xFF and \xFE?php tags. So we either need to fix this in the regexp or add another requirement that project_input filter be the heaviest filter. An alternative might be to respond to the 'prepare' op and replace [#XXXXX] project issue references with \xFE#XXXXX\xFF references. Then, in process, if the reference isn't valid or otherwise isn't converted into a link switch it back to [#XXXXX]. I haven't thought this idea through carefully so there could be reasons that this would be a bad idea.

Also, it's Drupal coding convention to make all comments true sentences (unless they are really short or it doesn't make sense to make them a sentence). So, capitalize the first letter, and put a period at the end.

If we can't figure out a way to get the caching to work properly with node access control modules, etc., then for the purposes of the GHOP task I think we'll just say make the patch for a filter that never caches and you'll get credit for that. But hold off on changing any of the caching stuff yet--let's see if anyone has good ideas for how to get around this problem.

moshe weitzman’s picture

once again, a filter which is non cacheable will never get installed on drupal.org or any other performance sensitive site.

basically, i think this module should cache by default and if a node access module is recognized, throw a status report warning. i don't even think we should implement a non cache option. it is worthless.

dww’s picture

@moshe: i mostly agree. however i disagree with the final point: not every site running project_issue has the scalability concerns that d.o itself has. for example, i could imagine wanting this feature, even without caching (since we have a bunch of node privacy stuff there) on sec.d.o. therefore, i think there's value in an option to enable/disable caching with clear "more accurate for sites with access control and fresher status/title results" vs. "actually scales well but only works in simpler configurations" sort of choices/instructions.

but, fundamentally i agree that caching should be on by default, and we should just generate a warning if we think we detect a configuration that doesn't play nicely with caching.

aclight’s picture

@moshe: I agree with you that a non-cacheable filter has no chance of being installed on d.o. But at this point, I'm less concerned about whether this can be installed on d.o but more with getting to a point where we can call this ghop task completed so mwrochna can start working on another (hopefully Drupal) task.

Unless I'm missing something, there's really no clean way to allow the user to select whether or not to make the filter results cacheable. Drupal just doesn't have the necessary hooks, etc. to do this in an intuitive way.

Here's an idea--why don't we create two filters. Project issue links (cached) and Project issue links (not cached). An administrator would enable one, but not both, of the filters. The two filters would use almost the same code--only the 'may cache' op would be different. This way it's clear to the user if caching is happening, and it's easy to effectively turn caching on and off. It's not perfect, because ideally we would need only one filter to do this, but this is the best way I can think of to get what we want without going to great lengths to get there. What do others think?

dww’s picture

@aclight: I guess that's reasonable. However, I'd hate to see a big function forked to change a single line in 1 case. Some care would need to go into how to share the code as much as possible, given the constraints core places on us with the somewhat whacky API we're running into. :(

aclight’s picture

Status: Needs review » Needs work

@dww: There's no need for function forking. Here's an example of what the two functions would like. I haven't tested the filtering itself to make sure that it works as expected, but I think it should.

/**
 * Implementation of hook_form_filter()
 * @ingroup project_issue_filter
 */
function project_issue_filter($op, $delta = 0, $format = -1, $text = '') {
  switch ($op) {
    case 'list':
      return array(
        0 => t('Project Issue to link filter (not cached)'),
        1 => t('Project issue to link filter (cached)'),
      );

    case 'no cache':
      switch ($delta) {
        case 0:
          return TRUE;
        case 1:
          return FALSE;
      }

    case 'description':
      switch ($delta) {
        case 0:
          return t('Converts references to project issues (in the form of [#12345]) into links. Use this filter when using node access control modules or if access to project issues is not given to all users (both anonymous and authenticated).');
        case 1:
          return t('Converts references to project issues (in the form of [#12345]) into links. Use this filter if no node access control modules are used on your site and if all users can access all project issues.');
      }

    case 'prepare':
      return $text;

    case 'process':
      $regex = '(?<!\w)(\[#(\d+)(#comment-(\d+))?\])(?![^<]*<\/a>|([^<]|(<(?!code>)))*<\/code>|([^<]|(<(?!pre>)))*<\/pre>|\w)';
      $offset = 0;
      while(preg_match('/'.$regex.'/', $text, $preg_matches, PREG_OFFSET_CAPTURE, $offset)) {
        $offset++;
        $match = $preg_matches[1];
        $nid = $preg_matches[2][0];
        $comment_id=$preg_matches[4][0];
        $node = node_load($nid);
        if (is_object($node) && node_access('view', $node)) {
          $link = theme_project_issue_issue_link($node, $comment_id);
          $text = substr_replace($text, $link, $match[1], strlen($match[0]));
          $offset = max($offset, $match[1]+strlen($link));
        }
        else {
          $offset = max($offset, $match[1]+strlen($match[0]));
        }
      }
      return $text;
  }
}

/**
 * Implementation of hook_requirements()
 * @ingroup project_issue_filter
 */
function project_issue_requirements($phase) {
  $requirements = array();
  $input_formats = array();
  if ($phase == 'runtime') {
    foreach (filter_formats() as $format => $input_format) {
      $filters = filter_list_format($format);
      if (isset($filters['project_issue/0']) && isset($filters['project_issue/1'])) {
        // Put up an error when both Project issue link (cached and non-cached) are enabled.
          $incorrect_formats['!input_format'] = l($input_format->name, "admin/settings/filters/$format");
          $requirements[] = array(
            'title' => 'Project Issue to link filter',
            'value' => t('Filter conflicts were detected.'),
            'description' => t('The Project issue to link (not cached) and Project issue to link (cached) filters are both enabled for the !input_format input format.  Only one Project issue to link filter should be enabled for any input format.', $incorrect_formats),
            'severity' => REQUIREMENT_ERROR,
          );
        }
        //error when HTML filter's weight is higher
        // @todo: look for position of both filter/0 and filter/1
        // @todo: may need to check position with respect to other filters such as codefilter
        if (isset($filters['filter/0']) && $filters['project_issue/0']->weight <= $filters['filter/0']->weight) {
          $description_names['%issuefilter'] = $filters['project_issue/0']->name;
          $description_names['%htmlfilter']  = $filters['filter/0']->name;
          $requirements[] = array(
            'title' => 'Project Issue to link filter',
            'value' => t('Some filter conflicts were detected.'),
            'description' => t('%issuefilter should come after %htmlfilter to prevent loss of layout and highlighting.', $description_names)
               .' '.l(t('Please rearange the filters.'), "admin/settings/filters/$format/order"),
            'severity' => REQUIREMENT_ERROR,
          );
        }
      }
  }
  return $requirements;
}

moshe weitzman’s picture

well, if we are going to tr to support both cacheable and non cacheable then i think we should do so dynamically. we can just add a submit handler (this is a form api feature) to the modules form which checks for node access and if found, calls a 'rebuild filter formats' function, whatever that function may be.

aclight’s picture

@moshe: Though making the caching/non-caching status dynamic would be nice, I think the logic to do so will be to complicated and error prone. Quoting myself from comment #33:

So, I initially thought that we could just cache the filter results if anonymous users have access project issues permission. But then I thought that this will be a little too restrictive, because we really only need to not cache the filter results if the access to any project_issue node is more restrictive than access to any node where the filter could be used. So, for example, we'd be fine caching the filter results if anonymous users did not have access nodes privs. but authenticated users had both access nodes and access project issues privs.

Even if we could cleanly work the logic out with regards to when caching should be enabled, I'm not sure how we'd do what you mention above. There's not really a rebuild filter formats function--it's just a submit handler. So I think we'd have to call the form building function and then pass that off to the submit handler. Isn't that considered hackish?

As for adding a submit handler to the modules page, I'm not sure how to do that, so I can't really speak to how practical that suggestion is.

dww’s picture

Adding the submit handler to the modules page is easy. I don't know about the rest of this, since I don't know the filter system that well and I haven't spent the time really groking this issue.

mwrochna’s picture

FileSize
7.42 KB

The regex will work correctly if all code-escaping filters output a <code> or <pre> before the regex acts (tags like <php do so, when they have a lower weight). Trying to make the regex more general (ex. to permit some filters to have a higher weight) will just make it slower and help in few cases. I added an array ($low_filters) of code-escaping filters (filter/0,codefilter/0,filter/1), an error will be put when one of them has a higher weight. Should I add every appropriate filter to $low_filters?

ad #49-51: I tried to do this dynamically before, but as aclight said, there's no clean way to 'rebuild filter formats', we'd have to call the form building function and call the input_format's form submit handler, or change the database manually. I ended up putting a requirement_error redirecting to this form (in patch 6).

I think it's a good idea to give the user the choice (by 'forking' with $delta) - many admins prefer publishing all issue titles and statuses (still hiding the issue's content) to disabling cache.

aclight’s picture

I think I prefer that we create two separate filters (one cached, one not cached) as I suggested in #48. But ultimately it's going to be dww or hunmonk that will have to make the call. I think there are arguments for both sides, but I think it just gets too complicated if we try to have the caching be determined dynamically by the code itself.

If you're in a hurry to get this closed, you might want to create two separate patches, one that implements #48 and one that implements what was suggested in #49 and #51. Unfortunately, I can't be of much help with regards to how to implement #49/#51, since I'm not very familiar with FAPI.

If you're not in so much of a hurry, then I guess you should just wait and see what dww and/or hunmonk think.

As for the regexp stuff, I think it's fine as is. I think it's fine to just require that the filter be weighted heavier than other code filters.

hunmonk’s picture

my feeling is that we've entered territory that is out of scope for the original task, which was to write the filter. these other considerations are important in terms of getting this code properly committed to project*, but i feel that this is something that the project* maintainers should work out.

so what i'd like to propose is this: implement the no-cache filter in the drupalorg.module for now. this accomplishes three important things:

  1. it allows mwrochna to actually finish this task :)
  2. it allows the drupal community to immediately benefit from his work.
  3. it allows the project* maintainers to work out a more complete solution over the long haul, without anyone feeling like we're committing a hack. when we figure out a best implementation, we can move it from drupalorg.module to project_issue, or wherever it belongs.

i'd like dww to weigh in on this idea before we go forward with it. if he's ok with it, then i'll post some more details about how exactly to implement/test this functionality in drupalorg.module

dww’s picture

I'm fine with splitting it. However, I think hunmonk means "implement the caching-version of the filter now", and let's worry about the no-cache configurations later. The drupal community can't benefit from a no-cache filter...

Thanks, all.

hunmonk’s picture

right, sorry. caching filter it is.

@mwrochna: i'm going to let aclight fill you in on how to implement this in drupalorg.module, since he's more familiar w/ the task.

dww’s picture

i think the drupalorg.module should stay out of this. project_issue should provide the cache-able filter for now, and we document the limitations, but the cache-able version would be useful to many other sites, too.

aclight’s picture

@dww: hunmonk and I had a long chat in IRC about this today. The reason hunmonk suggested putting this in the drupalorg.module is that we know that d.o gives all users 'access project issues' privileges and that no node access control modules are installed.

@mwrochna: I'm sorry doing this task has become so drawn out. You have had this task for 10 days, and you have been responsive with providing good patches that for the most part have done what we have requested. However, due to limitations in the Drupal filtering system and a lack of understanding on our part of how complicated this task would get, you've gotten drawn into these arguments about how to best attack the problem.

Here's your escape: if you do what I describe below, and if your patch works within the parameters I describe below, I will mark this GHOP task as closed, regardless of whether the patch is marked RTBC here. I think keeping you from taking other tasks when we can't agree on what to do about this issue would be unfair. I hope that you'll stick around with this issue even after you're officially done with the task, because this will be a very nice improvement on d.o and you've done a great job so far.

So, here's the plan:
1. The filter should always return FALSE to hook_filter when $op == 'no cache'. Don't worry about checking for node access control modules there.
2. Within the filter itself, keep the logic as you have it now. In other words, keep this part:

+        if (is_object($node) && node_access('view', $node)) {

3. The filter should remain a part of project_issue.module
4. In hook_requirements(), you should check for the following and return an appropriate error:
a) Is a node access control module installed (ie. do any modules implement hook_grants())?
b) Do anonymous users have the 'access project issues' permission? If not, return an error.
c) Is the project issue link filter weighted inappropriately? (you've already done this part, just keep it in).
5. Add text to both README.txt and INSTALL.txt pointing out the limitations of this filter. You don't need to go into great details, but you should strongly suggest that sites using either a node access control module or on which anonymous users do not have 'access project issues' should not enable this filter.

In the hook_requirements() logic, take out any check to see if the filter is being cached, since you're changing it so that it will always be cached.

Please test your patch with and without a node access control module to make sure that the warnings from hook_requirements() are present when appropriate. Please do the same by changing the 'access project issues' permissions for anonymous users.

I expect that 1 more patch should wrap this up. You've basically done all of the above at one point or another, so it will just be an issue of putting these pieces of code together in one patch. I'll review your patch, and assuming it does what I specify above I'll close this GHOP issue and give you credit.

Whether dww or hunmonk will commit it is another issue, but lets keep that separate from finishing the GHOP task.

@all: I think there have been a lot of valid issues that have been brought up here with regards to how this should be implemented. Carefully considering all of our options and doing this right the first time doesn't fit well with the relatively rushed nature of GHOP tasks. I hope I'm not offending anyone by, in a sense, putting my foot down about what is needed to complete the task.

dww’s picture

@aclight: You're being utterly reasonable about the whole thing. ;) I appreciate your efforts, and I'd be shocked if your level headed approach offends anyone. If so, it's their problem, not yours... Thanks!

@mwrochna: I'd like like to reiterate aclight's appreciation for your efforts in here, and to express my apologies that this has mushroomed into a huge, nasty, complicated issue. Sorry we weren't clear enough earlier about keeping the scope of the GHOP task separate from the final, perfect solution that gets committed for the long haul. Thanks for sticking with it this far! ;)

mwrochna’s picture

Status: Needs work » Needs review
FileSize
8.44 KB

Appart from doing what you mentioned, I corrected the regex so it doesn't omit tags like '<code class=...>', and added geshifilter/0 and bbcode/0 to $low_filters. Please check if the comments in readme and install.txt are understandable, correct English :)
Thanks for all the support so far!

aclight’s picture

Status: Needs review » Needs work

@mwrochna: You're officially done! Nice work. Thanks for sticking with this patch when things got difficult. I'm going to go ahead and mark this GHOP task closed so you can move on to something else if you would like. As I said before, it would be great if you could stay with this patch until it is committed, but we understand if you don't have time.

@all: Setting to cnw because I think some of the text can be clarified a little. I'll try to post a patch doing this shortly.

aclight’s picture

Status: Needs work » Needs review
FileSize
9.31 KB

Here's a patch that (hopefully) clarifies the node access control and access project issues permissions warnings/explanations.

hunmonk’s picture

Status: Needs review » Needs work

haven't tested yet. this is only a code style review:

  1. do we want the hard-coded issue states numbers for the CSS classes? we're wanting to get away from that. maybe for now it's best, but i wanted to at least mention it.
  2. i'd like to see more thorough doxygen on all the new functions -- specifically descriptions of the args.
  3. spacing on $comment_id=0 should be $comment_id = 0
  4. NULL not null
  5. $title = check_plain("#$node->nid#comment-$comment_id: $node->title"); -- that title looks awfully noisy to me. maybe we shouldn't include the fragment there?
  6. $output = "<span class=project-issue-status-$node->sid>"; -- attribute values should be double quoted.
  7. Statuses are shown on hover, issues with certain statuses are crossed out -- the word 'statuses' feels clunky, can we reword that?
  8. $match[1]+strlen should be $match[1] + strlen
  9. perhaps the description for the anon user access check in project_issue_requirements() could mention the conflict more specifically? ie, currently there's no mention that b/c anon doesn't have access, it's causing a problem.
  10. .' '.l(t('Please rearange the filters.') bad spacing
dww’s picture

@hunmonk: right on. a few replies:
re: 1. see what i said @aclight in comment #11 above.
re: 5. yeah, i think we should leave out the comment fragment in what we display.
re: 10. bad spelling, too ("rearrange").

mwrochna’s picture

re: 1. it's not that hard to add a column to admin/project/project-issue-status, I can do this if you think hard-coded css is not enough.
re: 2.,3.,4.,6.,8.,10. corrected
re: 7.,9. I tried to change something, but probably someone else should do this ;)

re: 5. using any comments in automatic links is noisy broadly speaking. Sometimes we want to show the issue title, because it's its only mention in a thread, sometimes we list comment-ids, and we don't want the repeat the title each time. Sometimes we use the #89673#comment-658331 syntax (with the comment_id), copying the url, sometimes we refer to 'comment #64' (the relative number of the comment in an issue).
Maybe instead of outputting #89673#comment-658331: issue title we should output just #89673#64, or even #64, and let the user write #89673: GHOP #53: add input filter to convert #12345 into an issue link [#comment-658331],[#comment-658332]?

$result=..query..('SELECT subject FROM {comments} WHERE cid = %d', $comment_id);
theme_project_issue_issue_link($node, $result);

(the subject is usually in the form of '#64')

Patch uploaded on http://token.w.staszic.waw.pl/inne/project_issue-filter10.patch , d.o attachments stopped working.

hunmonk’s picture

Status: Needs work » Needs review
FileSize
9.3 KB

uploading the patch in #65 to the queue

hunmonk’s picture

Status: Needs review » Needs work

re #1: let's just leave it as is for now.

re #5: what about #89673-64? that looks pretty readable to me.

also, shouldn't this have a t() call?

'title' => 'Project Issue to link filter',

there are a few instances like that in project_issue_requirements().

mwrochna’s picture

Status: Needs work » Needs review
FileSize
9.72 KB

Oops, t()'s added.
5. I changed to [#89673#comment-658331] -> #89673-64, and disabled the crossing out for comments.
I don't have any idea how to fix 9.

hunmonk’s picture

Status: Needs review » Needs work

this is getting very close, but there are still some problems

  1. the theme calls in project_issue_filter() are written incorrectly. they should be of the form theme('project_issue_issue_link', $node) -- this runs the theme call through the theming system instead of calling it directly, which allows theme developers to properly override the function in their theme if they desire (ie, if they have a theme called 'foo', they could write a function foo_project_issue_issue_link($node), which would override theme_project_issue_issue_link($node)
  2. $title = check_plain("#$node->nid-". substr($comment_subject, 1)); <-- this isn't working correctly. if i do check out [#125#comment-1], i get check out #125-omment #1 back
  3. i think we can tweak the behavior of the comment situation a little better, to make the usage and display consistent. what i'm thinking is that we do #125-3: Who's Online and Chatbox, and that displays #125-3: Issue title. thoughts?
  4. i don't feel like we should be checking the node type in the theming function -- the theming should strictly handle the display. can we move that check to the filter processing?

i tested the requirements checking pretty thoroughly, and it's working well. let's keep at this, we're almost there!

mwrochna’s picture

Status: Needs review » Needs work
FileSize
9.46 KB

1.,4.: corrected
2.: fixed with 3.
3.: You mean using the comment number=69 instead of the cid=667478, and writing #89673-69: GHOP #53: add input filter to convert #12345 into an issue link? Ok, just check if the query searching for the cid isn't too slow (we need it to append the '#comment-6674748' fragment). When no cid is found the comment number is ignored (false passed to theme()).
To list comments, I thought about writing comments #125-1: Who's Online and Chatbox,#125-2: Who's Online and Chatbox,#125-4: Who's Online and Chatbox in issue #125: Who's Online and Chatbox and show the title only in the latter, but comments #125-1: Who's Online and Chatbox,2,4 or something like this is fine, so I reverted the issue's title in 'commented' links.

mwrochna’s picture

Status: Needs work » Needs review
hunmonk’s picture

man, this patch is frought with edge cases...

i found another one, which we should address if we're going to keep the strikethrough formatting:
since the filter output is cached, when you edit the status of an issue to a strikethrough status, it doesn't display the strikethrough, since we're not clearing the cached filter output on status edit.

to be on the safe side, we might want to clear the cached data anytime somebody edits the issue, period. thoughts?

also, you don't need the check_plain() around the link titles in theme_project_issue_issue_link() -- l() provides a check_plain() to the link text automatically.

mwrochna’s picture

see comments #3,#4 - I concluded that cache is cleared every 24h (at least on d.o), and that this is a delay we can live with.
Or, to workaround caching we'd have to
-for each issue, save what content refers to them (ugly, new edge cases),
-or for each {cache_filter} row save what issues it refers to, then search (ugly and slow).
-alter project_issue_project_edit_form_submit and project_issue_update_by_comment
Or come up with some bright idea :) (some separate, uncached place for those states? or a small, fast, uncached post-filtering?)
Or we can remove the feature (or make it optional/disabled when cache is enabled)

I think we should leave it as it is, just document/warn about the delay (unless problems with caching arise often, in which case more general changes could pay off)

hunmonk’s picture

ok, i guess we can live w/ the 24 hour cache as compared to nothing. so here's the very small list of remaining things:

  1. add some doc regarding the delay of the status stuff. i think adding it to the README file is fine, or possibly the project_issue help page -- use best judgement.
  2. remove those two extraneous check_plain() calls that i mentioned in #72
  3. i need some explanation regarding the thread usage in this query:
    $comment_id = db_result(db_query("SELECT cid FROM {comments} 
    WHERE nid = %d AND thread = '%s'", $nid, int2vancode($comment_number) .'/'));
    

    it seems to work fine, i just want to make sure this is a reliable way to pull the comment id, and i don't really understand the vancoding stuff.

i gave the regex what i feel is a decent workout, and it seems to be performing perfectly. this should be ready for commit after those issues above are addressed.

hunmonk’s picture

Status: Needs work » Fixed

@mwrochna: i've decided you suffered long enough on this patch, so i took it home for you. great work, and great persistance! we'd love to have you on board for more project* related work anytime :)

  1. fixed the extraneous check_plain() calls.
  2. add doc to README.txt to detail the caching delay with the reference links when an issue status is edited.
  3. investigated the vancode/thread method of determining the comment id, and i'm satisfied it works.
  4. cleaned up some minor coding style stuff in comments, and added a few extra code comments where necessary.

committed to HEAD.

hunmonk’s picture

Status: Fixed » Active

hrm. i get a merge conflict when i try to upgrade drupal.org, because of the views patch.

reopening until we can get this running on drupal.org

hunmonk’s picture

Assigned: mwrochna » hunmonk
hunmonk’s picture

Status: Active » Needs work

sigh, the drupal.org curse strikes again...

apparently, the query we're using to get the comment ID based on the comment number isn't working properly. thus, i don't think we can count on the thread to get us to the comment id.

i see three options here:

  1. query directly for the subject ie, db_query("SELECT cid FROM {comments} WHERE subject = '%s'", '#'. $comment_number);
  2. add a comment_number filed to {project_issue_comments} and query for that. this has the added benefit of allowing us flexibility with the comment titles later on
  3. drop support for the filter with respect to links directly to comments

thoughts?

hunmonk’s picture

well, it gets even worse. when i enabled this filter on d.o, all hell broke loose :(

couldn't access certain issues, apache was segfaulting every couple of minutes, etc, etc...

now what??

moshe weitzman’s picture

@hunmonk - it is possible that this is what happens when you clear the filter cache on drupal.org regardless of our new filter. cleariung this cache is an extremely rare, and extremely expensive event. once the cache is significantly rebuilt, the site stabilizes. just throwing out a possibility.

mwrochna’s picture

@ #78.1. I used the subject before, but I thought it didn't work (see comment #69.2), and like you said - we may need to change the comment title.
So of course we can/should add a comment_number field, just I'm not sure if 'thread' isn't enough.
Can you give some examples where this didn't work ({comments}.thread wasn't int2vancode($comment_number) .'/')?
(ex. SELECT thread,subject FROM comments WHERE thread !~ '^[[:alnum:]]*/$' LIMIT 2;)

hunmonk’s picture

@mwrochna: i tried it on a couple of issues w/ a lot of followups. all i can tell you is that some of them didn't match up.

hunmonk’s picture

@mwrochna: alright, i take back my assertation that the threading stuff doesn't work -- i just re-tested a bunch, and they seem to be working perfectly. dunno what happened before. maybe i made a mistake.

so now we just need to figure out what's causing the segfaults on d.o. it was suggested that the filtering code was eating too much memory (d.o has some kind of process memory protection running). i wasn't able to reproduce the problem on either project.drupal.org or my local test install, though -- even pulling up huge issues.

hunmonk’s picture

regarding the segfaulting on d.o, killes, nnewton and i have gotten this far:

  • it's a stack overflow
  • it happens on issues with over 47 comments
  • it only happens when the filter is turned on
  • it's not bombing in the filter code itself
  • we can only repro it on drupal1, etc
mwrochna’s picture

I'll try to move code tag escaping out of the regex (it may be the slowest/ the most memory consuming part).
For now, you can try to drop support for comments (patch attached), this removes one query and a little from the regex.

mwrochna’s picture

The patch removing comment support was against an earlier version, I attach it again in case you'd like to try that (_noc1.patch).
Try it if you think a db_query (the one with int2vancode) could have caused the segfault.

I tried to change the filter as I said (_lin1.patch) - the regex is shorter, and there's a bunch of code to escape code tags in a linear way. But as comments are usually short, the change in time won't be noticable, the code is longer, and it may use more memory than it did (copying $text into $newtext), so it's most probably useless - I'm attaching it just in case.
Try it if you think the preg_match loop was causing the segfault, or if you want a minor speed improvement.

Did the segfault happen on issues/comments with no issue-links (when the filter didn't change anything)?

hunmonk’s picture

@mwrochna: i've clearly determined the following:

  1. the segfault happens in the this function: preg_match("/$regex/", $text, $preg_matches, PREG_OFFSET_CAPTURE, $offset)
  2. the segfault only happens when both the project issue filter AND the code filter are enabled. with the code filter disabled, everything works fine -- except that we have no code filter :P
  3. the segfault seems to only happen when there's actually <code> tags used in the comment. i know for a fact that the comment at http://drupal.org/node/89673#comment-653228 causes the segfault.
ezyang’s picture

Hey folks, while I haven't tried it yet myself, have you tried generating a backtrace when PHP segfaults and forwarding it over to the devs?

hunmonk’s picture

FileSize
19 KB

attached is the exact string that the preg_match() function segfaults on. this is the text pulled from comment #48 above, after it has been run through the code filter.

so if anyone can have a look at this, and at the regex we're using, and see if they can figure out what the problem is, it would be greatly apprciated :)

hunmonk’s picture

FileSize
8.52 KB

attached snippet has been tested as creating the segfault on php 5.2.5

Gerhard Killesreiter’s picture

Note: I've been able to shorten the script a bit, but the result didn't crash on drupal.org anymore, only on my laptop.

Gerhard Killesreiter’s picture

Status: Needs work » Needs review
FileSize
9.46 KB

I've investigated some more and we do run into a stack overflow. This is actually well documented for pcre.

There are two workarounds:

1) increase stack

2) fix regexp

I've done 2) and the simplified regexp is:

$regex = '(?|([^<]++|(<(?!code)))*<\/code>|([^<]++|(<(?!pre)))*<\/pre>|\w)';

Note the two double plusses.

I've updated the patch from #70.

aclight's test cases from #84 still work as can be seen on http://test.drupal.org/node/89673

Gerhard Killesreiter’s picture

The filter is now enabled here on drupal.org

hunmonk’s picture

Status: Needs review » Fixed

we didn't need the whole patch again, so i just took the relevant change in $regex.

looks like it passes aclight's testing post with flying colors. committed to 5.x-2.x, officially deployed on d.o, sec.d.o, project.d.o.

rejoice!

Pasqualle’s picture

aclight’s picture

Status: Fixed » Active

Hm....this is interesting.

I believe the problem here is that for this issue, there is no comment #28.

I did a little test comment here: http://test.drupal.org/node/89673#comment-705776

Just look at the top 7 lines of that.

For the first three links, I deleted comment #3 of that issue. Note that when I do a shortcut link to #3, the filter does not print a comment for that link (since it does not exist) and instead just links directly to the issue node itself.

For the second set of links (these point to this same issue but on test.drupal.org), the reference to #28 (which does not display when you look at all the comments on this issue) actually links to comment #29 and the reference prints as #89673-28. All references to comment n where n > 28 have links that point to comment n + 1

I'm a site admin on d.o and don't see any evidence of comment #28, so I don't think it's unpublished.

I'm not sure where the bug is here. It's probably not actually related to this patch itself, but rather it's either just an anomaly somehow related to the IFAC upgrade, or maybe it's just a one off in the d.o database, or possibly in the int2vancode() or vancode2int() functions from comment.module.

This probably isn't worth spending much time on, but I suppose it's possible that this bug is a symptom of a greater problem somewhere.

hunmonk’s picture

from comment_save():

$max = db_result(db_query('SELECT MAX(thread) FROM {comments} WHERE nid = %d', $edit['nid']));

since the comment numbers are always tracked sequentially, i don't think we can rely on the thread value -- the code above suggests that it's possible to delete the most recent comment, add a new one, and have the same thread value, even though the comment number would be incremented by one. this would throw things out of sync.

so we need another way to map the comment numbers to the cids. two suggestions:

  1. just use the subject: db_query("SELECT c.cid FROM {comments} c INNER JOIN {project_issue_comments} pic ON c.cid = pic.cid WHERE c.subject = '%s'", '#'. $comment_number);
  2. add a 'comment_number' column to {project_issue_comments}

1 would certainly by fine for now, 2 would allow us more flexibility long term if we decided not to control the comment subjects like we do now.

thoughts?

Pasqualle’s picture

interesting problem.
for issue link
we use node.nid as issue id,
but we do not use comments.cid as comment id (because it is not displayed), so we use comments.subject..
It seems very site specific, but users are use to it, so it will remain this way..

Is not the node.id missing from the SELECT statement in the first option? That select should return multiple rows.

As I know comparing strings is slower than comparing numbers, so the second option could be faster, but there are not to many comments per node id, so the difference should be minimal..

As we use comment subject for the link, the first option is fine. As I feel, there is no such thing like $comment_number, it is a subject. Do not create a column for it.

aclight’s picture

As a note, i discovered that if you put in a comment number that is valid but which points to an unpublished comment, you still get a link that goes specifically to that comment and not just to the general issue. The only information revealed here is that the comment number (cid) is that of a valid comment--there's no information about the actual comment itself that is visible to anyone who wouldn't otherwise have the ability to see the unpublished comment.

But it should be easy to add a "AND status" condition to the WHERE clause when we fix this issue, so I figured I'd just report this here instead of creating a separate issue.

hunmonk’s picture

Status: Active » Needs review
FileSize
4.28 KB

lord, the update queries for this took me awhile... :|

tested the upgrade path on mysql and postgres and it seems to work perfectly. i also addressed the unpublished comment issue mentioned in #99

created a new comment and it worked fine. filter seems to be working perfectly w/ the new setup.

hunmonk’s picture

forgot mysqli in the update code. this should fix it.

aclight’s picture

Status: Needs review » Reviewed & tested by the community

Tested with mysqli and functionality works as expected, even when an unpublished comment is referenced in the project issue, eg:

This fixes blah blah [#12345-45]

The only potential gotcha I noticed is that if a comment is unpublished, any links to it using the syntax above will stay as they are until the filter cache is cleared for that node, which I think takes 24 hours by default. This is not a problem at all in my eyes because that's completely out of our control. I only point this out in case someone tests this and thinks at first that it doesn't work as intended.

hunmonk’s picture

Status: Reviewed & tested by the community » Fixed

the update also passed the test on project.drupal.org.

committed to 5.x-2.x. i'll deploy on d.o later today.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.