The node teaser generator keeps the whole body as the teaser, if there is indication that PHP code is included in the body (since it is unkown, what is going to come out of the PHP parser). The only serious problem is that it does not check if the PHP filter is included in the input format, so it does keep the complete body even if there is no PHP parsing on the code. This breaks PHP code examples in articles or blog posts, which are not supposed to be parsed, but which require proper teasers.

The solution is to make the teaser generation code aware of the input format (pass the whole node), and let it provide the teaser that way, depending on the input format.

The attached patch also cleans up that very unprofessionaly looking sequence of copy-pasted stuff in the node_teaser function.

Comments

gábor hojtsy’s picture

Here is an alternate approach, which does not take the node as reference, but asks for the body and the format, and then returns the teaser, as before. Pick this or the previous one, or suggest something else :)

gábor hojtsy’s picture

Is someone willing to review these fine patches? Either one :)

jonbob’s picture

-1 on using the node reference; some node types need to extract the teaser from a field or collection of fields not named "body."

The other patch looks fine to me, but I haven't tried running it.

gábor hojtsy’s picture

JonBob, this is just an internal decision, as node_teaser() is not a hook implemented, it is just the default logic used to generate the teaser. Other modules are free to provide a teaser for a node in the node validation process before this logic is invoked, so that the custom teaser is not overwritten. Passing on the node reference or the body and the format is just a matter of programming style...

jonbob’s picture

Not entirely. It seems sensible to me for a module to concatenate two fields and run them, as a unit, through node_teaser() to generate a teaser that way. Using the node reference, they would have to abuse the body field for this (and if one of the two fields in question is "body" already, that's a sticky situation).

gábor hojtsy’s picture

Ah, understood, and agreed!

Steven’s picture

Can't this be solved by allowing the passing of NULL as the $format, and to disable this check in that case? I really like this solution, it's much better than the current hack.

Steven’s picture

I was talking about the alternate solution by the way.

gábor hojtsy’s picture

Steven's suggestion implemented.

gábor hojtsy’s picture

Seems like everyone is happy with this patch :) Or is that only my single minded view? ;)

dries’s picture

I wonder if this is the best way to check for the PHP code filter. It's not exactly clean code. Except for that check, this patch looks good.

gábor hojtsy’s picture

Dries, the filter format API provides filter_list_format() which returns an array in this format. I don't think that redefining the filter API is a good move in this part of the release cycle.

dries’s picture

But what do we do when I create a filter X which needs special treatment just like the PHP filter?

gábor hojtsy’s picture

Node module will not know anything about arbitrary filters, but it is not supposed to know anything about them either... We can put some new shiny hook in here to let filters help doing teasers, but I think it is just overcomplicating the question here. Sure this patch will not solve issues with arbitrary filters, this just solves an existing bug, and cleans up code. I don't have a good idea though to handle arbitrary filters while generating the teaser, and would really like to get the fix into Drupal for 4.6.0. Any good ideas are welcome!

gábor hojtsy’s picture

I would be happy to work more on this patch, given a direction...

TDobes’s picture

+1 for this patch.

As for arbitrary filters that may require special handling, that problem existed before this patch existed. My thought on solving it would be to move teaser generation to occur on the filtered text. This would solve the problem of needing to deal with special-cases filters like the PHP code filter and any others that will encounter problems if their input is truncated. Of course, moving teaser generation in this way will cause it to potentially generate unclosed tags which won't be fixable by an htmlcorrector filter because the filters would have already done their processing. Of course, if htmlcorrector were part of core, we could choose to run it after the teaser generation rather than with the other filters.

Anyway, those are just thoughts... not necessarily "good ideas." :-) I suggest we commit this patch as-is to deal with the PHP code filter as a special case, and deal with allowing other filters to be special cases later. As far as I know, no other filter needs special handling at the moment other than the PHP filter.

gábor hojtsy’s picture

TDobes, just a small comment: we don't do teaser generation on the output to allow dynamic teasers. The PHP filter for example allows for dynamic information to be put into the text.

dries’s picture

Generating the teaser from the raw input isn't going to work; this is illustrated by the PHP filter, but I could come up with a handful of other examples. The proposed patch is a quick hack; it fails to address the larger problem.

Generating the teaser from the output (filtered input) may result in unexpected behavior, but at least it isn't a hack. Generating the teaser from the output would work if the filter system would not cache all generated output. I'd have to dig the code but if I remember correctly, this was within the filter system's capabilities. For example, how does it deal with 'non-teaser dynamic output'?

I'd certainly prefer to explore the 'on output' path.

gábor hojtsy’s picture

Dries, you don't care about unexpected output, as long as it is generated by some fine code? Really? I needed to read your words multiple times, but still cannot understand your point here...

dries’s picture

Goba, not quite. I suggest to use 'on output teasers' in combination with _not_ caching dynamic teasers. Like that, there won't be unexpected behavior because PHP-based teasers wouldn't get cached. This should be fairly easy to accomplish because the filter system supports this already. Open filter.module and search for 'no cache'. Makes sense?

gábor hojtsy’s picture

Aham, so 'on output teaser' generation would mean that teasers would be dynamically generated all the time, and not stored in the DB? The suggested name of the approach implies that. Or I just got totally lost in the discussion... I increasingly feel like wasting your time asking to explain to me what you are up to... And most of the time I arrive at dead ends. Like changing the teaser generation so dramatically in this phase of the 4.6 release process does not seem to be a feasible project, although it might make sense for 4.7.

dries’s picture

Goba: most teasers would get cached except those that are specified to be dynamic (i.e. those who are generated by a filter with the 'no cache' attribute set). The current filter system only caches filtered output if it can be cached. For example, it does not cache filtered output that is the result of the PHP filter. This shouldn't (and wouldn't) be any different for teasers. See filter_format_allowcache() and check_output() for details.

Either way, I'll give it some more thought. Like, I want to investigate how this affects teasers being stored in the database.

gábor hojtsy’s picture

Well, this would surely be nice to fix in 4.6.

Steven’s picture

I'm definitely in Goba's camp. Yes we would be dealing with the PHP filter as a special case, but then it is a really special filter. In fact, it is the only filter like it: it does not transform user-input from one format to the other, it executes it. It is itself a hack, but one which works very well within the context of the filter system.

By generating teasers and caching them, we would need to regenerate all teasers if filter settings change. This isn't a problem in Dries' situation, but it is not clear to me how his approach would deal with storing teasers in the database. We also need to consider how this would mesh with custom teasers, like flexinode.module or excerpt.module.

Using PHP code in Drupal core is only going to increase when we're going to collect code snippets in the handbook. I don't think treating the PHP filter as a special case is unacceptable for core: until we see another code evaluator filter, this whole discussion is unnecessary.

There is another thing to consider: the only reason we can short-cirtcuit teaser generation for PHP code is because you can decide yourself in the PHP code whether to output a teaser or the full body. This is not something you can do when the "no cache" filter in question is a 'normal' filter without conditional logic in it.

The current filter system works the way it does because people currently want it to do all those things. Custom evaluation filters is not one of those things.

The fact is, this patch improves PHP teaser behaviour massively already. What we have now in core is an even bigger hack. +1 on this patch, for 4.6 final.

Steven’s picture

StatusFileSize
new4.23 KB

I think there was a bug in Goba's patch. strpos(...) != false should be strpos(...) !== false. strpos() can return 0 when the string is matched at the beginning.

New patch for CVS, should apply to 4.6 too.

Steven’s picture

I convinced Dries recently that this patch is still a good idea. Generating teasers dynamically is something which we /may/ do in the future, but until someone actually asks for that, this solution is much better than what we had before.

Commited to 4.6 and HEAD.

Anonymous’s picture