The HTML corrector input filter mistakenly encodes (entifies) the less-than symbol in the "<!--break-->" tag. This results in the string <!--break--> being added to node output in some cases -- for example, in RSS feeds generated by taxonomy.module -- which displays the literal string "<!--break-->" at the summary split.
The patch adds a single line to _filter_htmlcorrector() that removes the string "<!--break-->" before entifying the less-than character.
+ // Remove the teaser separator before entifying angle brackets.
+ $text = str_replace('<!--break-->', '', $text);
+
// Properly entify angles.
$text = preg_replace('!<([^a-zA-Z/])!', '<\1', $text);
This is a conservative patch intended to fix only the appearance of the "<!--break-->" tag. I propose expanding this chance to remove all HTML comments prior to coding the less-than character. To remove all HTML comments, the line $text = str_replace('<!--break-->', '', $text); above should be changed to:
$text = preg_replace('/<!--(.|\s)*?-->/', '', $text);
Reproducing the bug
- Create a series of nodes using the "Full HTML" input filter.
- Give these nodes unique summaries (teasers). That is, instead of "splitting" the node, give these nodes unique teasers that are not intended to be joined with the rest of the body. (This is achieved by unchecking "Show summary in full view" above the node's summary/body textareas.)
- Tag each of these nodes with the same term.
- View the RSS feed for that term (at
example.com/taxonomy/term/TID/0/feed). You should see the literal text "<!--break-->" at the top of each item.
The "<!--break-->" tag is also visible on each node's Dev Load tab.
| Comment | File | Size | Author |
|---|---|---|---|
| html_corrector_break_tag.patch | 523 bytes | todd nienkerk |
Comments
Comment #1
todd nienkerk commentedPlease note that, in the example above, the text "<!--break-->" has been stripped from inside the <code></code> tags. This results in my patch looking like it does nothing. Please view the patch file itself to see the un-stripped code.
(Perhaps the code filter should allow HTML comments?)
Comment #2
todd nienkerk commentedFixing typo in the issue title.
Comment #3
damien tournoud commentedRelated, but strangely the opposite: #222926: HTML Corrector filter escapes HTML comments
Comment #4
todd nienkerk commented@Damien Tournoud: Actually, I think that issue is exactly the same. (And, unfortunately, it's showing up in 6.x and 7.x.)
Comment #5
jcnventuraIndeed the problem is the same, and had already been identified two years ago (#97182: <!--break --> is transformed into html code with lt and gt), which makes me wonder how the HTML corrector module was merged into core with such a major problem.
João
Comment #6
todd nienkerk commented@jcnventura: I left a message in the issue you posted directing people to the patch posted here. How can we get this into the next core release?
Comment #7
gpk commented@6: See the roadmap at http://drupal.org/node/222926#comment-1086392.
I'm marking this as duplicate of #222926: HTML Corrector filter escapes HTML comments because the underlying problem is incorrect handling of HTML comments by filter.module, and I don't believe a partial fix like this would ever be committed. Your patch may remain a useful fix though for some people until such time as the other issue is fixed.
Often patches don't get in because people don't test them so that might be somewhere you can help.
Comment #8
g10tto commentedHow does one apply this patch? What file and where inside does one place the code?
Comment #9
todd nienkerk commentedg10tto:
First, read this: Applying patches. The patch is applied to filter.module, which is the "component" of this issue (listed above). It's found in the core modules directory:
/modules/filter/filter.module.Comment #10
g10tto commentedThis was patched in the latest version of v6.12, however I still get issues from time to time (like now) on new nodes, even if I replace the filter.module file with one that I know works on another Drupal site.
Comment #11
drupert55 commentedI see this in 6.13.
Comment #12
spade commentedI still see this in 6.16.
The patch fixed it for me.
Thanks.
Comment #13
gpk commentedNote that this should be fixed in 6.17 since #222926-102: HTML Corrector filter escapes HTML comments got committed, though that issue is still marked needs work for one or two follow-ups/edge cases.
Comment #14
Pomax commentedStill broken, moved my comment on it to http://drupal.org/node/222926#comment-3444322