The following text will cause codefilter to prematurely stop syntax highlighting:

<code>
<?php
print "Hello </code>";
?>
</code>

Here's a live example:

<?php
print "Hello 

";
?>

As you can (currently) see, that definitely needs to get fixed.

In addition, an html snippet that includes <code> will also cause Code filter to stop syntax highlighting prematurely:

<code><p>This is a snippet of <code>html</code> that includes the code tag</p></code>

Live Example:

<p>This is a snippet of html that includes the <code>code element.

Comments

johnalbin’s picture

Title: <code> within <code> prematurely stops syntax highlighting » </code> string prematurely stops syntax highlighting

It's actually the </code> string that causes the error, not <code>.

zeta ζ’s picture

Whereas:

print "Hello ";

is OK.

It only fails if there is an unclosed <code> outside the .

nancydru’s picture

As I mentioned in the other post, I had this problem too, so I'm tracking it.

zeta ζ’s picture

Example #2a is not very useful, because any html should be outside the <?php ?>.

If the code tags are outside the <?php ?>, the whole lot needs another pair of embracing code tags, which is when codefilter fails.

Should we not also bear in mind the vast back catalogue of nodes that were written with the codefilter as it works now? Maybe we might need

if (nid < 210000) {
  ...
}

for d.o.

nancydru’s picture

I doubt that fixing this will break any existing nodes because people haven't been able to get it to work. Also bear in mind that codefilter is not used solely on DO - I have two sites that I use it on and neither one is in any danger of reaching 210,000 nodes in my lifetime.

zeta ζ’s picture

Nancy: I wasn’t intending to put it in codefilter just in d.o.

Currently can’t get my testsite to put <?php ?> in

<code>tags</code>

, so I’ll have to work on this.

soxofaan’s picture

The solution/workaround we offer in GeSHi filter is to support both <code> and [code] as code block delimiters, so if you need <code> in a code block you write [code]...<code>...</code>...[/code]

Supporting all sorts of nesting corner cases will lead to a regular expression nightmare or the need for a parser, which isn't worth it IMHO.

nancydru’s picture

I looked at the module and admit I don't understand preg-replace. Can you increment a depth counter every time you encounter the start tag and decrement when you encounter an ending tag and only quit when the depth is back to zero?

johnalbin’s picture

Nancy, unfortunately, you can't implement a depth counter because the code can just be snippet of code and there's no guarantee that it will be well-formed (with matching opening and closing tags.) See my first example for what I mean.

Stefaan's solution is not bad. But I'm testing another solution (which I'm not sure will work at this stage).

This issue is similar to: http://drupal.org/node/38047 ("?>" string prematurely stops syntax highlighting)

zeta ζ’s picture

Ah yes… I was thinking of a depth counter (using a gargantuan regex) :-(

zeta ζ’s picture

I think soxofaan #7 is on the right track. Although I wouldn’t support both as such. Rather I would leave <code>...</code> as html tags and require a different pair of tags to invoke codefilter to do its work (eg. for this post [cf]). There would then be no reason to nest codefilter tags (unless you want to specify what to type to invoke codefilter, In which case we could handle the single exception of [cf][cf].*[/cf][/cf] ie. no need to handle partial quote).

Rationale:
the W3C doesn’t seem to have much to say about <code>...</code> CODE: Designates a fragment of computer code. I’m not even sure they were thinking of HTML as computer code. By default <code>...</code> renders only as mono-spaced, and doesn’t even preserve consecutive spaces. In a browser, partial snippets like <code>...<code>...</code> break the page, and <code>...</code>...</code> fail (both as you might expect).

I’m confident this will make codefilter much simpler and quicker: Although it is supposed to filter input, it does so at output time (and time again) as far as I understand, so could be the source of a performance hit if we handle too many edge cases.

corsix’s picture

As #9 mentioned the "?>" premature closing issue, I feel it is pertinent to point out that the patch in that issue handles this edge case as well (see http://img174.imageshack.us/img174/7770/codefilterow7.png for an example).

nancydru’s picture

@zeta-zoo: it's interesting that they have so much more to say about <pre>.

@corsix: I'm sure Mr. Albin is aware of that and considering it.

mr.baileys’s picture

Issue tags: +Invalid XHTML

Subscribing. I was trying to show my users how to use the code filter by embedding <code>-tags inside another set of <code>-tags...

As an unfortunate side effect, nesting <code>-tags also seems to render the resulting document invalid XHTML because of an unmatched </code> tag.