Widont treats <pre> as </p> and alters preformatted text

guardian - April 10, 2008 - 22:17
Project:Typogrify
Version:6.x-1.x-dev
Component:Code
Category:bug report
Priority:normal
Assigned:mikl
Status:closed
Description

I noticed that the widont feature operates on the last line of code snippets put inside <pre> </pre> tags so I had a look at the implementation and noticed that the regular expression used is incorrect.

Please find attached two files:

  • php-typogrify.fix_.patch provides my personal fix to the current implementation
  • php-typogrify.new_.patch patches the current implementation to the version found at http://wordpress.org/extend/plugins/wp-typogrify/ (possibly more complete and up to date)

patching against the newer implementation seems to work for me, otherwise you can stick to the implementation the module currently uses and only apply the widont fix

cheers

AttachmentSize
php-typogrify.fix_.patch1.52 KB
php-typogrify.new_.patch19.3 KB

#1

guardian - April 11, 2008 - 09:55
Title:Widont fails to bypass text inside <pre></pre> tags» Widont treats </pre> as </p> and alters preformatted text
Status:active» needs review

more accurate title and status

#2

guardian - April 11, 2008 - 09:56
Title:Widont treats </pre> as </p> and alters preformatted text» Widont treats &lt;pre&gt; as &lt;/p&gt; and alters preformatted text

#3

neochief - January 8, 2009 - 19:36

I can say that newer widont implementations breaks pages on my site. If you have a big article, complex regexp which widont contains overflows preg's stack. It's not hapening when I'm using older version. Maybe python has a better regexp implementation, but php sucks in this case.

#4

neochief - January 9, 2009 - 17:13
Status:needs review» needs work

In widont($text) the preg_replace($widont_finder, '$1&nbsp;$2', $text); returns NULL in my case. The preg_last_error(); returns PREG_BACKTRACK_LIMIT_ERROR.

By playing with that big regexp in widont(), I found that if you'll remove [^\s<>]* inside it, everything begins to work fine. Actually, I simply don't understant what they trying to match with this thing (" <>" ????), but it's very likelly that the devil sits in it.

I've attached a content on which is happens.

AttachmentSize
test.txt 8.4 KB

#5

mikl - January 12, 2009 - 18:06
Assigned to:Anonymous» mikl

Okay, that change seems sensible. I'll give it a shot.

#6

guardian - April 3, 2009 - 09:36

is this fix applied in 5.x-1.0-beta4 please?

#7

mikl - April 3, 2009 - 15:25

No, I've been trying to put some test cases together, so I can try and refactor some of the heavy RegExes in Typogrify without creating new bugs – sorry for the delay…

#8

DamienMcKenna - April 20, 2009 - 17:20

I think I ran into this on v6.x, I posted an article with heavy use of PRE and CODE only to have the entire body disappear when the Typogrify filter was enabled.

#9

neochief - April 21, 2009 - 00:23
Title:Widont treats &lt;pre&gt; as &lt;/p&gt; and alters preformatted text» Widont treats <pre> as </p> and alters preformatted text
Version:5.x-1.x-dev» 6.x-1.x-dev
Status:needs work» needs review

Here's a patch with which my site works fine about a 3 month so far. It's against latest 6.x

AttachmentSize
typogrify-widont.patch 1.33 KB

#10

mikl - April 25, 2009 - 16:28
Status:needs review» fixed

All right, I've committed the patch from #9, going to roll a beta 5 release soon :)

#11

mikl - April 26, 2009 - 17:42

Hmm, it would seem that the patch from #9 introduces a regression – the nbsp doesn't replace the space, but is added, giving in effect two spaces.

I've written some more tests, so now I need to figure out a way to fix one without breaking the other…

#12

mikl - April 26, 2009 - 17:42
Status:fixed» needs work

#13

mikl - April 28, 2009 - 16:06
Status:needs work» fixed

The regression was fixed in #447416: Widont not working properly – thanks dboulet

#14

System Message - May 12, 2009 - 16:10
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.