Widont treats <pre> as </p> and alters preformatted text
guardian - April 10, 2008 - 22:17
| Project: | Typogrify |
| Version: | 6.x-1.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | mikl |
| Status: | closed |
Description
I noticed that the widont feature operates on the last line of code snippets put inside <pre> </pre> tags so I had a look at the implementation and noticed that the regular expression used is incorrect.
Please find attached two files:
- php-typogrify.fix_.patch provides my personal fix to the current implementation
- php-typogrify.new_.patch patches the current implementation to the version found at http://wordpress.org/extend/plugins/wp-typogrify/ (possibly more complete and up to date)
patching against the newer implementation seems to work for me, otherwise you can stick to the implementation the module currently uses and only apply the widont fix
cheers
| Attachment | Size |
|---|---|
| php-typogrify.fix_.patch | 1.52 KB |
| php-typogrify.new_.patch | 19.3 KB |

#1
more accurate title and status
#2
#3
I can say that newer widont implementations breaks pages on my site. If you have a big article, complex regexp which widont contains overflows preg's stack. It's not hapening when I'm using older version. Maybe python has a better regexp implementation, but php sucks in this case.
#4
In
widont($text)thepreg_replace($widont_finder, '$1 $2', $text);returnsNULLin my case. Thepreg_last_error();returns PREG_BACKTRACK_LIMIT_ERROR.By playing with that big regexp in widont(), I found that if you'll remove
[^\s<>]*inside it, everything begins to work fine. Actually, I simply don't understant what they trying to match with this thing (" <>" ????), but it's very likelly that the devil sits in it.I've attached a content on which is happens.
#5
Okay, that change seems sensible. I'll give it a shot.
#6
is this fix applied in 5.x-1.0-beta4 please?
#7
No, I've been trying to put some test cases together, so I can try and refactor some of the heavy RegExes in Typogrify without creating new bugs – sorry for the delay…
#8
I think I ran into this on v6.x, I posted an article with heavy use of PRE and CODE only to have the entire body disappear when the Typogrify filter was enabled.
#9
Here's a patch with which my site works fine about a 3 month so far. It's against latest 6.x
#10
All right, I've committed the patch from #9, going to roll a beta 5 release soon :)
#11
Hmm, it would seem that the patch from #9 introduces a regression – the nbsp doesn't replace the space, but is added, giving in effect two spaces.
I've written some more tests, so now I need to figure out a way to fix one without breaking the other…
#12
#13
The regression was fixed in #447416: Widont not working properly – thanks dboulet
#14
Automatically closed -- issue fixed for 2 weeks with no activity.