Line break converter can result in empty node display - PCRE limits
Behrang - April 2, 2007 - 16:48
| Project: | Drupal |
| Version: | 6.x-dev |
| Component: | filter.module |
| Category: | bug report |
| Priority: | critical |
| Assigned: | Unassigned |
| Status: | reviewed & tested by the community |
Description
When very long text is entered in the body of a node and input filter contains Line break converter filter, it doesn't let the content be displayed in the view tab, but it is available in the edit tab.
This seems to happen inside "_filter_autop" function.
Steps to reproduce:
1- Go to Create content > Page (or any other node type)
2- Enter a title
3- For body, enter a text that is longer that 40000 characters
4- Submit
5- Now in the view tab, body is not displayed
If you edit the body and enter less content (Under 30000), it will be viewed.
My configurations:
- Windows XP
- Apache 2.2.3
- PHP 5.2.0
- Drupal 5.1

#1
Hardly critical. I would like to see this reproduced on various OSes and PHP versions before trying to find the regex among the many which runs out (if it's indeed autop).
#2
Have the same problem that was described. This is very disagreeable bug. I traced the problem on FreeBSD dedicated server. At home Windows 2000 PC (with Apache, PHP, MySQL) all works fine.
#3
The problem is not in the drupal filter module but in the php settings.
Find and uncomment this strings in php.ini:
;pcre.backtrack_limit=100000
;pcre.recursion_limit=100000
then set it to
pcre.backtrack_limit=1000000
pcre.recursion_limit=1000000
for example.
#4
Confirming this issue on Drupal 5.3 (PHP 5.2.4)
The #3 recipe has worked for me by putting in settings.php:
ini_set('pcre.backtrack_limit', 1000000);ini_set('pcre.recursion_limit', 1000000);
#5
I put 147000 characters and works very well..
D.5-dev apache on windows xp and mysql, browser Firefox 2
#6
OK so it appears from the above that the problem is with config of PHP (actually only in PHP 5.2.0 or later, which introduced the PCRE limits http://uk3.php.net/manual/en/ref.pcre.php). See also http://bugs.php.net/bug.php?id=40846.
Solution appears to be to increase the limits as per #4 in settings.php. The link to the PHP bug above actually suggests 10,000,000 as a more sensible limit (i.e. 100 times the PHP default of 100,000, and 10 times the suggested 1,000,000 at #4). Might want to check that we are in fact increasing the system's current limits...
Would probably need to be addressed in 6.x/7.x first and then backported to 5.x but as there's no patch yet I'll leave it against 5.x since that's where most people will be hitting this problem at the moment.
@ricabrantes: what version of PHP are you using? What are the values of pcre.backtrack_limit and pcre.recursion_limit (e.g. from phpinfo)? Are you saying that you reproduced this bug, and that #4 fixed it?
#7
My versions are: Windows xp sp3(beta), PHP 5.2.5, MySql 5.0.51, Apache/2.2.6, pcre.backtrack_limit 100000 and pcre.recursion_limit 100000..
I tested on Firefox 2.0.0.12, ie6, opera 9.26 and Safari 3.0.4 for windows..
I can´t reproduced the bug, the text is show very well..
#8
Upping the limits to:
ini_set('pcre.backtrack_limit', 10000000);
ini_set('pcre.recursion_limit', 10000000);
doesn't seem to solve the problem. I've entered these values in settings.php, and the changes are confirmed in PHPinfo()
I have a 70,535 character count node and it will not display under the view tab...
Drupal 5.7
PHP 5.2.5
Apache (Unix)
Shared hosting environment
-Chris
#9
Some additional information:
I've found some errors listed in my hosted site's control panel error log regarding these long character nodes that I'm trying to edit/submit:
[Mon Mar 31 14:27:33 2008] [error] [client 70.137.148.72] ALERT - configured request variable value length limit exceeded - dropped variable 'field_body_of_chapter[0][value]' (attacker 'IP.address', file '/example_home_directory/index.php'), referer: http://www.examplesite.com/en/node/2170/edit
[Mon Mar 31 13:51:30 2008] [error] [client 70.137.148.72] ALERT - configured request variable value length limit exceeded - dropped variable 'field_body_of_chapter[0][value]' (attacker 'IP.address', file '/example_home_directory/index.php'), referer: http://www.examplesite.com/en/node/1481/edit
After some searching I found this error is related to the Suhosin Hardened PHP extension. Specifically the suhosin.request.max_value_length value of 65000 . My problem node/post of 70,535 characters is exceeding these limits and the field: 'field_body_of_chapter[0][value]' is being dropped... I didn't fully determine when it's being dropped (when I submit the node, when I view the node, etc). But somewhere in the process of creating/viewing the node, it's getting ...killed by this limit.
I've tried increasing these limits via ini_set, but they don't take hold, phpinfo() returns the same 65000 limit:
(tried:
ini_set('hphp.post.max_value_length', 180000); <--- how I've seen it described on other forums
ini_set('hphp.request.max_value_length', 180000);
and
ini_set('suhosin.post.max_value_length', 180000); <--- how the variable actually appears in my phpinfo()
ini_set('suhosin.request.max_value_length', 180000);
)
My host provider recently upgraded to PHP 5.25 (which may have included Suhosin Hardened PHP) - I have existing long nodes in the database, and they are displayed under the View tab. But any edits I try to submit to these existing long text nodes are not submitted - so the problem appears to be occurring during the Submit phase...
So, for those people (like myself) who first try to solve the problem with
ini_set('pcre.backtrack_limit', 10000000);
ini_set('pcre.recursion_limit', 10000000);
and still don't see the 'unable to view long text nodes problem' go away, try looking at your PHP setup to see if the same Hardened PHP restrictions are in place.
-Chris
#10
I ran into this because I kept getting completely empty node contents displayed on my site seemingly at random.
http://drupal.org/node/225335 was duplicate. This is a nasty one.
#11
#12
@hubris: Just to clarify/confirm: I conclude that in your case the problem you were having is unrelated to the original problem of PCRE limits but a specific restriction on your server's PHP setup which I can't imagine Drupal should try to work round (i.e. there is no way on your server of POSTing more than 65k in the node body). Thanks for the update since that does at least clarify the situation.
Also just to note that the problem is much less likely to occur prior to PHP 5.2.0 since the PCRE limits were essentially reduced with this version of PHP.
#13
Yeah I should note my install is standard debian etch on 5.2, and we have a bunch of articles which break this limit. Bumping this back to critical since it's a pig to track down and we had a load of visitors asking about 'missing pages' etc.
Not to mention everyone will be running 5.2+ when we release.
#14
@gpk: You are correct on the clarification/confirmation: the problem I'm having is not the PCRE limits -- the 65k POSTing limit I'm experiencing is due to the Suhosin Hardend PHP POST limits as setup by my host provider, and isn't something that Drupal development needs to take into account.
-Chris
#15
i guess i am having the same issue here. #4 did not help.
#16
@yngens: what are the values of pcre.backtrack_limit and pcre.recursion_limit reported by phpinfo() on your server? Also is it running suhosin (again should be reported by phinfo()).
Also what is the size of the post you are trying to make?
#17
gpk, i don't remember, but i am sure i tried even bigger numbers than ones recommended here. not sure about suhosin too - i decided to require users to divide big posts into chapters instead of putting everything into one post as a workaround. but the rpbolem is still there and when i have little more time i will try to test again and report here. thanks
#18
OK awaiting your input ...
#19
I have had this empty node problem after upgrading to php5.x from php4.x
I put this code in my php.ini file in root directory:
pcre.backtrack_limit = 150000pcre.recursion_limit = 150000
the longest node of my site (53896 characters including spaces, 55704 bytes) is now showed again.
Since it solved my needs, I did not set a higher limit because I read here about some side effects:
http://de.php.net/manual/en/pcre.configuration.php
but it's good to know that it worked and that a higher value can solve the problem for longer nodes.
Thanks,
Renirtor
#20
Alternative solution
Just want to confirm saw this issue when I upgraded from php 4.x to php 5.2.6
drupal 5.12
php 5.2.6
Linux Fedora FC8
Resolved the issue by trying solution in comment #3.
But you can also use the paging module which solved the problem without making changes to PCRE limit.
#21
#22
@John Morahan
Do you mean that by appplying the patch you don't need to update PCRE limits?
#23
That's the idea, yes.
#24
the idea is that the new regex just replaces the \n\n and variants without trying to remember the bits in between.
moving this issue back to filter.module
#25
with test
#26
Regarding the test:
#27
#28
The last submitted patch failed testing.
#29
apparently an installer change confused the testbot
#30
I made a mistake (sorry don't know why)and resubmitted the patch in #25 as well for retesting. I hope this does not affect testing started previously for patch in #27. If it conflicts, My apologies. Should be more careful next time. I am suspecting even if this may not conflict, the system message about result may conflict since all it says results about "last patch submitted" rather than what time/date the retesting was requested. In that case my request would be the last one.
#31
Nicely done.
#32
Eh. Can we please have a couple of of the 20 people or so who reported having this issue testing the patch?
#33
Hmm, chx "assigned" me this issue to review ... but unlike chx i can be (and was) distracted ... so i am not sure whether my input is still relevant ...
Well ... the last patch replaces a backtracking regex with a simpler regex. Testing strings of pcre.backtrack_limit-length is kinda superfluous now, as there is no backtracking or recursive regex in the _filter_autop function left, that could run into that "limit". I would suggest removing the addition to filter.test, and can re-roll the patch if needed.
Yet the new regex leads also to a slightly different output than the old regex - there's whitespace in the last < p >-Tag (which has no impact in HTML). This could be trivial, but as I am no regex-ninja, there could also be other implications I don't see ... I have attached a demo script - illustrating the (trivial?) difference.
#34
Thanks for the review frega!
Yeah, I forgot to handle the ending \n's as a special case like the beginning (and also dropped a \n from the final
</p>). Will fix later.I do think the test (or something like it) should stay, so that it will fail if someone later makes a change that unintentionally runs into these limits again. It's not always immediately obvious from looking at a regex how it will behave in these situations.
#35
#36
#37
Good patch. Assuming it is enough to test the string exactly at the limit, rather than a longer string...
#38
well, $this->randomName() adds a short prefix too
#39
Committed to CVS HEAD. Thanks.
#40
#41
Untested backport.
#42
thank you
#43
#537788: Urgent : When there are many unicode words node shows up as just blank rendering the site useless was a likely duplicate.
#44
This fixed a problem I had on my local install that had a backtrace limit of 1000. Also tested that increasing the backtrace limit also solves the problem, but this is a good fix. Took me 30 minutes to debug that it was the line break filter and lead me to this issue.