Closed (fixed)
Project:
Drupal core
Version:
7.x-dev
Component:
base system
Priority:
Normal
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
2 Aug 2009 at 14:56 UTC
Updated:
10 Jul 2010 at 02:40 UTC
Jump to comment: Most recent file
Comments
Comment #1
gerhard killesreiter commentedThis issue is due to the limits of the perl regular expressions that we use to validate that you posted valid UTF8.
If you increase the limits in your settings.php file you will see the text.
ini_set('pcre.backtrack_limit', 200000);
ini_set('pcre.recursion_limit', 200000);
You will need to experiment which values work for you.
Comment #2
avpadernoAs people normally post in English, on Drupal.org, the problem would not appear so much frequently.
The length of the text of the reported forum is also higher than the length of a normal post you can find here. I would think that such problem is not so relevant for Drupal.org itself.
Comment #3
gerhard killesreiter commentedRight, I am not going to make the proposed changes on drupal.org.
Comment #4
kaakuu commentedThanks.
However this question is NOT answered "What is the max word limit that is allowed on drupal.org ?"
Knowing this can help to "titrate" ini_set('pcre.backtrack_limit' since there are no documentation or max words provision either in drupal.org or the drupal cms script.
@Gerhard Killesreiter
I am not finding these
ini_set('pcre.backtrack_limit', 200000);
ini_set('pcre.recursion_limit', 200000);
in settings.php file that comes with drupal core.
Comment #5
gerhard killesreiter commentedThere is no "max word" setting on either drupal.org or inside the Drupal CMS.
Comment #6
kaakuu commentedSo, why the post does not show up beyond a certain number of words ? Which means there IS a "max words" working or lurking somewhere which is NOT known to the poster, which ideally he should know.
Also not finding these
ini_set('pcre.backtrack_limit', 200000);ini_set('pcre.recursion_limit', 200000); in settings.php file that comes with drupal core.
Comment #7
kaakuu commentedIt seems there has been some works, and solution? regarding this in http://drupal.org/node/133188
but @KiamLaLuno - this does not happens with non-english only BUT English also.
Whether frequently this happens or once in a bluemoon I personally feel there needs to be a note ( whereever there is a submission form ) stating the max numbers of words that will be displayed, and beyond that a cut-out. It is just like where image files or video files say you have 10mb or 500mb limits though all people do not use that limit.
Comment #8
gerhard killesreiter commentedBecause of the limits that preg has, it _appears_ as if there was a limit on the number of words. It seems that the fact that your example post used Indic script has something to do with it. Probably all the letters were in fact multi-byte characters.
The preg settings are not in settings.php. Normally the default is sufficient. We could add them to the settings.php and advice users who want to post long texts in indic scrips to raise them.
Comment #9
kaakuu commented@Gerhard - Thanks!
This happens with English words also. If you have edit permissions (which probably you have) you can just convert the matter in http://drupal.org/node/537780 to ASCII and see that the vanishing problem still happens.
Whether it happens inside or outside Drupal it is just decent to make a note of the max possible words somewhere prominently even though regular users may not post that length.
Comment #10
avpadernoI am not sure it's possible to convert that Indic text in ASCII without to loose all the characters.
Comment #11
gerhard killesreiter commentedActually, yes, you are right, I've converted each "character" to a, b, c, and so on and if I reset the pcre variables back to their original value neither your text nor mine are displayed.
I am guessing that this is a side-effect of a not-well-tested security patch.
Your text has about 650 lines with abot 400 characters per line after I converted it. I guess that most people don't write that much on the net...
Comment #12
gerhard killesreiter commentedIt seems that disabling some or all of the filters at
admin/settings/filters/1
will also fix the problem.
Comment #13
kaakuu commentedThanks Gerhard.
@ KiamLaLuno - You need not do any conversion, just use any English text with 20,000 characters.
@ Gerhard - Regarding "most people don't write that much on the net" I am afraid this depends on use case. We came to know of this problem with our face hung in shame because people did write that much (long stories in blog post) and these people were doing the same thing without any problem in wordpress. (details are here http://drupal.org/node/537780#comment-1880014)
In any case this seems to be a duplicate issue as pointed by Damien Tournoud - #133188: Line break converter can result in empty node display - PCRE limits.
BUT it is rather urgent to know what are the exact fixes that will work for Drupal 6x ( and 5x)
without disturbing any filters (filters are set that way for some purpose and those are working)
IMHO, any form at Drupal org or the CMS needs mentioning max chars that will displayed because some people do write long blogs, specially fiction writers or scientific posts.
Comment #14
avpadernoMentioning the maximum number of characters allowed should not be difficult, as it is possible to add an explanation or submission guidelines for each content type defined in Drupal; that doesn't even require to modify any Drupal core code.
That is indeed a temporary solution that can be used until Drupal code is changed.
Comment #15
kaakuu commented@KiamLaLuno
Thanks. Mentioning that is not difficult is ok but it is difficult to know what is the limit. On some tests it is 15,000 plus on some other its 30,000 plus or so.
If its not difficult let it be implemented.
Usually in many software the admin interface allows to define the min and max possible values and if any max value is impossible it just refuses the admin to set that value. This is very useful, practical and must have but there is either a tendency or already a decision to keep this feature out of core.
If I remember correctly Geeklog suffered from attack of empty posts once as well as thousand mbs of blah blah texts thus making the above available necessary for "security" purposes also. Breaches in so called security happen not due to what most people do but what a very few people may do in still fewer occasions. Being proactive is wise :)
Comment #16
gerhard killesreiter commentedThere are no "exact" fixes. You need to test which setting of the preg limits works for you / which filter configuration works for your site.
Comment #17
vm commentedAt this point should this be considered a bug in drupal rather than on d.o ? as my take this affects all sites.
Comment #18
avpadernoI agree; this is a problem in Drupal. If we could find a temporary fix for the problem while drupal core code is changed, that would be good.
Comment #19
gerhard killesreiter commentedI guess we could document the variables in the settings.php file.
Comment #20
gerhard killesreiter commentedHere's a patch after some text autiting by webchick.
Comment #21
gerhard killesreiter commentedComment #22
agentrickardText looks good. Doc link helpful.
Comment #23
webchickRemoved the double-spaces between the sentences and committed to HEAD. Thanks!
Moving down to 6.x. Looks like it commits fine with a little fuzz.
Comment #24
moshe weitzman commentedLets not commit this to D6.
This is going to cause a lot of useless conflicts on settings.php for CVS deployment folks who rightfully edited settings.php. Also, people who deploy without version control are gong to be hand copying these edits just to stay up to date. I don't think it is worth it for a few commented out lines.
Comment #25
webchickI'm fine either way, but CVS deployment folks shouldn't get any conflicts since this patch is against default.settings.php, not settings.php, no? I'm not sure why anyone would've hacked their default.settings.php. I agree with not committing it to D5 for this reason, though.
Comment #26
avpadernoIt would not make sense to change default.settings.php, as it would be overwritten by a Drupal update; it doesn't sense to not copy that file either, as there could be some lines that need to be copied in settings.php.
Comment #27
gábor hojtsy@moshe weitzman: fixes and improvements are committed to default.settings.php in other issues as well. As people explained, it should not cause conflicts, since people are not supposed to edit this file for any reason.
I've also removed the double spaces in the comment and committed that as Angie did.
Comment #29
pwolanin commentedlooks like this was never committed to Drupal 7
Comment #30
pwolanin commentedhere's a simple re-roll of the patch above (which applies cleanly with offset).
Comment #31
damien tournoud commentedNo brainer. (just to confirm the RTBC).
Comment #32
dries commentedCommitted to CVS HEAD. Thanks.
Comment #33
kaakuu commentedThanks for ultimately solving this crippling and absurd problem. Thanks.