I cannot see what I posted - please see http://drupal.org/node/537780

Please hit edit tab and see the matter I posted.
What is the max word limit that is allowed on drupal.org ?

This will also help us to know what is happening with invisible posts in our sites too.

Comments

gerhard killesreiter’s picture

This issue is due to the limits of the perl regular expressions that we use to validate that you posted valid UTF8.

If you increase the limits in your settings.php file you will see the text.

ini_set('pcre.backtrack_limit', 200000);
ini_set('pcre.recursion_limit', 200000);

You will need to experiment which values work for you.

avpaderno’s picture

Category: bug » support
Priority: Critical » Normal

As people normally post in English, on Drupal.org, the problem would not appear so much frequently.
The length of the text of the reported forum is also higher than the length of a normal post you can find here. I would think that such problem is not so relevant for Drupal.org itself.

gerhard killesreiter’s picture

Status: Active » Fixed

Right, I am not going to make the proposed changes on drupal.org.

kaakuu’s picture

Thanks.
However this question is NOT answered "What is the max word limit that is allowed on drupal.org ?"

Knowing this can help to "titrate" ini_set('pcre.backtrack_limit' since there are no documentation or max words provision either in drupal.org or the drupal cms script.

@Gerhard Killesreiter

I am not finding these
ini_set('pcre.backtrack_limit', 200000);
ini_set('pcre.recursion_limit', 200000);
in settings.php file that comes with drupal core.

gerhard killesreiter’s picture

There is no "max word" setting on either drupal.org or inside the Drupal CMS.

kaakuu’s picture

So, why the post does not show up beyond a certain number of words ? Which means there IS a "max words" working or lurking somewhere which is NOT known to the poster, which ideally he should know.

Also not finding these
ini_set('pcre.backtrack_limit', 200000);ini_set('pcre.recursion_limit', 200000); in settings.php file that comes with drupal core.

kaakuu’s picture

It seems there has been some works, and solution? regarding this in http://drupal.org/node/133188
but @KiamLaLuno - this does not happens with non-english only BUT English also.
Whether frequently this happens or once in a bluemoon I personally feel there needs to be a note ( whereever there is a submission form ) stating the max numbers of words that will be displayed, and beyond that a cut-out. It is just like where image files or video files say you have 10mb or 500mb limits though all people do not use that limit.

gerhard killesreiter’s picture

Because of the limits that preg has, it _appears_ as if there was a limit on the number of words. It seems that the fact that your example post used Indic script has something to do with it. Probably all the letters were in fact multi-byte characters.

The preg settings are not in settings.php. Normally the default is sufficient. We could add them to the settings.php and advice users who want to post long texts in indic scrips to raise them.

kaakuu’s picture

@Gerhard - Thanks!

This happens with English words also. If you have edit permissions (which probably you have) you can just convert the matter in http://drupal.org/node/537780 to ASCII and see that the vanishing problem still happens.
Whether it happens inside or outside Drupal it is just decent to make a note of the max possible words somewhere prominently even though regular users may not post that length.

avpaderno’s picture

you can just convert the matter in http://drupal.org/node/537780 to ASCII and see that the vanishing problem still happens

I am not sure it's possible to convert that Indic text in ASCII without to loose all the characters.

gerhard killesreiter’s picture

Category: support » bug
Priority: Normal » Critical
Status: Fixed » Active

Actually, yes, you are right, I've converted each "character" to a, b, c, and so on and if I reset the pcre variables back to their original value neither your text nor mine are displayed.

I am guessing that this is a side-effect of a not-well-tested security patch.

Your text has about 650 lines with abot 400 characters per line after I converted it. I guess that most people don't write that much on the net...

gerhard killesreiter’s picture

It seems that disabling some or all of the filters at

admin/settings/filters/1

will also fix the problem.

kaakuu’s picture

Thanks Gerhard.
@ KiamLaLuno - You need not do any conversion, just use any English text with 20,000 characters.
@ Gerhard - Regarding "most people don't write that much on the net" I am afraid this depends on use case. We came to know of this problem with our face hung in shame because people did write that much (long stories in blog post) and these people were doing the same thing without any problem in wordpress. (details are here http://drupal.org/node/537780#comment-1880014)

In any case this seems to be a duplicate issue as pointed by Damien Tournoud - #133188: Line break converter can result in empty node display - PCRE limits.
BUT it is rather urgent to know what are the exact fixes that will work for Drupal 6x ( and 5x)
without disturbing any filters (filters are set that way for some purpose and those are working)

IMHO, any form at Drupal org or the CMS needs mentioning max chars that will displayed because some people do write long blogs, specially fiction writers or scientific posts.

avpaderno’s picture

Mentioning the maximum number of characters allowed should not be difficult, as it is possible to add an explanation or submission guidelines for each content type defined in Drupal; that doesn't even require to modify any Drupal core code.

That is indeed a temporary solution that can be used until Drupal code is changed.

kaakuu’s picture

@KiamLaLuno

Thanks. Mentioning that is not difficult is ok but it is difficult to know what is the limit. On some tests it is 15,000 plus on some other its 30,000 plus or so.
If its not difficult let it be implemented.

Usually in many software the admin interface allows to define the min and max possible values and if any max value is impossible it just refuses the admin to set that value. This is very useful, practical and must have but there is either a tendency or already a decision to keep this feature out of core.

If I remember correctly Geeklog suffered from attack of empty posts once as well as thousand mbs of blah blah texts thus making the above available necessary for "security" purposes also. Breaches in so called security happen not due to what most people do but what a very few people may do in still fewer occasions. Being proactive is wise :)

gerhard killesreiter’s picture

There are no "exact" fixes. You need to test which setting of the preg limits works for you / which filter configuration works for your site.

vm’s picture

At this point should this be considered a bug in drupal rather than on d.o ? as my take this affects all sites.

avpaderno’s picture

I agree; this is a problem in Drupal. If we could find a temporary fix for the problem while drupal core code is changed, that would be good.

gerhard killesreiter’s picture

Project: Drupal.org site moderators » Drupal core
Version: » 7.x-dev
Component: Other » base system
Priority: Critical » Normal

I guess we could document the variables in the settings.php file.

gerhard killesreiter’s picture

StatusFileSize
new1.23 KB

Here's a patch after some text autiting by webchick.

gerhard killesreiter’s picture

Title: I cannot see what I posted » Document possible preg memory issues
Status: Active » Needs review
agentrickard’s picture

Status: Needs review » Reviewed & tested by the community

Text looks good. Doc link helpful.

webchick’s picture

Version: 7.x-dev » 6.x-dev

Removed the double-spaces between the sentences and committed to HEAD. Thanks!

Moving down to 6.x. Looks like it commits fine with a little fuzz.

moshe weitzman’s picture

Lets not commit this to D6.

This is going to cause a lot of useless conflicts on settings.php for CVS deployment folks who rightfully edited settings.php. Also, people who deploy without version control are gong to be hand copying these edits just to stay up to date. I don't think it is worth it for a few commented out lines.

webchick’s picture

I'm fine either way, but CVS deployment folks shouldn't get any conflicts since this patch is against default.settings.php, not settings.php, no? I'm not sure why anyone would've hacked their default.settings.php. I agree with not committing it to D5 for this reason, though.

avpaderno’s picture

It would not make sense to change default.settings.php, as it would be overwritten by a Drupal update; it doesn't sense to not copy that file either, as there could be some lines that need to be copied in settings.php.

gábor hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

@moshe weitzman: fixes and improvements are committed to default.settings.php in other issues as well. As people explained, it should not cause conflicts, since people are not supposed to edit this file for any reason.

I've also removed the double spaces in the comment and committed that as Angie did.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

pwolanin’s picture

Version: 6.x-dev » 7.x-dev
Status: Closed (fixed) » Active

looks like this was never committed to Drupal 7

pwolanin’s picture

Status: Active » Reviewed & tested by the community
StatusFileSize
new1.05 KB

here's a simple re-roll of the patch above (which applies cleanly with offset).

damien tournoud’s picture

No brainer. (just to confirm the RTBC).

dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks.

kaakuu’s picture

Thanks for ultimately solving this crippling and absurd problem. Thanks.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.