The latest update of PCRE to 8.30 produces the following warning in drupals installation:
Warning: preg_match(): Compilation failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 2112 in truncate_utf8() (line 339 of /home/sites/drupal7/includes/unicode.inc).

This is discussed in full here: http://drupal.org/node/1442876
Systems affected so far are Archlinux and FreeBSD. It is possible that more systems will display such warnings once they update PCRE to 8.30
A user has mentioned that this also affects drupal-searches (http://drupal.org/node/1442876#comment-5622720). I haven't tried it yet myself so I cannot tell.

I created a bug-report on Archlinux (https://bugs.archlinux.org/task/28533) but they suggest that this get fixed on the drupal-side.

Files: 
CommentFileSizeAuthor
#53 d5-1446372-unicode-fix.patch763 bytesLes Lim
#23 do-1446372-remove-surrogate-D6.patch773 byteschamplin
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]
#22 do-1446372-remove-surrogate-D7.patch683 bytesHeine
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do-1446372-remove-surrogate-D7.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#19 do-1446372-remove-surrogates.patch703 bytesHeine
PASSED: [[SimpleTest]]: [MySQL] 34,608 pass(es).
[ View ]
#17 do-1446372-remove-surrogate.patch1009 bytesHeine
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do-1446372-remove-surrogate_0.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#16 do-1446372-remove-surrogate.patch1009 bytesHeine
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do-1446372-remove-surrogate.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#1 1446372-modified-unicodeinc-to-play-nice-with-PCREv830.patch985 bytesbserem
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 1446372-modified-unicodeinc-to-play-nice-with-PCREv830.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Comments

StatusFileSize
new985 bytes
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 1446372-modified-unicodeinc-to-play-nice-with-PCREv830.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

The attached solves the problem. I simply removed the \x{D800}-\x{F8FF} part as Pierre Schmitz suggests here: https://bugs.archlinux.org/task/28533

Version:7.12» 7.x-dev
Status:Active» Needs review

Silly me... The above patch is against drupal-7.x-dev as it should be. Forgot to update the issue status

Status:Needs review» Needs work

The last submitted patch, 1446372-modified-unicodeinc-to-play-nice-with-PCREv830.patch, failed testing.

I got the same error on FreeBSD8.2, pcre 8.30, apache22, php5.3, mysql55. On my another FreeBSD system, php5.2, php52-pcre, drupal7.12 works well.

Version:7.x-dev» 8.x-dev
Issue tags:+needs backport to D6, +needs backport to D7

Looks like the patch is breaking existing test coverage, didn't look to see if it's a problem with the test or a problem with the patch. Note all bugfixes are applied to 8.x first then backported.

Assuming this will need to be backported to D6 as well.

The patch has 1 fail and 38700 passes. Probably something very specific fails that needs the characters I patched.

@alan mccoll, there is a subscribe button now in drupal.org, you do not need to answer in order to subscribe

Yes it's the search test that breaks, so it's directly related to the patch.

Status:Needs work» Needs review
Issue tags:-needs backport to D6, -needs backport to D7

Status:Needs review» Needs work
Issue tags:+needs backport to D6, +needs backport to D7

The last submitted patch, 1446372-modified-unicodeinc-to-play-nice-with-PCREv830.patch, failed testing.

It was said that PCRE 8.30 uses libpcre.so.0 instead of libpcre.so.1, which causes the problem. I will test on my system tomorrow.

Help me out here, we should be testing against D8??

I haven't had any experience with D8, I'll look at it this week

Recently I have updated PCRE to version 8.30 and now I constantly get this error when do any search:

•warning: preg_replace(): Compilation failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 1816 in /www/istmat/soft/drupal/modules/search/search.module on line 333.

I think that error must be eliminated ASAP as it really nasty bug and from now more and more sites will have updated PCRE!

Title:Updating PCRE to 8.30 in the operating system produces warnings on Drupal installationInvalid Unicode code range in PREG_CLASS_UNICODE_WORD_BOUNDARY fails with PCRE 8.30

bserem, invalid codepoints in modules/search/test/UnicodeTest.txt are most likely the culprit.

Thanks Heine.

Anybody having any ideas on the patch? It failed (as it says) one time in the search module (it had 650 passes of course).
That is against D7 of course, not D8.

I'm out of ideas

StatusFileSize
new1009 bytes
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do-1446372-remove-surrogate.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

The test fails on line 311 of UnicodeTest.txt which contains the byte sequences:

ee 80 80 ef a3 bf

ef a3 bf is UTF-8 for U+F8FF, which has been removed with the patch.

It appears that a sequence is not possible because it encompasses the following Unicode areas:

Surrogate Area: U+D800 - U+D8FF
Private Use Area: U+E000 - U+F8FF

The surrogate area is not allowed in PCRE 7.3+. Attached patch removes it.

(I also believe we shouldn't break on U+FEFF which is a Word Joiner, but that's best left for another issue).

Status:Needs work» Needs review
StatusFileSize
new1009 bytes
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do-1446372-remove-surrogate_0.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

This is the Drupal 8 patch. Please test with PCRE 8.30 as I do not have a test install with it atm.

Status:Needs review» Needs work

The last submitted patch, do-1446372-remove-surrogate.patch, failed testing.

StatusFileSize
new703 bytes
PASSED: [[SimpleTest]]: [MySQL] 34,608 pass(es).
[ View ]

*sigh*

Patch in the Dark Ages format :(

Status:Needs work» Needs review

Just did a minimal-D8 install, all went fine.

It should however be backported to 7 and 6 in order to test it with the search function.

StatusFileSize
new683 bytes
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch do-1446372-remove-surrogate-D7.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Drupal 7 patch in updated format to aid testers. D8 contains the same test BTW.

StatusFileSize
new773 bytes
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]

A similar patch is needed for D6 where the code is located in PREG_CLASS_SEARCH_EXCLUDE (search.module).
Thanks. -virgil

I can confirm that on D6 after applying patch from #23 error gone and search works again.

I confirm D8, though I didn't have a site with a search-index to test search.

All my D7 sites are in PCRE8.2 still, so we need somebody for D7 to confirm it.

Status:Needs review» Reviewed & tested by the community

Go. Before catch pointed out this issue (I was only looking at criticals) I had the same patch prepared.

I checked and the next Ubuntu release due Apr seems to be using PCRE 8.12 http://packages.ubuntu.com/search?keywords=pcre&searchon=names&suite=pre... . CentOS/RedHat is using PCRE 7, we can expect this most excellent and modern OS to break some time around the Sun grows cold. I suspect Fedora will be the next to break on May 8 when Fedora 17 releases because their trunk ("Rawhide") contains 8.30.

Priority:Major» Critical
Issue tags:+D7 stable release blocker

I'm going to tag this as Drupal 7 stable release blocker and critical. The next D7 release is likely to be end of March, that only gives people 5 weeks to upgrade before the Fedora upgrade, after which point I'd expect us to start seeing duplicate tickets getting opened etc.

Status:Reviewed & tested by the community» Needs work

The last submitted patch, do-1446372-remove-surrogate-D6.patch, failed testing.

Status:Needs work» Reviewed & tested by the community

Not sure why the suffixed patches are being tested, back to RTBC.

Not sure why they got tested suddenly now when they were submitted previously, but:
#1092232: Bot needs to handle patches named for all core versions -D[678]

Version:8.x-dev» 7.x-dev

Committed/pushed this to 8.x.

There was some discussion on irc about trying to get this into 6.25, I don't think we should try to do that, but it'd be good to get it into 6/7 so there's time for it to bed in before and end of march release (i.e. 6.26).

I'm leaving this RTBC for 7.x since #22 is the same patch re-rolled.

Version:7.x-dev» 6.x-dev
Status:Reviewed & tested by the community» Patch (to be ported)

Committed and pushed to 7.x. Thanks!

Moving to 6.x for backport.

Status:Patch (to be ported)» Reviewed & tested by the community

#23 has the patch.

thanks, went to dev 7, issue seems to be gone for now.

Hrm, we wanted to get this in this month, right?

#23: do-1446372-remove-surrogate-D6.patch queued for re-testing.

The problem has also turned up on Debian wheezy now. The patch in comment #23 resolves it.

Status:Reviewed & tested by the community» Fixed

Thanks all, committed and pushed #23 to Drupal 6. Should be in the next bugfix release.

The D7 patch from comment #22 fixed the issue of 'You must include at least one positive keyword with 3 characters or more.' on any search under Debian Testing. Actually I couldn't get the patch to apply but I manually replaced the one line from the patch file and it fixed search.

# dpkg -l | grep pcre
ii libpcre3:amd64 1:8.30-4 Perl 5 Compatible Regular Expression Library - runtime files
...

Status:Fixed» Needs review

#22: do-1446372-remove-surrogate-D7.patch queued for re-testing.

This is a Drupal 7 patch but the test system thinks it's a Drupal 6 patch. :(

Status:Needs review» Fixed

Please do not re-test the patch. The issue is marked fixed. This means the patches have been committed to all applicable development branches for D6, D7, and D8.

Status:Fixed» Needs review

#22: do-1446372-remove-surrogate-D7.patch queued for re-testing.

Status:Needs review» Fixed

Version:6.x-dev» 7.12
Component:install system» node system

Path #19 worked for me in a D7 with FreeBSD 9 (pcre 8-30_1)

Version:7.12» 6.x-dev
Component:node system» install system

Issue tags:+7.13 release notes

Calling this out as something to mention in the release notes.

Status:Fixed» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Anyone care to tell me how i can fix this on a Drupal 5.x install? Can I ask the ISP to roll back the php version or is it PERL? Not to clear on this and just trying to keep an old site live until we can finally complete a drupal 7x migration.
Error on search in drupal 5. x is: "You must include at least one positive keyword with 3 characters or more."

Comment 23 patch worked for my D6 issue! First time I ever installed a patch too. But it still didn't fix my site search :( I can see now that the root issue is migrating the site from old web server (Ubuntu 11.04, set up ~2010) to new web server (Ubuntu 13, Linode VPS, set up a few days ago). Still, hoping I can get it working so that the old D6 site can be running on a more secure OS, more secure Apache/PHP/MySQL, etc.

StatusFileSize
new763 bytes

Needed this for D5. Obviously it'll never be committed, but posting the patch here for reference.

Hi Ben,
Small world, I see you are chasing the PCRE thing in Drupal also.
Bil Herd

Any Help still available?

I am on Drupal 7.8. The search worked fine when running Squeeze. Upgraded to Wheezy and search is dead on multiple drupal sites.
I have tried manually applying a couple things I've read here, specifically

I've changed D800 to E000 in the unicode.inc file
I've looked at the search.module tweaks as well but these appear to be for other versions as my file has no PREG_CLASS_SEARCH_EXCLUDE section.

the exact error I get is:

Warning: preg_replace(): Compilation failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 2102 in search_simplify() (line 445 of /home/njellis/public/PcComputerGuy.com/public/modules/search/search.module).
You must include at least one positive keyword with 3 characters or more.

This is a Linode LAMP. I have full ssh access so can tweak or do whatever, but so far - I've been unsuccessful at resolving this.

Thanks for any suggestions!

njellis: Simplest way is just grep all .php and .inc files of your Drupal installation for a "{d800}" string. Right way is upgrade a Drupal to the latest version, 7.24.

Thanks RedRat. I just did that, and only a single module comes up that appears unrelated:

sites/all/modules/ctools/includes/cleanstring.inc:'\x{a80b}\x{a823}-\x{a82b}\x{d800}-\x{f8ff}\x{fb1e}\x{fb29}\x{fd3e}\x{fd3f}' .

My report says I am running drupal 7.8.

Any other ideas?

Ctools has Unicode bug too: https://drupal.org/node/1697538, you can manually apply such patch to your modules/ctools/includes/cleanstring.inc file.

But keep in mind that 7.8 is a very outdated version and have to be updated ASAP because of many bugs and security flaws.

Ah, ok, I will try and get that done. I know I tried it at one point and many things broke. I thought 7.8 > 7.24 but apparently not! :)

Thanks for the help.

The last submitted patch, 22: do-1446372-remove-surrogate-D7.patch, failed testing.