The latest update of PCRE to 8.30 produces the following warning in drupals installation:
Warning: preg_match(): Compilation failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 2112 in truncate_utf8() (line 339 of /home/sites/drupal7/includes/unicode.inc).

This is discussed in full here: http://drupal.org/node/1442876
Systems affected so far are Archlinux and FreeBSD. It is possible that more systems will display such warnings once they update PCRE to 8.30
A user has mentioned that this also affects drupal-searches (http://drupal.org/node/1442876#comment-5622720). I haven't tried it yet myself so I cannot tell.

I created a bug-report on Archlinux (https://bugs.archlinux.org/task/28533) but they suggest that this get fixed on the drupal-side.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

bserem’s picture

The attached solves the problem. I simply removed the \x{D800}-\x{F8FF} part as Pierre Schmitz suggests here: https://bugs.archlinux.org/task/28533

bserem’s picture

Version: 7.12 » 7.x-dev
Status: Active » Needs review

Silly me... The above patch is against drupal-7.x-dev as it should be. Forgot to update the issue status

Status: Needs review » Needs work

The last submitted patch, 1446372-modified-unicodeinc-to-play-nice-with-PCREv830.patch, failed testing.

limingtao’s picture

I got the same error on FreeBSD8.2, pcre 8.30, apache22, php5.3, mysql55. On my another FreeBSD system, php5.2, php52-pcre, drupal7.12 works well.

catch’s picture

Version: 7.x-dev » 8.x-dev
Issue tags: +Needs backport to D6, +Needs backport to D7

Looks like the patch is breaking existing test coverage, didn't look to see if it's a problem with the test or a problem with the patch. Note all bugfixes are applied to 8.x first then backported.

Assuming this will need to be backported to D6 as well.

bserem’s picture

The patch has 1 fail and 38700 passes. Probably something very specific fails that needs the characters I patched.

@alan mccoll, there is a subscribe button now in drupal.org, you do not need to answer in order to subscribe

catch’s picture

Yes it's the search test that breaks, so it's directly related to the patch.

bserem’s picture

Status: Needs work » Needs review
Issue tags: -Needs backport to D6, -Needs backport to D7

Status: Needs review » Needs work
Issue tags: +Needs backport to D6, +Needs backport to D7

The last submitted patch, 1446372-modified-unicodeinc-to-play-nice-with-PCREv830.patch, failed testing.

limingtao’s picture

It was said that PCRE 8.30 uses libpcre.so.0 instead of libpcre.so.1, which causes the problem. I will test on my system tomorrow.

bserem’s picture

Help me out here, we should be testing against D8??

I haven't had any experience with D8, I'll look at it this week

RedRat’s picture

Recently I have updated PCRE to version 8.30 and now I constantly get this error when do any search:

•warning: preg_replace(): Compilation failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 1816 in /www/istmat/soft/drupal/modules/search/search.module on line 333.

I think that error must be eliminated ASAP as it really nasty bug and from now more and more sites will have updated PCRE!

Heine’s picture

Title: Updating PCRE to 8.30 in the operating system produces warnings on Drupal installation » Invalid Unicode code range in PREG_CLASS_UNICODE_WORD_BOUNDARY fails with PCRE 8.30

bserem, invalid codepoints in modules/search/test/UnicodeTest.txt are most likely the culprit.

bserem’s picture

Thanks Heine.

Anybody having any ideas on the patch? It failed (as it says) one time in the search module (it had 650 passes of course).
That is against D7 of course, not D8.

I'm out of ideas

Heine’s picture

The test fails on line 311 of UnicodeTest.txt which contains the byte sequences:

ee 80 80 ef a3 bf

ef a3 bf is UTF-8 for U+F8FF, which has been removed with the patch.

It appears that a sequence is not possible because it encompasses the following Unicode areas:

Surrogate Area: U+D800 - U+D8FF
Private Use Area: U+E000 - U+F8FF

The surrogate area is not allowed in PCRE 7.3+. Attached patch removes it.

(I also believe we shouldn't break on U+FEFF which is a Word Joiner, but that's best left for another issue).

Heine’s picture

Status: Needs work » Needs review
FileSize
1009 bytes

This is the Drupal 8 patch. Please test with PCRE 8.30 as I do not have a test install with it atm.

Status: Needs review » Needs work

The last submitted patch, do-1446372-remove-surrogate.patch, failed testing.

Heine’s picture

*sigh*

Patch in the Dark Ages format :(

Heine’s picture

Status: Needs work » Needs review
bserem’s picture

Just did a minimal-D8 install, all went fine.

It should however be backported to 7 and 6 in order to test it with the search function.

Heine’s picture

Drupal 7 patch in updated format to aid testers. D8 contains the same test BTW.

champlin’s picture

A similar patch is needed for D6 where the code is located in PREG_CLASS_SEARCH_EXCLUDE (search.module).
Thanks. -virgil

RedRat’s picture

I can confirm that on D6 after applying patch from #23 error gone and search works again.

bserem’s picture

I confirm D8, though I didn't have a site with a search-index to test search.

All my D7 sites are in PCRE8.2 still, so we need somebody for D7 to confirm it.

chx’s picture

Status: Needs review » Reviewed & tested by the community

Go. Before catch pointed out this issue (I was only looking at criticals) I had the same patch prepared.

chx’s picture

I checked and the next Ubuntu release due Apr seems to be using PCRE 8.12 http://packages.ubuntu.com/search?keywords=pcre&searchon=names&suite=pre... . CentOS/RedHat is using PCRE 7, we can expect this most excellent and modern OS to break some time around the Sun grows cold. I suspect Fedora will be the next to break on May 8 when Fedora 17 releases because their trunk ("Rawhide") contains 8.30.

catch’s picture

Priority: Major » Critical
Issue tags: +D7 stable release blocker

I'm going to tag this as Drupal 7 stable release blocker and critical. The next D7 release is likely to be end of March, that only gives people 5 weeks to upgrade before the Fedora upgrade, after which point I'd expect us to start seeing duplicate tickets getting opened etc.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, do-1446372-remove-surrogate-D6.patch, failed testing.

catch’s picture

Status: Needs work » Reviewed & tested by the community

Not sure why the suffixed patches are being tested, back to RTBC.

xjm’s picture

Not sure why they got tested suddenly now when they were submitted previously, but:
#1092232: Bot needs to handle patches named for all core versions -D[678]

catch’s picture

Version: 8.x-dev » 7.x-dev

Committed/pushed this to 8.x.

There was some discussion on irc about trying to get this into 6.25, I don't think we should try to do that, but it'd be good to get it into 6/7 so there's time for it to bed in before and end of march release (i.e. 6.26).

I'm leaving this RTBC for 7.x since #22 is the same patch re-rolled.

webchick’s picture

Version: 7.x-dev » 6.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Committed and pushed to 7.x. Thanks!

Moving to 6.x for backport.

chx’s picture

Status: Patch (to be ported) » Reviewed & tested by the community

#23 has the patch.

xjm’s picture

maynardo’s picture

webengr’s picture

thanks, went to dev 7, issue seems to be gone for now.

xjm’s picture

Hrm, we wanted to get this in this month, right?

Anybody’s picture

cafuego’s picture

The problem has also turned up on Debian wheezy now. The patch in comment #23 resolves it.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

Thanks all, committed and pushed #23 to Drupal 6. Should be in the next bugfix release.

Corwin’s picture

The D7 patch from comment #22 fixed the issue of 'You must include at least one positive keyword with 3 characters or more.' on any search under Debian Testing. Actually I couldn't get the patch to apply but I manually replaced the one line from the patch file and it fixed search.

# dpkg -l | grep pcre
ii libpcre3:amd64 1:8.30-4 Perl 5 Compatible Regular Expression Library - runtime files
...

Corwin’s picture

Status: Fixed » Needs review

#22: do-1446372-remove-surrogate-D7.patch queued for re-testing.

This is a Drupal 7 patch but the test system thinks it's a Drupal 6 patch. :(

xjm’s picture

Status: Needs review » Fixed

Please do not re-test the patch. The issue is marked fixed. This means the patches have been committed to all applicable development branches for D6, D7, and D8.

berteam’s picture

Status: Fixed » Needs review
Heine’s picture

Status: Needs review » Fixed
ecommercium’s picture

Version: 6.x-dev » 7.12
Component: install system » node system

Path #19 worked for me in a D7 with FreeBSD 9 (pcre 8-30_1)

xjm’s picture

Version: 7.12 » 6.x-dev
Component: node system » install system
webchick’s picture

Issue tags: +7.13 release notes

Calling this out as something to mention in the release notes.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

jessZ’s picture

Anyone care to tell me how i can fix this on a Drupal 5.x install? Can I ask the ISP to roll back the php version or is it PERL? Not to clear on this and just trying to keep an old site live until we can finally complete a drupal 7x migration.
Error on search in drupal 5. x is: "You must include at least one positive keyword with 3 characters or more."

bbakelaar’s picture

Comment 23 patch worked for my D6 issue! First time I ever installed a patch too. But it still didn't fix my site search :( I can see now that the root issue is migrating the site from old web server (Ubuntu 11.04, set up ~2010) to new web server (Ubuntu 13, Linode VPS, set up a few days ago). Still, hoping I can get it working so that the old D6 site can be running on a more secure OS, more secure Apache/PHP/MySQL, etc.

Les Lim’s picture

FileSize
763 bytes

Needed this for D5. Obviously it'll never be committed, but posting the patch here for reference.

BilHerd’s picture

Hi Ben,
Small world, I see you are chasing the PCRE thing in Drupal also.
Bil Herd

njellis’s picture

Any Help still available?

I am on Drupal 7.8. The search worked fine when running Squeeze. Upgraded to Wheezy and search is dead on multiple drupal sites.
I have tried manually applying a couple things I've read here, specifically

I've changed D800 to E000 in the unicode.inc file
I've looked at the search.module tweaks as well but these appear to be for other versions as my file has no PREG_CLASS_SEARCH_EXCLUDE section.

the exact error I get is:

Warning: preg_replace(): Compilation failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 2102 in search_simplify() (line 445 of /home/njellis/public/PcComputerGuy.com/public/modules/search/search.module).
You must include at least one positive keyword with 3 characters or more.

This is a Linode LAMP. I have full ssh access so can tweak or do whatever, but so far - I've been unsuccessful at resolving this.

Thanks for any suggestions!

RedRat’s picture

njellis: Simplest way is just grep all .php and .inc files of your Drupal installation for a "{d800}" string. Right way is upgrade a Drupal to the latest version, 7.24.

njellis’s picture

Thanks RedRat. I just did that, and only a single module comes up that appears unrelated:

sites/all/modules/ctools/includes/cleanstring.inc:'\x{a80b}\x{a823}-\x{a82b}\x{d800}-\x{f8ff}\x{fb1e}\x{fb29}\x{fd3e}\x{fd3f}' .

My report says I am running drupal 7.8.

Any other ideas?

RedRat’s picture

Ctools has Unicode bug too: https://drupal.org/node/1697538, you can manually apply such patch to your modules/ctools/includes/cleanstring.inc file.

But keep in mind that 7.8 is a very outdated version and have to be updated ASAP because of many bugs and security flaws.

njellis’s picture

Ah, ok, I will try and get that done. I know I tried it at one point and many things broke. I thought 7.8 > 7.24 but apparently not! :)

Thanks for the help.

dietr_ch’s picture

The last submitted patch, 22: do-1446372-remove-surrogate-D7.patch, failed testing.

maxmayers’s picture

The last submitted patch, 22: do-1446372-remove-surrogate-D7.patch, failed testing.

wescleyteixeira’s picture

Issue summary: View changes

#23 worked for me on D6.