cron hang on preg_replace param count @ line 327

Dave Sherohman - October 8, 2009 - 12:50
Project:Porter-Stemmer
Version:6.x-2.2
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:closed
Description

All 6.x-2.x versions of porterstemmer contain an undocumented (and unnecessary) dependency on PHP version 5.10 or later, as they use the optional "count" argument to preg_replace, causing it to fail when stemming and put cron into an endless loop of retrying it. For 6.x-2.2, the error logged is:

Wrong parameter count for preg_replace() in /path/to/drupal/sites/all/modules/porterstemmer/porterstemmer.module on line 327.

The same error appears in 6.x-2.0 and 6.x-2.1, albeit on different line numbers.

The offending code reads:

  // y -> Y if we should treat it as consonant
  $tmp = preg_replace('/^y/', 'Y', $tmp);
  $count = 1;
  while ( $count ) {
    // Do this replacement one by one, to avoid unlikely yyyy issues
    $tmp = preg_replace('/(' . PORTERSTEMMER_VOWEL . ')y/', '$1Y',
      $tmp, 1, $count);
  }

Replacing this loop with:

  // y -> Y if we should treat it as consonant
  $tmp = preg_replace('/^y/', 'Y', $tmp);
  $done = 0;
  while ( !$done ) {
    // Do this replacement one by one, to avoid unlikely yyyy issues
    $before = $tmp;
    $tmp = preg_replace('/(' . PORTERSTEMMER_VOWEL . ')y/', '$1Y',
      $tmp, 1);
    $done = ($tmp == $before);
  }
will restore compatibility with pre-5.10 versions of PHP.

#1

jhodgdon - October 8, 2009 - 13:56

Thanks! I'll investigate and get that fix into a release later today.

#2

jhodgdon - October 8, 2009 - 16:22
Status:active» fixed

I also looked through the rest of the code, and I don't think there are any PHP version dependencies beyond Drupal's requirement (PHP 4.3 for Drupal 6.x). And just as a note, this affects PHP 4.x only (the change to preg_replace came in PHP 5.1.0).

Anyway, with essentially the patch above, the Porter Stemmer tests still pass, the module still works fine with PHP 5, it should also now work with PHP 4.x. This change is in the porterstemming_prestemming() function, which is called for each word that is stemmed, so the test coverage is pretty comprehensive.

Due to the severity of the problem for PHP 4.x users, and because of the testing I'm also reasonably certain the fix doesn't break anything, I decided to release immediately. So this fix will be out as version 6.x-2.3 within a few minutes. Keep in mind that the packaging scripts on drupal.org make a small delay between me telling it I want a new version, and the download link working.

#3

System Message - October 22, 2009 - 16:30
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.