Replace ereg with preg

meba - May 22, 2006 - 20:47
Project:Drupal
Version:7.x-dev
Component:file system
Category:task
Priority:critical
Assigned:Unassigned
Status:closed
Description

As a first small result of my effort to replace as much as i can handle occurences of ereg* functions by preg_match in Drupal ( http://groups.drupal.org/node/544 ), this is the first patch i am creating. Please let me know if i am doing things well as i am new to Drupal development :)

This patch removes ereg() in includes/file.inc at:

elseif ($depth >= $min_depth && ereg($mask, $file)) {

At my point of view, everything is working with preg_match().

Attaching patch (is this correct? should i put it elsewhere?)

AttachmentSizeStatusTest resultOperations
file.inc_ereg_patch598 bytesIgnoredNoneNone

#1

meba - May 22, 2006 - 20:52

Well, i have a typo in the patch, attaching a new one

AttachmentSizeStatusTest resultOperations
file.inc_ereg_patch_0600 bytesIgnoredNoneNone

#2

chx - May 23, 2006 - 00:49
Status:needs review» needs work

Awesome idea. But the code needs work. Please fix the caller places to pass pregs.

  1. image.inc:10:  $toolkits = file_scan_directory('includes', 'image\..*\.inc$');
  2. system.module:591:    foreach (file_scan_directory(dirname($theme->filename), 'style.css$') as $style) {
  3. system.module:689: *   The key to be passed to file_scan_directory().
  4. system.module:707:    $files = array_merge($files, file_scan_directory($dir, $mask, array('.', '..', 'CVS'), 0, TRUE, $key, $min_depth));
  5. system.module:547:  $themes = system_listing('\.theme$', 'themes');
  6. system.module:550:  $engines = system_listing('\.engine$', 'themes/engines');
  7. system.module:696:function system_listing($mask, $directory, $key = 'name', $min_depth = 1) {
  8. system.module:893:  $files = system_listing('\.module$', 'modules', 'name', 0);

In the future someone might change them and then it'll be crucial for them to be full pregs. Also, I really do not advice using a slash for a delimiter, because we are dealing with files, paths and such. Someone might write [^/] so I'd recommend you ^lt; > as delimiters. Or #.

#3

chx - May 23, 2006 - 00:49

I wanted to say < and > but an obvious typo stopped this.

#4

meba - May 24, 2006 - 17:34

ok, will do next week, now i am working on my exams at university .)

#5

meba - January 11, 2007 - 23:16
Version:x.y.z» 5.x-dev
Component:file system» other
Category:bug report» task
Status:needs work» needs review

Rerolling against 5.x. Searched for all eregs using file_scan_directory and should be fixed.

includes/common.inc:  file_scan_directory(file_create_path('css'), '.*', array('.', '..', 'CVS'), 'file_delete', TRUE);
includes/common.inc:    $files = array_merge($files, file_scan_directory($dir, $mask, array('.', '..', 'CVS'), 0, TRUE, $key, $min_depth));
includes/file.inc:function file_scan_directory($dir, $mask, $nomask = array('.', '..', 'CVS'), $callback = 0, $recurse = TRUE, $key = 'filename', $min_depth = 0, $depth = 0) {
includes/file.inc:          $files = array_merge($files, file_scan_directory("$dir/$file", $mask, $nomask, $callback, $recurse, $key, $min_depth, $depth + 1));
includes/image.inc:  $toolkits = file_scan_directory('includes', 'image\..*\.inc$');
includes/install.inc:    $installs = array_merge($installs, file_scan_directory('./modules', '^' . $module . '\.install$', array('.', '..', 'CVS'), 0, TRUE, 'name', 0));
install.php:  $profiles = file_scan_directory('./profiles', '\.profile$', array('.', '..', 'CVS'), 0, TRUE, 'name', 0);
install.php:  $locales = file_scan_directory('./profiles/' . $profilename, '\.po$', array('.', '..', 'CVS'), 0, FALSE);
modules/system/system.module:    foreach (file_scan_directory(dirname($theme->filename), 'style\.css$') as $style) {

Test please

AttachmentSizeStatusTest resultOperations
eregpreg.patch2.04 KBIgnoredNoneNone

#6

RobRoy - January 11, 2007 - 23:19
Status:needs review» needs work

<?php
// Code style. Should be
'^'. $module .'\...'

// not

'^' . $module . '\...'
?>

#7

meba - January 11, 2007 - 23:50
Status:needs work» needs review
AttachmentSizeStatusTest resultOperations
eregpreg_0.patch2.04 KBIgnoredNoneNone

#8

RobRoy - January 12, 2007 - 00:10

Not related to your patch specifically, but should we prefix the style line with a '^' to not match somestyle.css also?

#9

drumm - June 27, 2007 - 04:43
Version:5.x-dev» 6.x-dev

#10

Gábor Hojtsy - June 28, 2007 - 00:21

I bet this does not apply to Drupal 6 at all. But a nice undertaking to start with :)

#11

Gábor Hojtsy - June 28, 2007 - 00:53

I mean the patch does not apply, not the idea :) So the patch probably needs to be rerolled.

#12

catch - October 29, 2007 - 11:47
Status:needs review» needs work

#13

Pancho - February 3, 2008 - 16:45
Title:Replacing ereg with preg_match» Replacing ereg by preg_match
Version:6.x-dev» 7.x-dev
Priority:normal» critical

Good start, indeed! We need to get away from ereg soon, as POSIX style regex are to be removed from PHP6.
However I think we should tackle this once and for all as a big task, therefore D7.
As this will break Drupal in PHP6, it is definitely a critical task.

#14

catch - July 25, 2008 - 09:26
Title:Replacing ereg by preg_match» Replace ereg with preg
Assigned to:meba» Anonymous
Status:needs work» needs review

http://drupal.org/node/286893 was duplicate, but has a new patch. http://drupal.org/files/issues/drupal-preg-286893-4.patch

#15

Arancaytar - July 25, 2008 - 09:49

It's easier to follow an issue when the patch is attached directly.

AttachmentSizeStatusTest resultOperations
drupal-preg-64967-15.patch4.64 KBIgnoredNoneNone

#16

earnie - July 25, 2008 - 11:59

Subscribing

#17

lilou - August 23, 2008 - 19:11

Still apply.

#18

Susurrus - August 23, 2008 - 21:39

If we do this then this additional issue should be addressed: #270824: Require preg_* functions for regular expressions.

#19

Robin Monks - September 1, 2008 - 01:39
Status:needs review» reviewed & tested by the community

Please note, this issue will be affected by this as well: http://drupal.org/node/74645

+1 for replacing ereg with preg!

Patch still applies, and appears to work perfectly. RTBC!

Blog API functionality: 47 passes, 0 fails, 0 exceptions
Filter administration functionality: 95 passes, 0 fails, 0 exceptions
Core filters: 20 passes, 0 fails, 0 exceptions
Test single fields: 112 passes, 0 fails, 0 exceptions
Test select field: 33 passes, 0 fails, 0 exceptions
Test date field: 35 passes, 0 fails, 0 exceptions
Test field weights: 47 passes, 0 fails, 0 exceptions

Robin

#20

Dries - September 5, 2008 - 09:26
Status:reviewed & tested by the community» fixed

Committed to CVS HEAD. Thanks.

#21

drewish - September 6, 2008 - 18:41

did we intentionally skip the ereg() in file_scan_directory()?

#22

Dave Reid - September 7, 2008 - 17:56
Status:fixed» active

Not sure if it was intentional, but it's definitely the last instance left in HEAD. I'll see if I can patch that to preg_match today.

#23

drewish - September 16, 2008 - 19:56
Status:active» needs review

here's a first wack at this. i ran through the installer and enabled all the modules with no problems. running the tests now.

AttachmentSizeStatusTest resultOperations
preg_64967.patch14.56 KBIgnoredNoneNone

#24

drewish - September 16, 2008 - 20:28

finished the tests and found one regex i'd missed in simpletest.

AttachmentSizeStatusTest resultOperations
preg_64967.patch12.07 KBIgnoredNoneNone

#25

drewish - September 19, 2008 - 18:03
Component:other» file system

since it's only affecting the file_scan_dir now i'll bump the component.

#26

webchick - September 19, 2008 - 19:31

Hm. Could we not keep the regexes in the calling function the same, and surround the pattern with "/ ... /" in file_scan_directory()?

#27

drewish - September 19, 2008 - 20:21

webchick, we could do that but we'd loose the ability to use case insensitive regular expressions. And this is a more honest way of doing it that won't lead to subtle bugs where ereg expressions fail with preg.

#28

webchick - September 20, 2008 - 03:48
Status:needs review» needs work

Well, that's true. And also, people could also use a different delimiter like # if they wanted to not have to escape /s or whatnot.

Reading this code reminds me that #255551: DX: Array-itize file_scan_directory()'s parameters and #74645: modify file_scan_directory to include a regex for the nomask. would both be nice improvements to file_scan_directory(). Those 120+ character function calls are nasty. ;)

Re-ran the file tests and everything looks good, so committed this. Thanks!

The change to file_scan_directory()'s parameters needs to be noted in the upgrade docs, so marking this code needs work until then.

#29

drewish - September 20, 2008 - 20:58
Status:needs work» fixed

just submitted: http://drupal.org/node/224333#preg_match

#30

Anonymous (not verified) - October 4, 2008 - 21:12
Status:fixed» closed

Automatically closed -- issue fixed for two weeks with no activity.

#31

donquixote - September 4, 2009 - 22:04

What can we do for 6.x ?
I see this issue being closed since October 2008, but in 6.x we still have ereg() instead of preg_match(). With all the nasty errors in PHP 5.3.

Should we make a new issue for 6.x, or switch this one from 7.x to 6.x ?
I would like to point out that the issue was reported for x.y.z (?), then switched to 5.x, then to 6.x. Now it's fixed in D7, but then was forgotten with no backport for almost one year! Is there something wrong with the issue workflow?

EDIT:
Would "@ereg" instead of "ereg" be ok as a temporary fix?

#32

Dave Reid - September 4, 2009 - 22:04

Using preg() instead of ereg() in Drupal 6 would break existing APIs, so it will never happen. Your best bet is to set your PHP settings to not show E_DEPRECATED errors. See http://us2.php.net/manual/en/function.error-reporting.php

 
 

Drupal is a registered trademark of Dries Buytaert.