I'd like to get rid of some old server-side includes code in my nodes. So I want to use a search like this:
<!--.+?-->
There are some pages in which if I don't add that final "?" to make the search "lazy," I'll end up deleting real content that needs to stay in the page. But when I try to use this search in Regular Expressions mode, I get this error:
* user warning: Got error 'repetition-operator operand invalid' from regexp query: SELECT t.body as content, t.nid, n.title FROM node_revisions t INNER JOIN node n ON t.vid = n.vid WHERE n.type = 'curriculum_page' AND t.body REGEXP '<!--.+?-->' in /usr/local/web/lamp/users/wsbe/site/sites/all/modules/scanner/scanner.module on line 801.
* user warning: Got error 'repetition-operator operand invalid' from regexp query: SELECT t.body as content, t.nid, n.title FROM node_revisions t INNER JOIN node n ON t.vid = n.vid WHERE n.type = 'page' AND t.body REGEXP '<!--.+?-->' in /usr/local/web/lamp/users/wsbe/site/sites/all/modules/scanner/scanner.module on line 801.
No matches are found, though I get 64 returns if I leave out the "?" and let the regexp be greedy. Any suggestions?
Comments
Comment #1
emdalton commentedWorkaround: I searched on
<!--[^<>]+-->instead. But if scanner doesn't support lazy matching, probably this should be documented.Comment #2
aasarava commentedThanks for the heads up and the workaround. I'll look into this to see if lazy matching is truly disallowed, or if there's some change I can make to the code to get this working.
Comment #3
markabur commentedProbably stuck with the POSIX style of ungreedy matching due to MySQL.
Comment #4
froboyThis error still pops up in 7.x. My regex search was
<h3>(.*?)<\/h3>and I can confirm that removing the lazy operator got rid of the error.Comment #5
eclecto commentedI've glanced this over and I second adding some kind of disclaimer to the documentation or on the module's forms, maybe even detecting attempts to use lazy quantifiers and throwing an error message instead of running the query.
It's definitely the MySQL/POSIX limitation as mentioned in #3, and the only workaround I can think of is literally pulling every possible field value to a PHP string and running it through PCRE instead of running the regex on the database side. I don't even want to think about the overhead.
Comment #6
smustgrave commentedWith D7 shutting down in a few weeks I'm triaging the D7 queue of scanner. Will leave bugs open for a bit longer and maybe, if the other committers want to, do a final D7 version release of scanner