Ignore pattern

Crashtest - July 15, 2009 - 21:37
Project:Custom filter
Version:6.x-2.x-dev
Component:Miscellaneous
Category:support request
Priority:normal
Assigned:Unassigned
Status:closed
Description

Hey,

i need to have ignore patterns that deactivate a rule.

Here is the scenario. **foo** gets replaced with --bar--. But if I write {{ **foo** }} the **foo** inside should not be replaced. So in other words {{ ... }} should protect its content.

Is there a simple way to realize this with the module?

greez

#1

kiamlaluno - July 16, 2009 - 06:32

I think it is possible by using three rules:

  • The first rule changes {{ **foo** }} in \FE\FFfoo\FE\FF (where \FE\FF are two characters with those hexadecimal codes).
  • The second rules replaces **foo** with --bar--.
  • The third rule replaces \FE\FFfoo\FE\FF with {{ **foo** }}.

The use of those strange Unicode values is to be sure they are not changed in a different rule, or they are used by a different filter.

#2

kiamlaluno - July 16, 2009 - 15:39

There is not the need to create three different rules; using the assertions it is possible to check the characters present before, or after the current matching point.

The question is in minimally related to Custom filter; it's rather a question about regular expressions; such questions should be asked in Drupal, or PHP forums, where more people would read it, and it would be easier to get an answer.

#3

Crashtest - July 16, 2009 - 20:03

Okay, assertions was indeed a good tip. But to be exact: {{{ some text **foo** some other text }}} should also be untouched. Unfortunately by now PHP doesn't support look-behind-assortions with variable length. And I can't think of a regex-only solution where you wouldn't need this feature.

So maybe I use your first tip... altough I don't like it much ;).

thx

#4

kiamlaluno - July 16, 2009 - 21:03

Assertions can be used even when they are followed / preceded by ".*"; I don't think there is a difference between the first case you mentioned, and the last.

#5

Crashtest - July 16, 2009 - 21:32

If I got you right you propose something like that:
(?<!\{{3}).*\*{2}.+?\*{2}.*(?!\}{3})

But that doesn't work. Because
.*\*{2}.+?\*{2}.*

Would match the whole string, when a **foo** is in it. And there is never a {{{ in front or a }}} at the end of it.

Example:
This is just a example text to {{{show **what** I mean}}}. Foobar.

So the whole string would be matched.

If variable quantors in a lookbehind assertion would be allowed, you could do something like
(?<!\{{3}.*)\*{2}.+?\*{2}(?!.*\}{3}.*)

This would work with some restrictions but it covers my point.

Or did I got you wrong?

A general question: Is this the proper place to discuss this? ;)

Reference:
http://www.php.net/manual/en/regexp.reference.assertions.php
(?<=bullock|donkey) is permitted, but (?

http://www.regular-expressions.info/lookaround.html
Therefore, many regex flavors, including those used by Perl and Python, only allow fixed-length strings.

#6

kiamlaluno - July 16, 2009 - 21:45

Why are you using "{2}", and "{3}" inside the regular expression?

#7

kiamlaluno - July 16, 2009 - 21:52
Status:active» postponed (maintainer needs more info)

What I meant before is that if you ask the question on Drupal forum, or a PHP forum you get more chances of having your question answered; so far, the only people who write in this issue queue are people who have problems with the module (but who don't seem interested to reply to questions made from other people, when the question is not about a problem they are having, or they had), or me. I will do my best to reply to any questions reported here, as far as my knowledge allows me to give a not too vague answer.
I didn't mean that such questions should not be asked here.

#8

Crashtest - July 16, 2009 - 22:03

Eh, why not? It's a legal quantor.

I want to match for example {{{ so I use \{{3}

P.S.: I've asked in a php-forum. Which gave me no result by now. So I'm happy to discuss with anybody ;).

#9

kiamlaluno - July 16, 2009 - 22:40
Status:postponed (maintainer needs more info)» active

(?<!\{{3}).*\*{2}.+?\*{2}.*(?!\}{3})

But that doesn't work. Because

.*\*{2}.+?\*{2}.*

Would match the whole string, when a **foo** is in it.

In that case, just change it to (?<!\{{3}).*?\*{2}.+?\*{2}.*?(?!\}{3}). That would make the .* not greedy.

I got confused by the term you used; what you call quantor is called quantifier, in PHP documentation.

Maybe you can be also interested in Conditional subpatterns.

I think that you must include also the regular expression delimiters, in the regular expression; the regular expression you wrote should be /(?<!\{{3}).*\*{2}.+?\*{2}.*(?!\}{3})/.

I can suggest you to always start with a more generic regular expression that you then modify until you obtain the expected result; trying to get the final result in a single step can lead you to use a not correct expression.

#10

Crashtest - July 17, 2009 - 10:42

This wouldn't work either. Because first the pattern without the assertions is matched and then the assertions are evaluated.

So the pattern .*?\*{2}.+?\*{2}.*? would match the bold in the following example:
This is just a example text to {{{show **what** I mean}}}. Foobar.

Of course there is never a {{{ before the expression (unless the string starts with it though).

I've found a pattern that works more or less:
\*{2}.+?\*{2}(?![^\{]*?\}{3})

But it breaks, if at least one { ist between **foo** and }}}. So for example {{{ **foo** { }}} would be matched.

quantor is called quantifier

Yep. You're right.

#11

kiamlaluno - July 17, 2009 - 14:38

It seems that the first idea to create three different rules is the one that works, so far. I would keep to try making the solution with assertions work, as it should the solution to adopt (if it would work :-)).

After reading your support request, and all the follow-up, I thought that Custom filter could have a page where to test a regular expression with a string; this would help in defining the regular expression to test without to create / change the rule, test the filter on a test node, and return back to edit the rule.
I think that it would save you some time while testing a new regular expression. What do you think?

The next change I thought is to make possible to use some prebuilt regular expressions that could be used to built a more complex expression.

#12

Crashtest - July 18, 2009 - 14:43

I thought that Custom filter could have a page where to test a regular expression with a string; this would help in defining the regular expression to test without to create / change the rule, test the filter on a test node, and return back to edit the rule.
I think that it would save you some time while testing a new regular expression. What do you think?

This would save a whole lot of time and would be a sweet feature. If it is ajaxified (so you don't have to reload the page everytime) it will be really pleasent.

Or a little bit different approach: In the same page where you are defining the rule, you could select a node on which the rule would be applied and the result is shown.

I've testet my pattern with this german site: http://www.regex-tester.de/regex.html But it is likely there exists something in english too. To get your inspiration.

Your 2nd idea: Do you mean something like a wysiwyg regex editor? Where you put patterns together?

At the moment I use a adobted algorithm like in your first post.

1. Put a special string (like ) behind every line in the {{{}}}-container
2. all rules look-ahead mit .*?
3. remove

But I still don't like it much ;).

#13

kiamlaluno - July 18, 2009 - 18:21

Your 2nd idea: Do you mean something like a wysiwyg regex editor? Where you put patterns together?

I am referring to something similar to the select field that appear close to the button labeled "ausgewähltes Feld an Mausposition einfügen". Differently from what made in that web site, I would add a short explanation of the purpose of the regular expression; also, third-party modules would be allowed to add additional presets in the select field.

#15

kiamlaluno - August 1, 2009 - 20:05
Status:active» fixed

As this support request already got an answer, I am setting it as fixed.

#16

System Message - August 15, 2009 - 20:10
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.