while developing locally, I entered both the eventual domain as my localhost (127.0.0.1) in the domain field and got presented with this

so I had a lil looksie and noticed that the domain validation is brutally strict and entirely not future-proof

<?php
 
foreach ($nofollow_domains as $nofollow_domain) {
    if (!
preg_match('#^([a-z0-9]([-a-z0-9]*)?\.)+([a-z]+)$#i', $nofollow_domain)) {
     
form_set_error('wysiwyg_filter_nofollow_domains_'. $format, t('Invalid domain %domain. Please, enter a comma separated list of valid domain names.', array('%domain' => $nofollow_domain)));
    }
  }
?>

the following domains would be perfectly valid, but are denied by this lil preg_match:
localhost, 127.0.0.1 (or any other IP), íntérnàtîõnàlízéd.com (http://www.icann.org/en/topics/idn/)

so I would like to ask the maintainer what the reason is to even do a check on this? who cares if they entered an invalid domain, it'll just allow that invalid domain to be used, where's the harm? I really only see a downside to this, as the range of valid domains will only expand in the future...

attached patch gets rid of the domain validation

Comments

Category:bug» feature
Status:Needs review» Needs work

Rather than removing, the validation logic could be enhanced.

The filter also add the nofollow directive to internal relative and absolute path with the Whitelist. They are obviously on the same domain and should never have that directives;

<a href="/path/to/page.html>Page</a>
become
<a href="/path/to/page.html" rel="nofollow">Page</a>

and
<a href="path/to/page.html">Page</a>
become
<a href="path/to/page.html" rel="nofollow">Page</a>

I think the filter should check for the protocol (http://, https://, ftp://, etc.). If there is none, the path is considered as internal and the nofollow should not be added.

some new domain TLDs were added, extension امارات for example

pretty sure this fails, as it doesn't consist of a-z

@gaellafond: AFAICT, the parser checks for '://' + domain pattern, so internal paths should never match with domains specified here.

@all: I've been looking at the code, and now I see how we can get rid of this 'strict validation'. It could be completely free, but we need to do something more than just removing the validation code. We need:

1) Remove the validation code (as in the patch above), but also...
2) The domain patterns are now stored with dots scaped. Well, we should not scape anything now at storage time (when the settings form is submitted).
3) Since we would not store the patterns escaped, we need to escape them during text parsing, and we need to use preg_quote() for this job.
4) Siince we will be changing the way this setting is stored, we need to implement hook_update_N() to migrate existing installations.

It would be much appreciated if someone can come up with a patch for review. Sadly, I'm too busy to work on this. Thanks

Category:feature» bug

Changed category. This is definitely a bug. Imagine any intranet service trying to whitelist e.g. a server named "docserver" here. Even if this is probably a rather academic thought.

Sorry, but I really don't see a reason to validate what the site admin considers being a domain. And after all even aside consideration, domains may consist of /[a-z0-9\-\.]/. Why do you stick to it?

Hi, I am having this same issue with Drupal 7. Should I open a new issue since this one is marked for Drupal 6? Thanks.

Has this ever been resolved? I cannot edit my text filters on Drupal 7. I have installed the flexifilter module, which I believe may have reintroduced this issue.

I figured out a work around for this. Instead of using "mysite" as the domain (as configured in MAMP Pro), I changed it to "mysite.dev". This fooled Drupal into thinking it was a real domain and it solved the issue here.

Had to remove WYSIWYG filter entirely as a result of this bug... might be worth addressing...

Same issue here. This affects my entire team who develop locally with different conventions for local environments. For now my solution is to comment out the validation locally and make sure I don't commit my changes upstream; less than ideal.

@brentratliff & @xtfer -- did you see my comment above? I simply changed mysite to mysite.dev and it fixed this issue albiet a workaround...

@highrockmedia Thanks for the suggestion, but it fails on a path like mysite.local, which is also a legitimate domain.

On Drupal 7, It also blocks saving the Input Format admin form, even if WYSIWYG Filter is disabled! To save e.g. the Full HTML format with a disabled WYSIWYG Filter, you need to check the WYSIWYG Filter box to enable it and show the config, change the host name to something more acceptable, uncheck the WYSIWYG Filter box to disable it, and then save.

Version:6.x-1.4» 7.x-1.x-dev

@jakew, see #1687794: Validation occurs on disabled filter. I just opened that

Moving to the latest branch.

Personally, I think people will need to stop domain validation with the i18n changes, or at least allow users to bypass this.

In the link module, the is the instance setting 'validate_url' to bypass, that may be an option here. And it's own "lenient" validation function is:

<?php
/**
* A lenient verification for URLs. Accepts all URLs following RFC 1738 standard
* for URL formation and all email addresses following the RFC 2368 standard for
* mailto address formation.
*
* @param string $text
* @return mixed Returns boolean FALSE if the URL is not valid. On success, returns an object with
* the following attributes: protocol, hostname, ip, and port.
*/
function link_validate_url($text) {
 
$LINK_ICHARS_DOMAIN = (string) html_entity_decode(implode("", array( // @TODO completing letters ...
   
"&#x00E6;", // æ
   
"&#x00C6;", // Æ
   
"&#x00C0;", // À
   
"&#x00E0;", // à
   
"&#x00C1;", // Á
   
"&#x00E1;", // á
   
"&#x00C2;", // Â
   
"&#x00E2;", // â
   
"&#x00E5;", // å
   
"&#x00C5;", // Å
   
"&#x00E4;", // ä
   
"&#x00C4;", // Ä
   
"&#x00C7;", // Ç
   
"&#x00E7;", // ç
   
"&#x00D0;", // Ð
   
"&#x00F0;", // ð
   
"&#x00C8;", // È
   
"&#x00E8;", // è
   
"&#x00C9;", // É
   
"&#x00E9;", // é
   
"&#x00CA;", // Ê
   
"&#x00EA;", // ê
   
"&#x00CB;", // Ë
   
"&#x00EB;", // ë
   
"&#x00CE;", // Î
   
"&#x00EE;", // î
   
"&#x00CF;", // Ï
   
"&#x00EF;", // ï
   
"&#x00F8;", // ø
   
"&#x00D8;", // Ø
   
"&#x00F6;", // ö
   
"&#x00D6;", // Ö
   
"&#x00D4;", // Ô
   
"&#x00F4;", // ô
   
"&#x00D5;",    // Õ
   
"&#x00F5;",    // õ
   
"&#x0152;", // Œ
   
"&#x0153;", // œ
   
"&#x00FC;", // ü
   
"&#x00DC;", // Ü
   
"&#x00D9;", // Ù
   
"&#x00F9;", // ù
   
"&#x00DB;", // Û
   
"&#x00FB;", // û
   
"&#x0178;", // Ÿ
   
"&#x00FF;", // ÿ
   
"&#x00D1;", // Ñ
   
"&#x00F1;", // ñ
   
"&#x00FE;", // þ
   
"&#x00DE;", // Þ
   
"&#x00FD;", // ý
   
"&#x00DD;", // Ý
   
"&#x00BF;", // ¿
 
)), ENT_QUOTES, 'UTF-8');
 
$LINK_ICHARS = $LINK_ICHARS_DOMAIN . (string) html_entity_decode(implode("", array(
   
"&#x00DF;", // ß
 
)), ENT_QUOTES, 'UTF-8');
 
$allowed_protocols = variable_get('filter_allowed_protocols', array('http', 'https', 'ftp', 'news', 'nntp', 'telnet', 'mailto', 'irc', 'ssh', 'sftp', 'webcal'));
 
// Starting a parenthesis group with (?: means that it is grouped, but is not captured
 
$protocol = '((?:'. implode("|", $allowed_protocols) .'):\/\/)';
 
$authentication = "(?:(?:(?:[\w\.\-\+!$&'\(\)*\+,;=" . $LINK_ICHARS . "]|%[0-9a-f]{2})+(?::(?:[\w". $LINK_ICHARS ."\.\-\+%!$&'\(\)*\+,;=]|%[0-9a-f]{2})*)?)?@)";
 
$domain = '(?:(?:[a-z0-9' . $LINK_ICHARS_DOMAIN . ']([a-z0-9'. $LINK_ICHARS_DOMAIN . '\-_\[\]])*)(\.(([a-z0-9' . $LINK_ICHARS_DOMAIN . '\-_\[\]])+\.)*('. LINK_DOMAINS .'|[a-z]{2}))?)';
 
$ipv4 = '(?:[0-9]{1,3}(\.[0-9]{1,3}){3})';
 
$ipv6 = '(?:[0-9a-fA-F]{1,4}(\:[0-9a-fA-F]{1,4}){7})';
 
$port = '(?::([0-9]{1,5}))';
 
// Pattern specific to external links.
 
$external_pattern = '/^'. $protocol .'?'. $authentication .'?('. $domain .'|'. $ipv4 .'|'. $ipv6 .' |localhost)'. $port .'?';
 
// Pattern specific to internal links.
 
$internal_pattern = "/^(?:[a-z0-9". $LINK_ICHARS ."_\-+\[\]]+)";
 
$internal_pattern_file = "/^(?:[a-z0-9". $LINK_ICHARS ."_\-+\[\]\.]+)$/i";
 
$directories = "(?:\/[a-z0-9". $LINK_ICHARS ."_\-\.~+%=&,$'#!():;*@\[\]]*)*";
 
// Yes, four backslashes == a single backslash.
 
$query = "(?:\/?\?([?a-z0-9". $LINK_ICHARS ."+_|\-\.~\/\\\\%=&,$'():;*@\[\]{} ]*))";
 
$anchor = "(?:#[a-z0-9". $LINK_ICHARS ."_\-\.~+%=&,$'():;*@\[\]\/\ ? ]*)";
 
// The rest of the path for a standard URL.
 
$end = $directories . '?' . $query . '?' . $anchor . '?' . '$/i';
 
$message_id = '[^@].*@'. $domain;
 
$newsgroup_name = '(?:[0-9a-z+-]*\.)*[0-9a-z+-]*';
 
$news_pattern = '/^news:('. $newsgroup_name .'|'. $message_id .')$/i';
 
$user = '[a-zA-Z0-9'. $LINK_ICHARS .'_\-\.\+\^!#\$%&*+\/\=\?\`\|\{\}~\'\[\]]+';
 
$email_pattern = '/^mailto:'. $user .'@'.'(?:'. $domain .'|'. $ipv4 .'|'. $ipv6 .'|localhost)'. $query .'?$/';
  if (
strpos($text, '<front>') === 0) {
    return
LINK_FRONT;
  }
  if (
in_array('mailto', $allowed_protocols) && preg_match($email_pattern, $text)) {
    return
LINK_EMAIL;
  }
  if (
in_array('news', $allowed_protocols) && preg_match($news_pattern, $text)) {
    return
LINK_NEWS;
  }
  if (
preg_match($internal_pattern . $end, $text)) {
    return
LINK_INTERNAL;
  }
  if (
preg_match($external_pattern . $end, $text)) {
    return
LINK_EXTERNAL;
  }
  if (
preg_match($internal_pattern_file, $text)) {
    return
LINK_INTERNAL;
  }
  return
FALSE;
}
?>

I will lend my voice to this. It is creating a time-suck for me trying unsuccessfully to set up a local mail infrastructure when all we really need is for the validation to allow a perfectly valid ip mail server name.

Have same problem with Wysiwyg Filter - when enabled it:

1. Produced a red message 'Invalid domain localhost. Please, enter a comma separated list of valid domain names', and

2. Obstructed the 'save changes' function in the filtered HTML text format config making it impossible to save any changes to settings

Changing the domain name from localhost to localhost.dev in the Wysiwyg Filter settings appears to have worked for me having only one site. Thanks highrockmedia.

the #13 works for me in my case, this time. thx :) @jakew - to prevent it by "system" working on this "issue" en details would be sthg to do otherdays.
(edit: drupal 7.22 - CKEditor 3.6.5.7647 - Wysiwyg 7.x-2.2)

StatusFileSize
new986 bytes

Fixed for my situation by just modifying the conditional to also allow a value of 'default'

Honestly. Why validate a domain at all. What has this got to do with a WYSIWYG filter at all? Is it intended to be some security snake oil? If so, you're trying to reinvent the wheel although you are not even automotive. If not, is it a symptom of vanity ("look how much we care")? In that case it's even more evitable. No offense, but the world's best-performing tools have always been the ones to focus on core competence.

This module is intended to validate HTML syntax, not URL syntax. Validating a domain name ist best done by a) the guy to enter the markup (it's him who wants to link somewhere!) and b) the browser rendering the site, which is besides much more likely to happen correctly as browsers are maintained by far more than a handful of people and (hopefully) by the ones always up to date with continuous changes in such things as "what is a valid URL".

Sorry for the little sidestep, let's get back to the point: Drop it, move it into a separate plugin so site-builders are free to decide or make it optional. I really don't understand the big deal about it. No offense, again, but three years for a ridiculous issue like this? Without any real feedback? Has the module been dropped? Wouldn't it be time to at least invite an additional maintainer? Someone to fork it if nothing else helps?

Sad sad, because it's an important and helpful module after all. :-(

Please, read #4. No one provided a patch in that direction since then. In addition to that, I may not have the time to deal with it, not even to keep track of follow ups nor the rest of the issues queue, so I have updated the project page as looking for additional co-maintainers. This should state more clearly that others may chime in here to help. ;)

Hell, I overlooked that. Sorry. Well I would at least be able to free some minutes and add a "no validation" option in a first step.

am still getting this error - tried several patches (#1 here as well as #1687794: Validation occurs on disabled filter - no luck, am disabling

Find a patch attached which adds a checkbox for each filter format. Now, the domain validation will only take place if this checkbox is checked.

Hope this helps and finds its way into the module quickly. A second step (but not necessary for this quick solution) could be a rework of the validation itself - to meet latest W3C best practices. If this is still an option, how would you want to validate strings that may consist of almost everything now, from /[a-z]+/ to /[^\.]+(\.[^\.]+)+/?

I would still vote for removing this validation completely. It does not make any sense for those who really would use it as they will know theirselves how their local domains are validly spelled, and for all others who do not want to bother with nofollow options, it is useless anyway.

Either way: HTH!

dave

Status:Needs work» Needs review

...

I had this problem (using Panopoly distro from Pantheon, has WYSIWYG filter). I tried the patch in #23. Got an error the first time I went to /admin/config/content/formats/panopoly_wysiwyg_text:

Notice: Undefined index: nofollow_url_validation in wysiwyg_filter_filter_wysiwyg_settings() (line 151 of /home/me/mysite/profiles/panopoly/modules/contrib/wysiwyg_filter/wysiwyg_filter.admin.inc).

It looks like there's no default set already for $settings['nofollow_url_validation']. When I checked the new checkbox created by this patch and saved it, the error went away.

Also, upon clicking "Add filter" to create a new content filter, I get the same error. I also get the error upon submitting that form, though the new filter appears to have been created. (The latter may be attributable to Drupal's error message lag I've seen?) When I go back to configure the new filter and save it, the error is gone.

Again, this occurs whether "WYSIWYG Filter" is checked or not in any case.

Panopoly is currently using 7.x-1.6-rc2, not the dev version.

EDIT: I also vote for turning off this particular validation.

I also installed the patch from #23.
It works fine, but I also got the error message mentioned in #26.

I'm not using Panopoly btw.

And I also vote for turning off the validation, or at least to make it optional.

Fixed the missing initialization check mentioned in #26, thanks for the hint. Here is a new patch checking this, although I barely believe that anything will happen here after all. Sorry to say that.

+++ b/wysiwyg_filter.admin.incundefined
@@ -144,6 +144,13 @@ This option allows you to specify which HTML elements and attributes are allowed
+    '#description' => t('If enabled, domains you enter in the field below will run through a simple, but sometims too restrictive check.'),

There is a minor typo in the nofollow_url_validation form field description. Despite the typo the patch worked fine. Thanks!

Status:Needs review» Reviewed & tested by the community

Changing the status.

Ping.

Updating patch per #29 comments.

Went the workaround route, enabling wysywig filter changing the hostname(s) to a valid one.

I think the best idea for a patch to this would be to simply disable this filters form_alter for the input format form so that it only tries to validate if the filter is enabled.

maybe...

<?php
if($form_state['values']['filters']['wysiwyg']['status'] == 1){
 
// *** validate nofollow_domains ***
 
$nofollow_domains = array_filter(explode(',', preg_replace('#\s+#', ',', $values['nofollow_domains']))); // form2db
 
foreach ($nofollow_domains as $nofollow_domain) {
    if (!
preg_match('#^([a-z0-9]([-a-z0-9]*)?\.)+([a-z]+)$#i', $nofollow_domain)) {
     
form_set_error('nofollow_domains', t('Invalid domain %domain. Please, enter a comma separated list of valid domain names.', array('%domain' => $nofollow_domain)));
    }
  }
}
?>

- or -
wrap the entire contents of the validate function in an if statement ...should'nt it be standard for all _validate functions to check if their form should even expect input or not?

<?php
/**
* Validate filter settings form.
*
* @ingroup forms
*/
function wysiwyg_filter_filter_wysiwyg_settings_validate($form, &$form_state) {
    if(
$form_state['values']['filters']['wysiwyg']['status']==1){
     
$values =& $form_state['values']['filters']['wysiwyg']['settings'];
     
// *** validate valid_elements ***
      // Check elements against hardcoded backlist.
     
$elements_blacklist = wysiwyg_filter_get_elements_blacklist();
     
$valid_elements = trim($values['valid_elements']);
     
$valid_elements = wysiwyg_filter_parse_valid_elements($valid_elements);
     
$forbidden_elements = array();
      foreach (
array_keys($valid_elements) as $element) {
        if (
in_array($element, $elements_blacklist)) {
         
$forbidden_elements[] = $element;
        }
      }
      if (!empty(
$forbidden_elements)) {
       
form_set_error('valid_elements', t('The following elements cannot be allowed: %elements.', array('%elements' => implode(', ', $forbidden_elements))));
      }
     
// *** validate nofollow_domains ***
     
foreach (wysiwyg_filter_get_advanced_rules() as $rule_key => $rule_info) {
       
$field_name = "rule_$rule_key";
       
$expressions = array_filter(explode(',', preg_replace('#\s+#', ',', trim($values[$field_name])))); // form2db
       
$errors = array();
        foreach (
$expressions as $expression) {
          if (
preg_match('`[*?]\*|\*\?`', $expression)) {
           
$errors[] = t('Invalid expression %expression. Please, do not use more than one consecutive asterisk (**) or one that is next to a question mark wildcard (?* or *?).', array('%expression' => $expression));
          }
          if (!
preg_match($rule_info['validate_regexp'], $expression)) {
           
$errors[] = t('Invalid expression %expression. Please, check the syntax of the %field field.', array('%expression' => $expression, '%field' => $rule_info['title']));
          }
        }
        if (!empty(
$errors)) {
         
form_set_error($field_name, implode('<br />', $errors));
        }
      }
     
// *** validate nofollow_domains ***
     
$nofollow_domains = array_filter(explode(',', preg_replace('#\s+#', ',', $values['nofollow_domains']))); // form2db
     
foreach ($nofollow_domains as $nofollow_domain) {
        if (!
preg_match('#^([a-z0-9]([-a-z0-9]*)?\.)+([a-z]+)$#i', $nofollow_domain)) {
         
form_set_error('nofollow_domains', t('Invalid domain %domain. Please, enter a comma separated list of valid domain names.', array('%domain' => $nofollow_domain)));
        }
      }
    }
}
?>

Issue summary:View changes
StatusFileSize
new1.13 KB

created a patch for this very simple approach...

StatusFileSize
new6.87 KB

okay now it's patched so that wysiwyg doesn't do anything with it's form unless you enable it by checking the 'Wysiwyg Filter' checkbox under 'Enabled Filters' this should take some headaches away...

Both patches are not working when changing lines manually :
# 35: Unexpected if in line 171
# 36: unexpected if in line 234

Just for you information. When there's someone who could send the working wyswig_filter.admin.inc file to me I'd be very grateful!

For those who want to create a new text format to use full html (like for a paypal button-block):

I found a solution by changing the CKEditor profile settings like here. And disabling the Wysiwyg-Module. So the new text format was not necessary anymore.

#13 is the most helpful.

Thanks HeathN and jakew, I read through this issue several times before I saw your comment and read #13. There is a workaround without any code changes:
If you are just trying to save the settings:

  • Enable the WYSWIYG filter (even though you don't need it) and update the domain options, either enter the valid ones or just delete the default and leave it blank
  • Save the updated settings
  • Disable the WYSIWYG filter and save again