Is there any way to limit taxonomy terms to alphanumeric characters? This is an inherent problem with user-generated tags that will be operated on as data. In my case, I use taxonomy terms with Views arguments. These arguments get passed to the URL where certain characters (*,&,-,%,$) have special functions. From searching on the forums, I have found that other users have similar problems but haven't found a solution. Is there any existing way to sanitize the user input for taxonomy terms?

One possibility would be to apply input formats to taxonomy terms and then create an input format that strips all non-alphanumeric characters.

Comments

cwgordon7’s picture

This would be a very simple module. Just create a folder called "taxonomy_alphanumeric" in your sites/all/modules directory, and add the following files:

taxonomy_alphanumeric.info

; $Id$
name = Taxonomy alphanumeric
description = Restricts taxonomy terms to alphanumeric characters.
dependencies[] = taxonomy
core = 6.x

taxonomy_alphanumeric.module

<?php
// $Id$

/**
 * Implementation of hook_form_alter()
 */
function taxonomy_alphanumeric_nodeapi($node, $op, $arg = 0) {
  if ($op == 'validate') {
    $vid = 1; // Replace with the vocabulary id of the vocabulary you want to ensure contains solely alphanumeric terms.
    $terms = drupal_explode_tags($node->taxonomy['tags'][$vid]);
    foreach ($terms as $term) {
      // Change this regular expression to allow more characters.
      if (preg_match('/[^A-Za-z0-9]/', $term)) {
        form_set_error("taxonomy][tags][$vid", t('The term %term is invalid, terms may only consist of alphanumeric characters.', array('%term' => $term)));
      }
    }
  }
}

// Note: the ?> here is so the code filter works. Remove this in your actual module.
?>

Note that the code I listed is untested and may not work perfectly, if at all. It should, however, at least give you enough to get started. Feel free to post back to this thread if you encounter any bugs.

nirad’s picture

Thanks so much. This works perfectly. I modified it slightly to allow spaces (I can use spaces but not dashes, as dashes substitute for spaces in the URL when the tag is used as an argument).

I also found a project called tagtrap that does something similar (restricts certain tags from a vocabulary). I am going to ask the maintainer if he wants to add this functionality, otherwise I think this could be submitted as a separate module. From searching I know there are other users who want this functionality.

Thanks again.

-Nirad

-Nirad