http://www.whatwg.org/specs/web-apps/current-work/multipage/common-input...

Let's add native #pattern FAPI property to provide easy validation Form API fields. Browsers could also use this property to perform client-side validation, but we should ensure it is performed server-side as well.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Dave Reid’s picture

Issue tags: +html5
Dave Reid’s picture

Assigned: Dave Reid » Unassigned
ericduran’s picture

Status: Active » Needs review
FileSize
1.94 KB

Status: Needs review » Needs work

The last submitted patch, 1174766-html5-pattern-property.patch, failed testing.

ericduran’s picture

Ooops.

ericduran’s picture

Status: Needs work » Needs review
ericduran’s picture

Title: Support the #pattern FAPI property for native HTML5 placeholder support » Support the #pattern FAPI property for native HTML5 pattern support

Fixing title.

Everett Zufelt’s picture

Status: Needs review » Needs work

Looks good, noticed one typo:

+function element_validate_patten($element, &$form_state) {

I assume you wanted element_validate_pattern. This needs to be corrected in two places.

ericduran’s picture

Status: Needs work » Needs review
FileSize
1.94 KB
aspilicious’s picture

Status: Needs review » Needs work
+++ b/includes/form.incundefined
@@ -3058,6 +3058,50 @@ function form_process_actions($element, &$form_state) {
+ * Validate a form elements pattern.

"ValidateS" in case this needs a reroll :)

Powered by Dreditor.

ericduran’s picture

Status: Needs work » Needs review
FileSize
1.94 KB
Dave Reid’s picture

1. Do we know why we're using mb_ereg vs preg? From the specs, the JavaScript Pattern production is PERL-based, and for PHP that is the preg* functions.
2. I thought ereg() was deprecated? mb_ereg() is not? That's odd.
3. Should pattern match the value from beginning to end? If so we may need to add boundries to the regex expression.

aspilicious’s picture

1)
Thats for eric :)
2)
deprecated: http://php.net/manual/en/function.ereg.php
not deprectaed: http://php.net/manual/en/function.mb-ereg.php

3)
"This implies that the regular expression language used for this attribute is the same as that used in JavaScript, except that the pattern attribute must match the entire value, not just any subset (somewhat as if it implied a ^(?: at the start of the pattern and a )$ at the end)."

Yes it should match the entire value

ericduran’s picture

Status: Needs review » Needs work

1. Yea, I tried to figure out which one would be closest to the js one -- clearly I #failed.
2. I also found it weird that it wasn't deprecated :-/
3. I think that should probably be left up to the pattern

I'll rerolled a patch with @daves feedback.

ericduran’s picture

Note, my comment on #14 was a cross post with #13. So I didn't see that comment before replying. FYI.

yched’s picture

Subscribe.

Once this is in, we'll want to integrate this to Field API text widgets.

attiks’s picture

Jelle_S’s picture

+++ b/includes/form.incundefined
@@ -3058,6 +3058,50 @@ function form_process_actions($element, &$form_state) {
+    $element['element_validate'][] = 'element_validate_pattern';

missing # before element_validate

22 days to next Drupal core point release.

ericduran’s picture

Issue tags: +HTML5 Sprint: August 2011 - 2

Updating tag

Jacine’s picture

Title: Support the #pattern FAPI property for native HTML5 pattern support » Support the #pattern FAPI property for native HTML5 pattern attribute
Issue tags: -HTML5 Sprint: August 2011 - 2 +Needs tests

I'm pretty sure this will need tests (?), so tagging for that and cleaning up tags.

Also, here's a handy little site with a bunch of examples to test with (for those of us that are regex stupid):

http://html5pattern.com/

Niklas Fiekas’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
FileSize
3.96 KB
5 KB

Added testcase, fixed #18 and one more typo.

Remarks:

  • We must support other input-elements than only textfields. Not sure if this is included yet, but the test also tries using #1174634: Add new HTML5 FAPI element: telephone. If special code is nescessary to support more element types, we must add it here when they get in first or add it to the elements, if this gets in first.
  • The spec recommends using title="..." on the input fields, explaining the input format. Otherwise it would be barely accessible. So we should make it clear to developers, that attributes => title should be provided.
  • Didn't touch the mb_ereg thing. If we must do that, how can we best replace it with preg_match?
ericduran’s picture

Status: Needs review » Needs work

yea, the patten process should be added to the other elements.

Also as mention in other comments we should switch to using preg_match.

That looks like all that's needed :)

Niklas Fiekas’s picture

Status: Needs work » Needs review
FileSize
1.99 KB
5.67 KB

Using preg_match, added support for tel type element, fixed typo.

ericduran’s picture

So the thing that's weird is that webkit browser do this to the pattern:

    String pattern = "^(" + rawPattern + ")$";

Which is similar to what @Niklas Fiekas has except they added the Parentheses my guess is for some cases we haven't figure out yet not sure. Any comments welcome.

We'll see what the test says. These are the kinds of issues that makes me which something like qunit was in core ;-)

sun’s picture

Status: Needs review » Needs work

The spec recommends using title="..." on the input fields, explaining the input format. Otherwise it would be barely accessible.

#attributes][title or #description. The element is also barely usable for sighted users without explanation of what exactly can be entered or not. The 'title' only appears as tooltip on hover (and hence, not at all for users on touch devices). In turn, I guess we want to recommend a short user input value syntax pointer in #description, preferred over #attributes][title, but suggest the latter for special scenarios where no #description is possible (compact layout/design, floating input elements, tables, etc).

+++ b/core/includes/form.inc
@@ -3119,6 +3119,50 @@ function form_process_actions($element, &$form_state) {
+ * Processes a form elements pattern.
...
+ * Validates a form elements pattern.

1) "#process callback for #pattern form element property."

2) "#element_validate callback for #pattern form element property."

Please also add @see pointers between the two.

+++ b/core/includes/form.inc
@@ -3119,6 +3119,50 @@ function form_process_actions($element, &$form_state) {
+ * @param $element
+ *   An associative array containing the properties and children of the
+ *   form actions container.
...
+ * @param $element
+ *   An associative array containing the properties and children of the
+ *   form actions container.

Stale copypasted phpDoc.

+++ b/core/includes/form.inc
@@ -3119,6 +3119,50 @@ function form_process_actions($element, &$form_state) {
+function element_process_pattern($element, &$form_state) {
...
+function element_validate_pattern($element, &$form_state) {

s/element_/form_/

+++ b/core/includes/form.inc
@@ -3119,6 +3119,50 @@ function form_process_actions($element, &$form_state) {
+    $element['#attributes']['pattern'] = $element['#pattern'];

We need to decide what happens if there is a 'pattern' in #attributes already.

To me it looks like we want to use !isset($element['#attributes']['pattern']) as an additional pre-condition in this #process callback. That is, because this condition would mean that the element only wants a browser-based validation, without server-side validation.

+++ b/core/includes/form.inc
@@ -3119,6 +3119,50 @@ function form_process_actions($element, &$form_state) {
+  if (isset($element['#pattern'])) {

The validate callback should not check whether #pattern is set. If it's invoked, then either the #process callback or something else added it.

+++ b/core/includes/form.inc
@@ -3119,6 +3119,50 @@ function form_process_actions($element, &$form_state) {
+    $value = $element['#value'];

No need for this.

+++ b/core/includes/form.inc
@@ -3119,6 +3119,50 @@ function form_process_actions($element, &$form_state) {
+    if (!mb_ereg($element['#pattern'], $value)) {

1) Use preg_match() instead.

2) As stated and referenced in #13 the actual #pattern value needs to be wrapped:

!preg_match('/^(?:' . $element['#pattern'] . ')$/', $element['#value'])

Furthermore, we need to decide whether to use preg_quote($element['#pattern']) here, or whether we require code to supply properly escaped/quoted regular expressions already. (I guess the latter)

+++ b/core/includes/form.inc
@@ -3119,6 +3119,50 @@ function form_process_actions($element, &$form_state) {
+      form_error($element, t('%name is not valid.', array('%name' => $element['#title'])));

The user-facing error message could be improved à la

"%name field contains invalid characters."

+++ b/core/modules/simpletest/tests/form_test.module
@@ -30,6 +30,13 @@ function form_test_menu() {
+    'access arguments' => array('access content'),

Replace with:

'access callback' => TRUE

+++ b/core/modules/simpletest/tests/form_test.module
@@ -30,6 +30,13 @@ function form_test_menu() {
+    'type' => MENU_CALLBACK,

Can be removed.

+++ b/core/modules/simpletest/tests/form_test.module
@@ -429,6 +436,27 @@ function form_test_limit_validation_errors_form_partial_submit($form, $form_stat
+    '#title' => t('One digit followed by lowercase letters'),
...
+    '#title' => t('Everything except numbers'),
...
+    '#value' => t('Validate'),

1) No need for t() here.

2) Rename the submit button to 'Submit'.

+++ b/core/modules/system/system.module
@@ -361,7 +361,7 @@ function system_element_info() {
-    '#process' => array('ajax_process_form'),
+    '#process' => array('ajax_process_form', 'element_process_pattern'),

This only touches #type textfield?

Niklas Fiekas’s picture

Status: Needs work » Needs review
FileSize
6.69 KB
5.4 KB

Thank you sun.

sun’s picture

Status: Needs review » Needs work
+++ b/core/includes/form.inc
@@ -3119,6 +3119,49 @@ function form_process_actions($element, &$form_state) {
+ *
+ * @return
...
+ *
+ * @return

oink! :) I meant to correct the stale phpDoc @params, not remove them. ;)

+++ b/core/includes/form.inc
@@ -3119,6 +3119,49 @@ function form_process_actions($element, &$form_state) {
+  if (isset($element['#pattern'])) {
+    if (!isset($element['#attributes']['pattern'])) {

The second condition needs to be ANDed to the first. If there's a 'pattern' attribute already, then we can only assume it's different to the one in #pattern, and thus, the server-side validation won't pass.

+++ b/core/includes/form.inc
@@ -3119,6 +3119,49 @@ function form_process_actions($element, &$form_state) {
+ * @see form_process_pattern().

No punctuation in @see

+++ b/core/includes/form.inc
@@ -3119,6 +3119,49 @@ function form_process_actions($element, &$form_state) {
+  if ($element['#value'] != '') {

Needs a type-agnostic match:

$element['#value'] !== ''
+++ b/core/includes/form.inc
@@ -3119,6 +3119,49 @@ function form_process_actions($element, &$form_state) {
+    // The browser expects the slash (/) unescaped, but we need it escaped for
+    // preg_match. Aditionally the whole string must match, so we have to
...
+    $pattern = '/^(?:' . str_replace('/', '\/', $element['#pattern']) . ')$/';

ugh, this doesn't look right to me.

The html5 spec needs to define the escape character for the regular expression somewhere. If it's not defined, then it's a typically a slash.

But that merely means that modules need to properly escape their regexes. The enforced escaping here makes the most simple expressions invalid; e.g.

'\d+' becomes '\\d+'

Just use the code of my previous comment, please.

+++ b/core/modules/simpletest/tests/form.test
@@ -584,6 +584,41 @@ class FormValidationTestCase extends DrupalWebTestCase {
+    $this->drupalPost('form-test/pattern', $edit, t('Submit'));
...
+    $this->drupalPost('form-test/pattern', $edit, t('Submit'));
...
+    $this->drupalPost('form-test/pattern', $edit, t('Submit'));

+++ b/core/modules/simpletest/tests/form_test.module
@@ -429,6 +435,27 @@ function form_test_limit_validation_errors_form_partial_submit($form, $form_stat
+    '#value' => t('Submit'),

Please remove t() here.

+++ b/core/modules/simpletest/tests/form_test.module
@@ -429,6 +435,27 @@ function form_test_limit_validation_errors_form_partial_submit($form, $form_stat
+function form_test_pattern_form($form, &$form_state) {

Super-minor: We usually specify #type in elements first.

Niklas Fiekas’s picture

Status: Needs work » Needs review
FileSize
3.8 KB
5.4 KB

:)

I am not sure how to fix the doxygen thing. What was stale about the old comments?

And:

The html5 spec needs to define the escape character for the regular expression somewhere. If it's not defined, then it's a typically a slash.

But that merely means that modules need to properly escape their regexes. The enforced escaping here makes the most simple expressions invalid; e.g.

'\d+' becomes '\\d+'

Just use the code of my previous comment, please.

That's not entirely true, because we're talking about the forward slash. \d stays \d and the test verifies that. The point is that / is in no way special to the browser, because no / prefix is used. We make it special by doing that and must escape it correctly.

What we need:

Browser preg_match with / prefix
\d \d
/{3} \/{3}
\/foo/ \/foo\/

str_replace(array('\/', '/'), array('/', '\/'), $pattern) would do that pre-escaping, normalizing \/ to / and then escaping all / to \/.

Niklas Fiekas’s picture

Talked with sun in IRC. Using generic form element instead of the c&p form action container now. Pattern escaping should be fine, as well.

sun’s picture

Status: Needs review » Needs work

+++ b/core/includes/form.inc
@@ -3119,6 +3119,59 @@ function form_process_actions($element, &$form_state) {
+    // The browser expects the slash (/) unescaped, but we need it escaped for
+    // preg_match. Aditionally the whole string must match, so we have to
+    // enclose the pattern in ^ and $. We don't have to escape those two
+    // characters, because the browser expects them to be escaped as well.
+    $pattern = '/^(?:' . str_replace(array('\/', '/'), array('/', '\/'), $element['#pattern']) . ')$/';

The str_replace() is what preg_quote() is for.

This also means we can heavily shorten that very long comment.

Niklas Fiekas’s picture

Status: Needs work » Needs review

No, as discussed in IRC:

php > print preg_quote('[a-z]{3}\d');
\[a\-z\]\{3\}\\d
sun’s picture

Hm. That manual, partial escaping of forwardslashes is still debatable.

While I can see the point of wanting to do the str_replace(), programmers don't really know whether their string is going to be PCRE-quoted afterwards, or whether they're responsible for that in the string they're supplying. It's a custom mixture of both, which is a bit weird.

Let's wait for feedback and reviews from others.

Niklas Fiekas’s picture

So ... it's like that:

  • The developer should be able to use whatever pattern is aloud in <input pattern="..." />. That is a Perl-Regex. It must match the whole string, not a part.
  • So how do we do that in PHP? Because it's a Perl-Regex we must use preg_match(). Because it must match the whole string we must enclose it in ^(?: and )$. Any problems?
  • Yes. preg_match() forces us to decide for one magic pattern delimiter to enclose the string in. Which one do we choose?
  • Whatever we choose (and often the default is indeed /), that character is now a special character - and it is not special in the browser. Because the browser must not use preg_match() or a delimiter. So that's the reason why we must escape this one character, nothing else.

This is what I am trying to explain in the comment. We can probably reword that to make it clearer.

sun’s picture

Niklas Fiekas’s picture

Actually it's quite clear, isn't it?

It's just Perl-regexes that we have. \ for escaping. No new special role for any other character.

foo/bar matches foo/bar. / isn't a special character (just for preg_match(), if we use it as a delimiter).
foo\/bar matches foo/bar, just like foo\!bar is the same as foo!bar.

cosmicdreams’s picture

a bit lost with this issue: what more needs to be done?

Dave Reid’s picture

Based on http://html5pattern.com/ it seems that '\' is the control character and I think we should expect people should be writing PRCE expressions and not un-escaped patterns.

Dave Reid’s picture

I hate the W3C. I wish they could clarify the standard to show that the pattern should be in PRCE form which works for both JavaScript and for PHP.

Dave Reid’s picture

Related StackOverflow question: http://stackoverflow.com/questions/9526814/

ericduran’s picture

I actually don't think is that bad.

We just needed to find a character that would work. I took chx answer and added it to this patch.

The only differences is

+    $pattern = chr(1) . '^(?:' . $element['#pattern'] . ')$' . chr(1);

instead of

+     $pattern = '/^(?:' . str_replace(array('\/', '/'), array('/', '\/'), $element['#pattern']) . ')$/';

Uploading here for review.

Also should probable add a test with some random characters.

I which UTF-8 was allowed so we could use a snowman -> ☃

Damien Tournoud’s picture

Status: Needs review » Needs work

HTML5 explicitly states:

If specified, the attribute's value must match the JavaScript Pattern production. [ECMA262]

and ECMA-262 explicitly states (in the description of new Regexp():

The characters / occurring in the pattern shall be escaped in S as necessary to ensure that the String value formed by concatenating the Strings "/", S, "/", and F can be parsed (in an appropriate lexical context) as a RegularExpressionLiteral that behaves identically to the constructed regular expression.

So in HTML5 and ECMA-262, there is no such thing as a "unescaped pattern". The pattern is always escaped, and the escape character is always /.

So as far as I know there is no magic necessary... '/' . $element['#pattern'] . '/' is correct. Also, it is *not* correct to add the ^ prefix and the $ suffix to the regular expression.

What HTML5 says is that:

[If] the element has a compiled pattern regular expression but that regular expression does not match the entirety of the element's value, then the element is suffering from a pattern mismatch.

Which means that we should preg_match() on the value and check if the match is equal to the value.

Damien Tournoud’s picture

So, supposing that browsers do the right thing and accept pattern="a\/b", there is no reason to care that some of them also accept the broken pattern="a/b". The escaped pattern is more restrictive, so there is no issue in enforcing it in Drupal.

Niklas Fiekas’s picture

Status: Needs work » Needs review

The paragraph says what browser should do in their implementation, not what users of the API should do. P is basically the user input, S is what is stored internally, here:

If a SyntaxError exception is not thrown, then:

Let S be a String in the form of a Pattern equivalent to P, in which certain characters are escaped as described below. S may or may not be identical to P or pattern; however, the internal procedure that would result from evaluating S as a Pattern must behave identically to the internal procedure given by the constructed object's [[Match]] internal property.

The characters / occurring in the pattern shall be escaped in S as necessary to ensure that the String value formed by concatenating the Strings "/", S, "/", and F can be parsed (in an appropriate lexical context) as a RegularExpressionLiteral that behaves identically to the constructed regular expression. For example, if P is "/", then S could be "\/" or "\u002F", among other possibilities, but not "/", because /// followed by F would be parsed as a SingleLineComment rather than a RegularExpressionLiteral. If P is the empty String, this specification can be met by letting S be "(?:)".

The following properties of the newly constructed object are data properties with the attributes that are specified in 15.10.7. The [[Value]] of each property is set as follows:

The source property of the newly constructed object is set to S.

Before that:

If the characters of P do not have the syntactic form Pattern, then throw a SyntaxError exception.

If you follow along the syntax tree described before that, you will see that P here can very well be valid with a forward slash.

I.e. if the user gives you '' then store it as '(?:)', if the user gives you 'foo/something' then store it as 'foo\/something'. So that you can later safely use it in the /-delimiter syntax.

Needs review for more discussion.

Damien Tournoud’s picture

Indeed, I read that part wrong. So the content of pattern= is the Pattern parameter of the Regexp object, so it is not escaped. We do need to escape it following the procedure outlined in the spec.

The rest of #41 holds.

Damien Tournoud’s picture

Status: Needs review » Needs work

An alternative would be to use bracket-style delimiters that do not actually need escaping.

$pattern = '{^' . $element['#pattern'] . '$}';

Back to needs work for this and the full-match described in #41.

Niklas Fiekas’s picture

Status: Needs work » Needs review
FileSize
6.15 KB

Thanks to @sun for bringing this up. Thanks to @Davereid for talking about this in IRC, to @Damien Tournoud for finding the related paragraph in ECMA and suggesting bracket-style delimiters - I didn't even know about them. They are awesome here. To @xjm for fixing the docs and the error message. To @chx for discussing whether the brackets around the pattern are needed.

Also: #1174620: Add new HTML5 FAPI element: email got in. Yay! Adding pattern support there, too.

Niklas Fiekas’s picture

Issue tags: -html5

#46: 1174766-html5-pattern-46.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +html5

The last submitted patch, 1174766-html5-pattern-46.patch, failed testing.

Niklas Fiekas’s picture

Status: Needs work » Needs review
FileSize
575 bytes
6.51 KB

Reroll + added pattern support for URLs, since the URL widget got in: #1174630: Add new HTML5 FAPI element: url.

Status: Needs review » Needs work
Issue tags: -html5

The last submitted patch, 1174766-html5-pattern-49.patch, failed testing.

Niklas Fiekas’s picture

Status: Needs work » Needs review
Issue tags: +html5

Dries forgot to commit some files of #1479652: Convert image style config filenames/keys from plural to singular, breaking HEAD. Retesting after jhodgdon reverted that. git apply --index++
#49: 1174766-html5-pattern-49.patch queued for re-testing.

Niklas Fiekas’s picture

cosmicdreams’s picture

Patch from #52 looks good. I just have one question:

What is supposed to happen if $element['#attributes']['pattern'] is set? Is form_process_pattern supposed to be recursive? Seems like everything blows up in that case.

Niklas Fiekas’s picture

@cosmicdreams: Mhh ... no, I am not sure what you mean. Here is testcoverage for that: If #attributes => pattern is given, only client side validation will be performed (if the browser supports it) and #pattern will be ignored.

Niklas Fiekas’s picture

Issue tags: -html5

#54: 1174766-html5-pattern-54.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +html5

The last submitted patch, 1174766-html5-pattern-54.patch, failed testing.

Niklas Fiekas’s picture

Status: Needs work » Needs review
Niklas Fiekas’s picture

sun’s picture

Status: Needs review » Reviewed & tested by the community

Looks good to go for me. Thanks for the hard work here, guys!

ericduran’s picture

Status: Reviewed & tested by the community » Needs work

This is almost perfect. The one thing I can see that's missing is the password element. It also supports the pattern attribute. I know core would probably never use it but I just needed it for a project :)

After that this should be good to go.

Niklas Fiekas’s picture

Status: Needs work » Needs review
FileSize
3.2 KB
8.41 KB

Thanks. Yes ... valid point. Should we add it to password_confirm, too? If yes: Only to the first element, or to both?

Status: Needs review » Needs work

The last submitted patch, 1174766-html5-pattern-61.patch, failed testing.

Niklas Fiekas’s picture

Status: Needs work » Needs review
FileSize
593 bytes
8.41 KB
ericduran’s picture

Status: Needs review » Reviewed & tested by the community

Sweet, nice work everyone :)

catch’s picture

Status: Reviewed & tested by the community » Fixed

Thanks folks. Committed/pushed to 8.x.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Dave Reid’s picture

For anyone interested in the very late follow-up for field widgets: #1942914: Add support for #pattern to relevant Field API widgets. Apologies in advance for the ping on a closed issue. *slaps hand*

jibran’s picture

Title: Support the #pattern FAPI property for native HTML5 pattern attribute » Change notice: Support the #pattern FAPI property for native HTML5 pattern attribute
Category: feature » task
Priority: Normal » Critical
Status: Closed (fixed) » Needs review
Issue tags: +Needs change record

I went ahead and created the change notice http://drupal.org/node/1953036. IMHO it required change notice.
Sorry for the noise.

amateescu’s picture

Title: Change notice: Support the #pattern FAPI property for native HTML5 pattern attribute » Support the #pattern FAPI property for native HTML5 pattern attribute
Category: task » feature
Priority: Critical » Normal
Status: Needs review » Closed (fixed)
Issue tags: -Needs change record

The change notice looks fine to me.