Right now, core uses form_clean_id() in a few places to clean perspective class names before inserting them into the html. But form_clean_id() is the wrong function to clean class names because it enforces a uniqueness to them by appending _0, _1, etc if form_clean_id() has been previously called with the same input.
http://www.w3.org/TR/CSS21/syndata.html#characters shows the syntax for valid class names and IDs.
In CSS, identifiers (including element names, classes, and IDs in selectors):
- can contain only the characters [a-zA-Z0-9] (U+0030 - U+0039, U+0041 - U+005A, U+0061 - U+007A) and ISO 10646 characters U+00A1 and higher, plus the hyphen (U+002D) and the underscore (U+005F)
- cannot start with a digit, or a hyphen followed by a digit.
- can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance, the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F".
I'm all for ignoring the 3rd part of that spec (too messy), but we should definitely implement the first 2 parts.
In addition to the validity of classes, we should also look to Drupal’s code style. We currently don't use underscores in class names, so this new function should replace them with dashes by default, but allow callers to relax that restriction if they have a use case (like integrating with 3rd party code.)
Comment | File | Size | Author |
---|---|---|---|
#53 | 464862-53-drupal-css-class.patch | 6.51 KB | JohnAlbin |
#51 | 464862-51-drupal-css-class.patch | 6.54 KB | JohnAlbin |
#50 | 464862-50-drupal-css-class.patch | 6.35 KB | JohnAlbin |
#47 | b69d64b4.patch | 14.03 KB | kkaefer |
#43 | b69d64b4.patch | 11.13 KB | kkaefer |
Comments
Comment #1
sunsubscribing
Handle renaming of form_clean_id() in here as well?
Comment #2
JohnAlbinSure! +1 to consistent api names.
Comment #3
dvessel CreditAttribution: dvessel commentedsubscribing
Comment #4
JohnAlbinHere's a completely untested patch (I mean it!) that implements a rough-draft approach to drupal_css_class() and drupal_css_id().
Comments/improvements on the implementation are appreciated.
Comment #6
JohnAlbinbah. \u is only supported in Perl and not PHP. :-p
New patch attached. Still mostly untested.
Comment #8
JohnAlbinMissed some simpletests for drupal_css_id().
I still need to add some tests that verify that drupal_css_class() works properly.
Comment #9
JohnAlbinre-rolled.
Comment #10
JohnAlbinHere's what this patch does:
The patch is now feature complete. But implementation details are still open for debate, as always! :-)
Comment #11
JacineSome of the code in the patch is over my head, but based on the summary in #10, +1 ;)
Comment #12
JohnAlbinAfter reviewing this, I realized that
drupal_css_id()
should have the same params asdrupal_css_class()
. Also, usingdrupal_css_class()
as a helper function fordrupal_css_id()
started to feel kind of weird, so I've introduced a proper helper function,drupal_clean_css_identifier()
.So that we can get some more reviews of these 3 new functions (from people who don't necessarily know how to apply patches), here's the docblock docs for them:
drupal_css_id()
is almost identical to the oldform_clean_id()
but with 2 lines changed as I noted in #10 above.Comment #13
Zarabadoo CreditAttribution: Zarabadoo commentedThis all looks good to me in terms of the final goal. I can definitely see myself using this set of functions in the future if it is implemented.
Comment #14
dvessel CreditAttribution: dvessel commented@JohnAlbin, do you think there would be any performance issues with preg_replace being hit so often? Having valid output is great but are we being overly cautious? The only time core uses preg_replace is for an attribute string AFAIK is in cleaning out the url to be converted to a body class. It never seemed to be an issue for form_clean_id(). Anything invalid was predictable and we filtered for that.
About the use of the $coding_standard parameter. Instead of using a binary switch, what if you could pass in a string or an array to be filtered.
example:
With that, you can have any string converted not just underscores. The default behavior would result in the same output. It's just more flexible.
This could possibly replace the need for preg_replace() by feeding the filter with expected "invalid" values as they are encountered.
What the id cleaner could look like:
Lastly, is $safe_prefix necessary? I've never encountered a situation where I had to be careful of an invalid prefix but maybe others have.
Comment #15
sunWe badly and definitely need this.
However, some clean-up is necessary:
'class' is special-cased here, but $attributes can as well contain a CSS ID. We want to ensure both are sanitized.
Additionally, that isset() and following unset() is a bit clumsy. Please move that as checks for the two special values in $key into the foreach.
That is a pretty nifty idea. The parameter/variable name might use a shorter name though. And a link to the handbook page where these are defined would be great (if existent).
Please remove this entire option. With or without it, developers and themers won't get the expected result, so there is no point in putting extra logic in there. We just want to strip anything invalid at the beginning in all cases.
A negated, positive char list would seem more natural to me here, no? I.e.
...
Should use strtr() here.
That probably needs another preg_replace(), i.e.
However, in general, I'm not completely sold on the idea that Drupal should all of this extra work due to sluggish developers. AFAIK, Drupal does not do this elsewhere. CSS IDs were the only special case, because those can't appear twice on a page - while a single developer is not able to control with other CSS IDs are on a page.
So. Bearing that, I would suggest to remove that extra validation for starting numbers/dashes.
The remaining clean-up of form_clean_id() looks just awesome!
Comment #16
sunI wonder whether we should move the code from #550572: CSS+JS regressions related to form-item-[name] into drupal_css_class() after that has been committed.
Extract:
The not-so-obvious trick lies in ] being replaced with an empty string, and spaces, underscores, and [ being replaced with hyphens. That means:
book_parent[bid][title] becomes book-parent-bid-title
Comment #17
JohnAlbinHmm… I really don’t know.
That's true, but the end of your pattern doesn't work.
\x{00A1}-
doesn't mean character A1 and higher, it means\x{00A1}
and-
. That was why I had originally written it the more awkward way. But thinking about this again, we could use\x{00A1}-\x{FFFF}
.Really? They are functionally equivalent in this instance, but strtr is slower. See http://www.simplemachines.org/community/index.php?topic=175031.0;wap2 Oh, wait. I guess not when using single characters. See http://www.cznp.com/blog/3/strtr-vs-str_replace-a-battle-for-speed-and-d... too. :-p So confusing! I'll replace it with strtr().
Yeah, ok. I agree. I tend to get all “feature complete-y” when coding. :-)
Ok, that makes me think we should implement Joon’s idea of a $filter parameter using that format.
This new patch simplifies
drupal_css_class()
anddrupal_css_id()
. They only have the $identifier paramter and always use Drupal’s coding standards as a filter. If someone needs to implement a non-Drupal, 3rd-party class or ID, they can access the underlieingdrupal_clean_css_identifier()
which uses the $filter parameter.Also, I realized we can't really run drupal_clean_css/id from inside
drupal_attributes()
. Because that would prevent developers from being able to go around Drupal coding standards if they needed to. So we just need to run drupal_css_class/drupal_css_id before strings are added to a $attributes array.Comment #19
JohnAlbinFixed broken test.
Comment #20
sunComma before i.e.
We should consider to remove some of those auto-generated IDs. They cannot be used safely anyway. And since we have shiny new + clean form element CSS classes now, they are even more obsolete.
Different issue though. ;)
We no longer wrap these in t(). Also, missing trailing comma after last element.
Beer-o-mania starts in 4 days! Don't drink and patch.
Comment #21
JohnAlbinRe-rolled with Sun's suggestions.
Comment #22
sunThanks! Let's get this wonderful patch in! :)
Comment #23
pwolanin CreditAttribution: pwolanin commentedtagging
Comment #25
lilou CreditAttribution: lilou commentedHEAD is broken.
Comment #26
webchickThat bit about formatting stuff according to Drupal coding standards is cruft, IMO. Let's strip it out; this is the domain of Coder module. Core doesn't babysit broken code.
It would also be good to get some before/after benchmarks for that preg_replace().
Other than that, this looks great!
Comment #27
JohnAlbinAfter talking to webchick in the hallway, I think she was reviewing the patch in #12. Yes, the $coding_standards parameter in that patch was goofy. That's why the patch in #21 nixed it.
Setting back to RTBC to let webchick have another look.
Comment #29
JohnAlbin“Failed: Failed to run tests.”? Hmm… the patch in #21 still applied (with fuzz).
Re-rolling and letting the testbot have another try.
Comment #30
sunI confirm this is RTBC.
Comment #32
sunStraight re-roll.
Comment #35
dawehnerHere was the problem.
Comment #36
sunyay, thanks! :)
Reverting status.
Comment #37
sunTagging.
Comment #38
sun*bump*
Comment #39
sunIs it because of the missing benchmarks, maybe?
Attached. No noticeable difference.
Comment #40
Dries CreditAttribution: Dries commentedReviewed. Looks good and convenient. Committed to CVS HEAD. Thanks all!
Comment #41
kkaefer CreditAttribution: kkaefer commentedMinor nitpick: It's technically not a "CSS" class or ID but rather a "HTML" class or ID, so drupal_css_class() and drupal_css_id() are kind of a misnomer. IDs and classes are not inherently tied to CSS but are HTML attributes (http://www.w3.org/TR/html4/struct/global.html#h-7.5.2).
Comment #42
sun@kkaefer: Your nitpick makes sense -- do we need to fix it or would the current function names be more grokable than drupal_html_class() and drupal_html_id()?
Comment #43
kkaefer CreditAttribution: kkaefer commentedPatch changes drupal_css_* to drupal_html_*
Comment #44
Damien Tournoud CreditAttribution: Damien Tournoud commentedSupporting that.
Comment #45
sunBut why did we leave out drupal_clean_css_identifier() ?
Comment #46
RobLoachSeems reasonable.
Comment #47
kkaefer CreditAttribution: kkaefer commentedalso rename drupal_clean_css_identifier().
Comment #48
RobLoachYup! Test bot likes it too.
Comment #49
webchickHm. I could go either way on this. I like "css" being in the name since that's a good clue to people about when to use those functions, but at the same time kkaefer's right that technically these are not inherent at all to CSS and can be used for other things as well.
I decided to go with accuracy/future-proofing and committed to HEAD.
Needs documentation (again). :)
Comment #50
JohnAlbinWell, if we’re going to nitpick… ;-)
Agreed.
However, (annoyingly) the HTML spec has different constraints on what are valid classes or IDs than what the CSS spec has. The CSS spec is described in this issue’s original description above.
The HTML spec for HTML identifiers shows that:
So, IMO, we have a few reasonable solutions to this:
I'm leaning to #2 and, since its easy to roll a patch for it, I've done so. But, consensus would be nice. My patch also fixes up some comments/naming in the tests.
Comment #51
JohnAlbinHere's a new patch with an update to testDrupalHTMLId() that adds a "Strip invalid characters" check.
Comment #53
JohnAlbinFixed the testDrupalHTMLId() test to work with the updated drupal_html_id().
Comment #54
RobLoachReasonable!
In summary:
Comment #55
webchickHaha. You guys are hilarious. :)
Cool, works for me. Committed to HEAD once more. Marking needs work for documentation.
Comment #56
sun.
Comment #57
jhodgdonUmmm... Looks like it needs documentation in the module update guide? changing tag scheme.
Comment #58
jhodgdonOK. I'm looking through the 3 patches above that were committed. What I think happened in this issue:
First patch http://drupal.org/files/issues/drupal.css-class.33.patch
a) form_clean_id() [D6 function] was removed and replaced by drupal_css_id()
b) drupal_css_class() was added.
c) drupal_clean_css_identifier() was added.
Second patch http://drupal.org/files/issues/b69d64b4_1.patch
a) drupal_clean_css_identifier -> drupal_clean_html_identifier [later reverted see below]
b) drupal_css_class -> drupal_html_class
c) drupal_css_id -> drupal_html_id
Third patch http://drupal.org/files/issues/464862-53-drupal-css-class.patch
a) drupal_clean_html_identifier -> drupal_clean_css_identifier
Net result:
a) form_clean_id() [D6 function] -> drupal_html_id()
b) drupal_css_class() was added.
c) drupal_clean_html_identifier() was added.
(a) needs to be documented in the Update Guide. Actually, it was documented, but they didn't get the later change, so I fixed it:
http://drupal.org/update/modules/6/7#form_clean_id
Comment #61
codebymikey CreditAttribution: codebymikey at Zodiac Media commentedCurrently using a govuk based theme, and it requires support for the exclamation mark within the class names.
Attaching a relevant patch here for reference purposes.