After some CSS files containing a charset declaration are processed by CSS aggregate and compression - Safari will ignore everything after the @charset "UTF-8";. This break all my themes and should be fixed in core very soon.

This piece of code fixes the problem, by completely removing all charset declarations. After CSS compression the declaration is not really required, while only English chars are inside the CSS. Before compression you may have documentation containing German Sonderzeichen or other non English chars inside the file and they are going to be destroyed by Editors that do not understand the file encoding without charset line, like suxxx Eclipse.

However - it is possible that such a tag is inside a CSS file and this is allowed by reference and therefor this is a critical bug.

Example solution for drupal_build_css_cache:

// Remove charset declarations for standards compliance (and Safari problems)
$regexp = '/@charset[^;]+;/i';
$data = preg_replace($regexp, '', $data);

I'm not a regexp guru and tried some hours to get this inside the other replacement rules... maybe someone else more familiar with regexp is able to integrate? then i will be able to create a better patch...

The above should be integrated into this replacement rule

    $data = preg_replace('<
      \s*([@{}:;,]|\)\s|\s\()\s* |  # Remove whitespace around separators, but keep space around parentheses.
      /\*([^*\\\\]|\*(?!/))+\*/ |   # Remove comments that are not CSS hacks.
      [\n\r]                        # Remove line breaks.
      >x', '\1', $data);
CommentFileSizeAuthor
#12 common_21.patch971 byteshass
#10 common_20.patch986 byteshass
#4 common_19.patch633 byteshass
#1 common_18.patch861 byteshass

Comments

hass’s picture

Status: Active » Needs review
StatusFileSize
new861 bytes

this could be a patch with the above example

Steven’s picture

I'm a bit on the fence about this. This patch doesn't actually add support for @charset definitions; it still requires all CSS to be UTF-8 encoded. So the only reason anyone would use @charset in a Drupal-related CSS file was if their editor was buggy, and the 'standards compliance' argument is a bit off.

In any case, the specs say that @charset must appear at the very beginning of the CSS file (no exceptions), while currently, the regexp scans all CSS styles. If you change your global replacement to an anchored regexp per file (don't forget @imports), you'd only have to inspect a fraction of the CSS data that the patch does now. Since the regexp itself wouldn't change, it'd still be cacheable by PCRE.

hass’s picture

Drupal is UTF-8 only, isn't it? I have some @charset with UTF8 on top of the CSS files, but other people may have CP1252, UTF16, UTF32 or something else to make sure their documentation is editable and readable in browsers. Whatever there will be on the end of the day, the CSS declaration themself without comments doesn't require anthing more then ISO 8859-1, isn't it? And this is what we get after we removed all comments.

Only outstainding problem are lines of @charset "UTF-8"; that maybe inside the compressed file more then once, what will create big problems in Safari and is not standard compliant. So the compressor does not create standard compliant code or am i wrong?

I'm a bit on the fence about this. This patch doesn't actually add support for @charset definitions; it still requires all CSS to be UTF-8 encoded. So the only reason anyone would use @charset in a Drupal-related CSS file was if their editor was buggy, and the 'standards compliance' argument is a bit off.

Should we add charset support? For Drupal we know the file is UTF8 so it shouldn't be a problem to add this line on top... The Standard Argument is that Themes like Zen or YAML may have such lines in the CSS.

In any case, the specs say that @charset must appear at the very beginning of the CSS file (no exceptions), while currently, the regexp scans all CSS styles. If you change your global replacement to an anchored regexp per file (don't forget @imports), you'd only have to inspect a fraction of the CSS data that the patch does now. Since the regexp itself wouldn't change, it'd still be cacheable by PCRE.

i know there is much more work required then my patch does, but this requires big changes to the compression function. It should follow all @imports and keep them sorted in the original sort order and so on... there is another patch in http://drupal.org/node/113607 that tries to solve this, but haven't solved it yet and it seems like noone is working on it :-(((.

The big question for me is therefor - should i wait for this bigger patch to get in or not - or add only one line to get this important problem solved.

hass’s picture

StatusFileSize
new633 bytes

This patch integrates 100% in the current regex and is therefor better.

Steven’s picture

Status: Needs review » Needs work

That regexp is buggy. Whenever it encounters a sequence @charset, the first OR member (remove charsets) will fail, while the second OR member will match (remove whitespace around operators) since it starts earlier in the string.

Such a sequence can occur when concatenating CSS ending in a space with CSS beginning with a @charset rule.

Steven’s picture

Ack, codefilter trims. The sequence is " @charset", with leading space.

hass’s picture

do you know how to fix this? i'm not the big regex guru and found this line works for me...

drumm’s picture

Maybe do a separate regexp instead of cramming it into the existing one?

hass’s picture

Status: Needs work » Needs review

the patch in #1 doing this and is a well working solution...

hass’s picture

StatusFileSize
new986 bytes

@Steven: is this the patch you like more? this could be more speedy, while it searches in every file and not on the final big one.

drumm’s picture

Priority: Critical » Normal
Status: Needs review » Needs work

I think Steven meant you should check only the beginning of each file since we can assume it does not appear elsewhere.

There is no need for a separate $regexp variable that is only used once.

hass’s picture

Status: Needs work » Needs review
StatusFileSize
new971 bytes

hopefully, this patch is correct...

1. reworked regex with the http://www.w3.org/TR/CSS21/syndata.html #4.4 rules
2. added start of line character

I tested to add more than one @charset line in one file and the invalid extra lines stay after compression - except the valid first line in the very beginning of the file that has been removed correctly. So i assume the regex reads only the first line now and nothing more, but i'm not sure about this.

hass’s picture

Someone found a few minutes to review this small, but important patch, please?

david strauss’s picture

Version: 5.1 » 5.2
Status: Needs review » Reviewed & tested by the community

This patch fixes the Safari CSS aggregation issue for me. Thanks hass!

drumm’s picture

Version: 5.2 » 6.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Committed to 5.x.

A updated patch needs to be made for 6.x.

hass’s picture

gábor hojtsy’s picture

Version: 6.x-dev » 5.x-dev
Status: Patch (to be ported) » Fixed

Ported in the other issue. Setting this back to fixed and for 5.x.

Anonymous’s picture

Status: Fixed » Closed (fixed)