drupal_load_stylesheet_content() removes whitespace improperly, creating invalid CSS
rfay - May 26, 2009 - 03:48
| Project: | Drupal |
| Version: | 6.x-dev |
| Component: | base system |
| Category: | bug report |
| Priority: | critical |
| Assigned: | Unassigned |
| Status: | closed |
| Issue tags: | CSS aggregation, Quick fix |
Description
drupal_load_stylesheet_content(), in common.inc line 2574, is in charge of aggregating and compressing CSS. It attempts to remove whitespace, but goes too far.
The CSS fragment:
div.this
div.is
div.an
div.error { display: inline; }is compressed completely inappropriately into
div.thisdiv.isdiv.andiv.error{display:inline;}where it should have been
div.this div.is div.an div.error {display:inline;}The attached patch attempts to solve this by first replacing newlines with spaces. After that the whitespace can be compressed.
This bug also exists in Drupal 6.x
| Attachment | Size | Status | Test result | Operations |
|---|---|---|---|---|
| css_compression_patch.patch | 689 bytes | Idle | Failed: Failed to install HEAD. | View details | Re-test |

#1
according to the CSS specs:
So separating the selectors by just a newline is invalid CSS IMO, so I don't think we should fix this...
#2
This example is not proper for a comma-separated list (sharing the same specification). This is three selectors declaring specificity - only the innermost matches.
div.this div.is div.an div.error
It would match only
<div class="this"> <div class="is"><div class="an"><div class="error">This DIV only would be matched</div></div></div></div>#3
I apologize, the indentation used in the example made me think of multiple selectors. You're right, this does look like an error.
#4
Very simple patch, explanation is good, marking rtbc.
#5
Hm. I'd feel a lot more comfortable committing this if we had tests to prove that we don't break something else in the meantime.
#6
Here is a test for that section of the code. It's a pretty simple approach, but may be adequate for now.
Essentially, this test just takes and loads a set of CSS stylesheets and then compares the result (optimized or not) against a static set of expected results.
The function drupal_load_stylesheet_content() does two basic things: Importing @imports, and compressing the whitespace out of the results (of $optimize is set). So this test attempts to deal with that.
The bad thing that I don't know how to get past with this approach: This bug is about inappropriate removal of linefeeds (which are valid whitespace). So I haven't figured out how to write a test which will work *before* the patch and then also work *after* it, since the patch deliberately changes the output (by replacing linefeeds with spaces instead of removing them). So I'm open to suggestions on this.
Can anyone suggest an approach that will test the old (broken) code as valid and then still test the new (patched) code as valid? I know that one of the objectives of this exercise is to make sure we're not breaking anything existing. On the other hand, writing a test to validate broken code is wrong.
#7
We should be able to use a small core CSS file for testing - no need to duplicate the whole of Garland's here. Not sure what you mean by testing the old broken code as valid - ideally the test should fail without your patch, and pass with it.
#8
From http://www.w3.org/TR/CSS2/syndata.html#whitespace
The patch above is unlikely to handle "form feed" or 0x0c properly.
#9
@catch: the CSS input file could definitely be simpler. However, I'm attempting to catch a variety of problems using a very simple technique. Using a more significant and diverse CSS file is more likely to show unintended change. Since I'm not attempting real validation (which would require parsing both files and comparing them semantically), this technique at least tries to get results on a more significant level.
To reiterate the problem with creating an appropriate test: With this simple test technique, this test only validates that the code behaves the same as it used to (static input producing the same static output). Thus, when you change how the code behaves (by handling linefeeds as whitespace instead of deleting them), the test will fail after the patch is applied. Since I think webchick's idea was to create a test which would validate that the patch was not breaking anything new, this won't accomplish that goal. Because the test will break when the patch is applied.
#10
Per @heine's comment I revisited the patch and re-studied the preg expression.
Since \s does cover formfeed, newlines, tabs, and spaces, the patch can be expressed more simply and with fewer changes and probably no side-effects at all.
The attached re-rolled patch does nothing except to delete the portion of the regex that was removing \r\n. Since they're legitimate whitespace, that's the error in this; they can't be removed. They *could* be optimized further, but that should probably be addressed separately.
#11
After discussion with webchick, I've rerolled this
#12
The last submitted patch failed testing.
#13
Submitting for re-testing.
#14
Now has tests, so I removed the "Needs test" tag.
#15
Pushing to critical as this can break some CSS files.
Thank you for noticing this. I like what you're doing here but the optimized CSS files now have a bunch of line breaks that they don't need to have. Before patch:
.field .field-label,.field .field-label-inline,.field .field-label-inline-first{font-weight:bold;}.node-unpublished{background-color:#fff4f4;}.preview .node{background-color:#ffffea;}After patch:
.field .field-label,.field .field-label-inline,.field .field-label-inline-first{font-weight:bold;}
.node-unpublished{background-color:#fff4f4;}.preview .node{background-color:#ffffea;}
As you can see, we have two line breaks here that we don't need to have. Is there anyway we could remove the line breaks, but add an extra space instead for the use case you provided so that the CSS still stays valid?
#16
@Rob Loach: Thanks for testing.
The reality is that linefeeds are valid whitespace in CSS, so removing them breaks the spec. In an earlier version of the patch, I replaced them with spaces, but decided later that that added risk to the patch.
Since we're only doing a minimal big of CSS compression here, without parsing the CSS, it's my opinion that the linefeeds, as valid CSS, must be treated with respect and not removed.
#17
That sounds reasonable. This is aggregation, not compression. It seems like if we want to compress it further, we'd want a pluggable framework for the CSS. Setting it to RTBC!
#18
My nitpicking side got the better of me.
#19
The last submitted patch failed testing.
#20
OK, one? more time?
This patch has the same fix, no change, to the bug at hand. It has not changed since #11.
What has changed:
Hoping we can put this to bed. It's just one line of code!
#21
Wheee!
#22
Wow, sorry. It's apparently taken me a very long time to get back to this issue! :(
I've reviewed this and mostly found minor formatting stuff, and a couple of things that look like bugs.
@rfay, could you explain this?
We should maybe have a reference off to said spec in the PHPDoc comments of drupal_load_stylesheet_content() because atm output like:
...color:#494949;}.this+.is
+.a
+.test{font:1em/100%...
...looks like a bug in our optimizer.
Here's the review. Note that since this doesn't change APIs, I probably won't get back to it again before code freeze. But it's marked as a critical bug so it will get fixed before D7 ships.
---
+++ includes/common.inc 17 Jul 2009 05:43:44 -0000@@ -2654,8 +2654,7 @@
\s*([@{}:;,]|\)\s|\s\()\s* | # Remove whitespace around separators, but keep space around parentheses.
- /\*([^*\\\\]|\*(?!/))+\*/ | # Remove comments that are not CSS hacks.
- [\n\r] # Remove line breaks.
+ /\*([^*\\\\]|\*(?!/))+\*/ # Remove comments that are not CSS hacks.
Looks like that # comment needs to be indented to be inline with the others.
+++ modules/simpletest/tests/common.test 17 Jul 2009 05:43:45 -0000@@ -282,6 +282,47 @@
+ * CSS Unit Tests
Comments need to end in a period.
+++ modules/simpletest/tests/common.test 17 Jul 2009 05:43:45 -0000@@ -282,6 +282,47 @@
+ 'name' => 'CSS Unit Testss',
"Tests" ;)
+++ modules/simpletest/tests/common.test 17 Jul 2009 05:43:45 -0000@@ -282,6 +282,47 @@
+ /**
+ * Tests basic CSS loading with and without optimization (drupal_load_stylesheet())
+ */
Indentation is off here. Must end with a period as well.
+++ modules/simpletest/tests/common.test 17 Jul 2009 05:43:45 -0000@@ -282,6 +282,47 @@
+ // Array of files to test. Original = .css,
+ // unoptimized expected output = .css.unoptimized.css,
+ // optimized expected = .css.optimized.css
+ // They live in the simpletest/files/css_test_files directory
Indentation/spacing is off here. These comments should be indented to the same width as the body of the function.
That said, these are probably slightly too detailed comments (it's spending a lot of time explaining stuff that's easily deduceable by reading the code). I'd go with something like:
// Test with two sample CSS files; one using the @import tag, and one without.
+++ modules/simpletest/tests/common.test 17 Jul 2009 05:43:45 -0000@@ -282,6 +282,47 @@
+ $unoptimized_output = drupal_load_stylesheet_content($content, FALSE);
+
+ $expected = file_get_contents("$path/$file.unoptimized.css");
Trailing whitespace in the middle there.
+++ modules/simpletest/files/css_test_files/css_input_with_import.css 1 Jan 1970 00:00:00 -0000@@ -0,0 +1,34 @@
+/**
+ * @file Basic css that uses @import
+*/
Indentation off here.
+++ modules/simpletest/files/css_test_files/css_input_with_import.css 1 Jan 1970 00:00:00 -0000@@ -0,0 +1,34 @@
+body {
+ margin: 0;
+ padding: 0;
+ background: #edf5fa;
+ font: 76%/170% Verdana, sans-serif;
+ color: #494949;
+}
...
+textarea, select {
+ font: 1em/160% Verdana, sans-serif;
+ color: #494949;
+}
Is this test data relevant at all? If not, let's remove it. Keep the test data as simple as possible.
+++ modules/simpletest/files/css_test_files/css_input_without_import.css.unoptimized.css 1 Jan 1970 00:00:00 -0000@@ -0,0 +1,6 @@
+/* $Id: style.css,v 1.57 2009/05/31 00:51:41 webchick Exp $ */
+
+/**
+ * @file Basic css that does not use
+}
+
Is this file intentionally all messed up? I can't see how that could possibly be valid CSS.
If it is intentional, please add more comments here so someone doesn't try and "fix" it later.
+++ modules/simpletest/files/css_test_files/css_input_without_import.css 1 Jan 1970 00:00:00 -0000@@ -0,0 +1,33 @@
+/**
+ * @file Basic css that does not use @import
+*/
Indentation.
+++ modules/simpletest/files/css_test_files/css_input_with_import.css.unoptimized.css 1 Jan 1970 00:00:00 -0000@@ -0,0 +1,34 @@
+/**
+ * @file Basic css that uses @import
+*/
+
+
+
+
Indentation.
Also, what's up with all of those blank lines? Part of the test case? If so, maybe document that in the @file comment? I could see someone "fixing" this later.
23 days to code freeze. Better review yourself.
#23
@webchick, thanks for the detailed review! You are a stickler!
First, you asked me to explain:
That's the essence of what this patch is about. The old code assumed that any linefeeds or carriage returns could just be gratuitously removed. In reality, they are whitespace per the spec (http://www.w3.org/TR/CSS2/syndata.html#whitespace), so a return between two parts of a CSS selector is as valid as a space. That's why the classic example is
.this
.is
.an
.example { display: none; }
which in CSS terms is completely equivalent to
.this .is .an .example { display: none; }
and is the type of case that the old code broke. (It "compressed" the code into .this.is.an.example { display: none })
Second, I believe that the formatting issues you pointed out are resolved in the attached patch.
Third, I want to mention that the test developed here is intended to be a generic and extensible test of CSS aggregation. It's not just a test for this bug, although I did try to slip that in. That's the reason for some of the extra comments, for example. It's also the reason there's some generic CSS in the test files - it's not just about this bug.
Fourth, the truncated test css file that you noticed was another bug. I've never been much of a user of @import, so I just assumed everything was fine. But our CSS aggregator unfortunately eats "@import" wherever it appears. Since it was in the comment at the top of the file, and didn't have the file argument, our aggregator just ate the rest of the file. When I sorted this out thanks to your keen eye, I posted #544568: CSS aggregation attempts to process @import in comment about it. It's relatively minor, but more of an indication that our CSS techniques in general are pretty crude.
In this case I took the (I think) fairly reasonable step of removing the word "@import" from the css comment, #544568: CSS aggregation attempts to process @import in comment has nothing to do with the issue we're working on here.
Fifth, I did have to make a couple of changes to the test code. It now calls drupal_load_stylesheet() instead of drupal_load_stylesheet_content(), which doesn't handle @import correctly. I switched back to DrupalWebTestCase from DrupalUnitTestCase as the @import didn't work correctly at all in the unit test environment.
Anyway, thanks. This sure is an exhausting one-line fix :-)
#24
The last submitted patch failed testing.
#25
#26
The last submitted patch failed testing.
#27
OK, here's the patch re-rolled after the commit of #584370: Enable Optimize CSS files on performance page & Drupal dies. I did scale back the test to only test the issue in this bug, rather than trying for more. No change the the actual fix at all, nor has there been forever. Let's get this in and get a few more people out of unpredictable css hell. This needs to go into 6.x too, which is where I originally studied it.
#28
Found a couple other small indentation problems. Other then that it looks really good. Here's the updated patch.
#29
This looks great. I keep wanting to remove the extra blank linkes in test.css - but it's a test, so it ought to be realistic.
#30
Great, thanks for the fixes. Let's maybe tweak our whitespace algorithm in the future so that it strips those because I agree it does look very odd.
Committed to HEAD. Marking needs porting to 6.x, although it's possible one of the other patches has it; I kinda lost track.
#31
Doing what I said I was doing. :P
#32
The test bot doesn't like this now it's committed - 1 fail being reported from 'CSS unit tests' for every new test result.
#33
I can reproduce locally.
#34
The first lines of the files do not match:
< /* $Id: css_input_without_import.css,v 1.1 2009/10/05 02:48:38 webchick Exp $ */---
> /* $Id: css_input_without_import.css.unoptimized.css,v 1.1 2009/10/05 02:48:38 webchick Exp $ */
#35
Something like this will get the test passing.
#36
That makes sense, looks a bit odd but I can't think of anything better either, and it's an odd bug in the first place.
#37
Almost. The file name was changed in one line but not the other.
However, that test needs an overhaul anyway. So take this.
#38
Wow, that's totally bizarre. I have no idea why this wasn't failing tests while it was in the queue. Sorry, guys. :(
Anyway, committed to HEAD.
#39
As an added bonus, Sun switched it to a DrupalUnitTestCase, which will improve testing performance! YAY!
#40
Backport to D6
#41
This is the same one we already hashed through for D7, and we should get it into D6.
#42
Still yup.
#43
Committed to Drupal 6 too, thanks.
#44
Automatically closed -- issue fixed for 2 weeks with no activity.