CSS aggregation fails on many variations of @import [#2936067]

Comment	File	Size	Author
#73	2936067-73-backport.patch	925 bytes	pfrenssen
9.2.x: PHP 7.3 & MySQL 5.7 28,647 pass, 1 fail 8.9.x: PHP 7.1 & MySQL 5.6 28,641 pass
#67	2936067-67-backport.patch	2.04 KB	pfrenssen

#59	2936067-backport-to-older-drupal-versions.patch	2.14 KB	pfrenssen

#43	css_aggregation_breaks_import-2936067-43-d7-do-not-test.patch	638 bytes	anrikun

#29	2936067-test-only.patch	7.2 KB	ravi.shankar
9.2.x: PHP 7.3 & MySQL 5.7 28,190 pass, 1 fail
#29	interdiff_27-29.txt	663 bytes	ravi.shankar
#27	2936067-test-only.patch	7.2 KB	eiriksm
9.2.x: PHP 7.3 & MySQL 5.7 28,184 pass, 1 fail
#26	2936067-test-only.patch	7.2 KB	eiriksm
9.2.x: PHP 7.3 & MySQL 5.7 Custom Commands Failed
#25	2936067-25.patch	835 bytes	patrickkrueger
8.9.x: PHP 7.3 & MySQL 5.7 28,638 pass
#15	0001-Issue-2936067-by-ClemensSahs-add-tests.patch	3.24 KB	Clemens Sahs
9.2.x: PHP 7.3 & MySQL 5.7 28,026 pass
#5	2936067-5.patch	829 bytes	DuaelFr
8.4.x: PHP 7 & MySQL 5.5 22,332 pass 9.1.x: PHP 7.4 & MySQL 5.7 27,376 pass

Comment #1

11 January 2018 at 21:47

DuaelFr created an issue. See original summary.

Log in or register to post comments

Comment #2

DuaelFr

he/him

French

Montpellier, France

CreditAttribution: DuaelFr as a volunteer and at Happyculture commented 11 January 2018 at 21:49

Issue tags:

+CSS aggregation

Log in or register to post comments

Comment #3

agentrickard

he/him

English

Georgia (US)

CreditAttribution: agentrickard at Palantir.net commented 17 January 2018 at 14:48

Status:

Active

» Closed (won't fix)

This is an invalid URL. The ampersand and semicolon are reserved characters with special meaning and need to be encoded.

The & should be recoded as %26 using a function like http://php.net/manual/en/function.urlencode.php

See https://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_reserved... and https://tools.ietf.org/html/rfc1738

Log in or register to post comments

Comment #4

agentrickard

he/him

English

Georgia (US)

CreditAttribution: agentrickard at Palantir.net commented 17 January 2018 at 14:51

The '&' here is being used to pass query arguments, and needs to be an actual & and not &

https://fonts.googleapis.com/css?family=Sedgwick+Ave+Display&subset=lati...

Log in or register to post comments

Comment #5

DuaelFr

he/him

French

Montpellier, France

CreditAttribution: DuaelFr as a volunteer and at Happyculture commented 17 January 2018 at 20:58

Status:

Closed (won't fix)

» Needs review

File	Size
2936067-5.patch	829 bytes
8.4.x: PHP 7 & MySQL 5.5 22,332 pass 9.1.x: PHP 7.4 & MySQL 5.7 27,376 pass

File

Size

2936067-5.patch

829 bytes

8.4.x:
PHP 7 & MySQL 5.5 22,332 pass

9.1.x:
PHP 7.4 & MySQL 5.7 27,376 pass

@agentrickard it was a real life example. Yes, this one has an easy workaround but it is real and it is working in all browsers so it's strange that Drupal breaks it when aggregation is enabled.

The thing is that, according to the RFC3986, semicolons can be used as "sub-delims". It's not very common but it exists. If you read the "Notes" section of the PHP urlencode function documentation, you'll be teased that both "&", ";" and "&" can be found as separator.

I wasn't able to start writing a test for \Drupal\Core\Asset\CssCollectionOptimizer yet but I think it's going to be needed.

The attached patch has been inspired from \Drupal\Core\Asset\CssOptimizer::processCss() and the new Regex can be tested ->here<-. It fixes the issue mentionned in the summary.

Log in or register to post comments

Comment #6

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 25 June 2020 at 03:28

Version:

8.4.x-dev

» 9.1.x-dev

Indeed this is very much in the wild; e.g., fonts from Google Fonts produce such an embed code.

It is definitely uncommon to see a semicolon in a URL, but my read of RFC 3986 comports with the above comment.

Let's bump the version and see if this still passes.

Log in or register to post comments

Comment #7

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 25 June 2020 at 05:17

Pass! I think this just needs a solid addition to the test suite.

Log in or register to post comments

Comment #8

proweb.ua CreditAttribution: proweb.ua commented 20 September 2020 at 23:21

#5 works

Log in or register to post comments

Comment #9

webdrips CreditAttribution: webdrips commented 24 September 2020 at 21:27

#5 works for me too.

The line that was breaking our site:

@import url('https://fonts.googleapis.com/css2?family=Roboto:wght@400;500;700;900&display=swap');

Log in or register to post comments

Comment #10

tanubansal CreditAttribution: tanubansal at Salsa Digital commented 29 September 2020 at 12:28

Tested #5
RTBC + 1

Log in or register to post comments

Comment #11

29 September 2020 at 12:28

Version:

9.1.x-dev

» 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Log in or register to post comments

Comment #12

Clemens Sahs CreditAttribution: Clemens Sahs commented 9 December 2020 at 08:50

Tested #5
RTBC + 1

needed for 8.x too!

Log in or register to post comments

Comment #13

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 9 December 2020 at 19:44

Still needs tests :-)

Log in or register to post comments

Comment #14

Cosmin Hodis-Mindras CreditAttribution: Cosmin Hodis-Mindras commented 3 January 2021 at 12:41

Issue summary:

View changes

Tested #5
RTBC + 1

Log in or register to post comments

Comment #15

Clemens Sahs CreditAttribution: Clemens Sahs commented 4 January 2021 at 10:06

File	Size
0001-Issue-2936067-by-ClemensSahs-add-tests.patch	3.24 KB
9.2.x: PHP 7.3 & MySQL 5.7 28,026 pass

I think this works back for drupal 8.x to

Log in or register to post comments

Comment #16

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 5 January 2021 at 17:21

Status:

Needs review

» Needs work

+++ b/core/tests/Drupal/Tests/Core/Asset/css_test_files/css_input_with_import.css
@@ -4,6 +4,7 @@
+@import url("https://fonts.googleapis.com/css2?family=Roboto+Mono:wght@300;400&family=Roboto:ital,wght@0,300;0,400;1,300;1,400&display=swap");

Concerned about including an external URL (even from somewhere as "stable" as you'd expect Google Fonts to be) in tests - even if there's a temporary network failure in retrieving, this would be more fragile than we probably need. I imagine a text fixture could serve just as well.

Log in or register to post comments

Comment #17

Clemens Sahs CreditAttribution: Clemens Sahs commented 6 January 2021 at 10:32

even if there's a temporary network failure in retrieving, this would be more fragile than we probably need.

In the case we make some network interactions, yes you are right and my full approval.

But in this case (CssOptimizer) we make simple string interaction.

at the same time we must edit the following, too.

@import url("http://example.com/style.css");
@import url("//example.com/style.css");

Please correct my if I miss something?

Log in or register to post comments

Comment #18

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 6 January 2021 at 18:00

Ah, yes, sorry - brain fart. I think then let's just change the fonts.googleapis.com to fonts.example.com, per the RFC on example domains and so as to not "endorse" any particular font vendor by reference?

Log in or register to post comments

Comment #19

AndyF CreditAttribution: AndyF at Fabb for FRUITION commented 7 January 2021 at 09:19

Also I think the regex as it stands might fail to match some (edge!) cases correctly?

@import "cu'st;om.css";
@import 'cus(t;om.css';
@import 'cus)t;om.css';
@import 'cus(t);om.css';
@import url(http://example.com/cu'st;om.css);
@import url(http://example.com/cus(t;om.css);
@import url(http://example.com/cus)t;om.css);
@import url(http://example.com/cu(s)t;om.css);

See https://regex101.com/r/97dCNQ/2.

Thanks!

Log in or register to post comments

Comment #20

11 January 2021 at 04:49

bradjones1 opened merge request !241

Log in or register to post comments

Comment #21

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 11 January 2021 at 05:12

Status:

Needs work

» Needs review

Hi Andy!

You raise a good point - in recent years internationalization (IRIs) and things like funky Google Fonts URLs have clouded the question of "what is a valid URL" to the point where it's rather difficult to regex it. I went back and looked at the current code, which basically boils down to matching any text starting with @import through to a semicolon:

https://regex101.com/r/ybSIaY/1/

As you can see, this matches all the existing test cases but incorrectly truncates the URL containing a semicolon.

I have updated this regex to now be '/(*ANYCRLF)(*BSR_ANYCRLF)@import.*;(\R|$)/i' which now matches from the start of the import statement to the end of the line (or to the end of the file, if for instance the file doesn't end with a newline. This shouldn't happen per the spec but might be worth trying to catch anyway?)

https://regex101.com/r/5YE5Rr/1

So far it looks like this helps clean things up, passes the expanded test coverage and doesn't have any regression on the existing aggregation unit test. The thing this will not capture is any import statements spanning multiple lines. I imagine this was the intent of the [^;] in the existing regex. AFAICT this is not a common pattern and I'm not quite sure there's a way to write a regex that does all the things we need here and handles the newlines in a platform agnostic way, but I could be wrong. We don't have this multiline condition in the current test coverage and honestly I'm not sure it's valid per the CSS spec, so this may be worth a possible regression to get Google Fonts with semicolons working, which is a way more common condition.

Log in or register to post comments

Comment #22

11 January 2021 at 06:26

bradjones1 opened merge request !242

Log in or register to post comments

Comment #23

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 11 January 2021 at 14:42

Status:

Needs review

» Needs work

Log in or register to post comments

Comment #24

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 11 January 2021 at 15:44

Does the CSS collection optimizer not have tests? Looks like it might not?

Log in or register to post comments

Comment #25

patrickkrueger

Berlin

CreditAttribution: patrickkrueger at patrickkrueger.com commented 28 January 2021 at 13:43

File	Size
2936067-25.patch	835 bytes
8.9.x: PHP 7.3 & MySQL 5.7 28,638 pass

~~Rerolling 2936067-5.patch against Drupal 8.9.x~~

Update: 2936067-5.patch is applying successfully against Drupal 8.9.x – there was some misconfiguration in our build process causing this irritation.

Log in or register to post comments

Comment #26

eiriksm

he/him

Norwegian Bokmål

Norway

CreditAttribution: eiriksm at Violinist, Foreningen Drupal Norge, Ny Media AS commented 18 March 2021 at 18:26

Status:

Needs work

» Needs review

File	Size
2936067-test-only.patch	7.2 KB
9.2.x: PHP 7.3 & MySQL 5.7 Custom Commands Failed

Not sure how polite it is to just take over someones merge request, but here is a test only patch that fails for me.

I kept the changes to the optimized css files, even though we could use css strings directly. Seems practical, if we changed something that in turn ended up changing those, somehow.

Log in or register to post comments

Comment #27

eiriksm

he/him

Norwegian Bokmål

Norway

CreditAttribution: eiriksm at Violinist, Foreningen Drupal Norge, Ny Media AS commented 19 March 2021 at 07:27

File	Size
2936067-test-only.patch	7.2 KB
9.2.x: PHP 7.3 & MySQL 5.7 28,184 pass, 1 fail

ok here is one without that phpcs error

Log in or register to post comments

Comment #28

19 March 2021 at 08:27

Status:

Needs review

» Needs work

The last submitted patch, 27: 2936067-test-only.patch, failed testing. View results

Log in or register to post comments

Comment #29

ravi.shankar CreditAttribution: ravi.shankar at OpenSense Labs commented 25 March 2021 at 09:57

Status:

Needs work

» Needs review

File	Size
interdiff_27-29.txt	663 bytes
2936067-test-only.patch	7.2 KB
9.2.x: PHP 7.3 & MySQL 5.7 28,190 pass, 1 fail

Just fixed failed test of patch #27.

Log in or register to post comments

Comment #30

25 March 2021 at 10:59

Status:

Needs review

» Needs work

The last submitted patch, 29: 2936067-test-only.patch, failed testing. View results

Log in or register to post comments

Comment #31

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 26 March 2021 at 22:32

Title:	CSS aggregation fails if an @import contains a semicolon	» CSS aggregation fails on many variations of @import
Issue summary:	View changes

Log in or register to post comments

Comment #32

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 26 March 2021 at 22:41

Issue summary:

View changes

Updating the title to handle some additional edge cases that are material to our handling of @import; might as well knock them out since this ticket has expanded to cover the collection optimizer, as well.

Log in or register to post comments

Comment #33

bradjones1

English

CreditAttribution: bradjones1 at FRUITION for Sport Obermeyer commented 30 March 2021 at 02:15

Status:

Needs work

» Needs review

Would be interesting to see if this resolves any of the referenced issues; some propose changes in behaviour rather than bugfix but this is the only one with tests AFAIK.

Log in or register to post comments

Comment #34

30 March 2021 at 02:15

Version:

9.2.x-dev

» 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Log in or register to post comments

Comment #35

majorrobot CreditAttribution: majorrobot at CivicActions for Global Game Jam commented 6 July 2021 at 17:41

The patch in #25 worked for us. Would be great to see this merged!

Our pre-aggregated CSS:
@import url(https://fonts.googleapis.com/css2?family=Nunito+Sans:wght@200;400;700&display=swap);@import url(https://fonts.googleapis.com/css2?family=Noto+Sans+TC:wght@400;700&display=swap);

And after aggregation:
@import url(https://fonts.googleapis.com/css2?family=Nunito+Sans:wght@200;@import url(https://fonts.googleapis.com/css2?family=Noto+Sans+TC:wght@400;

It appears the optimizer saw the semicolon, stopped, and went to the next rule.

Log in or register to post comments

Comment #36

bradjones1

English

CreditAttribution: bradjones1 for Sport Obermeyer commented 6 July 2021 at 20:41

@majorrobot if you think it's RTBC can you mark it as such?

Log in or register to post comments

Comment #37

majorrobot CreditAttribution: majorrobot at CivicActions for Global Game Jam commented 6 July 2021 at 20:44

@bradjones1 I haven't verified the tests. I'm not clear if that is a blocker to RTBC?

Log in or register to post comments

Comment #38

bradjones1

English

CreditAttribution: bradjones1 for Sport Obermeyer commented 6 July 2021 at 23:01

@majorrobot Not sure what you mean by "verified" but if you have functionally validated this (your comment above) and taken a look at the patch and think it's ready for framework maintainers to review, RTBC it.

Log in or register to post comments

Comment #39

majorrobot CreditAttribution: majorrobot at CivicActions for Global Game Jam commented 7 July 2021 at 16:15

Status:

Needs review

» Reviewed & tested by the community

Thanks @bradjones1. Done!

Log in or register to post comments

Comment #40

bradjones1

English

CreditAttribution: bradjones1 for Sport Obermeyer commented 7 July 2021 at 16:22

Issue tags:

-Needs tests

Log in or register to post comments

Comment #41

szeidler CreditAttribution: szeidler at Ramsalt Lab commented 7 July 2021 at 18:38

We're using the merge request changes successfully on a couple of projects, to make the Google Fonts import work as expected.

Log in or register to post comments

Comment #42

alexpott

he/they

English

🇪🇺🌍

CreditAttribution: alexpott at Acro Commerce, Thunder commented 9 July 2021 at 08:39

Priority:	Normal	» Major
Status:	Reviewed & tested by the community	» Needs work

I think this is a pretty major bug if CSS aggregation is failing on snippets provided by google fonts then that's quite unexpected.

While reviewing the code I think there is one part of the change that's not necessary - or if it is then it needs test coverage to prove it.

Log in or register to post comments

Comment #43

anrikun CreditAttribution: anrikun commented 4 August 2021 at 08:27

File	Size
css_aggregation_breaks_import-2936067-43-d7-do-not-test.patch	638 bytes

A quick fix for D7 users (port of patch at #25)

Log in or register to post comments

Comment #44

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 17 September 2021 at 13:18

Issue tags:

+Needs reroll

This needs a reroll now that #2669074: Convert file_create_url() & file_url_transform_relative() to service, deprecate it is in.

Log in or register to post comments

Comment #45

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 17 September 2021 at 13:46

Version:	9.3.x-dev	» 9.2.x-dev
Issue tags:	-Needs reroll

This is a bugfix so it can stay on 9.2.x for the time being. Then we have no merge conflict since #2669074: Convert file_create_url() & file_url_transform_relative() to service, deprecate it is a feature request for 9.3.x.

Log in or register to post comments

Comment #46

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 17 September 2021 at 19:39

I have updated the test so that inline and external imports are now mixed. I will now revert to the original approach using preg_replace() to address comment #2936067-42: CSS aggregation fails on many variations of @import.

Log in or register to post comments

Comment #47

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 18 September 2021 at 08:23

Status:

Needs work

» Needs review

OK it is working as expected. Remark #42 is addressed. Ready for next review!

Log in or register to post comments

Comment #48

idimopoulos CreditAttribution: idimopoulos for European Commission and European Union Institutions, Agencies and Bodies commented 21 September 2021 at 06:23

Status:

Needs review

» Reviewed & tested by the community

It worked fine for us. Remarks addressed. RTBC +1.

Log in or register to post comments

Comment #49

lauriii

he/him

Finnish

Finland

CreditAttribution: lauriii at Acquia commented 1 October 2021 at 11:14

Status:

Reviewed & tested by the community

» Needs work

@nod_ pointed out #19 which led into discovering that there are at least some advanced variations where the previous regex would have recognized an import statement but the new regex wouldn't:

@import url(http://example.com/cus(tom.css);
@import url(http://example.com/cu(s)tom.css);

Discussed with @nod_ and @alexpott about use cases where the new regex will recognize invalid CSS syntax. We thought that would be acceptable because that's essentially pre-existing problem, and this issue won't make it worse. Also, the new regex essentially decreases the likelihood of CSS aggregator created syntax errors.

Log in or register to post comments

Comment #50

bradjones1

English

CreditAttribution: bradjones1 at Not Vanilla, Inc. commented 1 October 2021 at 17:00

@lauriii thanks for the review. Can you be more specific what you mean by the "new" regex? I had proposed a change to the regex in https://git.drupalcode.org/project/drupal/-/merge_requests/241/diffs?com... which I think later got backed out. Can you be more specific what you think needs to be changed here? It's not clear to me from your comment. Do we also need to include any additional test (not valid, but common mistake) patterns you mention, a la those in #19? Thanks.

Log in or register to post comments

Comment #51

lauriii

he/him

Finnish

Finland

CreditAttribution: lauriii at Acquia commented 2 October 2021 at 08:39

I was referring to the regex that has been changed as a new regex since the pattern changed significantly from the regex that we use currently. I'm sorry because that could have been clearer.

I think what needs to be done is add test coverage that proves that the two examples I posted in #49 work with the changes.

Log in or register to post comments

Comment #52

bradjones1

English

CreditAttribution: bradjones1 at Not Vanilla, Inc. commented 2 October 2021 at 23:38

Version:	9.2.x-dev	» 9.3.x-dev
Status:	Needs work	» Needs review

OK thanks for the clarification.

See https://regex101.com/r/QuK3Pp/1 which is based on the set presented in #19 but with an updated regex to:

/@import\s*(?:url$\s*)?[\'"]?([^\'"\]+)([\'")]+$?.*;)/ig

The difference being the removal of the U (ungreedy) flag as well as removing ( and ) as invalid characters inside of the URL.

Pushed this in https://git.drupalcode.org/project/drupal/-/merge_requests/241/diffs?com...

I'll admit it's been a while since I last looked at this issue so I'm not sure if making this greedy will have undesired effects, but I suppose that's what the test suite is for!

Rebased to 9.3.x.

Log in or register to post comments

Comment #53

bradjones1

English

CreditAttribution: bradjones1 at Not Vanilla, Inc. commented 2 October 2021 at 23:50

Status:

Needs review

» Needs work

Realized I included some new URL patterns to the test but these also need to be included in the aggregated fixtures.

Log in or register to post comments

Comment #54

AndyF CreditAttribution: AndyF at Fabb commented 3 October 2021 at 11:44

Thanks for all your work on this @bradjones1!

The difference being the removal of the U (ungreedy) flag as well as removing ( and ) as invalid characters inside of the URL.

This is probably a silly question, but if we have a greedy .*; on the end, does the new regex buy us anything over just matching @import.+;?

Log in or register to post comments

Comment #55

nod_

French

Lille

CreditAttribution: nod_ at Acquia commented 3 October 2021 at 12:29

Seems a regex will be tricky, we need to catch those things to have a complete solution:
https://developer.mozilla.org/en-US/docs/Web/CSS/string#syntax
https://developer.mozilla.org/en-US/docs/Web/CSS/url()#values

Log in or register to post comments

Comment #56

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 08:06

Looks like the regex posted above by @bradjones1 in #52 covers those edge cases?

https://regex101.com/r/QuK3Pp/1

Is there a case that is not covered by this example? I see that both the regular strings and url() declarations are covered with semicolons and parentheses inside them.

Log in or register to post comments

Comment #57

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 08:10

OK I see that the escaped quotes were not covered, but they are working with the same regex. Here I have added additional test cases for escaped quotes and double quotes as described on https://developer.mozilla.org/en-US/docs/Web/CSS/string#syntax

https://regex101.com/r/ThvAX6/1

Log in or register to post comments

Comment #58

nod_

French

Lille

CreditAttribution: nod_ at Acquia commented 4 October 2021 at 08:19

Oh nice :) Regex seems to have problems when several @import statements are on the same line

Log in or register to post comments

Comment #59

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 08:53

File	Size
2936067-backport-to-older-drupal-versions.patch	2.14 KB

Since this has been rebased on 9.3.x the patch no longer applies on 9.2.x and below. Here is a quick reroll of the patch for people that need to unbreak their sites right now.

This patch doesn't include the test coverage but can be applied on a wide range of Drupal versions: 8.8.x, 8.9.x, 9.0.x, 9.1.x, 9.2.x and 9.3.x with varying degrees of offset.

Log in or register to post comments

Comment #60

AndyF CreditAttribution: AndyF at Fabb commented 4 October 2021 at 09:37

Looks like the regex posted above by @bradjones1 in #52 covers those edge cases?

I wonder if it's unnecessarily complicated though? It mostly optionally matches at the front and then has a .*; at the end. If we remove the optional parts, I think the regex reduces to @import[^\'"\]+)([\'")]+.*; (https://regex101.com/r/ESu2qg/1). At that point it's not clear to me that it offers anything over a simple @import.+; (https://regex101.com/r/EfhPdD/1). Sorry if I'm missing something already discussed!

Log in or register to post comments

Comment #61

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 10:04

Neither of those work with inline statements as mentioned by _nod in #58.

The inlining makes it a whole lot more complex. I have been experimenting with it but I'm stumped on the parentheses and semicolons that can be part of a URL inside url(), for example @import url(http://example.com/cu(s)t;om.css);. I have no idea how we can reliably detect the ending parentheses and semicolon.

This is already detecting inline @import statements that do not use url():

https://regex101.com/r/Az82S0/1

Log in or register to post comments

Comment #62

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 10:30

OK we need to narrow down what is considered to be a valid URL. It is not fully documented on https://developer.mozilla.org/en-US/docs/Web/CSS/url()

I've been experimenting in the Firefox CSS engine, and these URLs are NOT valid:

url(http://example.com/cu'stom.css);
url(http://example.com/cu"stom.css);
url(http://example.com/cus(t;om.css);
url(http://example.com/cus)t;om.css) ;
url(http://example.com/cu(s)t;om.css);
url(http://user:pass@example.com/cu(s)t;om.css);
url (http://example.com); - note the space between "url" and the opening parenthesis. This is invalid

To make the above valid they can either be wrapped in quotes or escaped with backslashes. These test cases are valid:

url( " http://user:pass@example.com/c'u(s)t;o\"m.css " ) ;
url( ' http://user:pass@example.com/c"u(s)t;o\'m.css ' ) ;
url( http://user:pass@example.com/c\'u$s$t;om.css ) ;

This makes a lot more sense :)

Log in or register to post comments

Comment #63

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 10:49

I have added the url() matching on quoted and double quoted strings: https://regex101.com/r/nNj8Sn/1

Still missing is the quoteless but escaped case url( http://user:pass@example.com/c\'u$s$t;om.css ) ;

I have no more time unfortunately to continue on this today.

Log in or register to post comments

Comment #64

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 11:14

I settled on this /@import\s*(?:(['"])?(?:\\\1|.)*\1.*|url$\s*(?:(['"])?(?:\\\2|.)*\2|(?:\\[$\'\"]|[^'")])*)\s*\).*);/gUi

Passes all the test cases, on separate lines or inline: https://regex101.com/r/X6B10k/1

Log in or register to post comments

Comment #65

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 12:49

I am getting errors when trying it in a real site :-/

On PHP 8.0 this throws JIT stack limit exhausted. It works though when I disable pcre.jit and increase the pcre.backtrack_limit to 10000000 but that is not a good idea.

I'm guessing it is caused by having 2 option groups in combination with non-greedy matching. However this seems to be necessary, we need to be able to identify matching quote pairs.

Possibly we can greatly reduce the number of recursions if we split this in 2 separate regexes? One for the string URLs and one for the url() pattern?

Log in or register to post comments

Comment #66

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 14:22

I did some more testing but matching quoted URLs using option groups and backtracking are out of the question. It works fine on individual test cases, but when testing with 100kb minified CSS files they are too heavy.

There are 5 possible cases of import statements. I added a simple regex for each case and execute them in series. This is fast and memory efficient even with large files.

We still need to repair the tests and extend them with some of the cases discovered above.

Log in or register to post comments

Comment #67

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 4 October 2021 at 14:28

File	Size
2936067-67-backport.patch	2.04 KB

Backport for Drupal 9.2.x and below.

Log in or register to post comments

Comment #68

nod_

French

Lille

CreditAttribution: nod_ at Acquia commented 4 October 2021 at 14:44

I like the solution, this might change the source order of @import though, depending on what is matched when.

Log in or register to post comments

Comment #69

nod_

French

Lille

CreditAttribution: nod_ at Acquia commented 4 October 2021 at 14:48

Looks like we can combine things:

@import\s*(\'(?:\\\'|.)*\'|"(?:\\"|.)*"|url\(\s*(?:\\[\)\\\'\"]|[^\'")])*\s*\)|url\(\s*\'(?:\\\'|.)*\'\s*\)|url\(\s*"(?:\\"|.)*"\s*\)).*;

Log in or register to post comments

Comment #70

nod_

French

Lille

CreditAttribution: nod_ at Acquia commented 5 October 2021 at 10:31

looks like it's working, one fail, haven't look to see if test or code is "wrong"

Log in or register to post comments

Comment #71

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 6 October 2021 at 14:02

Assigned:

Unassigned

» pfrenssen

I'm going to work on this for a few hours, take a look at the test coverage. I will unassign as soon as I am ready.

I will also create a downloadable patch compatible with older D9 versions to help out @ultrabob.

Log in or register to post comments

Comment #72

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 6 October 2021 at 18:58

There is a bug in the regex, it is not detecting some import statements correctly, such as the following:

@import url(http://example.com/c\"us\(t;o\'m.css);
@import "cu\"st;om.css";
@import url( http://user:pass@example.com/c\'u$s$t;om.css ) ;

I'll see if I can reconstruct it from the 5 separate ones. It looks like something went wrong possibly in the escaping of the quotes. Perhaps we should use NOWDOC to define the expression.

Log in or register to post comments

Comment #73

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 6 October 2021 at 20:59

Assigned:

pfrenssen

» Unassigned

File	Size
2936067-73-backport.patch	925 bytes
9.2.x: PHP 7.3 & MySQL 5.7 28,647 pass, 1 fail 8.9.x: PHP 7.1 & MySQL 5.6 28,641 pass

File

Size

2936067-73-backport.patch

925 bytes

9.2.x:
PHP 7.3 & MySQL 5.7 28,647 pass, 1 fail

8.9.x:
PHP 7.1 & MySQL 5.6 28,641 pass

4 files were hidden/shown/deleted

File	Size
2936067-test-only.patch	7.2 KB
9.2.x: PHP 7.3 & MySQL 5.7 28,190 pass, 1 fail
css_aggregation_breaks_import-2936067-43-d7-do-not-test.patch	638 bytes

2936067-backport-to-older-drupal-versions.patch	2.14 KB

2936067-67-backport.patch	2.04 KB

Stopping for today. Using NOWDOC is indeed a good way to import the regular expression. It is also possible to import it in quoted form but this involves tricky triple backslashes. If we keep it as a NOWDOC then this can also be trivially copy/pasted into an online regex builder tool for future work. Otherwise we'll have to strip out the escapes again.

While working on the tests I noticed that there are some bugs in the code that strips out duplicate whitespace. There are some regexes that will detect quoted strings, but they get tripped by unquoted import statements that contain escaped quotes. For example something like @import url(some\'file.css); will trip up whitespace deduplication in the rest of the line. This is out of scope for this issue though. I will file a followup.

Here is also a version of the patch without test coverage for people who want to use this on older versions of Drupal 8/9.

Log in or register to post comments

Comment #74

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 6 October 2021 at 21:25

Created followup for the whitespace deduplication issue: #3241196: Whitespace deduplication tripped up by escaped quotes in URLs

Log in or register to post comments

Comment #75

pfrenssen

Sofia

CreditAttribution: pfrenssen at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented 7 October 2021 at 12:59

Status:

Needs work

» Needs review

Awesome, looking good from my side. I also tested it using a real life project that was affected by this bug and it works fine now. No more problems with the stack limit. Setting back to NR

Log in or register to post comments

Comment #76

nod_

French

Lille

CreditAttribution: nod_ at Acquia commented 7 October 2021 at 13:30

Same happy with it now too

Log in or register to post comments

Comment #77

7 October 2021 at 14:01

Status:

Needs review

» Needs work

The last submitted patch, 73: 2936067-73-backport.patch, failed testing. View results

Log in or register to post comments

Comment #78

nod_

French

Lille

CreditAttribution: nod_ at Acquia commented 7 October 2021 at 14:10

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #79

nod_

French

Lille

CreditAttribution: nod_ at Acquia commented 11 October 2021 at 14:53

looks good to me, need someone to RTBC this :)

Log in or register to post comments

Comment #80

ultrabob CreditAttribution: ultrabob commented 12 October 2021 at 08:37

Status:

Needs review

» Reviewed & tested by the community

I spun up a copy of a site that is known to have this issue, upgraded it to 9.3.x, and confirmed that css was aggregating and the issue persisted. I then applied the patch, and css was still aggregating and the problem was fixed.

I went through the code as well, and while the regex is a monster, that I'm not sure I fully grasped, the code and tests both look good.

Log in or register to post comments

Comment #81

alexpott

he/they

English

🇪🇺🌍

CreditAttribution: alexpott at Acro Commerce, Thunder commented 12 October 2021 at 10:56

Issue summary:

View changes

Updating the link in the IS to the current regex and using @pfrenssen's examples...

Log in or register to post comments

Comment #82

alexpott

he/they

English

🇪🇺🌍

CreditAttribution: alexpott at Acro Commerce, Thunder commented 12 October 2021 at 11:06

Sorted out commit credit. I creditted people who reviewed the patch and commented on the code via gitlab. I didn't credit @ravi.shankar because their fix was not a fix. I didn't credit @patrickkrueger since the upload was for Drupal 8.9 and the patch for that existed and applied already. And I didn't @anrikun because D7 patches need a separate issue as per policy.

Log in or register to post comments

Comment #83

alexpott

he/they

English

🇪🇺🌍

CreditAttribution: alexpott at Acro Commerce, Thunder commented 12 October 2021 at 11:21

Version:	9.3.x-dev	» 9.2.x-dev
Status:	Reviewed & tested by the community	» Fixed

Committed 7374845 and pushed to 9.3.x. Thanks!
Committed 5aa154d and pushed to 9.2.x. Thanks!

I backported the change and fixed the conflict in core/tests/Drupal/Tests/Core/Asset/CssOptimizerUnitTest.php and ran all the tests in core/tests/Drupal/Tests/Core/Asset - everything looks good.

Log in or register to post comments

Comment #84

12 October 2021 at 11:22

alexpott committed 7374845 on 9.3.x

Issue #2936067 by pfrenssen, bradjones1, nod_, eiriksm, DuaelFr, Clemens...

Log in or register to post comments

Comment #85

12 October 2021 at 11:22

alexpott committed 5aa154d on 9.2.x

Issue #2936067 by pfrenssen, bradjones1, nod_, eiriksm, DuaelFr, Clemens...

Log in or register to post comments

Comment #86

26 October 2021 at 11:24

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Log in or register to post comments

CSS aggregation fails on many variations of @import

Issue fork drupal-2936067

Comments