Closed (fixed)
Project:
Parallel CSS - AdvAgg Plugin
Version:
6.x-1.x-dev
Component:
Code
Priority:
Normal
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
8 Jun 2011 at 17:34 UTC
Updated:
26 Jun 2011 at 03:41 UTC
Sweet project! Have you thought about moving over to use the CDN module?
Comments
Comment #1
chriscalip commentedWell I thought about it.. and this one really complements the CDN module. Pretty much CDN rewrites the links inside the html page and this module in conjunction with (advagg) rewrites the links in the aggregated css files.
And I could use some opinion about this:
I can be wrong about this but the admin panel in CDN admin/settings/cdn/details has the logic setup for by file type. Usually people would do
CDN Mapping:
http://img1.drupal.org|jpg
http://img2.drupal.org|gif
http://img3.drupal.org|png
http://img4.drupal.org|ico
While this module has it for the sequential replacement element of the logic.
http://img1.drupal.org
http://img2.drupal.org
http://img3.drupal.org
http://img4.drupal.org
pretty much with a data set of
background:url('sites/all/themes/do/1.png');
background:url('sites/all/themes/do/2.png');
background:url('sites/all/themes/do/3..png');
background:url('sites/all/themes/do/4.png');
background:url('sites/all/themes/do/5.png');
background:url('sites/all/themes/do/6.png');
background:url('sites/all/themes/do/7.png');
background:url('sites/all/themes/do/8.png');
The result would be:
background:url('http://img1.drupal.org/sites/all/themes/do/1.png');
background:url('http://img2.drupal.org/sites/all/themes/do/2.png');
background:url('http://img3.drupal.org/sites/all/themes/do/3..png');
background:url('http://img4.drupal.org/sites/all/themes/do/4.png');
background:url('http://img1.drupal.org/sites/all/themes/do/5.png');
background:url('http://img2.drupal.org/sites/all/themes/do/6.png');
background:url('http://img3.drupal.org/sites/all/themes/do/7.png');
background:url('http://img4.drupal.org/sites/all/themes/do/8.png');
Comment #2
chriscalip commentedHaving said that what would be the user experience for managing this? I can't picture of a way to consolidate these 2 different requirements into just one admin interface aka the mapping Text Area field. What do you think?
Comment #3
mikeytown2 commentedI helped Wim Leers with some prototype code that deals with this exact situation. It is available in the CDN module. See readme.txt for details.
Your mapping on
admin/settings/cdn/detailswould look likeAnd then the "PHP code for cdn_pick_server()" on
admin/settings/cdn/otherwould look likeThis will spread all cdn requests fairly equally across the 4 different img* domains.
Comment #4
chriscalip commentedIt is very possible to do integration with cdn. Because both $cdn_basic_mapping and $parallel_css_settings are just a string of urls.
$parallel_css_settings is always:
http://img1.drupal.org
http://img2.drupal.org
http://img3.drupal.org
http://img4.drupal.org
While $cdn_basic_mapping can be
http://img1.drupal.org
http://img2.drupal.org
http://img3.drupal.org
http://img4.drupal.org
With:
$CDN_PICK_SERVER_PHP_CODE_VARIABLE:
$filename = basename($servers_for_file[0]['url']);
$unique_file_id = hexdec(substr(md5($filename), 0, 5));
return $servers_for_file[$unique_file_id % count($servers_for_file)];
or this:
http://img1.drupal.org|png
http://img2.drupal.org|gif
http://img3.drupal.org|jpg
http://img4.drupal.org|ico
The devil is in the details
Comment #5
chriscalip commentedI can make an assumption that during the advagg process of parallel_css it's always gonna be whatever is the selected mapping url(s) we want to balance this out as evenly as possible.
pretty much the same formula:
With that said I can do like this:
So i pretty much make separate module of parallel css mapping admin for just in case folks that dont want to make use of the cdn module but still wants to do a load balancing on their css aggregates..Or pretty much just remove the admin aspect of the parallel css and use cdn.
What do you think?
Comment #6
Peter Bowey commented@chriscalip
I love the idea = +1.
Load Balancing => 'yes'
Notes: I have not started using this module yet, I prefer to read the source and see where it is going...:)
Well done!
Comment #7
mikeytown2 commentedYou want some sort of hash on the filename, that way the same file will always be coming from the same server; thus your browser will always have the cache of it. I don't think your current code does that. Also set the weight of this to be heavier than css_emimage
Comment #8
Peter Bowey commentedReferring to #7
In the 'ancient' non CMS days .... :)
I used a 'parallel' URI CDN 'hash' like this: (0-2)
eg: ('old timer' HTML method sample):
DNS:
Comment #9
mikeytown2 commented@peter bowey
In regards to #8, that works great until the order of your link tags change; once they change then you have to re-download the same CSS file from a different domain instead of getting it from your browser cache. Or in this case if you add/remove a url() link at the top of a CSS file then all the url() references will be pointing to a different server.
The code below shows how the filename hash thing works. If you change the number of servers than the modulus will be different. This isn't perfect by any means but in terms of code complexity VS getting it right, its a pretty good tradeoff. The url() changes when the # of available servers changes, which makes sense.
Outputs
Comment #10
Peter Bowey commented@mikeytown2
Many thanks!
That is a 'acceptable' method!
I will plan to integrate it into the advagg + parallel interface 'thingy'
Appreciate you will and time to encourage an 'old dog non-cms coder'.
* I am still learning the correct Drupal 'bark' - it is not 'woof - woof' - more like 'callback sometime grrrr' * :)
Comment #11
chriscalip commentedHey mikey,
I made you a co-maintainer if you want to handle hash code , ill whip up the cdn integration thing. sorry talking with a client. cant respond for a time.
Comment #12
Peter Bowey commented@mikeytown2 project support count = +1
Mike, must be about 22+ projects you love + support :)
I elect that you have 26 hours per day, the rest of us 24....
Comment #13
chriscalip commentedYikes, i thought about #11 more .. its just that i wasnt aware of the concept. I can quickly research and implement it. but if you want to take care of it (at least that part of the module) thats okay too :)
Comment #14
mikeytown2 commentedI'll be busy over here for a little while so the ball is in your court :)
http://groups.drupal.org/node/154564
Comment #15
chriscalip commentedI got this, should be finish by tomorrow. need to sleep and all
Comment #16
Peter Bowey commented@chriscalip
I think @mikeytown2 has planted enough 'good seed' to get this 'hash code' rose 'in bloom' :)
Comment #17
chriscalip commentedI could not sleep. This is interesting.
I re-read the messages and i realized that i am not getting the big picture here.
Picking on the clues "asset collective" and "to be heavier than css_emimage" I started
reading the issue queues of several modules including advagg and css_emimage.
Having said that I just want to be clear on what we are trying to pull off here.
Drupal site http://www.example.com
has several css files including the following
site admin installs cdn, advagg, parallel_css, and css_emimage.
CDN mapping url:
Three scenarios:
Senario A parallel_css, advagg compress css, core advagg css/js are enabled. css_emimage is not.
During the css aggregation process because parallel_css has a weight of -10 see (parallel_css.install) it gets first dibs
on hook_advagg_css_alter. parallel_css gets the mapping url array from cdn_basic_mapping and then proceeds to the replacement
process. After the replacement process of $content it gets pass to the other implementers of hook_advagg_css_alter and at the
end of the process we get an aggregated file of css_0f8107b462965cd0d36e3ad9a51359e7_0.css containing among its contents:
--- So why does parallel_css needs to implement hash code if the other modules are doing it?
Senario B parallel_css, advagg compress css, core advagg css/js are enabled. css_emimage.
parallel_css is lightest.
we get an aggregated file of css_0f8107b462965cd0d36e3ad9a51359e7_1.css containing among its contents:
--- Is Css Embedded Image able to handle a domain name ???
Senario C parallel_css, advagg compress css, core advagg css/js are enabled. css_emimage.
parallel_css is heavier than css_emimage
we get an aggregated file of css_0f8107b462965cd0d36e3ad9a51359e7_2.css containing among its contents:
--- are these strings valid ???
Comment #18
mikeytown2 commentedXXXX-CSS-EmbeddedString-XXXXX is a BASE64 encoded version of that file. You get the benefits of a image sprite without some of the hassles that come with it. So this module (Parrallel CSS) needs to check that the ulr() is not base64 encoded and is a file. css_emimage will only drop in 32kb of image data into the CSS file so anything larger will then be processed in this module.
Comment #19
chriscalip commentedMikeytown2 and peter bowey , you guys were pretty deep, I could not get what you guys were saying. mbutcher and i figured it out and even made some improvements.
First I wanted to make sure that this is the concept that we are trying to achieve.
1.png from 1.css is loaded the first time as img1.d.o/1.png
at the next pages: 1.png from 2.css is appearing as img2.d.o/1.png
what we want to make sure is 1.png is always attached to the same server.
1.png is always http://img1.d.o/1.png from any aggregated css.
although the distributed set is not always optimal i agree that this is the best way in the long run.
Matt Butcher made some suggestions on how to speed it up from:
md5
Overall Summary
Total Incl. Wall Time (microsec): 4,401 microsecs
Total Incl. MemUse (bytes): 102,348 bytes
Total Incl. PeakMemUse (bytes): 199,084 bytes
Number of Function Calls: 466
TO:
crc32
Overall Summary
Total Incl. Wall Time (microsec): 3,173 microsecs
Total Incl. MemUse (bytes): 101,664 bytes
Total Incl. PeakMemUse (bytes): 181,248 bytes
Number of Function Calls: 420
Comment #20
Peter Bowey commented@chriscalip
Many Thanks for working this through.
The use of crc32() many not be unique enough in some cases, hence the reasoning for using md5().
Given a file, and a CRC32 checksum, it is relatively simple to make small modifications to the file so that it has the desired checksum. There is no easy way to do this with md5 sums.
CRC32 is useful for say, a communications checksum, because it's fast and efficient and effective at catching the kinds of errors that happen over a communictions line (short bursts of errors, at most, in relatively small blocksizes). It's easy to implement and long predates MD5.
But if you're using it for anything other than a simple communications checksum, 'it's being abused'.
Comment #21
chriscalip commented@peter bowey
My pleasure its a fun project for me. http://drupalcode.org/project/parallel_css.git/commit/18974e3 Done.
Comment #22
Peter Bowey commentedRefer #19:
See http://brainspl.at/articles/2006/12/29/speed-up-page-loads
The above 'quote' is only meant as a idea 'template' and 'brain food' :)
Comment #23
chriscalip commentedI think we have achieved this now. :) 1.png will always be assigned to the same domain.
@TODO if cdn_basic_mapping exist use that instead of the parallel_css_mapping
@TODO make parallel_css weight more heavy than css_emimage
Comment #24
Peter Bowey commentedRefer #23
@chriscalip
Looking through the latest code @ http://drupalcode.org/project/parallel_css.git/blob_plain/refs/heads/6.x...
The above code methods look good to me.
I will test this 'real-time' today! :)
+1
Many thanks for contributing to Drupal projects!
Comment #25
Peter Bowey commentedRefer #23
It is also interesting reading through other projects / ideas that used this parallel asset method:
See -> http://statichtml.com/2010/use-unique-ips-for-sharded-asset-hosts.html
Overloading of brain food (sorry!)... :)
Comment #26
chriscalip commented@TODO make parallel_css weight more heavy than css_emimage
http://drupalcode.org/project/parallel_css.git/commit/32c84d8 Done.
Comment #27
chriscalip commentedRefer #25 Oh joy! My company website is like that http://www.straightnorth.com
We are pretty much using (img1.straightnorth.com,img2.straightnorth.com,img3.straightnorth.com,img3.straightnorth.com,img4.straightnorth.com,css.straightnorth.com) all pointing to the same ip :(
Comment #28
Peter Bowey commentedRefer #27
@chriscalip
*smile* That is only meant to be a 'heads up' about some 'possible issues' + how some 'typically older consumer' grade ADSL routers offer 'crude' 'firewall' protection... eg: "SYN Flood to Host" :)
Personally, I use a dual-wan Linksys RV082 ADSL2+ on two active ADSL2+ lines - with two static IP's... feeding a dedicated Linux Server (3 x Ethernet Ports / Gateway). In this event, I have 'disabled' the Linksys RV082 WAN firmware 'crud protection' and use Linux 'packet stateful' firewall..
Comment #29
Peter Bowey commentedRefer #26
@chriscalip
Good work Chris! +1
Just one to go: :)
Of interest see the following Drupal CDN links:
http://drupal.org/node/962266
http://drupal.org/node/956164
Notes: Google is pushing a growing number of hits for your module:
+1:)
Comment #30
chriscalip commentedThis is a bit tricky, i am troubled by cdn's approach of only those who knows php will be able to pull this off.
http://drupal.org/node/962266
We need a better approach here:
What do you think of this:
@ /admin/settings/advagg/parallel-css
[X] Use Available CDN Mapping and CDN pick-server
----------------------------------------------------------------
Be sure to read: http://drupal.org/node/962266
----------------------------------------------------------------
URL:
----------------------------------------------------------------
Enter the domains urls you want included separated by each line. Warning dont include a '/' at the end of the domain url.
* For example http://img1.drupal.org
* http://img2.drupal.org
* http://img3.drupal.org
* http://img4.drupal.org
* https://s1.amazonaws.com/drupal_cdn
In addition for SEO purposes (prevent double content) : Please update the .htaccess file
In between these two lines:
# RewriteBase /
# Rewrite URLs of the form 'x' to the form 'index.php?q=x'.
* # Parallel CSS - Start RewriteCond %{HTTP_HOST} img1.drupal.org [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
*
* RewriteCond %{HTTP_HOST} img2.drupal.org [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
*
* RewriteCond %{HTTP_HOST} img3.drupal.org [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
*
* RewriteCond %{HTTP_HOST} img4.drupal.org [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
*
* RewriteCond %{HTTP_HOST} s1.amazonaws.com/drupal_cdn [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
# Parallel CSS - End
----------------------------------------------------------------
Comment #31
mikeytown2 commentedInstead of htaccess rules there is an issue for CDN in regards to SEO. It's fairly high on my priority list
#1060358: CDN and SEO as in it might get done in 2 weeks
Comment #32
Peter Bowey commentedRefer #30
"Oh No", not Apache .htaccess rules 'again'.... :(
"tongue-in-cheek"
I use exclusively Nginx, that poor Apache 2.x 'sod' died for me 2 years past (R.I.P.)
Research Reference: http://drupal.org/node/1060358#comment-4333802
Comment #33
chriscalip commented#32
I mean .... I am giving an option for people to use the CDN mapping and cdn_pick_server instead of using parallel_css mapping and logic.
Pretty much a checkbox in the admin settings page of parallel_css
[ YES OR NO ] [X] Use Available CDN Mapping and CDN pick-server
----------------------------------------------------------------
Be sure to read: http://drupal.org/node/962266
-----------------------------------------------------------------
Comment #34
Peter Bowey commentedRefer #33
@chriscalip
Sounds good to me Chris! +1
I got the 'shakes' when I saw that .htaccess 'thingy :)
Comment #35
chriscalip commentedok dokes. going with that option.
Comment #36
Peter Bowey commentedRefer #30 + #31
For those setup's effected by CDN 'duplicate' SEO a partial solutions exists here ->
http://drupal.org/project/files_proxy
Comment #37
mikeytown2 commented@peter bowey
Not the right solution. We need to send out a 404 at a minimum or a 301 ideally if someone tries to access html content on your server through the CDN.
Comment #38
Peter Bowey commentedRefer #37
@mikeytown2
Thanks, I misunderstood the 'doc' reading @ http://drupal.org/project/files_proxy
In nginx.conf, I use something like this for the CDN private back-channel URI path (what the CDN pulls from):
Then something like this on the Drupal PHP side:
Comment #39
chriscalip commentedhey how expensive is it to get several ips and host accounts and just have it rsync? trying to solve my straightnorth.com and imgX pointing to same ip problem.
Comment #40
chriscalip commentedBTW CDN make use of hook_file_url_alter via cdn_file_url_alter --- that function is a beast with user access checks, cdn testing checks and cdn_devel_page_stats stuff. i am currently pretty much copying and pasting the important parts of cdn_file_url_alter -- or i could go the route of calling cdn_file_url_alter... what do you guys think?
Comment #41
Peter Bowey commentedRefer #39
*Same IP's*
That should only be a 'problem' if the router starts blocking packets. See #28
Comment #42
chriscalip commented#41
Router being the router of the users looking at the site or the router of the hosting company of the site?
Comment #43
Peter Bowey commentedRefer #42
A) = Host / Server router
Side-note: Obviously, you 'hire' hosting. I run my own dedicated server -'in-house' :)
All I pay for, is 2 x ADSL2+ 'public' lines / connections (100Gb x 2 - per-month use)...
In your case, I do not think that a professional 'host' company would have 'that issue' with their modern routers!
Comment #44
chriscalip commented#43
thank you.
Comment #45
chriscalip commentedFirst working prototype of cdn integration very basic.
http://drupalcode.org/project/parallel_css.git/commit/6f62c02
pretty much we still have to go to
@ /admin/settings/advagg/parallel-css
Check the box [X] use available cdn mapping and cdn_pick_server of cdn
this doesnt do the following CDN features:
a.) CDN supports HTTPS
b.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for all users.
c.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for authenticated users only.
Comment #46
Peter Bowey commentedRefer #45
http://drupalcode.org/project/parallel_css.git/commit/6f62c02
Looking good so far! +1
Comment #47
chriscalip commentedthis doesnt do the following CDN features:
a.) CDN supports HTTPS
b.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for all users.
c.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for authenticated users only.
These will prolly have to be other issues. Right now, I dont know how to pull these off.
Comment #48
mikeytown2 commentedWhy are you copying the cdn_file_url_alter function? Just require the CDN module and be done with it. Or am I missing something? Run the image references in the CSS through file_create_url or if they are running CDN on a non patched drupal, detect it by stealing the first part of cdn_init() (
variable_get(CDN_THEME_LAYER_FALLBACK_VARIABLE, FALSE) == TRUE) and then call cdn_file_url_alter directly.Have it look something like this
Comment #49
chriscalip commenteddoh! or even better
Incidentally this is the one prone to let the relative urls in ../.. which causes a bug like http://drupal.org/node/1183062
I am hoping that somewhere in the process advagg_build_css_bundle always gets run.
Comment #50
mikeytown2 commentedGood idea!
I've added in the fallback logic on my end so advagg_build_uri() looks a lot like #48 (#1185786: allow for URLs to get CDN-ed even if cdn patch is not applied). As for #1183062: Support for URI (path) rather than Domain, how that works is configurable in the CDN module. The reason why it wasn't working is by default CDN disables it's self on all paths that start with
admin/*; I have a special case to handle those now.Comment #51
chriscalip commentedhttp://drupalcode.org/project/parallel_css.git/commit/1dcee3d Done.
Nice one on the CDN fix!
Comment #52
Peter Bowey commented@chriscalip
@mikeytown2
Nice teamwork Mike and Chris!
@chriscalip, your module may have possibly saved me from moving my D6 Core to D7 (a long story within...) .. :)
@mikeytown2, the updated advagg_build_uri() you applied has made the 'great Code Sun' shine here :)
see http://drupal.org/node/1185786. Great, see my comment above to Chris.
Many thanks for a useful module Chris, additionally - we have learned some 'cool stuff'.
Teamwork = Cool!
Comment #53
chriscalip commentedit was! lets do it again sometime.
Comment #54
chriscalip commented