hi
this weekend I went ahead and upgraded drupal from 4.4 to 4.5 on a machine, I dint have time/priority (in Dutch these words are almost the same :-) sooner.
The core went fine and as usual lots of unpacking, configuring for all the modules I want to experiment with. After some time I tried to access the site in a -just-to-be-sure- moment from windows machine with an IE browser without being logged in. I found some strange behavior. Sometimes IE wanted to download the file, sometimes it showed the file (being a binary) and sometimes I got a valid page. I never got this from my firefox on both windows and Linux while being logged in. Later I could reproduce this on a firefox browser without being logged in.
After some time / thinking I think I found out that both apache (my webserver) and drupal were compressing (gzipping) the page resulting in a browser that was able to uncompress the page once. But if it was a cached page and drupal was zipping this and Apache as well, the browser was just displaying a gzipped page.
I have a rather old setup with an ancient version of apache but does this sound possible? Since I am hosting (some) other content outside the drupal dir, I would like to keep apache gzipping pages so is there an option to stop drupal from compressing pages?
Comment | File | Size | Author |
---|---|---|---|
#47 | gzip_3_0.patch | 1.92 KB | killes@www.drop.org |
#46 | gzip_3.patch | 2.05 KB | Richard Archer |
#44 | gzip_htaccess.patch | 436 bytes | Richard Archer |
#43 | gzip_2.patch | 1.62 KB | Richard Archer |
#39 | gzip_1.patch | 2.35 KB | killes@www.drop.org |
Comments
Comment #1
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedWhich method use your Apache to gzip the pages? Which version is it?
I find it strange that we get this report only several weeks after the release. I suspect that your particualr set up is borken.
Comment #2
bertboerland CreditAttribution: bertboerland commentedThe reason I waited so long was that I wanted others to find out this :-) Mind you, I *think* it is the fact that they are both compressing the cached pages. You might want to try it out for yourself by going to http://willy.boerland.com/myblog and in case a there is a cached page you will end up by downloading a binary that once renamed to a gz and unpacked, will give you the html file.
drupal 4.5
apache: 1.3.26 (with *all* security patches)
php: 4.2.3
Content-Encoding: gzip
And yes, since I quess 100+ sites are already at 4.5 I think it is related to my setup. However, I understand correctly there i no way of having drupal stop sending out compressed pages? There is no configuration option?
Comment #3
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedYes, I get funny garbage, too.
The apache version you run is the one Debian provides. It is still used very often, but still nobody but you seems to experience the problem, so I still think that somehow your setup is broken. Apache should recognize that the page is already gzipped. Do you use mod_gzip or libz?
Comment #4
bertboerland CreditAttribution: bertboerland commentedSince this one is more or less critical for me but not for the RotW, re-setting priority to "nomal". I am using zlib btw. Did you try to save the file and unzip it? And I am correct that there is no option to disable cached paged from being zipped?
Comment #5
bertboerland CreditAttribution: bertboerland commentedI think I "solved" this for me by changing the bootstrap.inc
setting this to postponed, if no-one else experiences this problem within 3 weeks I will close this issue.
Comment #6
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedI tried to download it now, but you must have changed something, the page renders fine.
You are right, there is no option to disable gzipped cache. I'd really like to discover the reason for this problem. Normally, zlib should check for gzipped content and do not gzip it again. Drupal sends the appropriate header.
Comment #7
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedLocking could have prevented this. ;)
Comment #8
axel@drupal.ru CreditAttribution: axel@drupal.ru commented> I found some strange behavior. Sometimes IE wanted to download the file, sometimes it showed the file (being a binary) and sometimes I got a valid page.
Sometimes I got same thing in Mozilla Firefox with differrent sites (not sure, but not only with Drupal) when I work trough my local proxies (I use chain of two proxies on my Debian box - one for banner deleting (privoxy) and second for offline page caching (wwwoffle)). I don't dig to details of such behaviour, but simply switch off proxies and reload browser's page help to solve problem - then I again switch on proxies and page loaded ok.
May be you also access site through proxy? Then you need to check this proxy settings, I think.
Comment #9
bertboerland CreditAttribution: bertboerland commentedAxel,
no it was not related to proxys or even the browser. It was related to cached pages being zipped by drupal and sent to apache that was ziping the page as well, resulting in a zipped zipped page for the client that only once unzipped the page and hence displayed a binary
Comment #10
bertboerland CreditAttribution: bertboerland commentedIt seems like I was not the only one. There is not enough information about the others setup (please post here!) to find a generic cause, but it is sure that some setups will cause zipped pages to be zipped.
Comment #11
ixis.dylan CreditAttribution: ixis.dylan commentedHas anybody found a better way to fix this problem? Disabling the storage of compressed cache pages in bootstrap.inc works, but it's ugly.
Is an uncompressed cache likely to take up a lot of space in the database? If not, perhaps a global "gzip/zlib compression" option for Drupal would be simpler.
Comment #12
(not verified) CreditAttribution: commentedAs author of the gzipped cache patch I'd really appreciate if the people who experience this problem could supply detailed(!) info on their setup and the headers sent both by the browser and the server. I'd like to fix the problem (if there is one on Drupal's side) or at least document correct server settings if the problem is there.
Comment #13
bertboerland CreditAttribution: bertboerland commentedHere some info, please ask if you need more:
apache 1.3.26
mmcache (version unknown)
drupal 4.5.1
Note: I didnt post (and thought abaout!) the MMcache between drupal and apache.
The configuration of mmcache is
extension="mmcache.so"
mmcache.shm_size="16"
mmcache.cache_dir="/tmp/mmcache"
mmcache.enable="1"
mmcache.optimizer="1"
mmcache.check_mtime="1"
mmcache.debug="0"
mmcache.filter=""
mmcache.shm_max="0"
mmcache.shm_ttl="0"
mmcache.shm_prune_period="0"
mmcache.shm_only="0"
mmcache.compress="1"
headers:
http://willy.boerland.com/myblog/
GET /myblog/ HTTP/1.1
Host: willy.boerland.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: PHPSESSID=307f07c169e3bbcbe92d150026c97583
HTTP/1.x 200 OK
Date: Sat, 05 Feb 2005 16:45:49 GMT
Server: Apache-AdvancedExtranetServer/1.3.26
X-Powered-By: PHP/4.2.3
Content-Encoding: gzip
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
----------------------------------------------------------
Comment #14
coma-1 CreditAttribution: coma-1 commentedI have exactly the same problem my setup it's currently apache 2.0.52, php 4.3.10 and drupal 4.5.2.
The relevant info it's:
from apache:
In php.ini:
And I can assure you that my problem was that last element, changing zlib.output_compression to off solved the problem, and the output of all php scripts (not only drupal) it's compressed by either drupal or apache.
You can test this things with wget:
The file wget drops should be a gzippped html, if you do a zcat and get a plain html it's all good, if you get another gzipped file it has been double compressed.
Comment #15
moshe weitzman CreditAttribution: moshe weitzman commentedi too am seeing strange behavior here. specifically, php is simply dieing after printing out the $cache->data in drupal_page_header(). truly, i don't think drupal should be storing gzipped cache, nor do i think it should perform gzip at all. these are better handled at php or apache layers.
Comment #16
beate_r CreditAttribution: beate_r commentedApparently i am observing a new variant of that problem in one of the subsites of my 4.6 test installation: after enebaling cacheing, lynx, w3m and the gecko based browsers display garbage. lynx tells me it wants to load a file named index.html.gz.
wget http://beate/drupal46/ on that site gives a readable html page.
w3m -dump_head tells me:
Apparently, the site sends uncompressed pages but tells the browser it is compressed.
BTW, links2 and dillo do disply the page but will not allow me to log in and disable cacheing.
Any clues?
Thanks
Michael
Comment #17
beate_r CreditAttribution: beate_r commentedPerhaps I should mention that
zlib.output_compression = Off
in my php.ini
Comment #18
bertboerland CreditAttribution: bertboerland commentedApparently i am observing a new variant of that problem
Please open a new bug report, since your problem is likely unrelated to this problem, open an new problem and you might want to use a link to this problem, but dont use this problem for other than descibed.
Comment #19
beate_r CreditAttribution: beate_r commentedno, it is not unrelated. After some investigation, i found out that the garbage is being stored in drupals cache. This has been reported in this thread.
My present workaround is to completly disable drupals caching (and to manually delete the compressed pages from the database.)
Whats the use of letting drupal compress pages and store these int its cache table? Just saving a few cycles of CPU time? On the cost of letting the web server and its client decide wether to transfer compressed or uncompressed data - they have the protocol to do that, drupal has not.
Michael
Comment #20
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedWe save cpu cycles and storage space. Apache only needs to check the Encoding header which will be easy. I do not know why zlib output compression doesn't check those headers for some (or all?) people.
Comment #21
onedrupaluser CreditAttribution: onedrupaluser commentedI have the same problem with Drupal 4.6.1: If the site delivers a cached page it is gziped twice and the result is garbage in my browser window. I tried two differnet browsers: Firefox and Opera and it's the same problem with both of them.
I recorded the HTTP headers:
- Cache turned on
- zlib-compression turned on
- Result: Garbage
Now I turned off compression by setting in my .htaccess
and the result is OK:
Turning off the cache is a solution as well. As soon as cache and zlib are operating the result is garbage.
If you need any other information just let me know. I can turn on compression at any time and test other solutions.
Chris
Comment #22
aries CreditAttribution: aries commentedThe same. Interesting, only one page doing this, all others are not.
--
Aries
http://aries.mindworks.hu
Comment #23
fagoquite strange.
i never had problems with this, then i upgraded from 4.6.1 to drupal 4.6.2... now i must turn off the cache, or i've the troubles mentioned here... (i haven't touched the apache or php config in the meanwhile)
Comment #24
bertboerland CreditAttribution: bertboerland commentedyet another one, closed than one, raised importance here.
Comment #25
bertboerland CreditAttribution: bertboerland commentedkilles sigested that is want apache that compressed the page but drupal and php was the first one. after some testing, i found out he was correct!
to solve this
1) either comment out the gzip lines in bootstrap.inc
2) put /etc/php.ini "zlib.output_compression = On" to OFF
(real) Fix needed. Drupal should look if the page was zipped in the first place....
Comment #26
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedHere's a patch that needs some testing.
Comment #27
Dries CreditAttribution: Dries commentedWhy do we set $do_cache to TRUE in the nested if's? $do_cache is already set to TRUE higher up so we're just overwriting TRUE with TRUE?
Also, can we rename $do_cache to $cache? That is more Drupal-ish.
Does that solve all apache-Drupal-caching problems or only some? I vaguely remember we agreed to add a setting, because that was perceived the only True Solution.
Comment #28
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedI've updated the patch.
The patch is intended to cure all the problems we had. It needs testing, though.
I don't recall to agreeing that a setting would be needed.
Just for the record: Steven measured the ratio of serving gzipped cached pages to re-unzipped pages on drupal.org to be about 5.1:1. That is from 6 page requests which we serve from the cache only one needs the cache to be unzipped. This is probably due to crawlers who only speak http 1.0.
Comment #29
bertboerland CreditAttribution: bertboerland commentedpeople who expierenced this problem, please try the patch of killes and report here if successful or not. i will patch as well
Comment #30
moshe weitzman CreditAttribution: moshe weitzman commentedIn my opinion, this is a case of misplaced optimization. Very little CPU is required to gzip a document. Why else would Apache and PHP provide an option to do this *for every page*. PHP knows that all its pages will be dynamic yet it still offers this option.
My recommendation is this feature entirely, and take the win in simplicity/code reduction. Furthermore, we won't have any more issues like this one, which has lingered unfixed for 9 months.
Admins who want gzip will elect to do so in their php.ini or .htaccess.
Comment #31
Arto CreditAttribution: Arto commentedIf the caching mechanism is to be patched, that'd be a good time to also tackle the problem that it's currently unusable on PostgreSQL: http://drupal.org/node/26369
Comment #32
fagoi don't have this issue any more.
since i'm using a new server running apache2 with deflate instead of apache1 with mod-gzip everything is fine.
unfortunately, i've no access to the old server (for testing) any more.
Comment #33
Dries CreditAttribution: Dries commentedWe talked about this at OSCON and it was considered a good idea to remove the gzip compression from core. This should fix all gzip-related issues. People with big Drupal sites are likely to have access to Apache settings. For small Drupal sites, it is up to the hosting company to tune their machines.
Comment #34
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedNo idea who you talked to, Moshe wasn't there. ;)
Apparently those people don't understand that the Apache functionality and this feature aren't the same
Anyway, since I do not need this feature (no high traffic sites), I've marked this won't fix. Somebody should open a task to remove it.
Anybody who finds me working on features that I don't need or don't get paid for should kick me really hard.
Comment #35
Gábor HojtsyDevel list shows interest in this patch, review required.
Comment #36
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedPositive test over here:
http://drupal.org/node/32494
Comment #37
kbahey CreditAttribution: kbahey commentedReported to work in http://drupal.org/node/32494 (which is a duplicate of this).
Comment #38
Dries CreditAttribution: Dries commentedPatch no longer applies against HEAD.
Comment #39
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedThere was some trailiing cruft.
Comment #40
smorrey CreditAttribution: smorrey commentedI just installed Drupal last night.
I installed it on top of a site already running Zend with full optimization in place.
My whole website was garbled, looked like a binary had opened right in my browser.
Found a db table called cache and emptied it, problem solved for 1 anonymous page view (problem doesn't occur at all on folks who are logged in).
Cleared cache again, got myself logged in with that single page view, went to the settings disabled cache.
No more problems.
Was going to file a new bug report last night and got too tired to think straight, so I waited until today.
Did a google search on Garbled Output Drupal, found this thread. So I figured I would post my experiences.
Turning off caching in Drupal fixes the problem. Honestly Drupal should not be trying to handle caching on my setup anyways.
Comment #41
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedPlease only change status with a good reason.
If you don't need caching, then you shoudl be fine with or without that patch.
Comment #42
Dries CreditAttribution: Dries commentedLooking at the code, it seems like we're storing non-gzip'ed data when gzip is not available, and that we're storing gzip'ed data when gzip is available.
How does the client know that the data is gzip'ed? Does it look for a header at the beginning of the data/payload? I guess it does.
With that in mind, if zlib_get_coding_type() evaluates to 'deflate', we don't cache at all. Why is that? Can't we cache the non-gzip'ed version of the data?
Comment #43
Richard Archer CreditAttribution: Richard Archer commentedIf zlib_get_coding_type() evaluates to 'deflate' then the page contents have already been compressed by PHP using the deflate algorithm. For this to happen PHP would have the zlib.output_compression directive enabled and the browser would have requested deflate in preference to gzip. The ob_get_contents call then returns the contents of the page in deflated form.
If this happens, the data is not stored in the cache, but just sent to the client in that format.
Data in the cache has to be either all compressed with gzip if gzencode is available or uncompressed if it is not. This is because when data is retrieved from the cache it is assumed to be in gzip format if gzencode is available.
Two little problems in this patch:
Introduction of the $do_cache/$cache variable is unnecessary. All the logic can be handled by a the conditional structures.
I also notice that in the case where the gzencode call fails and returns false $data is still stored in the cache. In the event of a zlib error the $data should not be stored.
Here's a re-rolled patch that addresses these two little issues.
This patch doesn't do anything to solve the problem of double-gzipped data being sent to a browser in the event that Apache is configured to compress pages with mod_gzip or mod_deflate. In this case the problem can be resolved by adding some extra directives to Drupal's .htaccess
# Disable Apache's gzip compression of pages by mod_gzip
mod_gzip_on No
# Disable Apache's gzip compression of pages by mod_deflate
SetEnv no-gzip
Comment #44
Richard Archer CreditAttribution: Richard Archer commentedSorry, the HTML entities got eaten in the .htaccess directives of my last post.
Here they are in a patch :)
Comment #45
Dries CreditAttribution: Dries commentedI still think it's worthwhile to disable Drupal's gzip caching, rather than using a .htaccess-based solution. I'd rather have my static CSS files gzip'ed. That said, this patch looks like an improvement so if Gerhard is cool with it, I'll commit it to HEAD. Good work Richard.
Comment #46
Richard Archer CreditAttribution: Richard Archer commentedAnd sorry again... my gzip_2.patch is completely broken. I don't know what I was thinking!
The patch gzip_1.patch is actually very nice.
Here's a new one, almost the same as the original but with a couple of spelling errors fixed and it checks that $data contains data before storing it.
Comment #47
killes@www.drop.org CreditAttribution: killes@www.drop.org commented@Dries: There us another issue about compressing css files. http://drupal.org/node/11128
@Richard: Thanks for looking at the patch.
I am ok with the Drupal part of the patch. it is more elegant but contains the same typo in the docs ;p. I am however not ok with the htaccess part of the patch. The problems with gzipped cache pages only occur when running zlib mode as a php component. When running mod_gzip (and presumably mod_delate) as apache modules there is no problem. I suppose that those check the headers before zipping anything. Also, we only cache pages for anonymous users. As I see it, the htaccess change would send uncompressed data to all logged in users.
Comment #48
killes@www.drop.org CreditAttribution: killes@www.drop.org commented*rofl*
Anybody for node locking in Drupal?
This patch should be committed:
http://drupal.org/files/issues/gzip_3.patch
Comment #49
Dries CreditAttribution: Dries commentedCommitted to HEAD.
Comment #50
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedYaY!
I hope you meant "fixed".
Comment #51
Richard Archer CreditAttribution: Richard Archer commented@Dries:
Storing gzipped data in the cache has several key benefits:
And after all, that's what caching is all about... saving resources like CPU, RAM and disk I/O.
I think this patch will have resolved the instability of the gzipped cache feature of Drupal. This buys some time to consider the "bigger picture" surrounding this issue. I shall give some thought to the other issue of CSS compression. My initial thought is that that on a properly optimized Drupal installation, Apache compression would be off, PHP's zlib.output_compression would be off and Drupal's compressed caching would be used exclusively.
@killes:
I like your @ notation... very slick.
I spent a lot of time testing different Apache/PHP/Drupal gzip configurations yesterday and I'm sure that Apache/PHP can be (mis)configured to double-compress. For example, on my apache-2.0.53/php-4.3.11 system the following entries in a .htaccess file result in all PHP script output being double-compressed:
But this is not going to be a problem for Drupal because it would be very rare that Apache's compression would be enabled on a system on which the Drupal admin had no control over this directive.
I posted my .htaccess config to this thread just so the solution to that side of the problem is in the archives in case someone searches. (Do people still search for solutions to problems before posting a support request? ;)
Comment #52
(not verified) CreditAttribution: commentedComment #53
ixis.dylan CreditAttribution: ixis.dylan commentedThis isn't fixed.
Comment #54
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedThis is fixed in the cvs version that will become Drupal 4.7.
Comment #55
(not verified) CreditAttribution: commentedComment #56
nbd CreditAttribution: nbd commentedAfter gzip_3.patch is applied, what are the correct settings for-
PHP.ini -
output_handler = ob_gzhandler ?
zlib.output_compression = Off/On ?
httpd.conf -
AddEncoding x-compress .Z ?
AddEncoding x-gzip .gz .tgz ?
AddType application/x-compress .Z ?
AddType application/x-gzip .gz .tgz ?
Any other settings ?
Doesn't the compression settings outside of Drupal have effect on images etc?
- Nir
Comment #57
nbd CreditAttribution: nbd commentedI've implemented the patch and I still get double gzipped files (corruption on screen for some nodes+ IE wants to download file.gz).
Small chance that it's changes that I did to the cache code but more probably it may be a problem with the patch.
Is there something about the settings that may effect the patch ? I tried both (each seperately)
output_handler = ob_gzhandler
and
zlib.output_compression = Off
zlib.output_compression_level = -1
And I see the problem with both.
Thanks,
Nir
Comment #58
deadmalc CreditAttribution: deadmalc commentedI'm using drupal 4.7.2 and I am still having the same problem, I'm sure this was working in 4.7.0 has the patch been reverted?
Thanks,
Malcolm
Comment #59
beginner CreditAttribution: beginner commentedComment #60
jacauc CreditAttribution: jacauc commentedaaah ok, excellent... only stumbled upon this now.
I was playing around with the "output_handler = ob_gzhandler" setting and all my pages also came down as .gz files.
I am using a CVS version which might be a week old. So the issue is defnitely still there.
Thanks
jacauc
Comment #61
cobrasound CreditAttribution: cobrasound commentedI am experiencing this issue too. If I disable the cache in Drupal it seems to go away.
Gentoo Linux
Apache v2
PHP5
Drupal 4.7
MySQL
Comment #62
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedAnybody reporting this issue should look at:
SetOutputFilter
and
php_value zlib.output_compression
and also at the value of the php output handler.
and report those here.
Comment #63
deadmalc CreditAttribution: deadmalc commentedI've just tried this on fc5, and using drupal 4.7.3. I cannot get it to fail, no matter what I set the compression to
(either zlib_compression or ob_gzhandler) it works fine. This is editing the php.ini file and restarting apache.
Maybe this is something to do with when you are using virtual hosts or a .htaccess to override
I'll have a look into this and post my results.
I can't actually use the caching as my primary website uses ecommerce and has a shopping cart which needs to be updated (obviously) and using the caching setting causes the shopping cart to be always cached (as I allow anon. purchases), but I have plenty of other sites I can test on.
It would be nice if you could cache only certain pages, and also have a cache setting on blocks and content which would prevent certain pages from being cached (e.g. if the shopping cart is active then do not cache pages)
Thanks,
Malcolm
Comment #64
jacauc CreditAttribution: jacauc commentedIs that fedora?
I've also had the doubly zipped pages experience before and the major problem was with IE.
Comment #65
dopry CreditAttribution: dopry commentedThis issue was previously fixed and closed. If the issue emerges again please open a new issue and include you php.ini settings, drupal version, and apache version. This seems to be installation specific as it is not happening for many users so is probably a local misconfiguration if it is still occuring. Again please do not reopen closed issues. Create a new one and indicate if you believe it is a regression of a previous issue.
Comment #66
piotrdesign CreditAttribution: piotrdesign commentedI turned my compression off and www.granturismo.pl still doesnt work. I got redirected from this thread: http://drupal.org/node/86360
Comment #67
piotrdesign CreditAttribution: piotrdesign commentedI turned my compression off and www.granturismo.pl still doesnt work. I got redirected from this thread: http://drupal.org/node/86360
Comment #68
piotrdesign CreditAttribution: piotrdesign commentedMy server got updated, and my site started loading once again. :)
Comment #69
Caleb G2 CreditAttribution: Caleb G2 commentedI've made a new post, per the request above in order to reopen this issue which is still present in Drupal 5.1 as of today. Please add any new comments about this issue there:
http://drupal.org/node/121820