I am running with Postges (which seems to be the reason behind many of my problems) and the standard template. After some while doing things (not sure what causes it entirely yet - see below) I try and display the home page as an anonymous user, I get a couple of non ascii characters appear in the browser and nothing else.
Deleting the contents of the cache table clears the problem.
Once it has started to occur - any display of the home page causes it to occur other than immediately after deleting the cache. Not sure if this is a timing thing or not.
At the time of simplest occurance (just selecting home from the primary menu when set up as an anonymous user), the following three 'cid' records are in the cache.
filter:1:9dd69af2aee6060e935bf68b4d5e6457
archive:calendar:28-8-2004
http://drupal.home/
Comment | File | Size | Author |
---|---|---|---|
#53 | drupal-head-cache-10407-cb_9.diff | 20.45 KB | Cvbge |
#51 | drupal-head-cache-10407-cb_8-define.diff | 20.44 KB | Cvbge |
#48 | drupal-head-cache-10407-cb_7.diff | 20.13 KB | Cvbge |
#43 | drupal-head-cache-10407-cb_6.diff | 19.46 KB | Cvbge |
#42 | drupal-head-cache-10407-cb_5.diff | 26.62 KB | Cvbge |
Comments
Comment #1
AlanChandler CreditAttribution: AlanChandler commentedI think I have a handle on what the problem is, although I don't yet know the solution.
The "data" column of table "cache" has the data in it stored in compressed form. At the moment is declared as type TEXT. This seems to be only a character oriented type and might be causing postgres to do some form of conversion back to a string. At least some of the data in the database appears to be uncompressed strings.
I thought the solution to this was to convert the data column to type bytea, but just tried this, and although I got a different result it was not correct. Since the manual talks about potentially escaping some bytes in this form too - I am not so sure.
The result I got, was a blank screen with the following (document source and what was seen in the browser).
\037\213\010
Comment #2
AlanChandler CreditAttribution: AlanChandler commentedI have changed bootstrap.inc to store the data on disk in an uncompressed form, and only compress it for transmission to the browser just before outputing it.
I have no idea whether this is an acceptable approach - but it now seems (with a few minutes testing to see that it works) to work fine under Postgres.
Patch attached.
Comment #3
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedI don't like your patch because I put some effort into getting the gzipped cache into core. ;)
The idea is not to save bandwidth (we could use mod_gzip then) but to remove the overhead in creating the gzipped pages anew for each page view.
Can you give me information on both the browser and the server setup? Especially interesting would be anything that uses gzipping techniques itself.
When trying my patch I discovered that I would get problems when changing the data column from "text" to "blob" in MySQL.
Comment #4
AlanChandler CreditAttribution: AlanChandler commentedThis is all at home.
I am running the debian linux (sarge) with apache2, php4 and postgres as the server. Client end is debian unstable. I generally am using konqueror as the browser, although on this one I also tried it with mozilla firefox. I have access to IE under windows 98 or XP if needed.
Comment #5
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedI've prepared a patch which excludes PostGreSQL users from this feature. I'd welcome a better solution though.
Comment #6
AlanChandler CreditAttribution: AlanChandler commentedWhy don't you just merge my patch and yours. That way, we still get to transmit compressed stuff when we can. Here is an attached patch that shows that.
Comment #7
AlanChandler CreditAttribution: AlanChandler commentedI just realised the real implications of your previous comments. I guess what you are saying is that it is more important to save processing time on the server than to save bandwidth by zipping up the contents on the fly just before transmission (which could be an expensive operation if the server has to do it for all users).
In which case, ignore my last patch.
Comment #8
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedI had a similar proble with mysql yesterday. This wasn't a php or mysql error, but a faulty database. The anonymous user was missing. Can you check if you have a uid = 0 user in your users table and that he has in the right role "anonymous user" (rid = 1)?
For some reason people lose this anonymous user occassionally. I'd really like to know why.
Comment #9
AlanChandler CreditAttribution: AlanChandler commentedYes, I have uid 0 in my database - although he doesn't have a name. users_roles links uid=0 to rid=1
Comment #10
adrian CreditAttribution: adrian commentedAssigning to myself.
Comment #11
adrian CreditAttribution: adrian commentedI have a working fix for this, but it's based on my latest postgres compatibility patch (most notably the new includes/database.pgsql.inc). I think i'll roll it into the other patch , but I first want to get the ok from dries.
Firstly, i changed the field to bytea, then what I did was to create a new set of db_api functions :
And then i updated includes/bootstrap.inc to run the data through those functions on save / output.
I also added the functions to database.mysql.inc, but they just return the input. I still need to do the updates.inc change for this, and there prolly needs to be more testing with various versions of postgres/php.
Comment #12
adrian CreditAttribution: adrian commentedlooks like i spoke to soon. i am getting close to the solution tho.
Comment #13
adrian CreditAttribution: adrian commentedThere is a patch that fixes this on the Postgres Compatibility Update issue.
Comment #14
JonBob CreditAttribution: JonBob commentedComment #15
adrian CreditAttribution: adrian commentedThis part of the update patch didnt make it in completely since it caused problems with mysql.
Here is a fix that properly escapes the blob field for mysql aswell as postgres.
I also neglected to add the new updates.inc function that recreates the cache table with a bytea type on the data field.
Comment #16
Dries CreditAttribution: Dries commentedReally, I don't like the fact you have to avoid using the %s and %d directives. It is prone to errors.
Comment #17
adrian CreditAttribution: adrian commentedThe problem with using %s and %d, is that the already escaped data gets escaped again, which kills postgres.
When attempting to use %s and %d for everything BUT the data field, i ran into cases where there was a '%' in the data field, which once again .. crashed sprintf completely.
I was discussing this on #drupal, and steven suggested i run the db_encode_blob through str_replace('%', '%%', $data) to make it sprintf safe. Hence, apart from the data field, all the other fields are using %s/%d in my latest patch.
There was a suggestion that we implement our own sprintf for use in db_query , whereby we can have %s , %d and %b (blob) types, and then we can even add the "'"s to strings automagically. This would however require us to update every db_query statement , so it was shelved as an idea to fix this for drupal 4.6.
Comment #18
adrian CreditAttribution: adrian commentedComment #19
Dries CreditAttribution: Dries commentedStill, it is this kind of special cases that - when added up - make it impossible to understand the code. I'll give it some more thought.
Comment #20
Steven CreditAttribution: Steven commentedI've coded a replacement db_query here:
http://www.acko.net/dumpx/dbquery.phps
As Adrian already said, it automatically adds quotes to blobs and strings, which means we could make queries a bit simpler and avoid double-quote usage. Making quotes part of the %s/%b placeholders really makes sense from a db abstraction point of view: you just signify "I want a string" without worrying about how it gets included.
It doesn't do any of sprintf's advanced formatting (like '%02d'), but I don't think they're used anywhere.
Given that it's 100% sure this code will be cleaned up for 4.6, I think it's okay to include that messier code for now.
Comment #21
adrian CreditAttribution: adrian commentedYes, what steven said. Essentially, I don't like the solution either, but it's a critical fix, and it's a special case of exactly 2 queries.
Here is an updated patch with a comment stating that.
Comment #22
adrian CreditAttribution: adrian commentedThis patch created the http://drupal.org/node/10407 bug, when using mysql.
Hence it's not ready for primetime yet.
Comment #23
joemc91 CreditAttribution: joemc91 commentedI just ran into this problem tonight and have not been able to diagnose it. My website (www.joepilot.net) is not updating the cache properly. It updates it, but unserializing the data fails, so somewhere in the encoding process, the data is broken.
My assumption is that the addslashes command in check_query breaks it, but I cannot imagine why since I've been using the program for a few days without error till this afternoon.
This problem only began occuring earlier today. It also may be that I added a bad (read single quote) into the variables section. This is actually the only cache item that is broken right now. I can also add nodes, but I cannot delete them (?). I had this problem a few weeks ago, but I don't remember how I fixed it. Thanks for your help.
Comment #24
adrian CreditAttribution: adrian commentedThis has been fixed in the latest postgres update.
the problem was that serialized data was being butchered when inside the query , due to table prefixing (ie: s:{1:'string'} was turned into s:1:'string'. ... thanks steven).
This doesnt place the query in there, but uses addcslashes to make the binary data safe for a text field (not bytea anymore).
Some storage space is being wasted, but until we can implement a proper solution (for 4.6), i feel this is the best we are going to get.
Comment #25
(not verified) CreditAttribution: commentedI have been experiencing this problem on my setup as well. Although it is a little different configuration than yours.
I am running:
MacOS X 10.3.6
Apache 1.33 (w/ PHP 4.3.9)
MySQL 4.0.18
Drupal 4.5.1 (happened LESS frequently in 4.5.0 btw)
Every once in a while, (I believe) it caches a blank page, and the only solution at this moment is to run a "delete from cache;" statement in the mysql console.
Also, when I had zlib enabled (via php.ini), it would cache a page, and display wierd ascii characters (as if editing a compiled program with a text editor). I had disabled zlib (via php.ini) in order to prevent this, but it seems the real problem is with the cache system.
This morning, I disabled the cache in hopes to prevent this further outage. If anyone has any suggestions, please make them 8) ..
Comment #26
Cvbge CreditAttribution: Cvbge commentedHello, the problem with cache still exists, both in 4.6 and cvs :(
I'll be talking about postgres of course :)
Let me describe the problem again: function page_set_cache() compresses the page and stores it in the cache table. The data stored is a normal binary data and as such it has to be stored in a 'bytea' type column.
Before storing it has to be escaped, for example using pg_escape_bytea() function. After retriving from a db it needs to be unescaped with e.g. pg_unescape_bytea().
page_set_cache() uses cache_set() to save the data, which in turn uses normal db_query() to insert it... And here's the problem:
INSERT ... VALUES ('$data')
because db_query() runs db_prefix_tables() on the query and {} will be removed thus breaking the datadb_query("INSERT ... VALUES('$s'), $data)
because db_query() runs db_escape_string() on it's arguments which will also break the dataSo, how can this be overcome? I see following solutions:
As I read this caching mechanism was intended to save cpu cycles by not compressing the page every time. So I don't know if encoding the data won't defeat the purpose of the cache...
What do you think?
Comment #27
Cvbge CreditAttribution: Cvbge commentedI'm not sure if you can do such conversion....
Comment #28
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedCvbge: I'd prefer to allow %b for db_query. What are the difficulties you see?
Comment #29
Cvbge CreditAttribution: Cvbge commentedHere's a patch implementing the first way (putting $data in the query body). Probably mysql-part needs some changes so it escapes the $data correctly.
After applying the patch everything works for not anonymous users, as previously, because cache is not used for them.
Now for anonymous users:
a) client does not support compressed content (e.g. wget)
- When getting a page for the first time I get
<b>Notice</b>: Undefined index: messages in <b>.../includes/bootstrap.inc</b> on line <b>772</b>
at the begining of the page, then normal page- When getting a page for the 2nd or further time (a page is cached) I get:
And also after the page:
<b>Notice</b>: Undefined property: cache in <b>.../includes/session.inc</b> on line <b>45</b><br />
b) if client does support compressed content:
- when getting a page for the first time: everything is ok
- when getting for the 2nd or further:
Then goes compressed html page
Comment #30
Cvbge CreditAttribution: Cvbge commentedThis one seems to work, for postgresql at least.
Comment #31
Cvbge CreditAttribution: Cvbge commentedFixes: added %f (search.modules uses it) and %%.
Comment #32
Cvbge CreditAttribution: Cvbge commentedThis version uses preg_replace_callback. It looks nicer then the previous one but have ugly $args initializing...
Comment #33
Cvbge CreditAttribution: Cvbge commentedA similar problem, it seems, exists in mysql: http://drupal.org/node/16998
Comment #34
Cvbge CreditAttribution: Cvbge commentedI've done some speed tests, similar to those for "extend db_query".
My interpretation of the tests is:
1. there is not much difference between callback and foreach versions. Thus I'd suggest callback version as it's nicer to read.
2. patched version is a bit slower then not-patched, but it's not much and, as I wrote in http://drupal.org/node/17656#comment-47980, should not be a reason to drop this patch.
Results:
callback version, mysql:
callback version, pgsql:
foreach version, mysql:
foreach version, pgsql:
original version, mysql:
original version, pgsql:
Comment #35
Cvbge CreditAttribution: Cvbge commentedImproved documentation, fixed db_queryd().
Comment #36
Cvbge CreditAttribution: Cvbge commentedClicked submit too fast :(
Comment #37
Cvbge CreditAttribution: Cvbge commented_db_query_cb() was missing return; for init code. This should fix "unexplained" default calls.
Comment #38
Cvbge CreditAttribution: Cvbge commentedThe weird notices I was getting occur also on fresh cvs with mysql. No patches, no postgres. Just vanilla drupal HEAD with MySQL.
I'm not sure how to reproduce this, but I have with cache and clean urls enabled and I access front page with Links or Opera as anonymous user I get:
When I clicked on a node I didn't get any notices for the first time, when I refreshed I got the notice about time index. The same when clicking on "Create new account" (user/register). This suggests that this happen when the page is loaded from cache.
What is weird is that I don't see this messages when I access the page from Firefox!
Disabling clean urls does not change anything. Applying my patch does not change anything.
Disabling cache of course help.
As I have proven 'weird notices' are not connected with postgresql nor with my patch.
Comment #39
Cvbge CreditAttribution: Cvbge commenteddb_query_temporary() might also need to be changed to support %b, but that should be easy, it's another clone of db_query().
I'll wait with the patch untill prefixing temporary tables is decided, as those 2 patches would change the same lines.
Comment #40
dopry CreditAttribution: dopry commentedI did a cursory test using mysql with a clean head install... Set cache on, loaded some lorem ipsum with devel module... logged out... didn't see any problems as anonymous, or logged in as an autheticated or super user..
did not that I the login block doesn't change after login, until another refresh occurs, but was consistant after removing patch so its probably user.module related... I'll head off that way now...
.darrel.
Comment #41
Cvbge CreditAttribution: Cvbge commentedI've tested that this bug also happens on fresh cvs with mysql as a db... also I've seen previous bug reports about this bug.
Comment #42
Cvbge CreditAttribution: Cvbge commentedFinal patch I hope.
Changes:
Becaus of new files you need to apply the patch with
patch -p1 < thepath.diff
I've done some simple tests for mysql and postgresql and I'm going to test it with generate scripts from devel module, but would be good if mysql people could test it too.
Comment #43
Cvbge CreditAttribution: Cvbge commentedPut common functions back in separate files.
Comment #44
Cvbge CreditAttribution: Cvbge commentedI've done some performance testing for 4.7: with cache/without cache/with patch/withoutpatch with concurncy 1 and 5. Raw results are at http://cvbge.org/mb/
The test was done on my local machine, linux, 256M memory, Celeron 850MHz, Apache 1.3.33, MySQL 4.1.11, no other activity, 3 runs of each test.
Preparation:
1. download fresh cvs, put into 2 dirs, patch one of them
2. create 1 admin user
3. add, enable devel module
4. run generate-taxonomy.php, generate-contants.php
5. enable all modules (except legacy)
6. enable all blocks
7. grant all rights for anonymous user (except rights for devel module)
Quick comparison, only concurency 1:
Comment #45
Cvbge CreditAttribution: Cvbge commentedForgot to add: only front page was tested; PHP version 4.3.10.
Comment #46
Dries CreditAttribution: Dries commentedAnyone that disagrees with this patch, or the numbers reported. I think we have no choice but to commit this patch, unless the performance regression is not acceptble.
Comment #47
Dries CreditAttribution: Dries commentedI'd reorder the case statements in the switch-block such that the most common path is the fastest path. %d should probably at the top, followed by %s. %% shouldn't be first. It can cut down the number of comparisons, and slightly improve performance.
What is this code for?
Comment #48
Cvbge CreditAttribution: Cvbge commentedSetting to RTBC, maybe more people will look at it.
I've written small script and counted all %foo modifiers, here are the results:
If anyone wants the script:
I've changed the order, but I don't expect big performance boost... I think I'll do new tests again tonight.
The code you cited allows to pass an array of all arguments. The same behaviour exists in current code, only coded differently, i.e.:
I've changed some documentation to be like the old one (different functions had differently worded documentation; not all has been updated)
Comment #49
Cvbge CreditAttribution: Cvbge commentedI've compared the cb_6 and cb_7 performance and seen no difference between them... Raw results are on http://cvbge.org/bm2/ if anyone is interested.
Comment #50
Dries CreditAttribution: Dries commentedI guess that means the overhead is due to preg_replace_callback.
Instead of writing
We can probably write:
It's not tested but it looks like it might be slightly faster, and a tad more elegant? Depending on PHP's internal implementation of
count()
(does it store the size in the array header?), this might or might not make a small difference.Maybe
"/(%%|%s|%d|%f|%b)/"
(double quotes) can be written as'/(%%|%s|%d|%f|%b)/'
(single quotes)?Comment #51
Cvbge CreditAttribution: Cvbge commentedStupid httperf...
Comment #52
Cvbge CreditAttribution: Cvbge commentedRaw ab logs at http://cvbge.org/bm3/ (strpos and reference - additional versions that check for % before calling preg_replace and pass $args to preg_replace as reference)
Comment #53
Cvbge CreditAttribution: Cvbge commentedDRUPAL_DB_QUERY_REGEXP -> DB_QUERY_REGEXP, undo one-liners
Comment #54
Dries CreditAttribution: Dries commentedCommitted. Thanks.
Comment #55
(not verified) CreditAttribution: commentedComment #56
chx CreditAttribution: chx commentedIt's more than a bit late, but we have sacrificed quite some speed here -- for MySQL users, the majority of Drupal users, there is no gain from the new system.
I will submit a patch which removes %b support and moves the necessary SQL statements cache_set / cache_get to ??sql.inc.
Comment #57
beginner CreditAttribution: beginner commentedComment #58
magico CreditAttribution: magico commented@chx: any news about this? Is this a 4.7 only bug? We have some new cache system things in HEAD...
Comment #59
magico CreditAttribution: magico commentedstrange... this was already set to HEAD! But the submit failed...
Comment #60
Gábor HojtsyThe performance issues are handled here for 6.x: http://drupal.org/node/171728
I set this fixed because the original issue was fixed here.
Comment #61
(not verified) CreditAttribution: commented