With the upcoming handlers patch we will be able to delegate page caching to a different mechanism. But even now, if you use, say, memcached you do not need the database for page caching. That, combined with the already-in session cookie trickeries allows us some simplification. Note that I took away the skipping of boot/exit hooks from the admin UI, those who know enough to make that choice can remove a single # from settings.php.
Comment | File | Size | Author |
---|---|---|---|
#156 | page.patch | 666 bytes | floretan |
#153 | page.patch | 5.34 KB | chx |
#151 | page.patch | 4.3 KB | chx |
#149 | page.patch | 16.6 KB | chx |
#144 | page.patch | 15.46 KB | chx |
Comments
Comment #2
catchsubscribe, looks nice on first thought. Can we have an upgrade path for the aggressive caching variable?
Comment #3
Dave ReidComment #4
sunchx_: TBH, I never used CACHE_AGGRESSIVE on any production site, because the code in bootstrap looked scary to me (though I used Boost module, which is probably more or less the same), so I really like the removal/replacement of it.
chx: I guess the default should be FALSE here, right? function page_get_cache($retrieve) {
chx: That's the reason why HEAD fails to install
Is it the patch I've applied, or does the password strength validator not work on HEAD during install?
wooohooo! Fatal error: Call to undefined function module_load_all() in \includes\common.inc on line 2979
Why on earth would module.inc not be loaded during install?
chx: Something else must go wrong. Cached pages for anonymous users take 400-500 ms per request. Without the patch, it's ~100 ms.
chx: The more I work on this patch and face *awkward* fatal error messages, the more I think we need to load module.inc unconditionally in D7 now. module_load_all(), module_implements(), etc. all lead to fatal errors in unexpected scenarios/failures.
chx: Luckily, simply adding require_once DRUPAL_ROOT . '/includes/module.inc'; to index.php seems to fix all of this awkwardness.
And, yes, I'm continueing this monologue until chx reappears.
chx: It seems we still need something like early_page_cache, since the order of bootstrap phases is wrong now. variable_get('cache') does not work in second phase, since variables are not yet set up. It probably would, if $conf['cache'] was defined in settings.php, on high-performance sites.
chx: For whatever reason, my anonymous user still gets a $_COOKIE[session_name()] after logging out.
Status quo:
- Introduced page_deliver_cache(), since it's needed in more than one bootstrap phase now.
- Cached pages are served not from cache. 500ms instead of 100ms. Most probably because my anonymous user gets a session_name() cookie for whatever reason. Deleting it does not help.
- Very nice patch.
Comment #5
sunComment #7
Dave ReidDon't think sun meant to remove the tags, we just cross-posted.
Comment #8
chx CreditAttribution: chx commentedsun, sorry, i worked from mine -- but thanks for pointing out my failures. I have moved more code around: the DRUPAL_BOOTSTRAP_VARIABLES stage is now before SESSION, and the access denied code has been moved to drupal_is_denied. I took care to initialize variables as well and made sure module.inc is loaded.
Comment #9
chx CreditAttribution: chx commentedBad status fixed.
Comment #11
chx CreditAttribution: chx commentedThe testing bot is right. However, I have session and bootstrap tests passing with this version. I now use a drupal_loggedin cookie instead of messing with the session cookie. Also, based on Moshe's review I cleaned up drupal_is_denied and a bit page_get_cache .
Comment #13
chx CreditAttribution: chx commentedNow, the patch above forgot to swap the bootstrap order for real. It was not easy to get that fixed. But, now we skip the session system for real when serving a cached page. Win. I needed to love quite a bit the session test to reflect the changed situation. I have pulled page_get_cache apart a bit so that you do not need to re-call page_get_cache to figure out whether page caching is on or not.
Comment #16
chx CreditAttribution: chx commentedNote that this patch is smaller, cleaner and yet, I think, now all tests will pass. At least bootstrap, session and comment pass for me.
Comment #17
chx CreditAttribution: chx commentedRemoved another unnecessary new construct from bootstrap.inc
Comment #18
moshe weitzman CreditAttribution: moshe weitzman commentedQuite a nice cleanup. We had not really dusted off drupal_bootstrap() in a couple of releases. I only found nits ...
Comment #19
chx CreditAttribution: chx commented1,2,3 are fixed. The last I disagree with, the purpose of the cookie is to forbid caching. The value is irrelevant. It's even misleading as you obviously wont have this cookie set to 1 when you first visit the site and receive a cached page.
Comment #20
moshe weitzman CreditAttribution: moshe weitzman commentedBot is happy, so RTBC.
Just to summarize. There is no new functionality in this patch - just a cleanup of bootstrap. The new cookie matches what boost and file based caching do already. It will be good for all these modules to rely on a cookie set by core (i.e. they can remove code when this goes in). I also like the UI improvement where we get rid of an inaccurate and scary warning about aggressive caching. Enabling of that feature is now left for settings.php. I say inaccurate because it says "The following enabled modules are incompatible ...". Thats not true. Some of their features are incompatible but they can still deliver a lot of value (e.g. devel). The whole situation is too complex to explain in the admin UI so I approve of the move of aggressive to settings.php
Comment #21
catchStill no upgrade path for the aggressive caching variable as mentioned in #2.
What happens if I'm logged in to a site, then it upgrades to Drupal 7, and I visit the site gain? As far as I can see, I won't have the cookie set, because I haven't logged in recently, so I'll be served cached pages. When I tried boost before I had the same problem, and the only workaround I found was to TRUNCATE sessions in order to force a new login for everyone. I looks like we're setting the cookie whether page caching is enabled or not, so this looks like only an upgrade issue though.
Comment #22
chx CreditAttribution: chx commentedI have changed the session cookie name ensuring noone who logged in during D6 has a session cookie thus noone is logged in. I have added a system update. Thanks catch!
Comment #23
chx CreditAttribution: chx commentedOh my, i forgot to return $ret.
Comment #24
chx CreditAttribution: chx commentedFixed a typo: "noone" should be "no one". English language and logic does not mesh (observe that someone and anyone is one word).
Comment #25
catchI've read through this and have no further nitpicks.
One important point which needs recording - is that with this patch applied, simply being logged in as user/1 won't allow you to update your site because your session will be invalid, you also need to set $update_free_access = TRUE in settings.php - however, in #304163: Allow update.php to regenerate settings.php we'll need a writable settings.php for the update to work at all (without manually editing the $databases array), so chx suggested we can set the variable there and then re-set once the upgrade is complete, which works for me - and would mean the permissions on the file and that setting are only done for the course of upgrading the site, and we're not asking people to manually change that particular setting themselves.
So, if this patch is applied before the settings.php fixes, we need to incorporate that logic in there. If that one goes in first, we need to do it here. But since they're drastically different issues, marking RTBC. Overall this is a great cleanup, will make contrib modules like boost a lot cleaner, very nice.
Comment #26
sunMinor clean-ups and fixes. Also renamed some stuff.
Comment #27
Dave Reid_page_cache_set_cookie() does not seem like a good name for what the function does. page_disable_cache() perhaps is better? I had implemented a simliar solution in #54238: page cache might show messages with page_cache_allowed(). Putting back for CNR until testing bot comes back.
Comment #28
sunRenamed to _page_cache_set_disable_cookie()
Comment #29
rszrama CreditAttribution: rszrama commentedMinor issue at the bottom of the patch -
$conf['page_cache_fire_hooks']
was renamed to$conf['page_cache_invoke_hooks']
, but the comment above it still references page_cache_fire_hooks.Had a chat with chx in IRC about modules that need to force cache exclusion for anonymous users on various pages. An example is Ubercart currently clears cached cart, cart/checkout, and cart/checkout/review pages to preserve a dynamic checkout process for anonymous users. It's doing that using the (old) CacheExclude method of clearing the cached page, tho chx pointed out I should update to the newer $conf manipulation... but I digress. This patch looks great for this use case, since you could replace that with the
_page_cache_set_disable_cookie();
call like drupal_set_message() and user_authenticate_finalize().The only thing I'd wonder is since this function is being called by various modules, should it keep the _ prefix? I was under the impression the function naming conventions would recommend that prefix only for functions that aren't intended to be used by other modules (though I know it was mostly a faux-namespacing thing).
Other than that, looks great. : )
Comment #30
sunFixed the PHPDoc in default.settings.php.
Comment #31
chx CreditAttribution: chx commentedI changed the name of the _page_cache_set_disable_cookie to page_cache_set_disable_cookie. No change otherwise.
Comment #32
catchI had a look through and making that a public function appeared to be the only thing missing, so RTBC (again).
Comment #33
c960657 CreditAttribution: c960657 commentedI may not fully understand the patch, so please bear with me. I have some questions:
1. AFAICT the drupal_nocache cookie is only relevant for anonymous users and means "do not cache the next request". If that is the case, it suggest stating this explicitly in the Doxygen comment for page_cache_set_disable_cookie(). Why is the cookie set from user_authenticate_finalize() that is only called when the user has just become authenticated? And why it it only deleted for anonymous users in _drupal_bootstrap()? Some elaborating code comments would be nice.
2. If drupal_set_message() is called on an uncached anonymous page view prior to the call to drupal_render_page() in index.php, the page is saved to the cache with the message in it. I assume this isn't intended? page_cache_set_disable_cookie() currently disables the cache for the next request, but it would be useful if there was a way to disable the cache for the current request as well.
3. Now that the drupal_nocache cookie has a drupal_ prefix, why not add this to the session cookie too now that it needs to be renamed?
4. If $conf['cache'] is set to CACHE_NORMAL in settings.php, the user must also specify $conf['cache_inc'], right? And he must use $conf['blocked_ips'] instead of the {blocked_ips} table, if he wants block IP addresses? If that is the case, I think the implications of setting $conf['cache'] should be elaborated in default.settings.php, or the setting should simply be removed from the file (it is a superuser feature).
5. There is still a textual reference to “The aggressive cache” on admin/settings/performance.
Comment #34
sun1) Not sure.
2) Sounds sane.
3) The session cookie does not need to be renamed. Otherwise, uid 1 would be logged out during a Drupal upgrade, I guess.
4) No, Drupal will use its default implementation in includes/cache.inc by default. Regarding blocked_ips I'm not sure.
5) Removed in this patch.
Leaving PNW due to 2)
Comment #35
chx CreditAttribution: chx commentedI will get back to 2) but I wanted to address 1) -- with the new page flow the session system is not on when page caching is decided so you need the cookie permanently while you are logged in otherwise you will get cached anonymous pages.
Comment #36
catchWe should also change the text in admin/settings/performance to say on/off instead of 'normal'. Maybe a checkbox instead of radios?
Comment #37
sunI thought the same - but turning it into a checkbox would mean that no other module could add another cache method? (dunno, if that still makes sense, but I think that is what modules like Boost depend on currently)
Comment #38
catchDid some benchmarks of this yesterday.
It's about 3% faster than normal page caching in HEAD, and it gets us back to where we were in Drupal 6.
D6
200 requests/second
D7 HEAD
195 requests/second
D7 patched:
200 requests/second
However I got a bit of variation in the results for all of these (the above are the median values from multiple benchmarks) - so it could really use someone else running the same thing. When dbtng got in, page caching was somewhere < 5% slower in benchmarks, so it seems somewhat consistent with that anyway.
Comment #39
chx CreditAttribution: chx commentedAddressed 2)
Comment #40
chx CreditAttribution: chx commentedMeh.
Comment #42
chx CreditAttribution: chx commentedAdded a comment, removed an unused static and fixed system_update_7019 which caused the bot to fail.
Comment #43
chx CreditAttribution: chx commentedEnhanced a comment in bootstrap.inc because Damien thought the code flow is not clear enough.
Comment #44
Damien Tournoud CreditAttribution: Damien Tournoud commentedCleared by the testing bot, carefully reviewed with chx on the IRC (which resulted in #42 and #43). That's a green light from me.
Comment #45
c960657 CreditAttribution: c960657 commentedIf I add
$conf['cache'] = CACHE_NORMAL
to settings.php as suggested in _drupal_bootstrap(), Drupal dies with Fatal error: Call to undefined function db_query() in /home/chsc/www/drupal7/includes/cache.inc on line 32. Here is the stack trace:You didn't comment on #33, item no. 3. Is there any reason the naming of the two cookies (drupal_nocache and SESS7xxxxx) is so different?
Comment #46
chx CreditAttribution: chx commentedWell _drupal_bootstrap does not say that. default.settings.php said "To use page caching without touching the database...". I amended that comment to elaborate on "without touching the database".
Comment #47
chx CreditAttribution: chx commentedAnd the session name, PHP calls its sessions SESSfoo I followed that. I am not particular to cookie names, really.
Comment #48
c960657 CreditAttribution: c960657 commentedI talked to chx on IRC about the $conf override in settings.php. I understand it better now, so never mind my comment.
Comment #49
chx CreditAttribution: chx commentedTo make sure it's in the issue, c960657 was concerned about possible breakage but I made it clear that there are many ways even with D6 to break Drupal with a careless override (make a typo in the name of session_inc...). Also found that I made a typo when rerolling against HEAD, I wrote "CLI" instead of "PHP CLI".
Comment #50
Damien Tournoud CreditAttribution: Damien Tournoud commentedThe last changes are minor, I don't revise my opinion of the patch: still ready to go in.
Comment #51
Dries CreditAttribution: Dries commentedI'm not sure this is a clean-up. It needs more work.
1.
Please rename to SESSION7. We don't usually abbreviate variable names. Also extend the code comment describing _why_ and maybe add a TODO if this is something we need to remove in the future.
2. I think drupal_is_denied() just got a lot messier. It is now doing multiple things, i.e. it is optionially generating a header. This is unexpected for an "is"-function.
3. I'm not sure but I _was_ using the aggressive page caching.
Comment #52
chx CreditAttribution: chx commentedComment #54
chx CreditAttribution: chx commentedUsual update function clash fixed by in-place patch editing ie absolutely no change just keeping up w/ HEAD.
Comment #56
chx CreditAttribution: chx commentedKeepin ' up with HEAD.
Comment #57
Dave ReidComment #58
chx CreditAttribution: chx commentedIt was RTBC before Dries set it to CNW and I addressed his concerns and passed the testing bot.
Comment #59
Dave ReidSorry, I was just trying to help since you set it to closed when it should not have been.
Comment #60
Dries CreditAttribution: Dries commented- "Note: We" should be "Note: we".
- "If set, then one of CACHE_DISABLED or CACHE_NORMAL. If NULL, then it will be looked up in the database." These 3 states are not very consistent. NULL is different from CACHE_DISABLED, which is unexpected, and could be mode more clear with better variables and better naming. If there is a CACHE_NORMAL, is there a CACHE_SPECIAL too? This looks like a partial clean-up.
- Personally, I think the use of 'drupal_nocache' is a bit messy. I'm not sure I agree that it is a clean-up so I'd like to understand why you think it is better. (By the way, I'm not sure I fully grok this clean-up. The description in the issue is rather brief and cryptic so I'd benefit from a good description/explanation.)
Comment #61
catchComment #62
moshe weitzman CreditAttribution: moshe weitzman commentedPatch does not apply cleanly anymore. Hope someone can fix and reroll.
Expanding on my prior summary. Hope I capture it correctly. Please improve upon it ...
Comment #63
chx CreditAttribution: chx commentedThat was not easy to reroll...
Comment #65
moshe weitzman CreditAttribution: moshe weitzman commentedTest bot unhappy. chx to the rescue?
Comment #66
chx CreditAttribution: chx commentedLet's try this.
Comment #67
catchWhy not truncate the sessions table instead of renaming SESS? Or if that won't work, we could add an update to truncate the sessions table anyway since it's a good excuse to clear it out.
I'll try to do a thorough review later.
Comment #69
chx CreditAttribution: chx commentedGood idea about truncating session.
Comment #71
chx CreditAttribution: chx commentedComment #72
Dries CreditAttribution: Dries commentedMy comments have not been addressed yet. The following is still confusing me:
I'd still like to understand why we want to use that cookie? It seems like a hack.
I don't necessarily consider this a UI improvement either.
Comment #73
catchI don't really get the NULL comment either. It looks to me like it allows you to set CACHE_NORMAL in $conf in settings.php - then you can serve the cached page without ever needing to do a variable_get()? If that's the case, it definitely needs a more verbose comment.
The cookie is used for modules like boost to check for anonymous vs. authenticated users in mod_rewrite rules so they can skip PHP entirely before serving cached pages from files.
Looks like this:
Having it in core saves a lot of code duplication with similar alternative caching backends. With our default database caching it means we can avoid loading session.inc on cached page views - as far as I can see the overall code footprint here should be a lot smaller. If we move hook_boot() and hook_exit() to their own includes, then it'd be only a handful of includes + boot/exit to serve the entire page in terms of code weight.
Comment #75
sunI don't really see why this patch should fail. Works in manual testing.
Re-rolled and (minimally) reworked some comments. Also removed changes to lines unrelated to this patch.
Comment #77
Damien Tournoud CreditAttribution: Damien Tournoud commentedNow that I think about it, the drupal_nocache cookie seems like a hack. What we need is to make sure that Drupal is fully compliant with standard HTTP caching mechanisms.
The current solution seems wrong and unnecessary:
- wrong because _page_cache_can_be_set() says that any page served to an anonymous user can be cached (regardless of the session data this anonymous user can have)
- unnecessary because we now start a session on demand: that means that's it's enough to check the existence of the session cookie instead of
drupal_nocache
when determining whether we can serve a page from cacheBy the way: this caching strategy also breaks language negotiation (even if it was also the case in D6), as described in #339958: Cached pages returned in wrong language when browser language used.
The fact is we need to store several versions of a page at a given URL. Both the cached page and the specific headers that were used to generate the page in the cache (as sent by the
Vary
header) needs to be there:Vary
headers we stored from the request URLVary
header combinationsComment #78
chx CreditAttribution: chx commentedDamien, this is fine but then whenever we destroy the session we need to destroy the session cookie too.
Comment #79
chx CreditAttribution: chx commentedLets see that after the mod_deflate patch this passes.
Comment #80
chx CreditAttribution: chx commentedI think re-testing is broken.
Comment #82
Dries CreditAttribution: Dries commentedI spent some more time with this patch tonight, studied Boost a bit more, and dug into the archives to refresh my knowledge about the file caching that was supported by core (originally worked on my Jeremy). At the end of the day (literally), I like what this patch does. However, it seems like DamZ's comments in #77 are not addressed (and neither is mine).
I also wonder if we should add a simple file cache backend in core. This would allow us to write tests against this, and make sure that things like language negotiation actually works. I think everyone, including, memcached backends would benefit from these additional tests. Furthermore, it would also help convince people that Drupal scales, which a lot of people that I talk to are still concerned about. Having basic file caching in core could be strategic from that point of view. File caching could provide a tremendous performance improvement and making it available to all sites could help Drupal from being hosted on shared hosting accounts to large multi-server setups.
Comment #83
catchJust to clarify - language negotiation is broken in both Drupal 6 and HEAD and always has been - this patch wouldn't break it, it just makes one of the approaches on that issue a bit harder to do. The latest version of that patch (skip caching pages when negotiation is used in the first place) will work fine for this - but it's not a perfect solution to the problem - the 'perfect' solution requires massive bootstrap refactoring to fix a very small percentage of page requests and I no longer think that's worth doing. We should probably continue discussion on that issue.
Basic file caching in core seems really interesting.
Comment #84
Damien Tournoud CreditAttribution: Damien Tournoud commentedI personally believe that it should not be Drupal's job to do the page caching. It's Drupal job to output correct headers (because it is the only judge of the freshness and cacheability of a piece of data), but the actual caching is more the job of the proxy servers (both forward and reverse) and of the client.
Drupal shouldn't even have to cache the whole page... only caching the last modification date of a particular (url, vary headers) should be more then enough. Reverse proxy servers like Varnish and nginx handle storing and retrieving cached data better then Drupal ever will. Shared hosting providers actually have (or should have) reverse proxy servers in front of their web server farms.
Removing the unconditional session starting for anonymous users was a great switch. Now we just need to clean that up. I don't believe we need anything else.
By the way: we are running a modified backport of #201122: Drupal should support disabling anonymous sessions on paris2009.drupalcon.org. With a very small max age of 10 seconds, we have more then 60% cache hit.
Comment #85
andypostAbout #73 look at http://drupal.org/node/101227#comment-1597328
only Konqueror this moment broken with gzip
maybe it's a good idea to store gziped content in one place
Comment #86
moshe weitzman CreditAttribution: moshe weitzman commentedAnyone know which web hosts offer reverse proxy servers? Those will be very desirable for D7 sites. I doubt that this practice is prevalent enough for us to rely on but I would love to be wrong about that.
Comment #87
Damien Tournoud CreditAttribution: Damien Tournoud commentedEarly release: this patch depends on #477038: Add session and security token support to DrupalWebTestCase.
This patch reverts part of #201122: Drupal should support disabling anonymous sessions: thanks to the simplification of the caching workflow we don't need the drupal_(get|set)_session() wrappers anymore.
Comment #88
mikeytown2 CreditAttribution: mikeytown2 commentedOnce I get a beta release of boost out the door I'm willing to take a peak at this issue; including converting boost to D7 #325813: Boost Port to 7.x.
Comment #89
chx CreditAttribution: chx commentedwell, this first, file caching second, boost third.
Comment #90
David StraussI'm generally against file caching in core. Boost-style caching is a poor substitute for real reverse proxy caching, and it has its own extensive set of problems, especially related to cache pruning and invalidation.
Comment #91
Dries CreditAttribution: Dries commentedRe @DamZ in 84 and re @David in 90: while I agree with you in theory, the reality is quite different. First, the majority of shared hosting accounts is not using reversed proxy caches. If they were, we'd have had much better support in core for a long time. Second, when you're running Drupal on a VPS, it is still easier to enable file caching (or database caching) than it is to setup and configure a reverse proxy. Third, when people evaluate different CMSes, they do some high-level testing -- very few setup reverse proxy caches before picking a technology. Drupal's database caching was (and still is) _critical_ for Drupal's adoption -- the fact that an out-of-the-box Drupal was faster than virtually any other FOSS CMS has been very, very important. So while I agree with your points, it doesn't change my opinion. This is not a technical decision, it's a strategic decision. What could change my opinion is the complexity of file caching.
The reality is that both of you are 100 times more educated about performance and scalability -- you're not like most people. I wish more people were like you guys though. ;-)
Re @chx in #89: I agree that we can focus on this patch first. However, if someone wants to provide basic file caching in a parallel issue, I don't think it would hurt -- quite the contrary, it would help us test and evaluate the changes in this patch).
Would love to get @c960657's take on this patch at some point. In the mean time, I'll have a look at #477038: Add session and security token support to DrupalWebTestCase.
Comment #92
David StraussTo clarify my concerns in #90, I've seen sites using Boost get crippled by the load of traversing and managing the files in the cache. Building an effective file cache that doesn't often become more of a problem than solution is a significant engineering challenge.
Comment #93
Dries CreditAttribution: Dries commentedWell, so #477038: Add session and security token support to DrupalWebTestCase was trivial.
@catch, do you think you could try @DamZ's patch in #87 and do some benchmarking to make sure there is no regression for database caching? I'm plan to dive into this issue some more later today.
Comment #94
Damien Tournoud CreditAttribution: Damien Tournoud commentedTime for a little bit of summary of what this patch is, now:
- first, it starts an output buffer unconditionally (for both cached and non-cached workflows)
- this allows us to revert the ugly part of #201122: Drupal should support disabling anonymous sessions: the session is started and / or destroyed at the very end of the process (in drupal_page_footer()) so we don't need the ugly drupal_(get|set)_session() wrappers anymore. Just write stuff into $_SESSION, the session will be automatically started for you
- in last resort, drupal_page_is_cacheable() is responsible for determining if a page can be cached or not. We now have a drupal_page_is_cacheable(FALSE), so we don't have to resort to hacks to prevent Drupal from caching a page (including my patented
$_SERVER['REQUEST_METHOD'] = 'NO-CACHE'
...)- drupal_start_session() is gone, and all functions now have consistent names and precise responsibility. For example: page_get_cache() is now named drupal_page_get_cache(). It used to "Retrieve the current page from the cache, if not found and the page is cacheable, start an output buffer", now it just retrieve the current page from the cache if possible.
- clean-up and unbreak the session tests, that made strictly no sense in some places
Here is a cleaned-up version. Let's see what the test bot thinks.
Comment #96
Damien Tournoud CreditAttribution: Damien Tournoud commentedSecurity tokens are broken, because the session_id() can change during a request lifetime. Let's set a fake session_id unconditionally when we decide not to start a session during bootstrap.
Comment #98
catchedit: cross-posted with Damz, these are from #87
ab -c2 -n10000 - normal page caching.
HEAD:
161.23 / 155.29 / 153.85 [#/sec]
Patch:
157.30 / 157.40 / 157.20 [#/sec]
While were' here, Drupal 6.
143.10 / 144.68 / 142.81 [#/sec] [#/sec]
When I did benchmarks in March, D7 came out a touch slower (while chx's patch restored the 5% or so drop), so not sure what we did there.
If someone's got a cleaner benchmarking environment than my laptop would be good to verify on that, but looks like there's not much in it in either direction.
Let's also see what the bot says. I'd really like a benchmarking bot by the way.
Comment #99
Dries CreditAttribution: Dries commented@catch, so if I interpret your numbers correctly, this patch (i) simplifies the page caching code (i.e. less code and cleaner code), (ii) advances the cache system (i.e. better support for file and memcached backend), (iii) makes for an easier UI (i.e. no scary aggressive cache), but (iv) comes at the cost of a small 2% performance drop. I think (i), (ii) and (iii) make (iv) worth it.
We don't have "real" proof for (ii) yet ... and we did loose the aggressive page caching in core. People using the aggressive page cache, have no good upgrade path built-in at this point.
PS: I thought you were our benchmarking bot? ;-) Thanks for running some benchmarks.
Comment #100
Damien Tournoud CreditAttribution: Damien Tournoud commentedThe test bot should like this version better.
Comment #101
Damien Tournoud CreditAttribution: Damien Tournoud commentedTo clarify: my latest patch attempts do not remove aggressive page caching, nor early cache. I think we can remove the early cache hook, which make little sense, but that's out of the scope of this patch.
Comment #102
Damien Tournoud CreditAttribution: Damien Tournoud commentedAlso performance is not the primary concern of those patches. Removing ugly code that got introduced recently and streamlining the cache process are the main priorities.
Comment #104
Damien Tournoud CreditAttribution: Damien Tournoud commentedThere was some debugging leftovers in the previous patch. Let's try this one instead ;)
Comment #106
Dries CreditAttribution: Dries commentedRough day, DamZ? ;-)
Comment #107
Damien Tournoud CreditAttribution: Damien Tournoud commentedOk, here is something that could actually work.
Comment #109
Damien Tournoud CreditAttribution: Damien Tournoud commentedAdd batch-specific tweak: we need to store the batch id in the session in order to guarantee that it will stay alive. I'm thinking we should drop the {batch} table and move everything to the session instead (that will be another patch).
By the way, it seems like the (limited) BatchAPI tests in HEAD succeed by accident: I'm pretty sure batches as an anonymous users are not really supported right now.
Comment #110
Damien Tournoud CreditAttribution: Damien Tournoud commentedWe discussed that with chx and agreed that my patch is a little bit different in its objectives then chx'. I moved the above patch to #477944: Fix and streamline page cache and session handling. Sorry for hijacking this issue, chx.
Comment #111
catchSeems like the status of this is postponed on that issue, so marking as such.
Comment #113
catchThat patch is in.
Comment #114
chx CreditAttribution: chx commentedhttp://drupal.org/node/370454#comment-1608560 i will revisit this later today.
Comment #115
Damien Tournoud CreditAttribution: Damien Tournoud commentedJust be sure not to remove the ability to set a
max-age
HTTP header to cached pages. This is an essential for reverse-proxy to serve some content for anonymous users bypassing completely the HTTP server.Comment #116
mikeytown2 CreditAttribution: mikeytown2 commentedI've been thinking about caching JSON AJAX. It's perfectly feasible, I just need all ajax requests that actually want the server to save the info to be a POST instead of a GET (makes since right?). Just want to make this clear since the default (see type) is a GET. I couldn't identify any places in core (7.x) where one uses AJAX to save data.
In terms of the performance page, I don't think its clear that the core page cache is for anonymous users. In Boost I went ahead and made it more usable IMHO. See screenshot for the changes made.
Comment #117
moshe weitzman CreditAttribution: moshe weitzman commentedSadly, chx' original patch never got to the finish line. damien usurped the issue and then kindly moved aside. anyone willing to revive this?
Comment #118
chx CreditAttribution: chx commentedI pretty much needed to restart the patch because of the many session changes. However, this is small and neat!
Comment #120
chx CreditAttribution: chx commentedIt helps if there is always a user object and if the boot hook is called when not caching the page. BTW the hook_boot test is not needed because session test already fails if hook_boot fails.
Comment #122
chx CreditAttribution: chx commentedRemoving the agressive caching test.
Comment #123
catchI liked this patch before, and it looks even better, a lot of stuff cleaned up. system_update_7033() needs a one-line description.
Comment #124
chx CreditAttribution: chx commentedBetter and more comments and a message on update for agressive caching. No functionality change.
Comment #125
chx CreditAttribution: chx commentedMore text fixes. catch was unable to find any more nitpicks. I am sure Angie will but oh well :)
Comment #126
mikeytown2 CreditAttribution: mikeytown2 commentedHow does this effect user scripts that only want to boot up the database?
Right now it seems like it will also boot
DRUPAL_BOOTSTRAP_ACCESS
DRUPAL_BOOTSTRAP_SESSION
DRUPAL_BOOTSTRAP_VARIABLES
Beer-o-mania starts in 16 days! Don't drink and patch.
Comment #127
chx CreditAttribution: chx commentedComment #128
chx CreditAttribution: chx commentedThat's ugly. We can do better and still have a negative code impact on core.
Comment #129
chx CreditAttribution: chx commentedcatch made the new comment better. Yay, feedback!
Comment #130
Damien Tournoud CreditAttribution: Damien Tournoud commented- 'page_cache_without_database' should be 'page_cache_invoke_hooks'
- 'cache_lifetime' should be renamed to 'page_cache_max_age' (I think I already did that in another patch, that apparently failed to be committed), and documented as such in default.settings.php
I suggest to move the $cache_mode = variable_get('cache'); to the else block. I'm not sure this even works right now.
The final_phase stuff looks like a hack. I understand what it is for, but it still looks like a hack.
I would rather see the first part of the check implemented as
if (isset($_COOKIE[session_name()])) { drupal_page_is_cacheable(FALSE); }
. I'm not quite sure what do to with the second part, maybe we could do something similar?Setting cache mode to agressive in Drupal 6 is exactly setting 'page_cache_invoke_hooks' to FALSE. Why not properly migrating those settings?
^ Typo in the last variable name.
Comment #131
chx CreditAttribution: chx commentedComment #132
chx CreditAttribution: chx commentedAh, we had a misunderstanding on the first point.
Comment #133
chx CreditAttribution: chx commentedAnd an issue summary: the patch removes early and aggressive caching. Aggressive caching lives on as a settings.php override because it's too complex for users to decide whether things works w/o a hook_boot or not. Early and late page caching is now just caching however because a) now session handling gives us a reliable indication whether we want to serve cached content or not b) per bin caching allows site builders to serve pages from something that's not database, like memcached, file etc.
Comment #135
chx CreditAttribution: chx commentedThat's one silly mistake I made there.
Comment #136
Gerhard Killesreiter CreditAttribution: Gerhard Killesreiter commentedIt looks good to me. I've also never used aggressive caching, so I don't mind seeing it go.
I slightly wonder why we keep banning by IP.
Comment #138
chx CreditAttribution: chx commentedSniff!
Comment #139
mikeytown2 CreditAttribution: mikeytown2 commentedIs there a way to use what was originally passed in drupal_bootstrap() and not go above that, to prevent unneeded code loading? That was one of the "side effects" of the old way with 2 bootstraps. Right now, another way to get around this would be.
16 days to code freeze. Better review yourself.
Comment #140
moshe weitzman CreditAttribution: moshe weitzman commentedNice patch. I'd say it is RTBC.
I came up with one nit below because apparently it isn't a drupal review if you don't exercise your inner perfectionist.
Perhaps add a comment about this line in particular. It is pretty dense.
This review is powered by Dreditor.
Comment #141
Dries CreditAttribution: Dries commentedI agree with Moshe in #140.
Also, in settings.php, 'Page Caching' should be 'Page caching'.
Otherwise this looks sane to me. Nice job, chx.
I'd also like to ask DamZ to approve or reject the latest version of this patch because he did a deep review in #130.
Comment #142
chx CreditAttribution: chx commentedIs there a way to use what was originally passed in drupal_bootstrap() and not go above that, to prevent unneeded code loading? <= see that
drupal_bootstrap(DRUPAL_BOOTSTRAP_VARIABLES, FALSE);
the FALSE in there prevents overgoing the originally passed in phase. Also see$final_phase
in drupal_bootstrap.#140, there is a novel above this line already. It was referencing to aggressive caching, however, I fixed that.
Comment #143
chx CreditAttribution: chx commentedDamien said I should revert the test, he is quite right.
Comment #144
chx CreditAttribution: chx commentedFixed that test.
Comment #145
mikeytown2 CreditAttribution: mikeytown2 commentedRTBC
Comment #146
Damien Tournoud CreditAttribution: Damien Tournoud commentedThis is a very nice clean up. Some parts of it are not that elegant yet, but we will be able to refactor some of it once Adrian's work on the variable system land (#145164-227: DX: Use hook_variable_info to declare variables and defaults).
Anyway, this looks ready to go in.
Comment #147
Dries CreditAttribution: Dries commentedCommitted to CVS. Thanks!
Comment #148
webchickGreat patch! Couple of things, mostly related to documentation.
Oh, +4,203,238.5 for removing this confusing and awful setting. Always so much fun to explain to end users: "Well, if it says some of your modules are incompatible, the way to tell what functionality you're going to miss is to simply grep for hook_boot or hook_exit implementations and read the code therein..." :P~~~
This seems like it causes a regression in the case that this function is called before the database is active, and a settings.php override is not present. It's going to default to not blocking anyone.
I spoke to chx about this and basically the only time this could happen is if someone has explicitly told Drupal to NOT use the database for caching, and they do not have a settings.php override of blocked IPs. Usually this will happen if they are handling the blocking themselves as a lower level, e.g. firewall or Apache.
Let's make that one-liner comment into a couple-liner comment so this is more clear that this isn't an oversight.
drupal_handle_denied() is a weird function name. How about drupal_block_denied()?
Also, shouldn't this be in t() or st()? Oh, nevermind... I see the comment now. Still though, we use the word "block" rather than "ban" (I have no idea why, but consistency++).
I do not understand what these first few lines in drupal_bootstrap_phase() are doing, and the PHPDoc didn't help me.
I'd change the PHPDoc to:
A boolean indicating whether a phase transition is occurring.
And add an inline comment above those lines that says something like:
If transitioning from one phase to another, store the phase name so it's not forgotten while recursing.
Hm. DRUPAL_BOOTSTRAP_ACCESS seems like a useful phase to have. Why is this being removed?
One-liner summary here? My eyes glazed over the 500th or 600th time I read the word "phases" on this line of code. ;)
This whole section of code is quite dense. It looks like we lost the comments that broke it up from the code before. Let's add those back in here.
I've been confused throughout the patch about what this variable is for, and now that I've read its description, I think it needs to be more clear. I figured that it was an array of hooks not to fire, but instead it is a boolean TRUE/FALSE about whether to fire hook_boot() and hook_exit().
So how about "...can be set to FALSE to skip calling all hooks, including hook_boot() and hook_exit(), which would normally be called on every page request."
I'm on crack. Are you, too?
Comment #149
chx CreditAttribution: chx commentedDRUPAL_BOOTSTRAP_ACCESS seems like a useful phase to have. Why is this being removed? because there is no place it can live, we would need to split it into two -- that happily can be a followup.
Tons o comments added back, fixed and added new ones too.
Comment #150
webchickLOL, oops. :) I need to be faster in my patch reviews. ;)
This will need a quick re-roll.
Comment #151
chx CreditAttribution: chx commentedComments.
Comment #152
drewish CreditAttribution: drewish commentedI think the committed parts of this issue are causing the following warnings:
Comment #153
chx CreditAttribution: chx commentedRipping the rest from performance.
Comment #154
drewish CreditAttribution: drewish commentedfixed the warnings i was seeing.
Comment #155
webchickThanks! Committed to HEAD.
Comment #156
floretan CreditAttribution: floretan commentedThe nice message that we want to return from the update function never actually goes anywhere. This should help.
Comment #157
Dave ReidGood followup find flobruit.
Comment #158
webchickCommitted to HEAD. Thanks!
Comment #161
c960657 CreditAttribution: c960657 commentedFollow-up: #804864: External page caching does only work when "page_cache_invoke_hooks" is set to FALSE. (Old "Aggressive Mode" is on)