Closed (outdated)
Project:
Drupal core
Version:
7.x-dev
Component:
base system
Priority:
Normal
Category:
Task
Assigned:
Unassigned
Issue tags:
Reporter:
Created:
22 May 2012 at 18:41 UTC
Updated:
26 Jul 2016 at 10:21 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #2
unitedwallabies commentedAttaching the version of the patch to conform to `-p1' flag.
Comment #3
unitedwallabies commentedComment #4
brockjo commentedPatch Tested on:
-Drupal 7.14
Browsers Tested:
-Firefox 12.0
-Google Chrome 19.0.1084.52
-Safari 5.1.7
Themes Tested:
-Default "Bartik" theme 7.14
-Two of my custom subthemes (Zen based)
There were no negative effects observed from the patch. I havent noticed anything different since implementation. Because variable_initialize() is loaded on every Drupal request, this has been thoroughly tested on my site (threw it on a site I am building, 3 days stable now). Due to the nature of the patch I haven't really noticed any differences either, so it is hard to test completely.
Comment #5
marcingy commentedThis need to be implemented in d8 and backported
Comment #6
unitedwallabies commentedThank you very much for the testing/confirmation, brockjo.
marcingy, what is the customary workflow — is the original submitter supposed to (re-)implement for D8, or will someone (you?) do that? Thanks!
Comment #7
tstoecklerThis could use some parentheses to make the order of operations clear.
Why are you removing the lock_wait()? That seems to be point 2. in your original post, but you should provide some reasoning for this.
This is using a tab, please use spaces. Also this should be on one line.
In general this could use some benchmarks. If you can prove that this is an improvement for people using DatabaseCache I can see at least point 1. going in. (Again, I don't get what you're trying to achieve with 2.)
EDIT: Removed Dreditor double-post.
Comment #8
unitedwallabies commentedI could never understand, how the order of
and(the multiplication) and theor(the addition) could be confused — by anyone... These are such basic operations of elementary middle-school calculus...Because, if this instance could not acquire the lock at the first attempt, it means, another instance got it and is already updating the cache — and the current instance need not bother. This not only makes sense at all times, but is also safer: when the locking begins to fail repeatedly (such as due to some database-server problem making it only partially available or
memcachedbeing restarted), the current code will aggravate the problem by sending PHP into a tight loop repeatedly trying to obtain a lock and recursively invokingvariable_initialize()— we have seen it here. The proposed implementation will react to such degradation more gracefully — returning variables even if placing them into cache fails.Sorry, I was unaware of Drupal code's style-guide — and used that of FreeBSD by default :-) Easy to change, of course. If there is a formal style guide for Drupal, could you, please, post a link? I'll try to make this and future patches compliant.
Well, the improvements will certainly be there, but will be hard to measure. Consider the DatabaseCache case. Currently the data is obtained from
{cache}for most hits (oneSELECT). When the cache is considered stale, the same data is obtained from{variables}(a simplerSELECT) and then stuffed into{cache}(a hundred or soINSERTs). But this refreshing of cache does not happen very often. So, one would need to pound the server for a while for the difference to be noticeable — and even then it may not exceed the margin of error. Also, the unqualifiedSELECTfrom{variables}is, of course, cheaper than a more complicated one (withWHERE-clause) from{cache}. But the modern machines are so fast, it may be hard to measure the difference. I'll try to set things up, but I can not afford to build a giant collider to confirm, what is obviously there :-)Comment #9
tstoecklerThe lock_wait() point is, as far as I can tell, moot. The lock_wait() pattern was introduced exactly to allow multiple processes not to fail while one process is rebuilding the cache. Please open that as a separate issue, if at all.
Without some benchmarks I don't see any point in doing this at all.
If this makes something faster, than you can measure it, it's as simple as that. Stating the opposite is quite simply an excuse. For this patch you could for instance insert a couple hundred variables and then measure how long rebuilding the variable cache takes.
That is quite condescending, but I won't fight it. I just think parentheses make these operations easier to read, but I'm not really attached to that.
Comment #10
marcingy commentedMoving to postponed this needs bench marks before it can be considered further.
Comment #11
unitedwallabies commentedIn the current implementation, all of the processes, that encounter stale cache, will rebuild it — one after another. In my implementation, only the first such process will rebuild the cache. The subsequent ones — those, that find the cache stale, but are unable to acquire the lock on the first attempt — will proceed without trying to rebuild the cache (with the values obtained directly from the database).
This saves some time in all cases (though still hard to measure because cache-rebuilding is very infrequent). In unusual cases, where the locking itself might be failing for a while, it prevents more serious problems as each process only attempts to obtain the lock once — better to not use cache at all, than to get stuck forever trying to rebuild it.
Light travels at different speeds in vacuum and in denser environments (such as gas or liquid). But measuring the difference is very difficult...
I've devised the new code and offered a theoretical proof, that it is no slower and usually faster than the existing implementation. I don't understand, why you are placing this burden of proof on me. Is it the coding style or the attitude of my comments or some other non-technical reason?
Could you elaborate? How can I measure the time of this one function? What I am trying to do now is hit the freshly-installed Drupal site on my server with Apache Benchmarking tool (
ab). But there are so many other variables (I can't even ensure, the server is not used by anything else during the benchmark), that the timings are different even from run to run — without any changes to the code. As I suspected, the margin of error is higher, than the theory-suggested improvement.The server is a 4-core machine running FreeBSD/amd64. Web-server is Apache-2.2.21. PHP is 5.3.8. Drupal is 7.14. Vanilla install (no memcache, no apc). Benchmarking from a separate host on the same LAN with the following command-line:
ab -r -t 90 -k -n 100000 -c 7 -H 'Host: dp7.example.net' http://narawntapu/Without the patch:
With the patch:
So, there is some improvement, but I don't think, it is conclusive... Until I started using the
-kflag ("keep-alive") the overhead of establishing a new TCP connection for every hit was hiding all improvements completely.I then installed PHP's apc-module and got a major improvement out of that, as expected. My patch still makes some difference, but is it conclusive? I would not be sure, if I did not know the theory behind it already...
Without my patch:
With my patch applied:
In addition, here are some benchmarks talking to the MySQL server directly. Using the server's own BENCHMARK function, I demonstrate, that my approach is not any slower even when the cache is tiny (
{cache_bootstrap}only has 6 rows here). Accessing the{variable}table directly:Obtaining the same data from
{cache_bootstrap}as the code incache.incis doing:So, why bother caching the data, if the retrieval from cache is no faster, than the retrieval from origin?
Comment #12
marcingy commentedWhy bother changing if there is no considerable performance improvement.
Comment #13
tstoecklerThe problem I am placing this "burden of proof" on you is that this patch introduces less abstract and somewhat "smelly" code. That is because variable_init() should really not know anything about cache backends. And this introduces a tight coupling on a spehttp://drupal.org/node/1595070#newcific backend.
Now if this would lead to a measurable improvement for users of the default cache backend then in this case the performance benefit would definitely outweigh the small amount of code smell, but if not there really is no point. So if you cannot show that the rebuild operation happens often and is significantly faster, then this patch is "won't fix".
Additional notes:
- The benchmarks you did, I think, simply accessed the home page and so did not measure the performance impact of rebuilding the cache. To benchmark I would use a simple script such as this one (untested):
Again, I don't think it would even be worth it to write such a script, because variable rebuilding happens so seldomly that the improvement would have to be huge for this patch to be worth it.
-
That is not true, please read the current implementation again. As I said, if we want to rework the lock pattern, please open a new issue, although I don't think you will get far with that, to be honest.
Leaving at "won't fix" per the previous post.
Comment #14
tstoeckler...
Comment #15
unitedwallabies commentedActually, come to think of it, my benchmarks did show an important improvement. While in most cases there is no difference, the maximum times decreased substantially after my patch. I explain this by the code not copying the data into cache any more — the current implementation does not do it every time, but, when it does, it takes measurable time to do. Here are the steps, that are currently done in this case:
{cache_bootstrap}— happens for each hit{semaphore}— create the lock{variable}{cache_bootstrap}{semaphore}— remove the lockthe proposed implementation would simply do a single SELECT at all times — and that may be why, the "max" timing is continuously lower in my benchmarks, when my version of the function is used. Thus, my implementation is just as fast in the normal case, but much faster (14% without APC, 76% with APC enabled) occasionally.
I agree with you on the "smelly" sentiment — I just do not know the API enough to make it better. But you do... Perhaps, it should be possible to set caching provider to "None" — either for the entire
bootstrapor just forvariablessomehow?And then there is the other aspect — in the current implementation there is no check for why
lock_wait()has failed. If the failure is due to a database problem or (if memcache is used to implement locking) due to memcached being restarted, the existing recursion becomes very deep and, if the server continues to process hits, can bring the server to its knees. We have seen it here... My implementation is not recursive and will only attempt locking once — providing for more graceful degradation in case of a problem.There is no need to hold the lock while loading values anew — MySQL can handle thousands of simple SELECT-queries without a hitch. It is only when updating the cache, that locking (to prevent "stampede") is justified...
I had to modify your script a little:
The results:
Comment #16
marcingy commentedYour tests were based on apache workbench those are not valid tests for bench marking code. We need real bench marks against the php code as suggested in #13. Also this approach does indeed fail the smell test. And the lock prevents a cache stampeed which was it was added to this function, remove the lock introduce a stampeed. Back to needs info.
Comment #17
unitedwallabies commentedI used the method suggested in #13 — and appended the results to #15. I think, the results are rather impressive (3.5 times win!) — not that I expected anything less :-). Try it yourself...
The cost of stampede-prevention is higher, than the stampede itself in this case :-) Consider the situation, where N processes encounter cache-staleness at the same time.
My way:
SELECT-queries to get the expired values anew from{variable}.INSERT(N-1 of the attempts failing).DELETEs the lock, while N-1 process their requests.The current way:
INSERT(N-1 of the attempts failing)SELECT-query to get the expired values anew, while N-1 processes sleep for 25 milliseconds (inlock_wait()).SELECT-queries initiated bylock_may_be_available()— invoked fromlock_wait().DELETEs the lockI posit, that the N
SELECTs of my implementation are not any more expensive, than the N-1SELECTs initiated bylock_wait()(and a certain K of these N-1 may have to repeat theSELECTagain)... Moreover, while waiting, these N-1 processes aren't serving the clients — they are sleeping in the current implementation. In my implementation, they all go on processing the request with the values obtained from the database directly (while the very first one goes on to rebuild the cache under lock).And then, in step 5, after the cache is rebuilt, these N-1 unfortunates still need to read the data from it, — which means N-1 more
SELECTs fromcache_bootstrap(or something faster, depending on the caching implementation used) — my way they all have read the data already in step 1.To rephrase: while the cache-rebuilding should, indeed, be done by just one process, reading of the original data straight from the database (with a single
SELECT) needs no such protection.Although it may be possible, that some alternative implementation of
lock_wait()is fantastically faster than the default one, N identicalSELECT-queries cost most databases (with things like query-caching enabled) just as much as a single one. In other words, the NSELECTs from my step 1. is not any costlier, than the 1SELECTfrom the current step 2.Comment #18
unitedwallabies commentedCome to think of it, it is not even immediately obvious, that preventing even cache-rebuilding from stampede is always worthwhile... For example, when caching is done by memcache, but locking is still done by the default
lock.inc(using the{semaphores}table), it may be cheaper to modify the cache by N processes simultaneously, than to have NINSERTs and oneDELETE...But this may be too radical and some caching implementations may even get corrupted by such simultaneous updating, so I still agree, that cache-rebuilding should be done under a lock. Yet I maintain, that there is no need for a lock to cover the reading of data from the DB...
Comment #19
damien tournoud commentedThe theory (which was definitely true at one point) is that retrieving a single row from the database + deserializing it is faster then retrieving all the rows and deserializing them one by one.
Not sure if it is still true (given that MySQL buffers result sets), but anything that changes this should be heavily benchmarked.
Comment #20
unitedwallabies commentedDamien, do you have references to the past benchmarking showing this to be true at some point? I wonder, when this was last measured and under what conditions...
I don't agree with the title-change — when using faster cache-implementations (like APC or memcache), it may still be better to cache variables. But it makes no sense to do it with default (DB-using) cache...
I'd add more... For a small site with a single database server it makes fairly little difference either way — on modern (or even last year's) hardware. Even if response time spikes on occasion (as it does now), no one will notice: despite a thousandfold difference, humans can't distinguish a millisecond from a microsecond anyway :-)
However, for bigger setups — those using multiple database servers with replication (either master->slaves, or multi-master) — each write (whether it is an
INSERT, orUPDATE, orDELETE) to the database becomes more expensive, while the reads remain, for most intents and purposes free.This means, using database for a caching back-end should be discouraged in general — not just for variables. It is the simplest method to ensure for somebody just getting started, but I doubt, variables-loading is the only place, that's unduly pessimized by using it.
Comment #21
marcingy commentedThere is nothing to review at the moment.
Comment #22
mpotter commentedOn several of our sites I have found that performance *is* improved by using the cache. Especially if you do a lot of variable_set for any reason. A single test page is to make a for-loop of 100 variable_set (with 100 different variable names). It's an edge-case test but helps identify problems caused by this recursion and locking mechanism around variables. Hit that test page with "ab -n 100 -c 100 ..." and you'll start to see the problems.
There are a couple of issue threads on this whole variable cache and locking recursion around so I didn't want to create "yet another". But since this thread has gone "off the rails" the least amount (don't want to get into D8 and Symfony stuff to deal with my problem on D6) I will link to a pull request for Pressflow6 that addresses a similar solution to this problem: https://github.com/pressflow/6/pull/52. Should be pretty easy to port this patch to D6, D7, and D8 without getting into over-complex cache building schemes.
My approach started with just removing the cache as in the first patch of this issue. But because I found performance decreases, I added in a single check to see if the variable cache lock could be acquired. If the simple lock_acquire is successful, then I go ahead and write-thru the cache. Otherwise I clear-cache so the next page request will rebuild it.
So most of the time you get the good benefit of the variables cache. But it removes all of the recursion that leads to so many locking problems and waiting worker threads.
If I get some time, I'll try to post real patches for D6 and D7 here, but hopefully this will give others another place to look for a potential solution to this tough problem. I'm also interested in getting other smart eyes on this potential solution from people who have already been thinking about this issue.
Comment #23
andrea.campi commentedI found this issue while investigating a surprising performance problem.
Some background: multi-node installation with APC, FPM, and both caching and locks on memcache. We are monitoring performance with NewRelic.
What we see from time to time is that a request will spend seconds (10s of seconds at time) in lock_wait.
The "funny" thing is that once the lock is acquired, the recursive call to variable_initialize will complete in ~100ms.
We are investigating in several directions (what is invalidating cache_bootstrap, and why would the lock be held for that long), but that's beside the point.
At least in our setup, I have to agree with unitedwallabies: letting the requests that "lose" the race go through and read the variables from the DB would be a huge win for us.
Comment #24
mrfelton commentedI'm experiencing severe locking issues with the mysql processlist showing
TRUNCATE cache_field, followed by several entries ofSELECT filename, name, type, status, schema_version, weight FROM system WHERE type = 'module', all stuck in 'Opening tables' state. Searching on those query strings brought me here. This is with D 7.16, and its happening on an Aegir migrate operation.Comment #24.0
mrfelton commentedCorrect the description of the second problem being addressed.
Comment #25
moshe weitzman commentedvariables are gone from d8
Comment #26
marcingy commentedReclassifying
Comment #27
mpotter commentedHere is a patch for D7 that reflects the work done in D6 Pressflow mentioned in #22. This moves the variable cache rebuild to the shutdown and removes the bad recursion of variable_initialize()
Comment #28
stefan.r commented