I know the D6 test site is just running on scratchvm with a single small DB server. However, lots of things are very slow on this site. I'm worried that even if we go live on the production hardware, things might still be noticeably slower than what we have now. I'm also concerned that during the sprint in Boston, we didn't make a whole lot of progress on performance/scalability issues -- Jeremy and David spent most of their time on the SVN and patch deployment system (to manage our existing performance patches), and Narayan was mainly dealing with setting up servers and whatnot. We did kill that DB crushing query in cvs.module, and we certainly laid the foundation for managing our code in a sane way, but I don't get the sense we have a really solid picture of what the performance profile of the new site is going to be.
I believe this task should be a blocker before actually trying to go live.
Thoughts? Is there any reasonable way to test this ahead of time, or do we just have to pull the trigger, hope for the best, and be ready to try to tune/optimize once it's under load?
Thanks,
-Derek
Comments
Comment #1
gerhard killesreiter commentedI certainly intend to look at the slow query log before this goes live and to look at the queries generated by any views. Are there other areas that need attention?
Comment #2
dwwNot sure, this isn't my primary area of expertise. ;)
Note: I'm going to make some changes to all the issue queue views in the next day or so that should hopefully simplify / speed up the queries a lot. So, I wouldn't spend too much time agonizing over those at least until #368371: Fix "Project" exposed filters on issue queue views is committed and deployed... It won't necessarily help the JOINs for the data in the tables, but it'll at least remove an expensive WHERE clause from another table in many cases, so it can't hurt.
Comment #3
jeremy commentedI have some memcache integration patches to help with performance. One is waiting to be merged (providing an external pagecache, allowing anonymous pages to be served without hitting the database), another I'm cleaning up so it can be applied cleanly the way our source control system works (a pagecache lock that prevents cache stampedes) -- I hope to have it ready later today or tomorrow. I also need to roll a new memcache module release, as code for a minimum cache lifetime has landed there.
Beyond that, I'm aware of the need for a patch to the menu system, per this comment. I had hoped to work on that already, perhaps later today or tomorrow.
Note that we did already merge all performance patches that are currently part of Drupal.org -- David and I were working on a clean way to maintain these patches which has been accomplished.
Comment #4
Amazon commentedI don't know how much effort it would take, but it seems to me that the association could buy time on a cluster at AWS. We could set up three web-heads and 2 db servers and then have a third server load test. This may be more trouble than it's worth, but it seems a viable option worth exploring.
Comment #5
jcfiala commentedHi,
One thing that's been impressed onto me as I've been working to improve the performance of a site we're building for a client is that the order of items in the 'WHERE' clause is important for determining the index that mysql will use - I had gotten into a habit of doing node queries with 'WHERE status = 1 and type = 'article'', for instance, when 'WHERE type = 'article' and status = 1' will perform better because the type index is more specific.
Related to that, views puts items in the WHERE clause of it's queries in the same order that they're in the filter display - pushing the 'Node: published' filter down and other more specific filters up generally helps.
Comment #6
david strauss@jcfialia That is incorrect.
Comment #7
gábor hojtsyWhat David told me in Cambridge, MA was that the D5 patches were ported so it is really up to (1) how slower is D6 in itself (2) how slower are the new project stuff, which was changed. Apart that, the search seems to have been improved in speed, even now that we use an external test search server.
Comment #8
pwolanin commentedI' have been wondering for the D6 menu system whether we need to stop trying to cache the whole router in cache_menu - this is a very big blob to read/write.
We try to pull it out of the cache (and/or set it) here: http://api.drupal.org/api/function/menu_router_build/6
The only place we are using this blob is: http://api.drupal.org/api/function/menu_link_save
And basically we could rewrite this function: http://api.drupal.org/api/function/_menu_find_router_path/6
to do a query if $menu was not passed in. Really having the $menu in memory is typically only beneficial when we are rebuilding the navigation links and hence are calling menu_link save many times in a single page load.
Comment #9
gerhard killesreiter commentedWe will use the SUN machine to do some performance tests and collect slow queries and improperly indexed tables.
Comment #10
gábor hojtsyGerhard completed extensive testing of the site, and suggested several improvements:
#371475: Reduce left joins for issue views Assigned to: dww
#372168: Slow query log from updated version of drupal.org
#371521: Missing index in menu_links table
#371458: Missing index menu_router tab_root
#371439: Mixed primary key on profile_values does not work
#371452: Missing index on visibility in profile_fields
#371543: Missing index for status on blocks
#215080: Performance: change {system}.type: alter table system modify column type VARCHAR(32);