Updates

A few updates since the original post was made:

  • master/slave has now been implemented, and the site is more responsive.
  • A newer tracker patch is now in effect, and should be functionally the same as plain vanilla Drupal 5.x, but with less database load.

---------------------------------

The Drupal association, infrastructure team, and the Oregon State University Open Source Lab (OSUOSL) team made several changes to the Drupal.org infrastructure. Some of these changes affect how you use some drupal.org features, so please read below for more information.

Over the last few months, Drupal's web site (drupal.org) has seen explosive growth in the number of users, posts and comments, as the project gets more and more popular. This has put stress on the hosting infrastructure for drupal.org, which is community donated and generously hosted by the OSUOSL.

Several performance problems have been identified, resolved, and worked around, including:

  • Added 4GB, now totaling 8GB, to the master database server and 2GB, now totaling 4GB, to each of the three web servers that host *.drupal.org. The RAM was paid for by community donations to the Drupal association.
  • Added 2 load balancing servers and added a database slave server, which were generously loaned by OSUOSL.
  • Installed and configured Squid, a reverse proxy cache server. It serves cached pages and images to anonymous users without the need to have them processed by the web servers for every page load.
  • Installed memcached, an in-memory object caching server, with Drupal's memcache API and integration module.
  • Disabled the search module temporarily. Since search queries were a significant consumer of database resources, this would reduce database resource consumption. As a temporary workaround, you can search Drupal, using Google.
  • Disabled the recent forum posts block on the side bar after identifying it was causing the most slow queries cummulatively by time.
  • Installed a different tracker patch that hopefully displays better results.
  • Upgraded the database from MySQL 4.0 to MySQL 5.0.

There are other planned future actions for improving drupal.org's overall performance. You will see improvements as the new changes are tuned over the next few weeks.

  • Install a master/slave configuration of MySQL, to split the read load vs. the write load on two separate machines. This requires some patches to Drupal that David Strauss is working on.

What you can do to help drupal.org:

  • If you have technical skills in the areas of optimization/tuning of MySQL databases, the LAMP stack or Drupal, please review the performance patch spotlight, join the high performance group, and volunteer to help on the Drupal Infrastructure list.
  • You can always Donate to the Drupal project. Your donations will help with infrastructure matters.
  • If you face problems while using drupal.org, please report them to the infrastructure team.

Comments

MattKelly’s picture

Just FYI,
http://drupal.org/node/155616 - it might be possible to even add the coop search to firefox's search.

Large Animal Games
www.largeanimal.com

ericatkins’s picture

I think that Drupal.org should add a Google Custom Search Engine (http://google.com/coop/cse/). Besides being fast and powerful, it could also generate a little extra $$$ for Drupal.org in the case that Adsense is enabled. CSE will also embed nicely into Drupal.org and use the current theme.

I rarely use Drupal.org's native search engine. I almost always use site:drupal.org -site:lists.drupal.org -site:cvs.drupal.org -site:api.drupal.org at google.com to search through the forums and the rest of Drupal.org.

What is it going to take for me to ever use Drupal's native search? I'm unsure. But Google's search is better and provides more accurate results. And right now, something's wrong with Drupal.org's native search. It's not available.

---
the Hushed Casket | Eric Atkins

kbahey’s picture

We have a contrib module that does that too http://drupal.org/project/google_cse

However, we plan to use Drupal's search on a slave. See Amazon's post below.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

crbassett’s picture

I don't mind either way what Drupal.org has, but I do almost exclusively use site:drupal.org in Google for searching this website. I don't know if that matters, but I thought the team might want to know that some search that way.

solutionsphp’s picture

I'm all for better performance, but removing the search functionality seems like a move in the wrong direction.

My review of David Mercer's "Drupal: Creating Blogs, Forums, Portals and Community Websites"
It doesn't matter how you get there if you don't know where you're going.
--The Flying Karamasov Brother

pwolanin’s picture

I think this is only a temporary move until new hardware or other changes make it feasible to re-enable it.

---
Work: BioRAFT

jthomas41’s picture

I agree.

pinxi’s picture

Or at least add Google Custom Search Business Edition

http://www.google.com/enterprise/csbe/index.html

Also, the homepage wasn't visible until I signed in.

melon’s picture

I rarely used the Drupal search form as it was so damn slow. I rather used Google with the site:drupal.org suffix in phrases. IMO if it’s faster without search, then be it.

dnewkerk’s picture

I strongly agree... please consider adding an inline Google search box to the Drupal.org theme. When I saw the search box disappear today, I figured it was toggled off by throttle and would come back in a while, but it didn't. The average user isn't going to know how to do the "site:" search on Google, will look fruitlessly around Drupal.org for how to search, will "not" likely read this news posting, and is going to get frustrated very quickly. And for those who do know how to do a site search on Google - it would still be nice to have it conveniently located on Drupal :D

You don't need to go with a branded or custom solution with the Google search box - just a (free) basic redirection to a Google site search would be enough.

-- Dave

kbahey’s picture

We added a temporary page that aliased the /search page to this node http://drupal.org/search.
On that node there is a link to Google site search for drupal.org.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

Amazon’s picture

Hello, the plan is not to remove searching all together but do some experimenting. Today searches were redirected to the slave database on a test site use for Drupal.org testing.

I don't have the exact number but imagine search having 500, 000 rows in the search index tables. Then create temporary tables based on those and pretty soon you can imagine that huge tables are overflowing out of memory and giant tables are being written to disk for searches. While the database blocks to create those tables, every thing slows down.

The additional 4GBs of RAM for the database server, plus redirecting search to the slave server may be one of many combinations that can help tune the databases over the next few weeks.

Cheers,
Kieran

CTO CivicSpace
Try hosted and pre-configured Drupal 5 profile
http://civicspacelabs.org/create

Kieran Lal

killes@www.drop.org’s picture

mysql> select count(*) from search_index;
+----------+
| count(*) |
+----------+
| 13884893 |
+----------+

--
Drupal services
My Drupal services

techczech’s picture

I wouldn't mourn the Drupal search. I almost never use it anymore preferring the Google Coop solution (http://glottalstart.com/goopal). Furthermore, the Drupal association could make some money on the ads. The one down side of a Google search is that it doesn't index new posts instantly which disqualifies it as the main search for a site of Drupal.org's nature. For instance, it still hasn't indexed any of this post or comments almost a day later.

______________________
Dominik Lukes
http://www.dominiklukes.net
http://www.hermeneuticheretic.net
http://www.czechupdate.com
http://www.techwillow.com

MacRonin’s picture

Does Drupal.org generate a XML Sitemap? It would help Google index the site more quickly and with less overhead since it would only have to read the new pages. I've been using it on my site and have seen people come from google search in response to posts I had made just 30 minutes earlier. Of course it won't always be that quick, but it's gotta help.

-------------------
http://www.PrivacyDigest.com/ News from the Privacy Front
http://www.SunflowerChildren.org/ Helping children around the world

gopher’s picture

Just throwing my 2 cents in here about search.

I use it a lot and find it VERY useful, especially the advanced search. Just wanted to put my vote in for hoping that google search is just a temporary solution.

I also used the 'Recent forum posts' block a lot.

Keep up the good work!!

JohnFilipstad’s picture

Good day!

I made this plugin with mycroft generator so I can search drupal.org via Google using the Search feature of Firefox and IE7.

Visit:
http://mycroft.mozdev.org/download.html?name=Drupal.org+via+Google&categ...
and just click on "Drupal.org via Google" plugin to install.

Then we can be patient in waiting for the activation of the search module here at drupal.org.

John
---------------------------------
http://www.drupalnorge.no
hvor oversettere, utviklere, leverandører, og brukere av Drupal i Norge samles!

agentrickard’s picture

Love this plugin. Can it have settings? I really want to search:

"myquery string site:drupal.org -site:lists.drupal.org -site:cvs.drupal.org"

--
http://ken.blufftontoday.com/
http://new.savannahnow.com/user/2
Search first, ask good questions later.

mcrickman’s picture

This plug in works great with firefox now I can search drupal.org at lightning speed.

restyler’s picture

I don't use Drupal.org search for a year or so, I use Google site:drupal.org. With all my respect to Drupal developers, I don't think that integrated search engine is better then Google. Though it is important to have such part of functionality in core, e.g. for intranets..

RussianWebStudio: improving the web

Will White’s picture

Server seems much more responsive. Thanks!

kbahey’s picture

Drats! I hope you did not jinx it ...
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

pbarnett’s picture

I've been using the site all day and the response time has improved significantly.

Early this morning, I could go and make a cup of tea waiting for a page to load.

Now, 'My recent posts' loads very quickly, and it's nice to see that the author name is now correct (I was browsing the recent changes to d.o.).

As an Open Source user and advocate since 1995 (and a Drupal user for a year), it gladdens my heart to see what this community can achieve.

Kudos to all involved in fixing the recent performance issues, especially webchick, David Strauss and dww!

cog.rusty’s picture

So, search was such a drain? And the database gets bigger and bigger...

I wonder if it has ever been discussed to move a good chunk of the forum posts, say 18 months old, to a separate "read-only archive" installation, which folks with old versions and historians could search. Even old sleeping issues could go there.

Or is the number of nodes insignificant compared to the number of queries? I don't know.

mfer’s picture

A big part of this is the number of people using d.o. Each time there is a new major release of drupal there is a significant increase in those who are using drupal and in turn d.o. We need additional resources to handle the number of people using the site.

--
Matt
http://www.mattfarina.com

kbahey’s picture

The sheer number of people searching, combined with the two pass nature of the search (two selects using a temporary table), makes it highly resource intensive.

For those wanting to help, please file patches, review, test and benchmark existing patches, or donate.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

Anonymous’s picture

Disabled the recent forum posts block on the side bar after identifying it was causing the most slow queries cummulatively by time.

maybe this should go into memcached with an expiry of 90 seconds or something?

kbahey’s picture

Caching, whether in the database, or in memcached, is for anonymous page views only.

Logged in users still get pages processed by Drupal.

There is an advcache module in the works that may remedy this issue, but it is not yet fully cooked.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

Anonymous’s picture

Caching, whether in the database, or in memcached, is for anonymous page views only.

maybe i'm missing something, so i'll state my assumptions in case i've got them wrong.

first, the recent forum posts block is not filtered per-user, but just presents the most recent posts and would be the same for all users?

second, that it would be fairly trivial (read - i'd be happy to cook up a patch) to patch forum.module to use the memcache module installed on d.o. to check for a cached copy, then only generate a fresh one from the db on a cache miss?

Amazon’s picture

Hi, one of the reasons we added 2GB extra RAM to each web server was so that we could make better use of the memcache module. If you could provide a patch, we would make use of it.

Cheers,
Kieran

CTO CivicSpace
Try hosted and pre-configured Drupal 5 profile
http://civicspacelabs.org/create

Kieran Lal

Anonymous’s picture

i've created an issue for this with a (simple) patch attached. please review and comment.

robertDouglass’s picture

A solution would be to use the block cache module to cache that block with an expiry time. The nice thing about the block cache module is that it also plays nicely with the memcache setup.

- Robert Douglass

-----
Lullabot | my Drupal book

JohnForsythe’s picture

I'm not sure it helped. I'm still seeing timeouts trying to load the forums. Posting comments still takes 15 seconds.

--
John Forsythe
Need reliable Drupal hosting?

GoofyX’s picture

The 'Advanced search' link in the Contributor links block is still visible and leads to nowhere (http://drupal.org/search/node).
--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

nimazuk’s picture

SEARCH is very useful, so please try to solve the search problem.

Thank you for your work.

N.Mehrabany
Baruzh web design & programming

Nima

clymber’s picture

Anyone else getting a completely blank drupal.org front page when they're not logged in? If I log in, I get the normal page, but when I'm logged out I get nothing. The blank page is not cached in my browser, it's being returned from drupal.org.

Is one of the reverse proxys caching a blank page for some reason?

Or, maybe this is part of the new performance tuning strategy? ;)

JirkaRybka’s picture

Personally, I used the Search previously, to avoid missing new posts not-indexed-yet by Google. But Google is acceptable for me as a temporary solution.

But I noticed that the "My recent posts" link is gone, which is a really big problem for me..... EDIT: I just noticed it still exists as a tab on "Recent posts", so it's only just missing from the navigation menu. Would it be possible to fix, avoiding me from pointlessly going via the all recent posts page?

Also I hope to see the "New forum topics" block back, however limited (only after login, only upon requesting a special page, or the like) - recently, for some time now, I was trying to help with support by answering questions and offering help in the forums, each time I have a bit of spare time at computer. (Gives me a kind of satisfaction to see problems solved with my help involved - I guess this approach is welcome.) But now it's quite difficult to see which the new questions are, discouraging greatly the temptation to give help.

Hope the performance problems get solved soon. Fingers crossed.

kbahey’s picture

Re: #2 "My recent posts".

What I did for the longest of time is bookmark the link in my browser.

In your case, it will be http://drupal.org/tracker/106464 (where the last number is your user ID).
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

JirkaRybka’s picture

Unfortunately, bookmarking pages is no way for me, because I move between quite a few different computers (home, work, friends, internet-café), additionally being unable to bookmark on some of these (internet café, work where I'm not supposed to do this stuff...) For some time my communication was disrupted due to the missing link, until I've found the workaround via tabs on Recent posts page - then this issue became much less critical to me, but still the workaround is neither fast nor elegant solution ;-)

newms’s picture

Why don't you use a social bookmark service like del.icio.us or ma.gnolia since you move around quite a bit?

I agree though that the "My Recent Posts" link, the search bar and forum block, were VERY useful. Hopefully the infrastructure will be improved enough that they will be returned soon.

newms

JirkaRybka’s picture

I just noticed: The "My recent posts" page seems broken a bit: There are only just 4 pages worth of my posts, but it shows pager for >9 pages, where #5 and later shows only "No posts available". :-/

kbahey’s picture

We put in a fix today for this. Please check again.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

JirkaRybka’s picture

Yes, now the pager is perfectly OK. Thanks a lot for your efforts.

Caleb G2’s picture

The lack of a forum block is a bit distressing. Besides just being fun to look at, it also lets people know how active the site is. Without the new forum posts on the side it looks like there's a lot less going on here, imho. Assuming appearances/marketing count for anything...

Can absolutely vouch for the level of assistance that the block cache module can lend to the situation (can even be set to cache for logged in users).

Beyond that, this is a very awesome turn of events. I'm sure many people would love to see some posts about memcache performance and/or how to install squid for Drupal.

- Caleb
=====
Bloggyland.com

Robardi56’s picture

A "simple" block caching would reduce the problem of this forum block.
You can help test here: http://drupal.org/node/80951

Caleb G2’s picture

...so that doesn't help the current situation a Drupal.org at all. There is already a blockcache.module which Jeff Robbins contributed that works awesome for Drupal 5.

=====
Bloggyland

aaron’s picture

At least I think it should be available on the front page. But I leave it to those watching the pulse to make the best decisions.

Aaron Winborn
Advomatic (Web Design for Progressive Advocacy, Grassroots Movements, and Really Cool Causes)

Anonymous’s picture

Despite having heard something about Drupal 4.x and 5.x loading the entire CMS infrastructure on each and every page load. I figure it was a server is that Drupal was always slow loading, searching, etc., but since the hardware just got upgraded is the slowness due to the software itself.

I'm in the planning process of the building a site off of the Drupal 6 platform, but if I start to get thousands of hits per day, will my site bog down like Drupal.org?

robj

mfer’s picture

This is some software but a lot of hardware. Software can always be more optimized and drupal 6 will be that way. Some things will be found in drupal 7 and so on. There will always be some things. The popularity of drupal has caused there to be many many more people at the site which has spurred some innovation and optimization here.

But, the popularity means more hardware is needed. Any site would need that. Drupal has grown so much in popularity that it needs more hardware. And, they are working on it. When any site gets large it has to grow in the number of servers needed to host it. That's what drupal is doing right now.

--
Matt
http://www.mattfarina.com

Caleb G2’s picture

Drupal.org has been experiencing year-over-year traffic growth of more than 230% for the past few years. It's getting HUNDREDS of thousands of pageviews a day, and to top it off there's a ton on activity on Drupal.org by authenticated users (as opposed to anonymous), which is a much, much harder situation than the average site will ever face (one can serve an exponential amount of anonymous users for every logged in user).

All of the above is held in check by volunteers and shoe string budgets. Drupal.org has needed some hardware thrown at it for a good long time and now it's happening. Hooray. :)

=====
Bloggyland

octima’s picture

I've been a registered user of drupal.org for a few days. Like many others (I suspect), I'm mainly interested in learning about Drupal by browsing the handbooks and forums. This kind of activity does not require me to be logged in. However, I find that when I visit drupal.org, I'm automatically logged in, whether I want to be or not, adding to the burden on drupal's infrastructure. Would it be a better use of resources if users were left to decide for themselves whether they wish to be logged in to the site?

VM’s picture

eventually, a users session is reset per the setting in settings.php and they have to log back in. If users would log out when they leave the site, their session would be terminated immediatley. Logged in or not, the search would still be affected when being used.
_____________________________________________________________________________________________
give a person a fish and you feed them for a day but ... teach a person to fish and you feed them for a lifetime

joep.hendrix’s picture

It may have not anything to do with the search performance. However, cached content will be served to anon users.
Removing the automatic login could IMHO improve the performance dramatically since I think that there quite a few users that visit the site for handbooks, api, etc without needing to log in.

-----------------------------------------
CompuBase, websites and webdesign

-----------------------------------------
Joep
CompuBase, Dutch Drupal full service agency

vph’s picture

It is the software.

As an administrator of a VBulletin forum, a PHPBB forum and a Drupal portal, I know that Drupal is much much slower than the other forum software (all using Apache, PHP, MySQL). I hope 6.0 will be faster, and 7.0 will be even faster.

Take a look at http://phpbb.com and see how many threads and posts and active users they have, you'll see drupal.org is pale in comparison. But their site has always seemed a lot faster to me. Do they have superior hardware? I doubt it.

joep.hendrix’s picture

Please do not compare Drupal to just some kind of Forum software.
Drupal is a CMS and the forum is just one of the modules.

-----------------------------------------
CompuBase, websites and webdesign

-----------------------------------------
Joep
CompuBase, Dutch Drupal full service agency

cog.rusty’s picture

I notice that while they do have a link for their own search system, the text field on their header is a Google search.

Also, they have been careful to keep query pages like "latest discussions" behind a link.

al4711’s picture

Hi,

which loadbalancer have you choose?

Do you know where the bottlenecks are (webserver, appserver, dbserver, application, ...)?

Is there a test-(environment/scenario) to measure the problems?

Due the fact that there are many ways to deliver content, it will be interesting how the infrastructure looks and where are the possibilities to optimize something.

I have done in the past for some companies some performance optimization and it takes always a long time, have you a timeline or some other restictions, for sure the site must be still alive during the tests and optimization time.

Cheers

Aleks

killes@www.drop.org’s picture

Please join the infrastructure mailling list and read its archive where a lot of these issues have been discussed.

--
Drupal services
My Drupal services

Dries’s picture

The slowest queries on drupal.org are the ones for (1) the tracker page, (2) the forum blocks, (3) the forum topics' next/previous links and (4) the search module. Since we disabled most of those features, drupal.org is again more usable. Of course, we want to re-enable all of those feature as soon as possible.

To make that happen, we need help from developers and database experts. To provide some focus, I compiled a list of the 6 most important performance patches that you can help with. We hope to get some of these into Drupal 6, and to backport them to Drupal 5 so we can use them on drupal.org until we upgrade drupal.org to Drupal 6.

  1. Database replication: database replication will help us distribute the load among multiple database server. This will help us with (1), (2), (3), (4) and more. Without this patch, we can't even take advantage of the extra hardware that we're installing. Needless to say, this patch is critical. Some extra background information and thoughts are available at http://buytaert.net/scaling-with-mysql-replication.
  2. Merge {node_comment_statistics} and {node_counter} into {node}: looking at the slow query log on Drupal.org we have reasons to believe that this patch could help us with (1) and (2) and (4). There are some reservations as well so we need people to help benchmark this patch so we can weigh the advantages and the disadvantages. After some good testing and benchmarking, we should be able to drive this patch home.
  3. Block caching: being able to cache expensive blocks would help us with (2) as it eliminates expensive queries.
  4. Tracker query rewrite: would help us with (1) because it rewrites an expensive query.
  5. Link handling in search module: this patch reduces the complexity of the search module and will help with (4). I helps performance and it makes for a better search.
  6. Path lookups: we're brainstorming about how we could reduce the number of look ups required for URL aliasing. I think we need to look at de-normalizing the path alias table, but there are some other ideas that are being proposed, measured and discussed.

(These patches are still allowed to change the Drupal APIs in Drupal 6, and are the _only_ patches that can break our APIs before Drupal 6 beta 1 is released.)

al4711’s picture

Hi,

have anybody looked in the past look into Sequoia for this issues?!

David Strauss’s picture

Sequoia is Java-based. Even using the PHP interface would be infeasible on inexpensive shared hosts.

themegarden.org’s picture

Even using the PHP interface would be infeasible on inexpensive shared hosts.

IMHO, it really doesn't matter , because drupal.org isn't hosted on inexpensive shared host.

There another reason(s) why sequoia isn't an option, and that is drupal architecture.

---
Drupal Themes Live Preview - themegarden.org

ontwerpwerk’s picture

inexpensive hosts matter - drupal.org runs the same drupal that you will be able to download

that's the whole point, drupal.org is a demo and a testcase of a large scale drupal implementation, but the system should be usable for anyone who knows how to get php hosting with a database

themegarden.org’s picture

I agree with you. Inexpensive hosts metter!
I was talking just about sequoia, and is it, or ins't it feasible for drupal.org.
Sequoia is "clustered jdbc" (it is for databases something like squid for web servers), and in some cases can solve problems bounded with large scale deployments.
However, I don't think that can be solution for drupal.org (due another reasons, not expensive/inexpensive hosting).

I know that drupal.org is also "showcase" of drupal itself, and that it's driven by (almost) unmodified drupal.
You can run unmodified php/mysql apps over sequoia (LibMySequoia - http://carob.continuent.org/LibMySequoia).

And there are a lot small to medium drupal based web sites running on inexpensive shared hosts.
But you can't compare inexpensive hosts with large scale deployments on (multiple) dedicated hosts.
---
Drupal Themes Live Preview - themegarden.org

David Strauss’s picture

If #2 gets completed, the tracker must be rewritten as part of the patch. This tracker query rewrite will, as designed, solve the performance problems raised in #4.

Dries’s picture

Good point, David. This dependency is not unimportant. :)

Shai’s picture

Dries,

I really appreciate your posting on this thread. This has been such a nagging and frustrating issue, given how important drupal.org is to the project. Your posting really communicates a clear commitment to solve the problem.

Others have posted quite knowledgeably on the topic and clearly the infrastructure group folks are working their butts off -- but knowing this issue has your attention, as the lead of the project, is quite comforting to me as I expect it is to many other users at all levels who are taking significant risks (in order to achieve significant rewards) by developing with Drupal.

Druapl is awesome!

Shai Gluskin

http://everydayandeverynight.com/

mlncn’s picture

Since it got no attention...

Slow query number 3, the forum topics' next/previous links, is in my guess (meaning the way I use Drupal.org) not a common way to navigate the forum.

Perhaps a patch could allow these links -- perhaps tied in with throttle.module -- to be disabled?

Probably easier, the table "forum" could be denormalized with next and previous topics' nids.

Sorry for not having the time to offer a patch.

~benjamin, Agaric Design Collective

benjamin, Agaric

killes@www.drop.org’s picture

You can disable the next/prev stuff by implementing an empty theme function we just did this as we think nobody is really using these links.
--
Drupal services
My Drupal services

stdbrouw’s picture

If you're planning on dedicating an entire server to searches, you might as well consider http://www.google.com/enterprise/mini/ - it's google searches without the indexing latencies. Anyway, thanks to all those involved in making drupal.org nice and fast.

jsimonis’s picture

Thanks for the update on everything.

I know that I rarely use the search on Drupal. I pretty much always use Google, since I can get results much faster and I can even see the cached version when Drupal is down or really slow.

I have noticed that things are faster, especially the "track" tab under my account. It used to take several minutes to load. Getting into the forums is still slow, though.

--
Jenni S.
http://www.nu-look.net
Portland, OR metro area
Contact Me

Generallee’s picture

...the Google search mashup really can't become a permanent fix or even a medium-term temporary solution. Regardless of the fact that many of us use Google to do site search, it's very important that Drupal "eat it's own dogfood" and have a usable, fast site search function, especially here on Drupal.org, to preserve the immense credibility it has as an open source CMS. The notion that it has to rely on Google for site search doesn't present the best image to people evaluating it against other competitors out there whereas workable, fast site search on the vast archive of Drupal content says that it can stand up to any FOSS or commercial competition out there.

(My first post! YAY! :-)

Lee
--
Websites that work hard for you: MorganAlley.com
Personal opinions: Alley.me.uk

restyler’s picture

What is 'explosive growth' in digits? Is there any stats of users/nodes in 2005-2007 on drupal.org website?

RussianWebStudio: improving the web

mfer’s picture

hass’s picture

this stats are more then 6 month old...

alexrayu’s picture

Thanks for taking time to improve the infrastructure guys! Thanks for faithfully working on the projects and developing and supporting Drupal! Lord bless.

leotemp’s picture

Just my two cents, i find the "Newest forum post" block on the right aggravating anyways, it causes alot of pages to come up on google that have no relevance other then (im guessing here) that the title of the post i just searched google for was in the block when google indexed it.

If you do put it back is there anyway to avoid this problem, if so please share.

MacRonin’s picture

I know the following works for Google Adsense, but I'm not sure if the regular Gogle sarch bots understand/follow the same rules.

Section targeting allows you to suggest sections of your text and HTML content that you'd like us to emphasize or downplay when matching ads to your site's content. By providing us with your suggestions, you can assist us in improving your ad targeting. We recommend that only those familiar with HTML attempt to implement section targeting.

To implement section targeting, you'll need to add a set of special HTML comment tags to your code. These tags will mark the beginning and end of whichever section(s) you'd like to emphasize or de-emphasize for ad targeting.

The HTML tags to emphasize a page section take the following format:

https://www.google.com/adsense/support/bin/answer.py?hl=en&answer=23168

-------------------
http://www.PrivacyDigest.com/ News from the Privacy Front
http://www.SunflowerChildren.org/ Helping children around the world

Gerhard Killesreiter’s picture

Update:

The search is back.

Most of the other features are also back.

The tracker query is a bit broken.

Memcached is also not working too well.

We now have a master/slave setup.

GoofyX’s picture

What about the 'Advanced search' link in the Contributor links block?

Good work guys.
--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

JohnForsythe’s picture

Now that's more like it, things actually seem faster now :)

--
John Forsythe
Need reliable Drupal hosting?

Moonshine’s picture

Things seem very snappy this morning. :)

I think you guys are on the right track w/ squid and multiple MySQL servers. Having a master insert/update/delete MySQL server and slave MySQL select servers is a popular layout for high traffic sites like Yahoo Flickr. This is a Powerpoint presentation (yeah, sorry) that I bookmarked a while back that has some solid information that might be useful. It's by John Allspaw (Flickr) at the '05 Zend PHP confrence in San Francisco and discusses high traffic layouts very similar to what your moving towards.

http://www.ludicorp.com/flickr/zend-talk.ppt

cog.rusty’s picture

About the tracker query patch which is currently installed, it should not go unnoticed that one of its performance benefits probably comes from a usability fix.

The previous tracker query had a bug with the node order which made me check the first *three pages* every time, to find the updated threads where I was involved. Now I can safely check only the first page.

kbahey’s picture

The original tracker functionality (like in a plain vanilla 5.x install) was used on Drupal.org (d.o).

The sheer number of nodes, comments and users on d.o caused these queries to be very very slow and hard on the database.

So, a patch was devised by David Strauss to reduce this. This had functionality changes in the queries, and causes the side effect you mention.

The current patch is basically a UNION variant of the 5.x tracker, so functionality should be the same as that.
It preserves the database schema as it is for 5.x, and hence is compatible with the data all sites have in their databases.

Again, this is a temporary measure. There is a movement to change the schema for that part to make it more efficient, but that would be for Drupal 6 or beyond.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

J.B-2’s picture

I've been looking at Drupal for a while but have always had a concern that Drupal sites are 'slow' (comparatively). And I was starting to realise that the biggest problem Drupal had was it's own website - it is a VERY bad advert for Drupal because it is SO slow. Same with the API site, then it got completely out of hand with the Barcelona Drupalcon site which was completely impossible to get to at some times - and I'm going so that's why I was interested.

I guess the point I'm making is that Drupal has the WORST sites as far as advertising the product. I looked at the hardware list being thrown at this and I laughed, if fundamental things have to be axed given that hardware then I have NO CHANCE of convincing customers to move to this platform. Guys, I so want to advise people to move here, but it's just, well, not convincing. Convince me...

mfer’s picture

over the last few days you should have noticed the speed boost. And, now all the functionality should be back.

What drupal has managed to do on one database server before it went to two is quite remarkable.

Now, it should be snappy. What's acting slow now?

--
Matt
http://www.mattfarina.com

kbahey’s picture

You are right that drupal.org is not a good advertisement for Drupal.

However, to be fair, the reason for the slowness is because drupal.org is being used like a Bugzilla or JIRA (an issue tracking system) by a large number of logged in users. The contributors to the project (not only committers, but those who review patches, write documentation, ...etc.) all check updates to threads they participated in, followups to their issues, issues they themselves followed up on, ...etc.

That usage is atypical for a normal site, where anonymous users are the bulk of visitors.

Combine the fact that we have a large number of logged in users all the time, with drupal.org having a huge number of nodes (160k), users (160k), and comments (?), and you know why the slowness is happening.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

andremolnar’s picture

Just thinking out loud - but wouldn't it make sense to split project into its own Drupal instance on another server and simply share the user table? That might make things a bit more 'typical'.

andre

BryanSD’s picture

Or perhaps use Drupal.com to promote/document Drupal and use Drupal.org for the development/trouble ticket aspects? Then again that may not be the best idea. One of the nice things about the current Drupal.org is the tightness in the relationship between users and developers. In many cases...they're also the same. We're all aware of projects where the developers appear to be "missing" from the site.

Bryan
CMSReport

GoofyX’s picture

Scratch Drupal in homepage? :-)
--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

kbahey’s picture

scratch.drupal.org is a test virtual host that often has a mirror copy of drupal.org for testing purposes.

A human error caused the theme to display the logo of the test site.

Has been fixed. Nothing to worry about.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba

joep.hendrix’s picture

Thanks all for the great improvement!
Im glad that the search is back and the recent forum topics!

Well done!

Joep

-----------------------------------------
CompuBase, websites and webdesign

-----------------------------------------
Joep
CompuBase, Dutch Drupal full service agency

rdwest2005’s picture

Hello guys,
My ENGLISH is very bad, please look over it.. I'm afgany and from a hut :) - -

I'm not trying to be a smart arss here either guys... Just do the test and see what you get...

I'm sure I can put you guys on the right track.
I joined the mailing list and sent a couple mails, but I left the list when I found OSU was running Gentoo.

A couple questions?...
Are the installs from the liveCD ?
Are the systems running Genkernel ?

Here is you guys a real test!
I'm sure I know how to compensate for script execution so fast, Drupal.org would load by a blink of an eye!

My laptop is Sony Vaio - XP/Gentoo Dual Boot - Centrino Core 2 duo T5600 1.83ghz - 1g Mem - 120g Drive

MySQL : Distrib 5.0.44
Apache : Apache/2.0.58
PHP : 5.2.3-pl3-gentoo

I'll do a dump from my laptop...

#CREATE DATABASE `linksdb`;
#USE `linksdb`;
#CREATE TABLE `links` (
# `id` int(11) NOT NULL auto_increment,
#  `link_name` varchar(100) NOT NULL,
# `link` varchar(200) NOT NULL,
#  `link_descr` text NOT NULL,
#  `name` varchar(50) NOT NULL,
#  PRIMARY KEY  (`id`)
#) ENGINE=MyISAM  DEFAULT CHARSET=utf8 ;

<?php

$mtime = microtime();
      $mtime = explode(' ', $mtime);
      $mtime = $mtime[1] + $mtime[0];
      $starttime = $mtime;
global $link;


mysql_connect("localhost", "user", "pass");
mysql_select_db("linksdb");
$num= 1;

$link_name="Link Name ";
$link="http://www.Gentoo.org/";
$link_descr="Worlds Fastest Linux Distro---Beeuu-tion";
$name="RD West Sr.";

/* Insert 100,000 rows */
for ($i=0;$i<100000;$i++) {
    mysql_query("INSERT INTO `links` (id,link_name,link,link_descr,name) VALUES ('','$link_name " .$num. "','$link','$link_descr','$name')");

	$num++;
}

echo "Done ...<br><br>";

$mtime = microtime();
      $mtime = explode(" ", $mtime);
      $mtime = $mtime[1] + $mtime[0];
      $endtime = $mtime;
      $totaltime = ($endtime - $starttime);
      echo 'This page was created in ' .$totaltime. ' seconds.';


mysql_close();
?>

Create a test user / pass for this...

My laptop will loop through the insert 100,000 rows and 5 columns of text in........

----> 7 seconds flat :)

Run the test...

Also, change the script to do $i<500000 records - but clear out first 100,000 test
Do this 4 times (4 times so you don't exceed ini execution time) by changing $num on second insert to $num = 500001 and next insert go $num = 1000001 etc...1500001...

So you should have 2 million rows - k?
Then paginate 30 records per page.
Now you have 100,000 pages - k?

Do select * and go from front 30 results(page 1 to page 100000)

My Sony laptop does this 1.6 seconds!

Do the tests...
See if the guys at OSU has Gentoo performing at its full potential!

I think 1 master and 2 slaves that I've built will out perform the future infrastructure!

Let me know some results [ rdwest2005 [ at ] gmail [ d0t] com

[EDIT]
PS:
When I clicked this edit - it was like 10 seconds to respond on me here on charter docsis broadband 3meg account
I've did complete DMOZ mysql imports that takes aprox 3 or 4 hours and my servers never glitch and always are instant responding...

Richard West

mdekkers’s picture

rdwest2005’s picture

The worlds fastest OS ATM from any results I've seen.

LOL

I just sent this mail to the list...
I got to get to work - l8r guys

Kieran,
Sorry, but the one guy David about got me p-o-ed...
You guys are way past the loads i've encountered for sure...
If you ever get the resources, get the severs compiled 64bit with all optimizations and you'll see...

but for help with the present...
compesate for low entropy on headless boxes
also drop apache nice to -10 and drop mysql to -5 (should get at least a 25% gain)
Its really the CMS so you must have the fastest OS to get the fastest response times
So every single wee lil optimization is crucial under a load
~R

Talk is cheap... Results are winners! NooB's are cry baby-ies

LOL

Richard West

joep.hendrix’s picture

Remove redundant indexes!

see issue http://drupal.org/node/172724

-----------------------------------------
Joep
CompuBase, Drupal websites and design

-----------------------------------------
Joep
CompuBase, Dutch Drupal full service agency