As many of you are no doubt aware ;) drupal.org has been pretty slow to respond lately. And while there has been a lot of work going on to resolve this problem by various members of the infrastructure team, there has not been a lot of information in terms of forum posts and such to keep the community at large informed.

So here's to changing all of that. :)

NOTE: Most of this content comes from a recent e-mail to the infrastructure mailing list (requires registration) from Eric Searcy of OSU Open Source Lab (OSL), drupal.org's generous hosting provider.

What's the deal?

The root of the problem that our database server is overloaded. This is causing table locks:

Table locking is also disadvantageous under the following scenario:
* A client issues a SELECT that takes a long time to run.
* Another client then issues an UPDATE on the same table.
This client waits until the SELECT is finished.
* Another client issues another SELECT statement on the same
table. Because UPDATE has higher priority than SELECT, this
SELECT waits for the UPDATE to finish, and for the first
SELECT to finish.

This is due, at least in part, to the massive size of the drupal.org's database, which houses over 6 years' worth of content. The most expensive query is the "my recent posts" page, which is also (of course) one of our most popular pages.

What's been done so far?

  1. OSL has performed some database server optimizations, which have helped decrease the load some.
  2. A caching proxy (Squid) has been installed to help divert some read-only DB traffic by caching frequently accessed pages like the home page and RSS feeds. This saves about half the hits to the database.
  3. Installed a variation of the move users.access into its own table patch, to remove user table locking.

What are we doing about it?

  1. OSL is working on moving drupal.org over to a new database cluster with more RAM that can run more concurrent threads. Currently, there are some issues with character encoding and performance.
  2. We are pursuing changing over drupal.org's tables to use InnoDB rather than MyISAM. This allows us to do *row* level locking rather than *table* level locking.
  3. Working on trying to improve the tracker query. See below.

How can I help?

  1. Donate to the Drupal Association, which gives us more resources to throw hardware at the problem.
  2. Help test the new database setup on scratch.drupal.org by poking around, doing some of your normal stuff, and reporting any errors to the infrastructure issue queue (remember to search first!).
  3. Join efforts to try to improve tracker query performance and help get a patch into core for Drupal 6.
  4. Join efforts to move users.access into its own table and help get a patch into core for Drupal 6.
  5. If you know a lot about tuning servers and databases and whatnot, join the infrastructure mailing list and lend your expertise.

Comments

Robardi56’s picture

Nice post chick, but it won't replace the need for a direct "Donate" link on drupal.org's homepage.
This direct link would need to a page explaining how donation are used (like for improving infrastructure), it would list latest contributors, amount raised.... (these are just ideas).

On wikipedia you have a link: "Your continued donations keep Wikipedia running!" .
I bet a "Your continued donations keep drupal developing" would certainly help people feel more involved.

JohnForsythe’s picture

Speaking of a direct donation link.. here it is:

http://association.drupal.org/donate

--
John Forsythe
Need reliable Drupal hosting?

jwalling’s picture

I checked the donation link. It required me to be logged into an association account to make a PayPal donation. That is an impediment to making a donation. Donations should be easy and convenient, i.e., a direct link from the main site.

AbeLincoln’s picture

Giving should be quick and easy. I wonder why they make you verify email address etc. to give them money. They could even just give and email address to send donations to (donate (A) drupal.org). That would be quicker and more likely to get me (though I'm a small dollar type, but it would be worth it for others too. They could maybe let me "subscribe" and give a few pounds / month)
___________________
Jim Jam and Sunny Fan Forum

NancyDru’s picture

The most expensive query is the "my recent posts" page, which is also (of course) one of our most popular pages.

As I asked somewhere once upon a time, and got no answer, "How recent is recent?" It appears "recent" is an incorrect term, unless one thinks in Biblical time. ;-) Apparently, it really means everything I've ever posted and mostly forgotten about.

It might help to reduce the load if we (i.e. an individual user) could set an interval (say 30 days) so Tracker only has to go that far back.

Plus, there are many posts in my "recent posts" that I no longer wish to track at all. If there was some way to remove those, it might help a bunch. It would certainly make me happier to not have those things popping up and slowly moving down my list. I realize that this might means removing my uid somehow from the posts, but I'm willing to have that happen. (I guess, to retain some audit trail, you could always just set the uid negative so it wouldn't match the query.)

Nancy W.
Drupal Cookbook (for New Drupallers)
Adding Hidden Design or How To notes in your database

Robardi56’s picture

What about caching this page, and refresh it only if in between a new post was made ?

Walt Esquivel’s picture

I may be naiive about this and it may already be a planned feature for a future Drupal release but...

Why doesn't drupal.org simply add a "My unread" tab similar to that found on groups.drupal.org's "Unread posts in my groups" (http://groups.drupal.org/unread) page?

The latter on g.d.o only pulls up my unread items that pertain to the groups I've joined.

Just a thought.

Walt Esquivel, MBA; MA; President, Wellness Corps; Captain, USMC (Veteran)
$50 Hosting Discount Helps Projects Needing Financing

styro’s picture

But I think that feature on g.d.o is a View - so it is probably even more expensive.

The expensiveness of these queries isn't so much related to how many results they match, but what they need to check/calculate to get those results. eg how many tables they need to look in (eg joins), what tables get locked, how many columns get checked and the indexes on those columns, how many rows they need to sift through etc etc.

I don't think limiting the result sets would speed anything up much - after all the limit then becomes another check that the query needs to take account of. And I think the tracker is already doing that to show you the 1st or nth page of results anyway.

Corrections welcome from more knowledgable folks though...

--
Anton
New to Drupal? | Troubleshooting FAQ
Example knowledge base built with Drupal

Island Usurper’s picture

From what I remember of my databases class, things like WHERE and LIMIT clauses happen after the database engine has collected the entire result set from the disk. This would also include JOIN ... ON from what I understand, because it matches every row of one table with every other row, then throws away the rows that don't match. I'm sure that SQL optimizes things where it can, but I sure haven't taken the time to figure out exactly how it does that.

-----
Übercart -- One cart to rule them all.

David Strauss’s picture

That's incorrect. Database always try to restrict the sets to WHERE criteria before processing JOINs and LIMITs.

syscrusher’s picture

You can learn a lot about how the queries work internally by using the EXPLAIN verb in MySQL. Just type "EXPLAIN ...." where "...." is the query you would have normally executed. I'm about 98% sure EXPLAIN works in PostgreSQL also. I know for a fact there's an equivalent if not, because I've used it...I just don't remember the syntax because I don't use Postgres every day. :-)

Have fun!

Syscrusher

peterx’s picture

select * from example_table where field_a = 3
If field_a is indexed, MySQL and other databases can select the right rows in the index before reading the rows. When field_a is not indexed, all databases have to scan through every row to find rows containing the right value. Indexing is important. The better databases have ways to show you what happens with and without indexes. MySQL and some other databases have the EXPLAIN statement to indicate how the database will search for the right rows.

Some joins can use the indexes on each table to create the result set without reading a row.

When databases have a prepare step separate from the execute SQL step, the prepare step works out the search strategy. You effectively precompile the SQL. That saves you time if you execute the SQL repeatedly.

Some databases have optimize type functions that analyze the indexes and save the results to help work out the best strategy during the prepare step. You might set a cron job to run the optimize once a week.

In a rapidly changing database, EXPLAIN works better after optimize.

Indexes slow down updates and speed up reads. When you have far more selects than updates or inserts, index every field used in a WHERE, ON or SANTA clause.

petermoulding.com/web_architect

erika’s picture

Yes that would be a good idea...
The user can have his own choices to either keep or not keep the posts in his recent posts....
but the posts musts be always avaiable online to support newbies
the users should be able to delete their recent posts only from thier storage..
for example you can delete your recent posts,but if I want to track I should be able to see your posts..coz I'm a newbie and it would be definitely useful to me...
makes sense?

drupalnesia’s picture

I hope you put the S/W optimization higher than H/W, because this problem not only effect Drupal.org site but our sites too which is impossible to ask our hosting to replace their server as soon. Maybe this is the time to enable again the Archive module (which disable on Drupal 5.x) with some additional feature such as: archive certain posting on certain period.
The difficult part is the native Drupal DB structure (compare to other CMSs) which store almost EVERYTHING in the node table, i.e: page, blog, and forum store in ONE table.

CAUTION!
Moving from MyIsam to InnoDB is not always a good solution. Why?
1. most Drupal users/webhostings use MyISAM not InnoDB.
2. in DB SELECT operation MyISAM faster than InnoDB.
3. "* Another client then issues an UPDATE on the same table.": because Drupal put Page, Blog, Stroy and Forum in same table. What you must do is separate the Forum table while Page, Blog and Story can be put on same table. The most DB UPDATE activity is in Forum not in Page, Blog or Story ( except that the website uses for Blog purpose, so maybe we must put Blog on new table also ).

Just my opinion.

magico’s picture

Agreed and...

1. do not throw hardware at it.
2. do not use any SQUID or other proxy systems
3. redefine table structure to separate more accessed fields as readonly to one table and the other more writable fields to other
4. reduce, reduce, reduce number of queries into the database
5. implement Drupal aggressive cache into filesystem (for both pages, menus and blocks)
6. break "node" table into "node_nodetype" tables, using "node" table only for aggregation
7. I'll take a look at tracker SQL

Regards,
----------
Fernando Silva
Openquest - Sistemas de Informação, Lda

NancyDru’s picture

3 and 4 are often in conflict. This is why I am not always a fan of complete normalization. For example, the comments table includes uid to track who made the comment. In a site where user names don't change, why not include the whole user name in the row, rather than having to do another query to get it? Accessing data on a disk drive is much slower than transferring it once found.

Nancy W.
Drupal Cookbook (for New Drupallers)
Adding Hidden Design or How To notes in your database

jeff_’s picture

In a site where user names don't change, why not include the whole user name in the row

In a site where user names don't (or rarely) change, why have a UID at all?

webchick’s picture

It's much faster to do:

"SELECT stuff FROM user WHERE uid = 456"

than:

"SELECT stuff FROM user WHERE name = 'Bob, The \'Magical\' Flying Cupcake!'"

jeff_’s picture

How much faster is it, really? Do you think that the server is spending a significant amount of time in strcmp()?

If you mean the index will be bigger, you have a point. But that's really just kind of a weird way of compressing the key before indexing it (by carrying a UID around all the code). In theory, the index could be smart enough to do this for you.

I'm just trying to say that the focus should be on the big performance problems first. If locking is the issue, use something that doesn't lock as aggressively (like InnoDB). Revisit the table design from a higher level if necessary.

If they scrape up a percentage point here or there by making redundant table designs, they'll be worse off. Better to get at the heart of the performance issue. If you see that a trick of some kind is required in a certain area, that's OK.

hass’s picture

everytime - it's bad to compare strings if not required... you need an index, this one needs to be scanned, cached, updated and so on. Using an autoincrement field is better in all cases. You will not see speed differences with "some" users... but with some thousands you will have an issue. aside saving a small number in a session variable will reduce session scope size and you win memory. if memory is an issue it's better. But another problem that will only popup's on high volume sites and not on a private homepage without traffic :-).

syscrusher’s picture

...but rather the disk I/O required to scan through the index data. Also, more data scanned will typically mean more traffic through the database's cache, and this data will displace something else that could have been in that cache instead.

Syscrusher

peterx’s picture

Using an id instead of a name saves time when there are several steps from one piece of data to another and should be looked at more.

If you have example tables ex_a and ex_b joined on an id instead of name, you save time all round.

Take a step up to a many to many join table. example_table_person is indexed on name, example_table_city is indexed on city, and then you create a many to many joining table indexed on name and city. Person "Adolph Blaine Charles David Earl Frederick Gerald Hubert Irvin John Kenneth Lloyd Martin Nero Oliver Paul Quincy Randolph Sherman Thomas Uncas Victor William Xerxes Yancy Wolfeschlegelsteinhausenbergerdorff" could be joined to city "Krungthep Mahanakhon Amonrattanakosin Mahintharayutthaya Mahadilokphop Noppharatratchathani Burirom-udomratchaniwet Mahasathan Amonphiman Awatansathit Sakkathattiya Witsanu Kamprasit". It is much easier to index example_table_person on pid, example_table_city on cid, and example_table_person_city pid-cid.

I set a up a payroll system with a 40 character surname and found the company had employees with surnames longer than 40 characters. I expanded the surname to 44 characters to handle the longest name then the company hired someone with a surname longer than 44 characters.

petermoulding.com/web_architect

Leeteq’s picture

Also here on Drupal.org it is possible to change the user name. That is a very valuable feature. I certainly hope it is not abandoned, at least not as how Drupal core will deal with it. That potential is one of the things that makes Drupal much more flexible than other CMS's.

But I often wonder how much "more expensive" it would be to actually store the user name along with the UID (in its own column) only in some selected tables that play a significant role regarding performance.

That way, many of the queries could save "unnecessary" lookups in the users table. I am no db performance expert, though.

Whenever a user name change happens (infrequently, but still important to have that option), then that extra column in those (very) few tables could be updated as well. That could be combined with a "delay/throttle" setting (must then delay the new user name from being stored in the users table as well, or it may lead to confusion), that will postpone such a query for a maximum amount (x) of time. For example timed to run during the low-traffic periods.

Hence, the user name change may not happen imediately, but will at least be possible and be done. This could be relevant if such a table update is very costly. All user name updates of "today" could been queued to be run during the night.

.
--
( Evaluating the long-term route for Drupal 7.x via BackdropCMS at https://www.CMX.zone )

David Strauss’s picture

We already do that in at least one table, {node_comment_statistics}.

David Strauss’s picture

To clarify, it's a common practice in high-performance database architecture called denormalization.

Database architecture must serve two goals: 1) a clean, non-redundant data layout for easy manipulation and 2) high performance for querying. The first goal pushes the database toward normalization. The second often pushes the database toward denormalization.

To give a real-world analogy, imagine you need to call someone often but can never remember their number or that you need to call them. You could either 1) post a note with just their name all over your house and look up their number in an address book or 2) post their name and number all over the house. The first approach makes it much easier to update their phone number because you only have to do it in one place. The second approach makes it much easier to call them because you don't have to look up their number in the address book. The approach you should take depends on 1) how often their phone number changes and 2) how often you need to call them.

(The ones and twos above correspond to normalization and denormalization.)

ldsandon’s picture

Denormalization is usually performed in "load once - query many" systems like datawarehouses - when denormalized data can't get out of synch easily. In OLTP-oriented systems denormalization can be really risky - you have to coordinate all changes without database support.

LDS

styro’s picture

I hope you put the S/W optimization higher than H/W, because this problem not only effect Drupal.org site but our sites too which is impossible to ask our hosting to replace their server as soon.

They are doing both because they need to do both. Just doing either one probably won't be enough the way the site is growing.

The difficult part is the native Drupal DB structure (compare to other CMSs) which store almost EVERYTHING in the node table, i.e: page, blog, and forum store in ONE table.

Wouldn't splitting it up require more queries and/or joins and/or threads for the tracker listing? I'm no expert, but wouldn't that make the sites main bottleneck even slower?

--
Anton
New to Drupal? | Troubleshooting FAQ
Example knowledge base built with Drupal

alexandreracine’s picture

Currently, there are some issues with character encoding [...]

Really? Does this help?

Upgrading from 2.1 to 2.2.RC1 characters encoding problem [SOLVED with Solution]

I had a blast with those...

Alexandre Racine

www.alexandreracine.com - mon site perso
www.salsamontreal.com La référence salsa à Montréal

Anonymous’s picture

you need to build distributed web server architecture with nginx for example. and to replace squid with oops web-caching server, because it's much faster.

http://sysoev.ru/en/
http://zipper.paco.net/~igor/oops/about.html

artlogic’s picture

I am certainly no expert, but it sounds like you are describing the exact conditions that are mentioned on this page under "Table locking is also disadvantageous under the following scenario":

http://dev.mysql.com/doc/refman/5.0/en/table-locking.html

The has a number of suggestions which involve raising and lowering the priority of certain statements that may be interacting. This would at least speed up the SELECT statements, somewhat at the expense of the UPDATE statements.

Additionally I've got to wonder if some form a database de-normalization or caching might help this problem. I'm not familiar with the Drupal 5 database schema, but if there's not already a recent_posts table that duplicates some data from other parts of the database, I would think there should be.

Shai’s picture

Thanks for sharing the information.

Shai

http://www4.jrf.org

themegarden.org’s picture

A few questions:
- how accurate (uptodate) is the http://scratch.drupal.org site
- where can I find details about drupal.org hosting (both hardware and software details)

It sounds to me that load-balancing cluster can solve problems (at least some of them).
MySQL offers (almost) out of the box, load-balancing capabilities (also in multi-master config).

I think moving to InnoDB could be good solution, too.
From mysql.com:

...
InnoDB does locking on the row level and also provides an Oracle-style consistent non-locking read in SELECT statements. These features increase multi-user concurrency and performance. There is no need for lock escalation in InnoDB because row-level locks fit in very little space. InnoDB also supports FOREIGN KEY constraints. You can freely mix InnoDB tables with tables from other MySQL storage engines, even within the same statement.

InnoDB has been designed for maximum performance when processing large data volumes. Its CPU efficiency is probably not matched by any other disk-based relational database engine. 
...

InnoDB drawbacks: INSERT has some transaction overhead.
But, SELECT is ... just faster (a lot)
---
Drupal Themes Live Preview - themegarden.org

hass’s picture

non-locking read in SELECT statements

...is only possible if you add extra param's to the SQL request! i opened a case about read-uncommitted ~8 weeks ago http://drupal.org/node/128008. MySQL does not do read-uncommitted with InnoDB per default, see http://dev.mysql.com/doc/refman/5.0/en/innodb-transaction-isolation.html. But using read-uncommitted will BOOST performance... no new hardware is required in this case, for sure.

jeff_’s picture

non-locking read in SELECT statements ...is only possible if you add extra param's to the SQL request!

Are you sure? In InnoDB, readers don't block writers and writers don't block readers in read committed mode (the default), if I understand correctly:
http://dev.mysql.com/doc/refman/5.0/en/innodb-consistent-read.html

That's the way PostgreSQL works, too. It reduces the need for locking drastically. It has been called "better than row-level locking".

hass’s picture

I can only say for sure about MSSQL2K, read-uncommitted is the way to boost performance and remove simply all deadlocks. We have never tried other isolation levels, but all our deadlocks are gone, traffic have increased tenfold at minimum and the machine is mostly under low load. This is a 4 CPU single core 2.4Ghz XEON, 512kb L2C, HT activated, 4GB Ram, normal load level ~21% - high load day's ~45% and the machine is nearly 4 years old! We have over 300 SQL request per second with many table joins... 60% searches, but this site is not running drupal...

What hardware is drupal.org running? :-) some data to compare?

jeff_’s picture

I can only say for sure about MSSQL2K, read-uncommitted is the way to boost performance and remove simply all deadlocks.

First of all, drupal seems to have performance problems related to locking, not problems with deadlocks. A deadlock is when a transaction must be aborted because two transactions are waiting on eachother (or in more complicated situations, you have a cycle).

Second, MSSQL is fundamentally different from InnoDB. InnoDB can avoid many of the locking problems without abandoning ACID (read uncommitted violates the "isolation" part of ACID).

hass’s picture

if you have deadlocks you possibly have toooooooo many table or row locks. In this situations are all tables full of locks and a request is awaiting the unlock. if the unlock doesn't come you get a deadlock, what will happen under very high load. additional, yes you possibly have cycles... but this is a different problem.

mostly it is not required to be consistent on reads... for e.g. cache table, frontpage not changing often, etc. therefor read-uncommitted is good enough for all read only operations. But the table locks made by MyISAM are same as MsSQL table locks... row locking made by InnoDB is better, but not best on load.

GoofyX’s picture

I agree to Drupal-id.com's comment above, that most hosting providers offer only the MyISAM engine for MySQL (or charge extra for InnoDB), so converting the table type to InnoDB would not probably be the best solution.

Let's focus on the software optimization issue first (SQL, decrease of queries if possible, etc.) and later on the hardware.
--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

themegarden.org’s picture

It doesn't mather what most hosting providers offer. As I understend, it's about drupal.org hosting.

It doesn't mean that you have to change something with your host. Converting MySQL table type to InnoDB on drupal.org host doesn't mean that you should do it too (specially at small shared hosts).

And, I agree that there is need to improve software (sql optimisation ...), but of course, it isn't trivial task.

---
Drupal Themes Live Preview - themegarden.org

GoofyX’s picture

You are right about this. My comment was made from the point of view that whatever optimizations are done on the drupal.org site would probably make it to drupal codebase itself, so this is why I object to the InnoDB table type.

It's probably because of different timezones, but now the site feels a lot quicker than last night (when I posted my first comment).

--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

webchick’s picture

You are right about this. My comment was made from the point of view that whatever optimizations are done on the drupal.org site would probably make it to drupal codebase itself, so this is why I object to the InnoDB table type.

This has already been contributed back to the codebase. There was a patch that went into Drupal 5.0 that removed the hard-coded type=MyISAM on each table definition that was present before. This allows MySQL to use whatever its default is set to, which means that all Drupal sites have the ability to use InnoDB or MyISAM, or HEAP or whatever.

It's a mistake to hard-code this in the code either way, because small sites on shared hosting (which is a huge percentage of our users) will get much faster performance from MyISAM, where large sites with many nodes and users may get better performance from InnoDB.

dfletcher’s picture

I've been through a similar experience on splendora.com and will share the info I've gathered. A blanket change of all tables to InnoDB actually slows things down. Inserts are much slower (InnoDB keeps things ordered) and selects are a shade slower. The primary benefit of course is the row level locking webchick mentioned, but in my mind it makes sense only to use this feature (because of the other slowdowns) on certain tables which are updated very frequently. This is the list of tables I ended up converting to InnoDB:

ALTER TABLE access TYPE=INNODB;
ALTER TABLE accesslog TYPE=INNODB;
ALTER TABLE cache TYPE=INNODB;
ALTER TABLE poll_votes TYPE=INNODB;
ALTER TABLE sessions TYPE=INNODB;
ALTER TABLE watchdog TYPE=INNODB;

I left all other tables (well other than a couple home-brew tables that are not interesting here ;) as MYISAM.

While we're talking about optimizing tables: Search seems to create temporary tables on disk instead of being HEAP type tables. Mysql documentation says mysql will create tables on disk if there is TEXT or BLOB columns, or if there is not enough memory allocated to temporary tables. I pretty much ruled out the second part by setting the allowed size of temporary tables to 1GB, which is much larger than our entire database. I haven't looked more closely, but I'm guessing there's a TEXT column in temp_search_results which is slowing searches. Not sure if this still affects the latest codebase, I'm a bit out of date, but it's certainly an issue in 4.7.6. Ideally, search should not create temporary tables. Less ideally, we could try to at least make sure we're not creating loads of temporary tables on disk.

modul’s picture

The technical issues in this thread are way beyond me, I'm afraid, but maybe a few hints to avoid queries (the bottleneck, I understand) could be useful.
- The page http://drupal.org/forum is very slow to load, whereas this page only serves as a series of links to the various subforums. The dynamic information on this page is of no use: who cares if there are 824 of 846 new postings? Who cares if the newest message was by WhatChaMaCallit, posted 342 hours and 12 minutes ago? That is information without any use or relevance whatsoever. Leave it out, and simple make that page into a queryless "static" page, just with the links to the subforums. It would save me 20 seconds of delay per visit.
- Same thing with the "New forum topics" in the rightside block. I admit, it can be nice to have a list of new topics, but not on every single node page... Make a separate page just for the new forum postings, but don't show them "everywhere".

And something just for my information, something which I really don't understand in the Drupal philosophy: why is it that registering is severely penalized by this very irritating low "speed"??? I thought registered users are the ones who should be given a treat, who should be rewarded for registering. That is not the case at all !! It is unregistered users who get the goodies, they get the benefit of caching. I don't know if caching is done for registered users too (I'd assume it is), but the net result is not visible at all. This baffles me. I think this is the world upside down, actually. Could someone please explain to me why this is the way it is, without resorting to irrelevant technicalities? There must be something conceptually wrong behind this, it's simply beyond my comprehension. Reward registered users, don't penalize them.

Ludo

sepeck’s picture

Unregistered users do not get the treat ... depending on your setup. Unregistered users do not get /tracker to their recent posts. Unregistered users do not get 'new' since last visit indicators. Depending on your permissions in Access Roles, registered users may have additional access that unregistered do not, even more, different roles may open up additional rights, such as 'site maintainer vs doc maintainer vs site admin roles on drupal.org. In order to determine what a user sees, caching just isn't something simple to do. Caching is what anonymous users get.

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide

modul’s picture

Thanks for your reply, Steven. Unregistered and registered users are, of course, entitled to different access patterns, but my main point was that unregistered users get speed. As you say it yourself: "Caching is what anonymous users get"... And I understand that caching is a tough issue, but still, the bottom line is: unregistered means speed, registered means lack of speed... That's what I meant by "the world upside down".

Ludo

jlin’s picture

Hi Ludo

Yes, I get where you're coming from. Not caching registered users is not necessarily a "philosophy", it's just that registered users, as you know, have dynamic content available to them. The concept of caching means to serve everyone the same pre-generated content, so that at first look, is at odds to the customized output presented to the registered users. This is the reason that anonymous users get cache, because all anonymous users are being sent the same page, whereas the system has to customize each page for registered users.

That said, there are modules being developed that will help with this issue, one of them being "block cache" module http://drupal.org/project/blockcache , which caches blocks within a page, and I think there's another module, but I don't remember the name right now. I can look it up and post back.

I totally agree with you on the extraneous information though. It doesn't really matter that there are 80,000 forum topics in the forum. Most of the posts are from years ago and contain outdated information. That number is only interesting for a person doing statistics such as monitoring the growth of the community and whatnot, and even Dries himself only does that once or twice a year.

NancyDru’s picture

I use the right sidebar recent posts to see if there are any things I can help with. I don't really want to go to all posts. I agree that the stuff in the forums page is unnecessary, but then I almost never go there. I usually enter the site by my own tracker URL. (Yes, that makes me one of the culprits, but that's the main way I can "donate" to the community at the moment.)

Nancy W.
Drupal Cookbook (for New Drupallers)
Adding Hidden Design or How To notes in your database

modul’s picture

Hi Nancy, sure it makes sense to have a list of all recent posts, but my point was that this list should have its own separate page. That way, the underlying query would only be executed for people who really want to see that list.

Ludo

Michelle’s picture

It has nothing to do with any sort of reward system. It's possible to cache everything for anon users because there's less variables to deal with. With logged in users, there's a lot of dynamic stuff that you can't cache. It's just a matter of what is and isn't possible to do. You're putting intent there that simply isn't there.

Michelle

--------------------------------------
My site: http://shellmultimedia.com

modul’s picture

Hmm, maybe I shouldn't have used the word "reward" here, because that indeed implies some kind of intent, which, obviously, is not there. Still, the net result is that unregistered users do have a (relatively) speedy access to the forum, whereas registered users don't.

Ludo

Michelle’s picture

It's no secret that the caching makes things faster for anon users. But your entire last paragraph was going on about the Drupal philosophy and penalizing registered users and such, which is just nonsense. Yes, it's faster for anon users. We all know that. But that's just because of the nature of the beast, that's all. No philosophy, no intent, no penalizing. Just the way it is.

Michelle

--------------------------------------
My site: http://shellmultimedia.com

J.B-2’s picture

...let's not lose sight of the audience here. Drupal is a framework to deliver websites, here's a small quote from a recent BBC website article I was reading regarding website design etc:

"Research suggests that users of a site split into three groups. One that regularly contributes (about 1%); a second that occasionally contributes (about 9%); and a majority who almost never contribute (90%)."

So, if it's fair to expect that Drupal will be used to deliver a variety of websites, which in general will fit in with the 'norm' across the web, then focusing on improving performance for that 90% (= mostly not registered) is not a bad idea! ;-)

Yes, I know this thread is about D.org in particular which may (or not?) have a much higher proportion of registered users, and anything that improves performance for registered users is to be welcomed. I'm just pointing out that non-registered users are important too, especially if they make up the vast majority of your audience!

Cheers
JB

dww’s picture

the tracker page is a major DB hog in D5. there's already an issue open about fixing this query (via software, not hardware). ;) reviews, testing, and benchmarking of my latest patch at http://drupal.org/node/105639#comment-244263 would be most appreciated.

this is a patch against D6, but it applies to D5, too. the schema update hunk obviously fails against D5, but that's easy enough to fix/re-roll if you wanted to test on a D5 site, instead.

i think it'd be reasonable for d.o to run a patched D5 using this patch until we're ready to upgrade to D6 (which will be months from now), assuming that this patch (or something like it) is committed to D6 before the schema freeze.

comments please:
http://drupal.org/project/comments/add/105639

thanks,
-derek

p.s. no, this won't solve *all* our problems, but could possibly wipe out all of the tablesorts from the tracker page, which is probably one of the primary causes of grief on d.o right now...

___________________
3281d Consulting

dww’s picture

well, my patch failed miserably, due to unforseen weirdness in MySQL. ;) however, David Strauss came through with a winner:
http://drupal.org/node/105639#comment-245280

after much review, testing and benchmarking, i just put this patch on d.o and the results are massive improvement. not to say this is the end of the story or the only problem, but it's a huge step forward.

everyone sing it together with me...

"3 cheers for David Strauss!"

;)

p.s. and his company Four Kitchen Studios who paid to let him work on this all day... ;)

___________________
3281d Consulting

tostinni’s picture

Congrats David Strauss, thanks a lot for your work ;)

NancyDru’s picture

What a difference in speed. Unfortunately, "my recent posts" is not updating unless I hit refresh... But it's worth it.

Nancy W.
Drupal Cookbook (for New Drupallers)
Adding Hidden Design or How To notes in your database

modul’s picture

Something has definitely happened to Drupal.org. The speed increase is Real, Tangible, Remarkable. I'm still finding out what is and what is not going faster (the main forum page, where you select one of the different forums is Not, but that's of minor importance), but the overall impression is Good. Congratulations to David!!!! It sounds like he untied the Gordian knot and made an increasingly unusable site into something indispensable. I'm still waiting a little bit before becoming overenthusiastic, but I think this is definitely going to be a good day in the history of Drupal.

Question: does it make sense to apply this patch also in regular, more modest sites, or would that be overkill?

Ludo

David Strauss’s picture

The only thing that's actually faster is the tracker. Everything else is only faster because the tracker is not hogging all the resources anymore.

modul’s picture

So, everything is faster :-). The tracker is useful, but I would hardly call it Drupal's heart. Amazing that such a thing could devour all those costly resources... Bàd tracker, bàd !!

Once again: thanks, David!

Ludo

Veggieryan’s picture

yup.

its WAY WAY WAY FASTER!

AWESOME!

thanks again!

NancyDru’s picture

Is there any way to get "My recent posts" to go back to putting new replies at the top of the list? I really hate having to go through several pages of posts to see if I need to do any follow up.

Nancy W.
Drupal Cookbook (for New Drupallers)
Adding Hidden Design or How To notes in your database

Michelle’s picture

http://drupal.org/node/105639

Michelle

--------------------------------------
My site: http://shellmultimedia.com

catch’s picture

the tracker patch is looking even better now, an amazing difference from that one query.

Along those lines, forum_get_forums and _forum_topics_unread are real killers on the forums index. Which other than /tracker must be one of the more frequently visited pages.

So I started this issue: http://drupal.org/node/145525

NancyDru’s picture

Mr. Strauss, your next mission, should you choose to accept it:

I just spent an hour trying to submit a bug report. I went straight to the project page (9 mins), then "View pending bug reports" (11 mins), "Submit" new bug report (in excess of 14 mins), Preview (7 mins), Post (long enough to drink a cup of coffee).

At this rate, I'm drinking way too much coffee! Please help.

Nancy W.
Drupal Cookbook (for New Drupallers)
Adding Hidden Design or How To notes in your database

ChrisKennedy’s picture

The drupal.org databases are being migrated today, that's why it's running slowly.

teledyn’s picture

Will David's patch be promoted to the DRUPAL-5-1 CVS?

peterx’s picture

The My Recent Posts page is a pain to read because it has 897456 entries that have not changed. You could replace it with a page that lists only pages that change.

Instead of reading the whole list of my posts every time I visit the page, set up a table with user id and the id/title of changed nodes. When a node changes, post the node id and title in the user_node_change table against the relevant users. You do not have to lock anything because you are only inserting, never updating.

When I log in, I see a list of nodes that have changed and click through them. There may be duplicates where a node has changed several times but the list can weed out the duplicates. There is no table locking.

The user-node pairs are deleted as I view the nodes. There is no need to lock when deleting because it does not matter if you try to delete a row that was already deleted. Just turn off the error checks.

The whole system can work without locks.

petermoulding.com/web_architect

Sharique’s picture

Do not lock tables for select operations. And we can implement row locking in place of table locking.

---
Sharique uddin Ahmed Farooqui
IT head, Managefolio.com

Sharique Ahmed Farooqui
http://www.openahmed.com

catch’s picture

another performance patch here to reduce db queries.

http://drupal.org/node/106559

hass’s picture

Having one path lookup per page would be a very big trick. i have no clue how to do an SELECT src FROM {url_alias} WHERE dst IN ('%s') with an %s as URL list... but this merge all alias table lookups into one lookup...

Veggieryan’s picture

Well, this won't make me very popular or hip but I think that single servers are going the way of the buffalo.

I am moving all my sites to www.mosso.com .... it is a CLUSTER of servers. Your site can get the power of 10-100 servers if needed at anytime.

they run some pretty big sites like www.hdradio.com and www.uneasysilence.com with no problems.

Basically, Mysql is handled on a CLUSTER of MANY servers. PHP is handled on another CLUSTER of MANY servers.

It is simply impossible to get that kind of performance with a single server or even 4 load balanced servers. Then theres the reliability and maintenance.....

Im all for more efficient SQL, but lets be honest. Drupal.org is HUGE. There is no way the servers will keep up.

If you have a cage of mice and give the X amount of food you get Y amount of mice. Give them 2X the food and get 2X the mice.

The current servers are KILLING drupal. It is EMBARASSIING to bring people to drupal.org and wait 30 seconds for a page to load.

Drupal should be 100x more popular than it is now. We have to shoot for that.

We cant bring 100x more users if our servers are straining at 1x. We are thinking on the completely WRONG SCALE.

Make drupal.org 100x more powerful and you allow 100x more growth.

Let mosso.com handle the server so that we can all focus on the code.

Sit back and laugh as drupal.org zips along with 10,000 users logged in as if it were 100 users.

NancyDru’s picture

Veggieryan’s picture

it is surely cheaper than maintaining what we have now.

I bet if you broke it down on HOURS spent just keeping up the current server that mosso would come in even or maybe 10x cheaper depending on what people would usually charge for the upkeep.

Economically mosso is the clear winner.

NancyDru’s picture

It doesn't work that way when you have volunteer labor and donated server space. Someone would have to fork out the big bucks to reside on Mosso - and keep it up month after month.

I even do my sites on a paying basis, and I couldn't begin to justify $100 per month minimum.

Nancy W.
Drupal Cookbook (for New Drupallers)
Adding Hidden Design or How To notes in your database

Veggieryan’s picture

I think 100 a month is damn cheap.
Considering some drupal techs make 100 an hour it would pay for itself the first time something goes wrong on a typical dedicated host.

also, consider that 100 per month is for as many websites as you can host.
so if you have 10 sites thats 10 bucks a month to never worry about technical issues and just focus on the sites themselves.

web hosting is changing. in 3-5 years EVERYTHING will be clustered.

David Strauss’s picture

$100/mo could not pay for quality clustered hosting. My company pays almost $300/mo for a single server with N+1 redundancy in most important areas. Multiple servers and load balancers for clustering and fail-over would increase that cost. $100/mo wouldn't even pay for us to co-locate with our own hardware.

Veggieryan’s picture

Like i said, alot has changed.
I have not found a single negative comment on mosso's clustered hosting.

I was paying $300 a month for an enterprise server and it was not nearly as fast and DEFINATELY not as reliable as mosso.

Its impossible for a single server to compete with a cluster.

Colocation is an even worse proposition.

the game has changed.

NancyDru’s picture

Maybe I should look at a drastic increase in my rates.

But, as I mentioned, keep in mind that most people working on this stuff do it for nothing - obviously not even glory.

Nancy W.
Drupal Cookbook (for New Drupallers)
Adding Hidden Design or How To notes in your database

ckeck’s picture

Hello everyone, my name is Chad Keck and I work for Mosso. I am a huge fan of Drupal and have been a part of community for a while.

We would certainly love to see the Drupal community move to our platform but I know many of you have some concerns. If anyone could point me to the decision maker for this aspect of Drupal I would like to discuss some hosting options with them and see what we can do to help out the community.

I will also leave my contact details below:

Chad Keck
Sales, Mosso :: The Hosting System
877.934.0409
ckeck@mosso.com
www.mosso.com

NancyDru’s picture

Robardi56’s picture

Seems good but it seems mosso doesn't allow shell access.... not sure the drupal people will be happy about that...

Veggieryan’s picture

they still have their development boxes for scratch.drupal.org

there is no need for shell access on the production server. that is what the development server is for.

webchick’s picture

http://drupal.org/node/105639#comment-245280

Drupal.org is now running that patch, and the results speak for themselves. We need help polishing that up so we can get it into core for everyone for Drupal 6 (and possibly Drupal 5, too).

Thank you, David Strauss of Four Kitchen Studios!!

mlle.yotnottin’s picture

What an amazing difference there is in speed in visiting drupal.org tonight! But this presents a problem: I used to go and play a game, or read my email or maybe even take a snooze in between page loads. Heck, once I was even able to cook supper while I waited. Now what will I do? My whole routine is in total disarray. I might actually have to do some *real* work now.

All jesting aside, whatever David did has made a tremendous difference in the speed and responsiveness in drupal.org. Thank you very much David and Four Kitchen Studios!

ChrisKennedy’s picture

hah :P

mcurry’s picture

Yes, please do work on optimizing the software first, if it can be done. Throwing hardware at the problem should be the last resort. The recent changes appear to be helping quite a lot. What can we do to help ensure that the tracker query patches make it back into D5?

sepeck’s picture

Saying 'cluster' is the solution is like saying we need water to put out a fire. Clusters can be a solution but they are not by default always a 'right' solution.

Personally, I don't feel embarrassed. I know people are working on it. The issue has been in the queue. People involved have worked on things, yet still very few people took the effort to actually get involved to work to a solution. Part is because there are not a lot of people that run sites with as much traffic as drupal.org, other part is it's easier to comment on a forum post and not get on irc, or get on the mail list etc.

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide

Veggieryan’s picture

first of all, let me clarify that I really respect the work that is done to keep drupal.org running.

it is a MASSIVE and POPULAR site.

the latest patches have made HUGE progress.
I just think its on a totally different scale.

Say we make drupal.org 2x faster. Thats great.

But Im saying what if it were 100x faster so we could have 100x more users?
That cant happen on 1-4 servers no matter how well you code it.

Just because drupal.org is on a cluster doesn't mean we can't still improve the code and make it more efficient by load testing it.

It just means that drupal.org will be able to keep up with the drupal community as it becomes more and more popular (which is obviously inevitable because nothing else comes close to drupal.)

All this wont matter anyways as all hosting will be clustered within a few years. It is the logical progression of the technology. At that point the whole open source lab will likely be a super cluster.

David Strauss’s picture

The latest tracker patches improve tracker performance 200 to 400x (not percent) on Drupal.org. You cannot throw hardware at the problem and expect that sort of speedup. Clustering is a valid approach only when your application is already profiled and optimized or if you have enormous amounts of money to throw at the problem. Of course, if you have that kind of money, you should still throw much of it into software improvements.

vivek.puri’s picture

But Im saying what if it were 100x faster so we could have 100x more users?
That cant happen on 1-4 servers no matter how well you code it.

Thats just your opinion and not a fact. It may very well be possible to scale (x) times on same hardware by improving many aspects within drupal itself. Such action will not just benefit drupal.org but all people who work and depend on drupal.
btw Mosso is not the right solution for drupal , it maybe for many sites. One important reason is drupal.org is not just the home for drupal, it also gives its developer insight into how real life sites are impacted. Running on Mosso will never give such an experience and its very valuable for future of drupal.

Just for facts 100x faster will not translate into 100x users and 100x more users don't necessarily need drupal to be 100x faster. It will help but is not necessary. 10 more contributers can help much more than 100x more users ;)

coupet’s picture

Well, it depends on the site. Software optimization will benefit all sites powered by drupal, whereas hardware optimization can be use to greatly improve performance of highly successful sites.

----
Darly

NancyDru’s picture

There are several places (including node.module) where I've run across node and node_revisions being joined on "vid." The node table has this as an index, but node_revisions does not. Wouldn't this help the speed of those joins? Perhaps the Handbook speed would be slightly improved?

NEVER MIND... just a senior moment

Nancy W.
Drupal Cookbook (for New Drupallers)
Adding Hidden Design or How To notes in your database

hanief84’s picture

I hope this act will boost Drupal's traffic!

"Hello from Malaysia! ^^ "
Website: www.indiecom.net
Skype: ga1984

GoofyX’s picture

The difference is much more than noticeable. Drupal.org is extremely responsible the latest days!
--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

dami’s picture

Not sure if it helps, but with the current menu setup, I have to click on '(all) recent posts' first, before landing on 'my recent post' page. Most of the time, all I care is the later one. From usability point of view, 2 clicks is a waste especially when the query is slow. Not sure how much it contributes to server load... But if we could show both 'all recent posts' and 'my recent posts' menu on the navigation block, it may save a lot of unnecessary hits?

NancyDru’s picture

catch’s picture

I get to it via browser history normally but given that there's a "my issues" link in the contributor links menu, it'd be quite a useful thing to have.

schwa’s picture

I access Drupal from many different machines and can't store browser bookmarks on them all - a discrete 'My Recent Posts' link would be very beneficial.

dww’s picture

http://drupal.org/node/107934 (specific to d.o)
http://drupal.org/node/146282 (new feature for D6)

___________________
3281d Consulting

nilsfr’s picture

You write that you are using Squid as a caching proxy. I wonder if you ever have heard of Varnish it is a highly efficient web proxy, allegedly much more efficient than Squid. I would think it's worth a try.

ldsandon’s picture

Unless and until you spotted a real hardware issue, throwing more hardware at the problem isn't often the right solution - it just makes the hardware vendor richers. You have a serialization issue, not an hardware problem. More hardware won't solve it - it could make it even *worse*.

1) Use a database setup which handles concurrency better. In "good" databases, readers do not block writers, and writers do not block readers.
2) Don't use read-uncommitted transaction level, aka "no transaction isolation at all". Some database were forced to allow that just because their locking implementation sucks and can't handle many concurrent statements properly. "Good" databases implementation does not need "read-uncomitted" isolation level at all.
3) Test under load - a database could be the fastest of all with 1 (one) user and then fails miserably as soon as more users are added.
4) Spot the hotspots :) Profile the application and identify which queries needs tuning, and tune properly. Use the proper database tools to perform this.
...
99) Now you can tune the hardware for your load. You could be surprised that far less hardware could be required than originally thought.

LDS

ldsandon’s picture

Actually, if you have a serialization issue, clustering can magnify it.

LDS

Veggieryan’s picture

please elaborate..
this is very interesting to me.

thanks,
ryan.

ldsandon’s picture

It depends on the clustering technology underneath, but usually a serialization issue means that something is waiting on something else. In a cluster, these "waits" (locks, etc.) may - or have to be - propagate to the other nodes, slowing down the whole cluster.
For example in Oracle Real Application Cluster library cache contention (excessive parsing), excessive I/O and concentrated locking issues (i.e. a table used like a counter) will perform worse in a cluster because these are globally coordinated in the cluster, threfore they won't go faster than a single node machine, and the added cluster overhead will slow them down.
A proper designed system won't hit these issues, of course, and will take advantage of the cluster, but a poorly designed one will hit them and won't scale.
Of course, with MySQL or other databases issues may be different - a good knowledge of the system is always required, but a cluster could not be the magic "fast=true" switch, probably a "vendor=richer" switch only.
Therefore before throwing more hardware at the system, it's better to analyze the current bottlenecks to find out if it's a real hardware issue, of if it is an implementation issue. After a proper tuning, I saw many DB systems running faster on a fraction of the hardware developers thought it was needed.

LDS

modul’s picture

For about 2-3 weeks, Drupal.org's speed was more or less comfortable to work with. This thread explains the technical wizardry which led to the much craved-for improvement. Finally, Drupal.org was what it is supposed to be: a tool, a collection of opinions and solutions, in which it was a joy to browse. I enjoyed it, and I thanked those who came up with the solution to the speed problem. I thought we were saved... Alas... We are not. For a couple of days now, it's back to the hideous 30-40 seconds of waiting between clicks, instead of the 2-3 seconds of the past couple of weeks.

What has happened??????

Why are we back to those stone-age waiting periods? Is it a temporary problem? When will it be back to "normal" again? I sincerely hope: very soon... And we all know now that it is possible to have speedy response times. So, when will they be back???

OK, I'm going to click "Preview comment" now, do some shopping, and come back to push the "Post comment" button. * Sigh *

Ludo

GoofyX’s picture

The last days (a week or so), I get occasional MySQL too many connections errors and I guess I'm not alone... Any ideas?
--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

--
... Morpheus: What is "real"? How do you define "real"? If you 're talking about what you can feel, what you can smell, what you can taste and see, then "real" is simply electrical signals interpreted by your brain...

joep.hendrix’s picture

What has happened???????

That is what I want to know too!
Was there a sudden increase of enormous amounts of content?
Users?
Modules?
etc.

I think it would be appropriate to read some official comments on this matter because it worries me a lot.

-----------------------------------------
CompuBase, websites and webdesign

-----------------------------------------
Joep
CompuBase, Dutch Drupal full service agency

sepeck’s picture

more new visitors. CNet mentions, etc.

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide

joep.hendrix’s picture

Was there an explosion of new visitors? What about the caching of anon visitors?

Will the performance degrade this fast the next couple of weeks, meaning load times that exceed 1 minute?
Sorry for being a bit cinical but I just do not believe that the sudden increase in load times is a result of more visitors.
Like ludootje mentioned, it was great after the mysql storage engine change and now it is as bad as before. There must be another reason why the load times are this poor again.

-----------------------------------------
CompuBase, websites and webdesign

-----------------------------------------
Joep
CompuBase, Dutch Drupal full service agency

sepeck’s picture

It is your right not to believe me. And my right to ignore you.

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide

joep.hendrix’s picture

Sorry, but I did not mean to offend you.
It just seems not logical.
Could you provide some statistics that shows the sudden increase of visitors related to the dramatic performance degration?

And probably more importantly: what are the steps that will be taken to improve it?

Thanks!

-----------------------------------------
CompuBase, websites and webdesign

-----------------------------------------
Joep
CompuBase, Dutch Drupal full service agency

modul’s picture

Not to question your authority, Sepeck, but I feel you took Joeph's words a bit too personal. We're just trying to understand why drupal.org, after a couple of weeks of comfortable working, again went into semi-hibernation.

You mention the amount of new members. I dunno... I took a look at the userid's of random users over certain periods of time. I round the numbers:
a user who registered 6 months ago, had userid 101.000
4,5 months ago: 116.000
4 months : 120.000
3 months : 131.000
2 months : 140.000
2 weeks ago : 152.000
2 days ago : 155.000

Yes, there is an increase in numbers, of course there is. But there is nothing really "dramatic" here. And nothing much happened between now and the time when things went more or less smoothly, 2 weeks ago or so. I cannot judge if more users are logging in at exactly the same moment, but it would surprize me if they would do so continuously over a period of 2 or more weeks.

Of course, the CNet mention would have led to an increase in visitors. But judging from these numbers, I would say that most of these visitors were anonymous users.

I haven't checked the amount of new texts and comments, but I don't have the feeling that there is a really Sudden and Considerable increase here.

So, the question remains: what happened??? Why did drupal.org run smoothly a couple of weeks ago, and not now?

Again, do Not take this personally. We're just trying to understand. And, more importantly, we are hoping, really really Really hoping, that drupal.org will very soon become usable again.

Ludo

sepeck’s picture

Whenever someone suggests I am lying or concealing information I do actually take it personally.

Both of you are free to believe what you want. Follow my tracker where I have answered this in several other threads.

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide

joep.hendrix’s picture

I never meant to accuse you from lying or concealing information. Sorry that you took it personal.

Not to find an excuse but English is not my mother tongue and maybe the following statement is interpreted differently by native English speakers.
I just do not believe that the sudden increase in load times is a result of more visitors
It was definately not my intention to say that you were lying, noway.
I was just trying to help find out what is causing this problem and that there might be another problem causing the sudden performance degration.

I think ludootje just perfectly put into words why the relation between increase of visitors and performance degration is not that obvious.

-----------------------------------------
CompuBase, websites and webdesign

-----------------------------------------
Joep
CompuBase, Dutch Drupal full service agency

peterx’s picture

Could we start a discussion specifically on caching with a proxy server?

We could make more content cachable to make better use of proxy servers if all Drupal developers know all the requirements for making pages cachable. We can make more pages cachable through Ajax and similar technologies if enough people follow the same pattern.

As an example, assume most of your Web pages have just one block different between anonymous and logged in. You could make the logged in block an Ajax block. The proxy delivers the anonymous page from cache. An Ajax login logs you into the server. Subsequent pages are delivered from the proxy cache and a little Javascript changes the login block from
"Log in"
to
"Hello Webchick, welcome to Drupal. Have a nice day. Would you like fries with that?".

There are other considerations including recording Web page visits for tracking, all of which would fit a separate topic.

petermoulding.com/web_architect

tesliana’s picture

Have you tried using PostgreSQL and if not then why not ;?)
___________________________________
Svi smo mi zarobljenici svojih ličnih iskustva.
We are all prisoners of our own experiences.

joep.hendrix’s picture

Have a look at the issue I posted:
http://drupal.org/node/172724

-----------------------------------------
Joep
CompuBase, Drupal websites and design

-----------------------------------------
Joep
CompuBase, Dutch Drupal full service agency