Hi, I recently switched from phpbb to drupal. (http://drupal.org/node/44368)

Here is my configuration

- web server: Dual Xeon 3.0Ghz (32bit), 2G RAM, SATA HDD
- db server: same as web server (physically different server)
- software: apache 1.3.34, php 5.1.2, mysql 5.0.18
- total number of nodes: 70,000+
- total number of comments: 330,000+
- total number of users: 25,000+
- Usually there are 150+ anonymous users and 50+ registered users (by checking Who's online block)
- Modules: most core modules (4.7) except archive, book, drupal, legacy, ping with contributed modules (adsense, codefilter, favorite_nodes, google_analytics, logintobaggan, nmoderation, subscription, taxomony_access, urlfilter, votingapi, comment_upload, tagadelic)

The problem was... the site was too slow at first. Especially, the 'view recent' was terribly slow. It took almost 10 seconds to display and showing node/posting comment also took more than 5 seconds.

So we modified the mysql database and tracker.module like this:

ALTER TABLE comments ADD INDEX index_comments_count(pid, status);

This was for speeding up "SELECT COUNT(cid) FROM comments WHERE pid = ? AND status = ?" at comment.module. This really boosted the speed of showing node with comments as well as posting comment.

How about modifying the drupal core like this?

And to speed up the 'view recent', we modified the following part:

$sql = 'SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM {node} n INNER JOIN {users} u ON n.uid = u.uid INNER JOIN {node_comment_statistics} l ON n.nid = l.nid WHERE n.status = 1 ORDER BY last_post DESC';

modified:

$sql = 'SELECT n.nid, n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM {node} n INNER JOIN {users} u ON n.uid = u.uid INNER JOIN {node_comment_statistics} l ON n.nid = l.nid WHERE n.status = 1 ORDER BY last_post DESC';

Is it okay to remove DISTINCT?

And we modified the following part also:

$sql = 'SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM {node} n INNER JOIN {node_comment_statistics} l ON n.nid = l.nid INNER JOIN {users} u ON n.uid = u.uid LEFT JOIN {comments} c ON n.nid = c.nid AND (c.status = %d OR c.status IS NULL) WHERE n.status = 1 AND (n.uid = %d OR c.uid = %d) ORDER BY last_post DESC';
$sql_count = 'SELECT COUNT(DISTINCT(n.nid)) FROM {node} n LEFT JOIN {comments} c ON n.nid = c.nid AND (c.status = %d OR c.status IS NULL) WHERE n.status = 1 AND (n.uid = %d OR c.uid = %d)';
$sql = db_rewrite_sql($sql);
$sql_count = db_rewrite_sql($sql_count);
$result = pager_query($sql, 25, 0, $sql_count, COMMENT_PUBLISHED, $uid, $uid);

modified:

$sql = 'SELECT * FROM ( SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM {node} n, {node_comment_statistics} l, {users} u, {comments} c WHERE c.uid= %d AND n.nid=c.nid AND l.nid = n.nid AND u.uid=n.uid UNION SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM {node} n, {node_comment_statistics} l, {users} u WHERE n.uid = %d AND l.nid = n.nid AND u.uid=n.uid)zz ORDER BY last_post DESC';
$sql_count = 'SELECT COUNT(*) FROM ( SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM {node} n, {node_comment_statistics} l, {users} u, comments c WHERE c.uid=%d AND n.nid=c.nid AND l.nid = n.nid AND u.uid=n.uid UNION SELECT DISTINCT(n.nid), n.title, n.type, n.changed, n.uid, u.name, l.last_comment_timestamp AS last_post, l.comment_count FROM {node} n, {node_comment_statistics} l, {users} u WHERE n.uid = %d AND l.nid = n.nid AND u.uid=n.uid )zz';
$sql = db_rewrite_sql($sql);
$sql_count = db_rewrite_sql($sql_count);
$result = pager_query($sql, 25, 0, $sql_count, $uid, $uid);

How about modifying the drupal core like this?

We also modified mysql configuration so that it does not create 45M+ temporary file at /tmp. (Bigger buffer size for sort/join) But the table lock / system lock still happens quite often.

how we can boost the speed more? If you have any good idea for speeding up drupal site, please share your experience or share the link for previous discussion.

Thanks very much.

Comments

morphir’s picture

Thanks for your feedback/review. Maybe you can join #drupal (irc-channel) @freenode ? Because I dont know who could review your hack.

Anyways, I'am kinda into memcached. So...

look here

Cheers!

morphir.com

ksoonson’s picture

Thank you very much for your information. But I don't quite understand how to use memcache. Could you please explain with easier word?

/*
Add these functions to bootstrap.inc and replace the original include_once './includes/session.inc'; with a call to
drupal_memcache_init. $conf['memcache_use'] and $conf['memcache_session'] are switching memcache usage.
$conf['memcache_session_server'] is an array of host:port strings.
*/

The above should be how to apply memcache but I don't understand the word clearly. (Please understand English is not my mother tongue)

If you could give example, that would be very appreciated.

killes@www.drop.org’s picture

Removing the DISTINCTs is probably not an option. You would most likely break Drupal for people who use node access modules. Adding the index might be an option, though, please file an issue.
--
Drupal services
My Drupal services

dries’s picture

Care to provide a patch and/or benchmark numbers? That would be much appreciated.

ksoonson’s picture

We don't have exact b/m result but as I said above, this boosted the performance very much. (view recent: 10 second --> 2 second, view node/post comment: 5 second --> 2 second)

killes says that removing DISTINCT is not good... could you please review the latter sql tweak if it has any potential problem? I did not do the tuning by myself. It was done by mysql/php expert who did not have any experience with drupal before. So he is quite curious if this will break any other drupal function or behaviour even though we did not see any issue so far.

Walt Esquivel’s picture

You might also want to join http://groups.drupal.org/high-performance and post your info there or a link from there to here. Here is its homepage info:

This group is dedicated to solutions and approaches for high traffic, high performing Drupal sites. As such, it will deal with a lot of information around the rest of a typical Drupal "stack" -- the operating system, web server, database, and PHP tweaks that combine to support the Drupal application.

The Performance and Scalability forum on Drupal.org is where else to look.

-----
Walt Esquivel, MBA, MA, Captain - U.S. Marine Corps (Veteran)
President, Wellness Corps, LLC
-----

ksoonson’s picture

Thanks very much for your information. I subscribed and posted as you said at: http://groups.drupal.org/node/577