Add automatic option for optimizing tables
IceCreamYou - September 4, 2009 - 22:18
| Project: | DB Maintenance |
| Version: | 7.x-1.x-dev |
| Component: | Code |
| Category: | task |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs review |
Jump to:
Description
Hi,
I like the idea of this module. I've had a similar module I wrote sitting around on my HD that just runs OPTIMIZE (for MySQL) or VACUUM (for PGSQL) to clean up the database every so often. Similar functionality is in the Database Administration module, but you have to explicitly choose which tables to optimize, whereas the module I wrote automatically looks for tables that need optimizing and just optimizes those.
The code is fairly short, and I think it fits in pretty well with the purpose of this module, so I figured I'd post it here. If you want to keep it, great; if not, that's fine too.
<?php
/**
* Implementation of hook_cron().
*/
function dbopt_cron() {
//Optimize every 6 hours. Ideally this would be a setting.
if (variable_get('dbopt_last', 0) < time() - 60 * 60 * 6) {
//This is not in the SQL standards, so PGSQL and MySQL have their own commands for it.
switch ($GLOBALS['db_type']) {
case 'pgsql':
//I don't run any sites on PGSQL so I haven't actually tested this, but the documentation indicates it should work.
db_query("VACUUM");
break;
case 'mysqli':
case 'mysql':
$tables = dbopt_get_tables(TRUE);
$tables = implode(',', $tables);
$result = db_query("OPTIMIZE TABLE $tables");
break;
watchdog('dbopt', 'Database tables optimized');
variable_set('dbopt_last', time());
}
}
/**
* Gets a list of tables and returns them in an array.
*/
function dbopt_get_tables() {
$result = db_query("SHOW TABLE STATUS");
$values = array();
while ($table = db_fetch_array($result)) {
if ($table['Data_free']) {
$values[] = $table['Name'];
}
}
return $values;
}
?>
#1
Er, sorry, that would be DB Maintenance, not DB Administration.
#2
SHOW TABLE STATUS has a data_free column that is the equivalent of overhead in phpmyadmin. You're missing out on the opportunity in your code to just optimize the tables that need it by evaluating the value of that column on each of the tables before optimizing it.
Moreover, optimizing is really only beneficial on tables with concurrent inserts for MyISAM tables, where there are variable length fields like varchar, that have large amounts of frequently deleted content. That's why I recommend watchdog and sessions, but it really all depends on how people use Drupal for their site. A site that's generating a lot of content with only a few users might do better by optimizing node_revisions more often instead.
However, if you divide data_free into data_length from the table status list, you could get a default list of tables that need to be optimized much in the same way that Windows defrag says whether or not your drive should get a defrag.
#3
If that was directed at me, I'm confused, because that's exactly what the code I posted does. If my code is run, only tables that need optimization will get optimized. There's no reason to look at data_length; data_free is the amount of space that would be gained by running OPTIMIZE.
I haven't worked with DBTNG yet, but from what I understand, updating this code for D7 would just consist of changing db_query("do stuff") to db_query("do stuff")->execute().
#4
My bad. I scanned through your code too fast. Maybe the capital D in Data_free threw me off.
On a static query with db_query(), the ->execute() is built in, so the part that would change the condition for looping through the result. The use of db_type also changes, I think.
#5
$db_type stays the same in D7... having read the DBTNG docs now, I think that dbopt_cron() would stay the same in the code above, and here's a revised version of dbopt_get_tables(). I don't have any way to test this right now though.
<?php/**
* Gets a list of tables and returns them in an array.
*/
function dbopt_get_tables() {
$result = db_query("SHOW TABLE STATUS");
$values = array();
foreach ($result as $table) {
if ($table->Data_free) {
$values[] = $table->Name;
}
}
return $values;
}
?>
#6