Posted by IceCreamYou on September 4, 2009 at 10:18pm
| Project: | DB Maintenance |
| Version: | 7.x-1.x-dev |
| Component: | Code |
| Category: | task |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs work |
Issue Summary
Hi,
I like the idea of this module. I've had a similar module I wrote sitting around on my HD that just runs OPTIMIZE (for MySQL) or VACUUM (for PGSQL) to clean up the database every so often. Similar functionality is in the Database Administration module, but you have to explicitly choose which tables to optimize, whereas the module I wrote automatically looks for tables that need optimizing and just optimizes those.
The code is fairly short, and I think it fits in pretty well with the purpose of this module, so I figured I'd post it here. If you want to keep it, great; if not, that's fine too.
<?php
/**
* Implementation of hook_cron().
*/
function dbopt_cron() {
//Optimize every 6 hours. Ideally this would be a setting.
if (variable_get('dbopt_last', 0) < time() - 60 * 60 * 6) {
//This is not in the SQL standards, so PGSQL and MySQL have their own commands for it.
switch ($GLOBALS['db_type']) {
case 'pgsql':
//I don't run any sites on PGSQL so I haven't actually tested this, but the documentation indicates it should work.
db_query("VACUUM");
break;
case 'mysqli':
case 'mysql':
$tables = dbopt_get_tables(TRUE);
$tables = implode(',', $tables);
$result = db_query("OPTIMIZE TABLE $tables");
break;
watchdog('dbopt', 'Database tables optimized');
variable_set('dbopt_last', time());
}
}
/**
* Gets a list of tables and returns them in an array.
*/
function dbopt_get_tables() {
$result = db_query("SHOW TABLE STATUS");
$values = array();
while ($table = db_fetch_array($result)) {
if ($table['Data_free']) {
$values[] = $table['Name'];
}
}
return $values;
}
?>
Comments
#1
Er, sorry, that would be DB Maintenance, not DB Administration.
#2
SHOW TABLE STATUS has a data_free column that is the equivalent of overhead in phpmyadmin. You're missing out on the opportunity in your code to just optimize the tables that need it by evaluating the value of that column on each of the tables before optimizing it.
Moreover, optimizing is really only beneficial on tables with concurrent inserts for MyISAM tables, where there are variable length fields like varchar, that have large amounts of frequently deleted content. That's why I recommend watchdog and sessions, but it really all depends on how people use Drupal for their site. A site that's generating a lot of content with only a few users might do better by optimizing node_revisions more often instead.
However, if you divide data_free into data_length from the table status list, you could get a default list of tables that need to be optimized much in the same way that Windows defrag says whether or not your drive should get a defrag.
#3
If that was directed at me, I'm confused, because that's exactly what the code I posted does. If my code is run, only tables that need optimization will get optimized. There's no reason to look at data_length; data_free is the amount of space that would be gained by running OPTIMIZE.
I haven't worked with DBTNG yet, but from what I understand, updating this code for D7 would just consist of changing db_query("do stuff") to db_query("do stuff")->execute().
#4
My bad. I scanned through your code too fast. Maybe the capital D in Data_free threw me off.
On a static query with db_query(), the ->execute() is built in, so the part that would change the condition for looping through the result. The use of db_type also changes, I think.
#5
$db_type stays the same in D7... having read the DBTNG docs now, I think that dbopt_cron() would stay the same in the code above, and here's a revised version of dbopt_get_tables(). I don't have any way to test this right now though.
<?php/**
* Gets a list of tables and returns them in an array.
*/
function dbopt_get_tables() {
$result = db_query("SHOW TABLE STATUS");
$values = array();
foreach ($result as $table) {
if ($table->Data_free) {
$values[] = $table->Name;
}
}
return $values;
}
?>
#6
#7
I have two questions.
IceCreamYou: how do I run or use your code?
Deekayen: what do you mean by: use watchdog and sessions? Can you explain what you have done with it and how?
#8
@Alan.Guggenheim:
I have attached a module that should work for D6. Just enable it and it should optimize your database very 6 hours. Note that you'll need to rename the file to dbopt.tar.gz (drupal.org adds an extra underscore to the file extension).
I think deekayen means that the DB Maintenance module currently has a system that lets you explicitly choose which tables to optimize (regardless of whether they need it) and the module recommends that you optimize just the watchdog and sessions tables.
#9
Thanks a lot.
#10
Yes, I have DB Maintenance, but it does not work as my DB are defined as an array not a string (a requirement of an other module...) so I cannot select any table.
Now do you know of a way or a tool to prune the accesslog table? mine is overgrown.
Thanks
#11
@IceCreamYou: Can you provide a patch file for this module, D6 or D7?