Download & Extend

Add automatic option for optimizing tables

Project:DB Maintenance
Version:7.x-1.x-dev
Component:Code
Category:task
Priority:normal
Assigned:Unassigned
Status:needs work

Issue Summary

Hi,

I like the idea of this module. I've had a similar module I wrote sitting around on my HD that just runs OPTIMIZE (for MySQL) or VACUUM (for PGSQL) to clean up the database every so often. Similar functionality is in the Database Administration module, but you have to explicitly choose which tables to optimize, whereas the module I wrote automatically looks for tables that need optimizing and just optimizes those.

The code is fairly short, and I think it fits in pretty well with the purpose of this module, so I figured I'd post it here. If you want to keep it, great; if not, that's fine too.

<?php
/**
* Implementation of hook_cron().
*/
function dbopt_cron() {
 
//Optimize every 6 hours. Ideally this would be a setting.
 
if (variable_get('dbopt_last', 0) < time() - 60 * 60 * 6) {
   
//This is not in the SQL standards, so PGSQL and MySQL have their own commands for it.
   
switch ($GLOBALS['db_type']) {
      case
'pgsql':
       
//I don't run any sites on PGSQL so I haven't actually tested this, but the documentation indicates it should work.
       
db_query("VACUUM");
        break;
      case
'mysqli':
      case
'mysql':
       
$tables = dbopt_get_tables(TRUE);
       
$tables = implode(',', $tables);
       
$result = db_query("OPTIMIZE TABLE $tables");
        break;
   
watchdog('dbopt', 'Database tables optimized');
   
variable_set('dbopt_last', time());
  }
}

/**
* Gets a list of tables and returns them in an array.
*/
function dbopt_get_tables() {
 
$result = db_query("SHOW TABLE STATUS");
 
$values = array();
  while (
$table = db_fetch_array($result)) {
    if (
$table['Data_free']) {
     
$values[] = $table['Name'];
    }
  }
  return
$values;
}
?>

Comments

#1

Er, sorry, that would be DB Maintenance, not DB Administration.

#2

Title:Optimize tables» Add automatic option for optimizing tables
Project:Cleaner» DB Maintenance
Version:6.x-1.0» 7.x-1.x-dev
Status:needs review» needs work

SHOW TABLE STATUS has a data_free column that is the equivalent of overhead in phpmyadmin. You're missing out on the opportunity in your code to just optimize the tables that need it by evaluating the value of that column on each of the tables before optimizing it.

Moreover, optimizing is really only beneficial on tables with concurrent inserts for MyISAM tables, where there are variable length fields like varchar, that have large amounts of frequently deleted content. That's why I recommend watchdog and sessions, but it really all depends on how people use Drupal for their site. A site that's generating a lot of content with only a few users might do better by optimizing node_revisions more often instead.

However, if you divide data_free into data_length from the table status list, you could get a default list of tables that need to be optimized much in the same way that Windows defrag says whether or not your drive should get a defrag.

#3

SHOW TABLE STATUS has a data_free column that is the equivalent of overhead in phpmyadmin. You're missing out on the opportunity in your code to just optimize the tables that need it by evaluating the value of that column on each of the tables before optimizing it.

If that was directed at me, I'm confused, because that's exactly what the code I posted does. If my code is run, only tables that need optimization will get optimized. There's no reason to look at data_length; data_free is the amount of space that would be gained by running OPTIMIZE.

I haven't worked with DBTNG yet, but from what I understand, updating this code for D7 would just consist of changing db_query("do stuff") to db_query("do stuff")->execute().

#4

Category:feature request» task
Status:needs work» patch (to be ported)

My bad. I scanned through your code too fast. Maybe the capital D in Data_free threw me off.

On a static query with db_query(), the ->execute() is built in, so the part that would change the condition for looping through the result. The use of db_type also changes, I think.

#5

$db_type stays the same in D7... having read the DBTNG docs now, I think that dbopt_cron() would stay the same in the code above, and here's a revised version of dbopt_get_tables(). I don't have any way to test this right now though.

<?php
/**
* Gets a list of tables and returns them in an array.
*/
function dbopt_get_tables() {
 
$result = db_query("SHOW TABLE STATUS");
 
$values = array();
  foreach (
$result as $table) {
    if (
$table->Data_free) {
     
$values[] = $table->Name;
    }
  }
  return
$values;
}
?>

#6

Status:patch (to be ported)» needs review

#7

I have two questions.
IceCreamYou: how do I run or use your code?

Deekayen: what do you mean by: use watchdog and sessions? Can you explain what you have done with it and how?

#8

@Alan.Guggenheim:

I have attached a module that should work for D6. Just enable it and it should optimize your database very 6 hours. Note that you'll need to rename the file to dbopt.tar.gz (drupal.org adds an extra underscore to the file extension).

I think deekayen means that the DB Maintenance module currently has a system that lets you explicitly choose which tables to optimize (regardless of whether they need it) and the module recommends that you optimize just the watchdog and sessions tables.

AttachmentSize
dbopt.tar_.gz 679 bytes

#9

Thanks a lot.

#10

Yes, I have DB Maintenance, but it does not work as my DB are defined as an array not a string (a requirement of an other module...) so I cannot select any table.

Now do you know of a way or a tool to prune the accesslog table? mine is overgrown.

Thanks

#11

Status:needs review» needs work

@IceCreamYou: Can you provide a patch file for this module, D6 or D7?

nobody click here