Does anyone know how Drupal.org (D6) splits their MySQL read and write queries to a slave and master database? I am trying to get all "read" queries go to a slave database and all "write" queries go to a master database.
I was told that there isn't a safe way to split all SELECT queries to go to a slave (read-only) because not all SELECT queries are slave-safe. But, Drupal.org does it successfully so I would greatly appreciate if someone could give me information on this.
I tried this patch by SwampCritter, but I couldn't get it to work: http://msrwd.svn.sourceforge.net/viewvc/msrwd/D6/
Whenever I try to create new content, I get an error message: "Post could not be saved" and the post is not saved.
I have also tried to modify /includes/database.mysql-common.inc on my own in order to split the read/write queries to go to a slave/master db:
function db_query($query) {
$args = func_get_args();
array_shift($args);
$query = db_prefix_tables($query);
if (isset($args[0]) and is_array($args[0])) { // 'All arguments in one array' syntax
$args = $args[0];
}
_db_query_callback($args, TRUE);
$query = preg_replace_callback(DB_QUERY_REGEXP, '_db_query_callback', $query);
/* Added by apersaud
Integrating multiple read/write databases
*/
if(strpos(strtolower($query),"select") === 0){
db_set_active('readonly'); //this will not contain any data from the master (write) database untill replication happens
}
else {
db_set_active('default');
}
return _db_query($query);
db_set_active('default');
}And here's how I set up the read/write datbases in settings.php:
$db_url['default'] = 'mysql://username:password@localhost/databasename';
$db_url['readonly'] = 'mysql://username:password@localhost/databasename';However, this also didn't work. Again, when I try to create new content, I get an error message "Post could not be saved" and also the only row of data in the node_access table is deleted upon this action (which causes access permissions problems for users). Also, in the log entries I get these errors:
Duplicate entry '0' for key 2 query: INSERT INTO node (vid, type, language, title, uid, status, created, changed, comment, promote, moderate, sticky, tnid, translate) VALUES (0, 'story', '', 'Test 9', 6, 0, 1242671928, 1242671928, 0, 0, 0, 0, 0, 0) in .../includes/common.inc on line 3422.Does anyone know what patch Drupal.org applied in order to achieve this functionality successfully in D6?
Comments
Got it working!
I found out drupal.org uses a patch developed by "Four Kitchens" in order to achieve this functionality, so I don't know how they did it.
However, I was able to hack something together that seems to be working so far on Drupal 6. I modified one file '/includes/database.mysql-common.inc':
function db_query($query) {
$args = func_get_args();
array_shift($args);
$query = db_prefix_tables($query);
if (isset($args[0]) and is_array($args[0])) { // 'All arguments in one array' syntax
$args = $args[0];
}
_db_query_callback($args, TRUE);
$query = preg_replace_callback(DB_QUERY_REGEXP, '_db_query_callback', $query);
/* Added on 5/21/2009 by apersaud
Integrating multiple read/write databases
*/
if(strpos(strtolower($query),"select") === 0){
$commits = array('alter', 'insert', 'update', 'delete', 'flush', 'lock','create');
$is_commit = false;
foreach($commits as $type) {
if((strpos(strtolower($query),$type))){
$is_commit = true;
}
}
if($is_commit){
db_set_active('default');
} else{
db_set_active('readonly');
}
}
else {
db_set_active('default');
}
return _db_query($query);
}
It's not the prettiest solution, but it works. I did it this way because not all SELECT read queries are slave-safe, so I tried to catch those non-slave-safe queries and direct them to the master database. I might catch some SELECT queries that should probably go to the slave but I don't think there's any harm in it going to the master instead.
Direct all LAST_INSERT_ID() to master database
Directing queries containing LAST_INSERT_ID() function to the slave returns incorrect results.
Adding one more conidtion:
if(strpos(strtolower($query),"LAST_INSERT_ID()") === FALSE){
....
....
}
else{
db_set_active('default');
}
I combine the both, so the
I combine the both, so the final function is like this
function db_query($query) {
$args = func_get_args();
array_shift($args);
$query = db_prefix_tables($query);
if (isset($args[0]) and is_array($args[0])) { // 'All arguments in one array' syntax
$args = $args[0];
}
_db_query_callback($args, TRUE);
$query = preg_replace_callback(DB_QUERY_REGEXP, '_db_query_callback', $query);
/*
* Integrating multiple read/write databases
*/
if(strpos(strtolower($query), "select") === 0 && strpos(strtolower($query), "last_insert_id") === FALSE){
$commits = array('alter', 'insert', 'update', 'delete', 'flush', 'lock','create');
$is_commit = false;
foreach($commits as $type) {
if((strpos(strtolower($query), $type))){
$is_commit = true;
}
}
if($is_commit){
db_set_active('default');
//drupal_set_message('default');
} else{
db_set_active('readonly');
//drupal_set_message('readonly');
}
}
else {
db_set_active('default');
}
/*
* End read/write router
*/
return _db_query($query);
}
It works well, thank you all.
sub
subscribing
We're using Pressflow since
We're using Pressflow since it has support for slave databases. But, the slave databases are severely under utilized since db_query_slave/db_query_range_slave/etc have to be used to send the query to the slaves. Unless you want to patch every module to use db_query_slave instead of db_query and so on, no more than 10% of query will ever go to the slaves.
So, what we did is patch the include file, 'database.mysqli.inc', specifically the function '_db_query()'.
The patch below basically overwrites the parameter $slave (if false) by checking the query if it is slave save by doing some logic:
The list of slave save tables is defined in the settings.php. For example:
$conf['slavable_tables'] = array("blocks*", "boxes", "cache*", "contact*",
"content_group*", "content_node*", "date*",
"imagecache*", "languages", "locales*", "menu*",
"permission", "role", "system", "term_data",
"url_alias", "variables", "views*", "vocabulary*");
Basically, we chose tables that are fairly 'static' in nature. Also, "*" is the wild card, so that blocks and blocks_roles tables are included.
Here is the patch:
--- database.mysqli.inc 2010-11-18 10:42:13.000000000 -0800
+++ database.mysqli.inc 2010-11-19 16:39:07.000000000 -0800
@@ -91,6 +91,18 @@ function db_connect($url) {
function _db_query($query, $debug = 0, $slave = FALSE) {
global $active_db, $active_slave_db, $queries, $user;
+ $slave = $slave ? $slave : _db_query_is_slave_safe($query); // send to to slave if 'safe'
+
+// static $counter = 0, $a, $b;
+// if (!$slave) {
+// $a++;
+// drupal_set_message("noslave $a: ".$query);
+// } else {
+// $b++;
+// drupal_set_message("toslave $b: ".$query);
+// }
+// $counter++;
+
if (variable_get('dev_query', 0)) {
list($usec, $sec) = explode(' ', microtime());
$timer = (float)$usec + (float)$sec;
@@ -414,3 +426,90 @@ function db_column_exists($table, $colum
* @} End of "ingroup database".
*/
+function _db_query_is_slave_safe($query) {
+ static $slave_safe_tables_pattern = false;
+
+
+ if ($slave_safe_tables_pattern == false) {
+ $slave_safe_tables = variable_get("slavable_tables", array());
+
+ if (count($slave_safe_tables)) {
+ $slave_safe_tables_pattern = '/^('. strtr(implode("|", $slave_safe_tables), array("*" => ".*")) .')$/';;
+// drupal_set_message('$slave_safe_tables_pattern '.$slave_safe_tables_pattern);
+ }
+ }
+
+ if ($slave_safe_tables_pattern == false) { // still nothing, assume slave are not desirable
+ return false;
+ }
+
+ $query = strtolower($query);
+ if (strpos($query,"select ") !== false && !preg_match("/alter|insert|update|drop|delete|flush|lock|create|last_insert_id|found_rows|sql_calc_found_rows|row_count/", $query)) {
+ // a select and contains none of the unwanted key words
+
+ $tokenize_query = _db_query_tokenize($query);
+ $tables = array();
+ $table_count = 0;
+ for ($i = 0; $i < count($tokenize_query); $i++) {
+ $token = $tokenize_query[$i];
+ if (isset($tokenize_query[$i+1]) && in_array($token, array("join", "from"))) {
+ $table = $tokenize_query[$i+1];
+ $tables[$table] = $table;
+ $table_count++;
+ }
+ }
+
+ if ($table_count == 0) { // couldn't detect a table, must be a reeealy complicated query
+ return false;
+ }
+
+ foreach ($tables as $table) {
+ if (!preg_match($slave_safe_tables_pattern, $table)) { // table is not in the list
+ return false;
+ }
+ }
+
+ return true; // if we get here, it's all good to go to slave
+ }
+ return false;
+}
+
+function _db_query_tokenize($query, $cleanWhitespace = true) {
+
+ /**
+ * Strip extra whitespace from the query
+ */
+ if($cleanWhitespace) {
+ $query = ltrim(preg_replace('/[\\s]{2,}/',' ',$query));
+ }
+
+ /**
+ * Regular expression based on SQL::Tokenizer's Tokenizer.pm by Igor Sutton Lopes
+ **/
+ $regex = '('; # begin group
+ $regex .= '(?:--|\\#)[\\ \\t\\S]*'; # inline comments
+ $regex .= '|(?:<>|<=>|>=|<=|==|=|!=|!|<<|>>|<|>|\\|\\||\\||&&|&|-|\\+|\\*(?!\/)|\/(?!\\*)|\\%|~|\\^|\\?)'; # logical operators
+ $regex .= '|[\\[\\]\\(\\),;`]|\\\'\\\'(?!\\\')|\\"\\"(?!\\"")'; # empty single/double quotes
+ $regex .= '|".*?(?:(?:""){1,}"|(?<!["\\\\])"(?!")|\\\\"{2})|\'.*?(?:(?:\'\'){1,}\'|(?<![\'\\\\])\'(?!\')|\\\\\'{2})'; # quoted strings
+ $regex .= '|\/\\*[\\ \\t\\n\\S]*?\\*\/'; # c style comments
+ $regex .= '|(?:[\\w:@]+(?:\\.(?:\\w+|\\*)?)*)'; # words, placeholders, database.table.column strings
+ $regex .= '|[\t\ ]+';
+ $regex .= '|[\.]'; #period
+ $regex .= '|[\s]'; #whitespace
+
+ $regex .= ')'; # end group
+
+ // get global match
+ preg_match_all( '/' . $regex . '/smx', $query, $result );
+
+ // return tokens
+ $arr = $result[0];
+ $new_arr = array();
+ foreach ($arr as $i => $val) {
+ $val = trim($val);
+ if ($val) {
+ $new_arr[] = $val;
+ }
+ }
+ return $new_arr;
+}
The _db_query_is_slave_safe got a problem
I found a problem on
if (strpos($query,"select ") !== false && !preg_match("/alter|insert|update|drop|delete|flush|lock|create|last_insert_id|found_rows|sql_calc_found_rows|row_count/", $query)) {If I send query llike this
SELECT node.title, node.created FROM nodeIt will return false, because the syntax have "create" on "node.created".
So I changed the code to
$query = trim(strtolower($query));if (strpos($query,"select ") === 0 && !preg_match("/alter |insert |update |drop |delete |flush |lock |create |last_insert_id|found_rows|sql_calc_found_rows|row_count/", $query)) {
But I don't know it safe? or there have better solution?