Handle database-specific SQL functions/operators by simple abstraction

hswong3i - August 14, 2007 - 18:51
Project:Drupal
Version:7.x-dev
Component:database system
Category:feature request
Priority:critical
Assigned:hswong3i
Status:patch (code needs review)
Description

a little difficulty that discovered during MSSQL driver development: i hope to remove the use of "LENGTH" since it is a database specific function (in case of MSSQL, it is called as "DATALENGTH"). hope to have some suggestion about this 2 lines:

  1. comment.module:1011
    <?php
    $query
    .= ' ORDER BY SUBSTRING(c.thread, 1, (LENGTH(c.thread) - 1))';
    ?>
  2. locale.module:373
    <?php
    $result
    = db_query("SELECT s.source, t.translation, t.language FROM {locales_source} s LEFT JOIN {locales_target} t ON s.lid = t.lid WHERE t.language = '%s' AND s.textgroup = 'default' AND LENGTH(s.source) < 75", $language->language);
    ?>

#1

hswong3i - August 18, 2007 - 13:05
Title:remove using "LENGTH" as SQL friendly» remove "SUBSTRING" and "LENGTH" as SQL friendly
Priority:normal» critical

@killes: it is also a critical issue for using SUBSTRING: for both Oracle and DB2, it is called as SUBSTR. BTW, comment.module is the only place that we use SUBSTRING, among all core module. maybe we should consider re-factor its programming logic, and implement this sorting by other method?

#2

hswong3i - September 3, 2007 - 08:18
Version:6.x-dev» 7.x-dev
Priority:critical» normal

according to indeed studying about ADOdb, i found that this is something can be overcome by using database specific abstract function. e.g. we can provide a function call as db_substr(), which will return a string for substring function, according to database specific syntax (MySQL: substr() or substring(); Oracle: substr(); MSSQL: substring()). please take these as reference (http://phplens.com/lens/adodb/docs-adodb.htm#substr and http://phplens.com/lens/adodb/tips_portable_sql.htm):

substr

This is not a function, but a property. Some databases have "substr" and others "substring" as the function to retrieve a sub-string. To use this property:

<?php
  $sql
= "SELECT ".$db->substr."(field, $offset, $length) from table";
 
$rs = $db->Execute($sql);
?>

For all databases, the 1st parameter of substr is the field, the 2nd is the offset (1-based) to the beginning of the sub-string, and the 3rd is the length of the sub-string.

Portable Native SQL

ADOdb provides the following functions for portably generating SQL functions as strings to be merged into your SQL statements (some are only available since ADOdb 3.92): ...

and so i will postpone this issue into D7. BTW, i will also include this research result in http://drupal.org/node/172541 as D7 database abstract layer technical preview, so we will able to judge its needs by using code, testing and benchmarking, on the other hand correct this idea as soon as we can. i don't hope to let this become another pity issue for D7 ;p

#3

chx - September 14, 2007 - 17:47
Status:active» won't fix

Core works fine, there is no issue here.

#4

hswong3i - September 14, 2007 - 18:29
Status:won't fix» active

@chx: once and once again: are you core committer? please don't close or set others issue as won't fix as you like, it is not your duty :(

#5

chx - September 14, 2007 - 18:30
Status:active» won't fix

Might not be my duty but I do close inadequate patches wherever i see them.

#6

hswong3i - September 14, 2007 - 18:47
Status:won't fix» active

@chx: please don't be too childish: the need of this patch is listed in my previous research remark (http://edin.no-ip.com/html/?q=node/310, http://phplens.com/lens/adodb/docs-adodb.htm#substr), and proved as function within my latest D7 database abstraction layer research progress result (http://drupal.org/node/172541#comment-305267). please feel free to comment about the correctness of my research result; BTW, striking such little footnote won't help about you overall propose (http://www.drupal4hu.com/node/64)

#7

Pasqualle - September 14, 2007 - 19:57

Would not be it easier to write 2 stored procedures, substring and length for oracle, db2, mssql (or any other) databases?
Do you really plan to rewrite all modules like this?

#8

hswong3i - September 14, 2007 - 20:20

@Pasqualle: i guess i am not proposing something new other than existing ADOdb research result. as they have researched for more than 7 years, with totally 19 different databases supporting. their research result should be very solid.

on the other hand, before i notice about the handling of ADOdb, i try to implement a pre-defined random function for oracle driver once before (integrated with progress of http://drupal.org/node/39260): it is totally functioning:

<?php
    db_query
('CREATE OR REPLACE FUNCTION rand RETURN FLOAT AS
        retval FLOAT;
      BEGIN
        SELECT dbms_random.value INTO retval FROM DUAL;
        RETURN retval;
      END;'
);
?>

BTW, isn't this the BEST handling? seems not to be, when you compare it with ADOdb implementation for oracle:

<?php
 
var $random = "abs(mod(DBMS_RANDOM.RANDOM,10000001)/10000000)";
?>

and let's have a look about pgsql implementation, too:
<?php
 
var $random = 'random()';        /// random function
?>

according to ADOdb's suggestion (http://phplens.com/lens/adodb/tips_portable_sql.htm): the numbers of additional abstraction function are limited. according to my studying on their different backend handling, i found that most of them are very reasonable and meaningful, and easy to implement :)

finally, we are not always using those special and specific SQL level functions: there is only around 5~7 lines existing within our latest drupal-6.x.dev CVS HEAD. as we are talking about a huge project as drupal core, you may simply guess about how many lines should be changed within contribution modules :)

#9

Pasqualle - September 14, 2007 - 21:52

Ok, I searched another php application for reference. I know phpbb support many databases.

so phpbb code snippet
length (only one occurence):

switch ($db->sql_layer)
      {
        case 'mssql':
        case 'mssql_odbc':
          $sql = 'SELECT *
            FROM ' . SMILIES_TABLE . '
            ORDER BY LEN(code) DESC';
        break;
 
        case 'firebird':
          $sql = 'SELECT *
            FROM ' . SMILIES_TABLE . '
            ORDER BY CHAR_LENGTH(code) DESC';
        break;

        // LENGTH supported by MySQL, IBM DB2, Oracle and Access for sure...
        default:
          $sql = 'SELECT *
            FROM ' . SMILIES_TABLE . '
            ORDER BY LENGTH(code) DESC';
        break;
      }

and substring is not used anywhere

I am now convinced, that this issue should be fixed if we want to support more databases.

#10

hswong3i - September 17, 2007 - 09:43
Title:remove "SUBSTRING" and "LENGTH" as SQL friendly» Abstract database specific special SQL functions (e.g. LENGTH, SUBSTRING, RAND, etc)

@Pasqualle: thanks for your help about exploration on phpBB, and your positive supporting :)

According to ADOdb's document (http://phplens.com/lens/adodb/tips_portable_sql.htm), I try to summarize those database specific special SQL functions which required for abstraction:

  1. DB_CONCAT_OPERATOR (defined constant)
  2. DB_LENGTH (defined constant)
  3. DB_UPPER (defined constant)
  4. DB_RANDOM (defined constant)
  5. DB_SUBSTR (defined constant)
  6. db_concat() (function wrapper)
  7. db_if_null() (function wrapper)

Some of them are really unusual, e.g. DB_UPPER and db_if_null(). BTW, as we are now fully implement those abstract functions which are required to be abstract, based on cross database compatibility concern (but not only based on today's needs, which means: trial and error), our developers may fully utilize most powerful database level functions as possibile. This also means that: increase our database abstraction flexibility and encourage our developers creativity.

Progress will integrate with my D7 database abstraction layer technical preview (http://drupal.org/node/172541), which will not duplicate in this issue once again.

P.S. need not to mention, time related handling is also a huge topic for database abstraction: ADOdb provide totally 6 different function wrappers for abstracting those differences (http://phplens.com/lens/adodb/tips_portable_sql.htm). BTW, we shouldn't require for these abstraction, as Drupal always use PHP-level date and time handling, which is very powerful and standardized, and database independence :)

#11

hswong3i - November 21, 2007 - 16:02
Title:Abstract database specific special SQL functions (e.g. LENGTH, SUBSTRING, RAND, etc)» DB compatibility: abstract database-specific SQL functions/operators
Category:feature request» bug report
Priority:normal» critical

Just as example, the following is code snippet from D6 OCI8 implementation, which demo how abstraction is needed for portable code:

<?php
/**
* Indicates the string to use to quote identifiers and names.
*/
define('DB_QUOTE_OPERATOR', '"');

/**
* Indicates the concatenation operator.
*/
define('DB_CONCAT_OPERATOR', '||');

/**
* Indicates the default string placeholder.
*/
define('DB_DEFAULT', '_OCI8_DEFAULT_');

/**
* Indicates the name of the SQL strtoupper function.
*/
define('DB_UPPER', 'UPPER');

/**
* Indicates the SQL to generate a random number between 0.00 and 1.00.
*/
define('DB_RAND', 'ABS(MOD(DBMS_RANDOM.RANDOM,10000001)/10000000)');

/**
* Indicates the name of the SQL strlen function.
*/
define('DB_STRLEN', 'LENGTH');

/**
* Indicates the name of the SQL substr function.
*/
define('DB_SUBSTR', 'SUBSTR');

/**
* Replace all escape characters in a query, based on the database specific
* implementation.
*
* queries sent to Drupal should wrap all identifiers and names in square
* brackets. This function will search for this syntax and replace it as
* corresponding escape characters, based on the database specific
* requirement.
*/
function db_escape_quote($sql) {
  return
preg_replace_callback('/\[([A-Za-z0-9_]+)\]/', '_db_escape_quote_callback', $sql);
}

/**
* Helper function for db_query().
*/
function _db_escape_quote_callback($matches) {
 
$match = $matches[1];
  if (
strlen($match) > 30) {
   
// Try to fetch mapping, if not exists, set it up.
   
$trim = variable_get('oci8_' . $match, NULL);
    if (!
$trim) {
     
// Fetch last counter, and +1 for next mapping.
     
$count = variable_get('oci8_max30', 0) + 1;
     
variable_set('oci8_max30', $count);
     
// Setup mapping and reverse mapping.
     
$trim = substr($match[1], 0, 30 - strlen($count)) . $count;
     
variable_set('oci8_' . $match[1], $trim);
     
variable_set('oci8_' . $trim, $match[1]);
     
$match = $trim;
    }
  }
  return
DB_QUOTE_OPERATOR . $match . DB_QUOTE_OPERATOR;
}

/**
* Return a portably concatenate strings.
*
* @param ...
*   Variable number of string parameters.
* @return
*   Portably concatenate strings.
*/
function db_concat() {
 
$args = func_get_args();
  return
implode(DB_CONCAT_OPERATOR, $args);
}

/**
* Returns a string that is the equivalent of MySQL IFNULL or Oracle NVL.
*
* @return
*   If $expr1 is not NULL, returns $expr1; otherwise it returns $expr2.
*/
function db_if_null($expr1, $expr2) {
  return
" NVL($expr1, $expr2) ";
}
?>

#12

hswong3i - December 22, 2007 - 07:21
Title:DB compatibility: abstract database-specific SQL functions/operators» Siren #5: Handle database-specific SQL functions/operators by simple abstraction

#13

chx - December 23, 2007 - 01:02
Category:bug report» feature request
Priority:critical» normal

#14

chx - December 23, 2007 - 09:20
Title:Siren #5: Handle database-specific SQL functions/operators by simple abstraction» Handle database-specific SQL functions/operators by simple abstraction

As Siren is not a project on Drupal.org but an unofficial fork, I am removing it from issue titles to avoid confusion.

#15

hswong3i - February 5, 2008 - 08:19
Category:feature request» task
Priority:normal» critical
Status:active» patch (code needs review)

This is just a very simple abstraction: replace most LENGTH and SUBSTRING, into DB_STRLEN and DB_SUBSTR. On the other hand, I also collect some other important abstraction, based on above research founding, and progress of my personal research project Siren.

Patch via latest CVS HEAD, and test with both MySQL and PostgreSQL.

AttachmentSize
fn_op_abstract-0.1.patch6.2 KB

#16

Crell - August 21, 2008 - 22:09
Status:patch (code needs review)» won't fix

Function abstraction is something that would need to happen as part of the new query builders. This approach is therefore no longer applicable.

#17

hswong3i - August 28, 2008 - 17:21
Category:task» feature request
Status:won't fix» patch (code needs review)

Thanks for TNG DB layer with OOP style, we can now clone ADOdb handling in more ideal implementation. As mentioned in http://drupal.org/node/167335#comment-636082, these abstraction is very useful for other DB driver development.

This patch handling the following abstraction: quote identifier (e.g. "), concatenate operator (e.g. +), uppercase function, random function, string length function, substring operator, string concatenate function and if NULL benching. Each DB driver will override their required change, where a friendly wrapper function is provided as like as escapeTable => db_escape_table(). 3 set of examples are available (comment.module, locale.module and statistics.admin.inc), too.

Patch via CVS HEAD, tested with MySQL and PostgreSQL. MySQL work fine with either case, where PostgreSQL will buggy for: 1. add comment (but thread view work fine); 2. access top visitors page.

P.S. This patch is mainly target for the missing abstraction which is now missing from TNG DB layer, and try to minimize change to examples if possible. TNG DB syntax change is not its main concern, where we may handle afterward.

AttachmentSize
portable-native-sql.patch8.97 KB
 
 

Drupal is a registered trademark of Dries Buytaert.