Now that we have .install files for modules, I believe it is time we extended our database abstraction layer.

I recommend we create the following extra functions :

1. db_create_table($tablename, $columns); // columns is an associative array with the 'name' => 'type'
3. db_drop_table($tablename);
3. db_create_index($tablename, $columns); // columns is an array of column names, or just a single column.
4. db_drop_index($tablename, $columns);
5. db_add_column($tablename, $column, $type); // PGSQL already has a similar function, but it isn't in both systems.
6. db_drop_column($tablename, $column);

The installer that is coming in, will be converting the database.*sql files into a system.install file, and with these functions, we can collapse ALL THREE SCHEMA FILES, into a single, much simpler file.

This will also _drastically_ simplify updates, and it will give us consistent usage of types across the database.
Certain things will still need to happen only on postgres (like create function calls for instance), but this DRASTICALLY decreases the amount of work we have to do to support multiple db types.

At the moment, there are no create table statements that can be changed in core itself (until we have a system.install) , however, contrib modules can already be ported to use these functions, which would mean that we have (preliminary) postgres support across the board on all contrib modules.

These functions should possibly keep track of what tables / columns / indexes have already been created, that would allow us to check (in a cross db way) what tables are available, and maintaining this information could possibly be useful in the future, such as not creating tables which have already been created.

Additionally, each of these functions can be used to trigger a process for entering additional database credentials, in the case that the specific database user is not allowed to create tables and the like.

This also helps make the patch for the installer a lot simpler, so if you want to help out with the installer, this would be a good thing to get involved with =)

CommentFileSizeAuthor
#12 forum.module_16.patch713 bytesFrando
#8 mysql.inc_0.patch8.16 KBFrando
#3 mysql.inc.patch7.15 KBFrando
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

moshe weitzman’s picture

This is an obvious next step for the installer. The fact that it helps with privelege escalation and postgres compatibility of contrib modules (and eventually other engines) is a real bonus.

webchick’s picture

In a cursory glance through update.php, looks like we also need:

db_alter_column_property($table, $column, $property, $value);

for stuff like:
$ret[] = update_sql("ALTER TABLE {blocks} ALTER COLUMN pages set default ''");

and maybe also:
$ret[] = update_sql("ALTER TABLE {poll} RENAME voters TO polled");

Frando’s picture

Assigned: Unassigned » Frando
Status: Active » Needs review
FileSize
7.15 KB

As I announced in the mailing list, I started working on a implentation in mysql yesterday.
Here's now the first patch for database.mysql.inc.

Everything seems to work well so far, however, the list of drupa ldatatypes is not finished yet, we should discuss which kinds of types we want to include.

As I'm not familiar with postgres, I won't be able to do a postgres implentation as well. But I think, when we're done with with testing and modifying the mysql implentation, there will be someone to do the posgres one.

best regards,
Franz Heinzmann

P.S.
some usage examples:

$array1 = array(
    'nid' => array('type' => 'int', 'key' => 'primary'),
    'link' => array('type' => 'string'),
    'name' => array('type' => 'text', 'suffix' => 'NOT NULL', 'default' => ''),
    'node' => array('type' => 'int', 'suffix' => 'NOT NULL', 'default' => 0, 'key' => 'unique'),
    'version' => array('type' => 'long text', 'suffix' => 'NOT NULL', 'default' => ''),
    'created' => array('type' => 'timestamp' , 'suffix' => 'NOT NULL', 'default' => 0, 'key' => 'foo_bar'),
    'changed' => array('type' => 'timestamp' , 'suffix' => 'NOT NULL', 'default' => 0, 'key' => 'foo_bar'),
    'required' => 'boolean'
);
      
db_create_table('test1',$array1);
db_add_column('test1', 'testc', array('type' => 'string', 'length' => '128', 'suffix' => 'NOT NULL', 'default' => 'foo'));
db_drop_column('test1','changed');
db_create_index('test1','foobar2',array('nid','created'));
db_drop_index('test1', 'foobar2');
db_drop_table('foobar');
chx’s picture

Dries wants Drupal developers to write SQL statements. Adrian proposes and it seems a good propose to make our definitions DB agnostic. That's not so hard as core does not use -- by far -- the full extent of possibilites of CREATE TABLE statements and I think that's an acceptable compromise. So here is some bridging code:

preg_match_all('/CREATE TABLE (.*) \((.*)\)[^\n]*;/Us', $x, $m, PREG_SET_ORDER);
foreach ($m as $definition) {
  $tablename = $definition[1];
  $fields = explode("\n", $definition[2]);
  foreach ($fields as $field) {
    $field = trim($field, ' ,');
    if ($field[0] < 'a') {
      $indexes[$tablename][] = $field;
    }
    else {
      $parts = explode(' ', $field);
      $fieldname = array_shift($parts);
      $type = array_shift($parts);
      $extra = implode(' ', $parts);
      if (strpos($extra, 'auto_increment') !== FALSE) {
        $type = 'serial';
        $extra = '';
      }
      $type = preg_replace('/(int.*)\(\d+\)/', '\1', $type);
    }
  }
}
$typemap = array(
  'mysql' => array('serial' => 'integer unsigned NOT NULL auto_increment'),
  'pgsql' => array('longblob' => "bytea default ''", 'longtext' => 'text'),
);

I checked the remaining $extras and it seems they are support by pgsql alike. aside from unsigned, they seem to be standard SQL. Indexes will need more love. But I think this path worths pursuing.

chx’s picture

I forgot to mention that http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html :

Another extension is supported by MySQL for optionally specifying the display width of an integer value in parentheses following the base keyword for the type (for example, INT(4)). This optional display width specification is used to left-pad the display of values having a width less than the width specified for the column.

The display width does not constrain the range of values that can be stored in the column, nor the number of digits that are displayed for values having a width exceeding that specified for the column.

so I strip (number) from integers so there are less types to support.

markus_petrux’s picture

Hi, sometime ago I wrote a SQL Parser that converts DDL statements using a syntax mostly based on MySQL into other SQL engines. It currently supports MySQL, PostgreSQL and MS-SQL, but there are plans to add support for, at least, all other SQL engines currently supported by phpBB.

Actually, it should support conversion of CREATE TABLE, ALTER TABLE and DROP TABLE. For some kind of ALTER TABLE features I had planned to add some functions to read the system catalog views. I have already some code, but it needs a lot of love... the rest, should more or less work.

I have made a couple of last minute fixes, so it can parse the Drupal MySQL schemas. It generates a bunch of warnings, though. You can check it here:
http://sql.phpmix.com

The syntax is described here:
http://area51.phpbb.com/phpBB/viewtopic.php?t=20911

The code can be found here:
http://easymod.cvs.sourceforge.net/easymod/sql_parser/
It is not up to date, though. You might want to look at the code. It is fully commented and it contains some references to other resources.

If you guys think it can be used for Drupal, I would very glad to make any changes so it supports anything that Drupal might need. However, I would like to keep the library as independent as possible so it can be used with Drupal, phpBB or anything else.

If the current method to invoke the parser doesn't fit Drupal guidelines, maybe it could be defined a set of APIs, but keep the backend as is.

Karoly contacted me about this thread. Please, let me know what you think about it. I'm currently busy, but I would be very glad to work on it.

Cheers

drumm’s picture

Status: Needs review » Needs work

- Non-standard indentation
- Badly formatted comments (capitalize and punctuate)
- No postgres

Frando’s picture

Status: Needs work » Needs review
FileSize
8.16 KB

Sorry for the wrong indentation, forgot to change my editor's preferences from the last project I've been working on ...

Now here's a new patch.

I also changed some more stuff:
- increased independance by adding properties for NULL, NOT NULL, SIGNED/UNSIGNED and automatical incrementation
- added the function proposed by webchick
- improved documentation

I won't be able to create a postgres version cause I'm not familiar enough with postgres. However, the target had been to define an abstract API for table creation/modification, and for some postgres crack, it should be easy enough to port.

best regards,
frando

Frando’s picture

in my last change, the usage examples I posted above aren't working any more, so here are working ones:

$array1 = array(
        'nid' => array('type' => 'int', 'key' => 'primary', 'auto_increment' => TRUE),
        'link' => array('type' => 'string'),
        'body' => array('type' => 'text', 'NULL' => FALSE, 'default' => ''),
        'foo' => array('type' => 'int', 'NULL' => FALSE, 'default' => 0, 'key' => 'unique'),
        'bar' => array('type' => 'long text', 'default' => ''),
        'footime' => array('type' => 'timestamp', 'foo_bar'),
        'bartime' => array('type' => 'timestamp', 'key' => 'foo_bar'),
        'required' => 'boolean'
      );
      
$query = db_create_table('test1',$array1);

$query = db_add_column('test1', 'wohoo', array('type' => 'string', 'length' => '128', 'NULL' => FALSE, 'default' => 'foo'));

$query = db_drop_column('test1','changed');

$query = db_create_index('test1','my_index',array('nid','created'));

$query = db_drop_index('test1', 'my_index');

$query = db_drop_table('test1');

$query = db_column_set_default('test1','foo',25);
Frando’s picture

just spotted a typo in my last example.
In the create table array, it has of couse to be 'footime' => array('type' => 'timestamp', key => 'foo_bar'), instead of 'footime' => array('type' => 'timestamp', 'foo_bar'),

kbahey’s picture

This is still ambiguous. I am not sure what test1 is or wohoo.

$query = db_add_column('test1', 'wohoo', ...  );

$query = db_drop_column('test1','changed');

$query = db_create_index('test1','my_index',array('nid','created'));

$query = db_drop_index('test1', 'my_index');

Like form API, it would be better to say:

$query = db_drop_column('tablename' => 'test1', 'column' => 'changed');

so the order of parameters does not matter.

Frando’s picture

FileSize
713 bytes

Yes, this is an idea .. might think about it some more tomorrow.

For the time being, the code in patch is documented, all parameters are explained there. My examples should not replace the documentation, just be a minor addition to it.

I'm quoting the most important part of the comments in my patch here to make things clear:

/**
* Create a new table.
* 
* @param string $table 
*   Name of the table to be created.
* @param array $columns 
*   Is an associative array with 'name' => 'type',
*     'name'  should be a string containing the name of the column,
*     'type'  can either be a string (containing a drupal datatype, the default properties will be used) or
*             an associative array (to specify more properties).
* 
*     If the latter is used, all default properties specified in the drupal datatype array can be overriden
*     by setting the following keys in the 'type'-array:
* 
*       'type' - The drupal datatype [required].
*       'length' - For varchar or numeric types [optional]
*       'NULL' - FALSE for NOT NULL, TRUE for NULL [optional]
*       'unsigned' - TRUE for UNSIGNED [optional]
*       'auto_increment' - TRUE for 'auto_increment' [optional]
*       'default' - Default value for the column [optional].
*       'key' - [optional]
*         'primary' to make the column the primary key,
*         'unique' for unique key or
*         use any other string $string to create a new key $string. All columns that have type['key'] = $string will be part of it.
*/
function db_create_table($table, $columns) {

and

/**
* Get drupal datatype definitions.
* 
* @return 
*   Array containing all valid drupal datatypes with their default properties
*/
function db_get_types() {
  $types = array (
    'string' => array('type' => 'VARCHAR', 'length' => 255),
    'text' => array('type' => 'TEXT'),
    'long text' => array('type' => 'LONGTEXT'),
    'timestamp' => array('type' => 'INT', 'length' => 11, 'NULL' => FALSE, 'unsigned' => TRUE, 'default' => 0),
    'int' => array('type' => 'INT', 'length' => 10),
    'medium int' => array('type' => 'MEDIUMINT', 'length' => 9),
    'tiny int' => array('type' => 'TINYINT', 'length' => 4),
    'blob' => array('type' => 'BLOB'),
    'long blob' => array('type' => 'LONGBLOB'),
    'boolean' => array('type' => 'TINYINT', 'length' => 1, 'NULL' => 'FALSE', 'default' => 0)
  );
  return $types;
} 

But the best would be just to have a look at the patch itself instead of these excerpts. It should be easy enough to read.

moshe weitzman’s picture

Anyone willing to revive this? It would be nice if we had a full postgres implementation, but i think that can lag a bit if needed. thats how the installer patch progressed.

moshe weitzman’s picture

Version: x.y.z » 6.x-dev

lets revive this for next release. we have very credible mssql, oracle and sqlite database layers out there. but hardly any modules write schemas for them. this patch would live us one set of functions and all db abstractions would thrive. i think we are shooting ourselves in the foot by not having this.

note that we would still write raw SQL everywhere except schema definition, since thats non standard for each DBMS.

Frando’s picture

Yes, I'll definitly revive this for 6.0. I'll reroll a patch as soon as development is open again.

Paul Natsuo Kishimoto’s picture

Eureka! I finally found this feature request after making many passes through the forums, documentation and issues and leaving several despairing comments which I now regret. This looks like progress!

Further to what adrian said initially, requiring unique module SQL for every RDBMS is a pain, and handicaps module developers that don't have access to all the supported databases. Current (4.7.x/5.0-beta) 'support' for Postgres and Oracle is chimerical, because it only lasts until the user wants to use a popular contrib module that hasn't been tested on these RDBMS.

More specific to this feature request/patch, there are several cans of worms opened. Two big ones:

  • What's meant by "raw SQL," and "Dries wants Drupal developers to write SQL statements"? The challenge of DB abstraction lies in differing implementations of "SQL". If developers are free to write SQL, there will inevitably be many RDBMS for which their SQL is invalid. One solution would be to provide complete DB abstraction for core/developers, while allow while allowing SQL passthrough for hacking / throwaway module development. markus_petrus' comment proposes another solution to this issue: write one flavour of SQL, translate to others.
  • Despite the "omg RTFM n00b" file I'm frequently pointed to, it might be time to consider relaxing the total ban on the 'class' keyword. This could allow the partial or complete use of one of many mature, fast PHP DBA layers. I refer to ADOdb, MDB2 (the most mature DBA in PEAR), PHP's PDO functions and possibly others.

I've done lots of wrangling with Drupal installations/setups but never any development. This is a feature that will convince me to start, as long as there isn't ideological resistance to doing it right.

Frando’s picture

Yup, I also think this should be Done Right in Drupal 6. And yes, it *might* be useful to abrogate the ban on the class keyword for this and some other drupal 6 stuff. I'm saying *might* be useful, I'm not saying that we should use classes because of classes' or OOP's sake. We just should not make an ideology out of "Drupal doesn't use classes". If classes prove to be useful for a Data/SQL API, we should use them. If not, we shouldn't.

And, one more point: We might want to do this alltogether and consistent with data models in Drupal 6 (e.g. adrians 'FAPI 3' proposal).

For me, this is one of the main goals for drupal 6, because a proper Data/SQL handling API is one of the things that drupal lacks most, and for me it's the one thing that could ease drupal module developement enormously. World domination would be close then ;)

moshe weitzman’s picture

guys - i appreciate the enthusiasm. i want this for d6 too. however, please consider the following:

- this issue is about a schema creation API. no more.
- there is no ban on classes. don't setup a fight where none exists. just create a patch or an issue describing your proposed enhancement and get some buy in. if a class makes sense, it is accepted.

umonkey’s picture

I think all this mess with db_create_table(), db_create_index() and so on is not much of an abstraction. I'd rather vote for just two functions: db_get_schema() and db_set_schema(), which would work with an array describing the structure of a table, with columns, indexes and whatever else, sort of like the form API.

Please argue.

Paul Natsuo Kishimoto’s picture

I'm not sure if I'm the only one who wasn't aware, but there's extensive discussion on the developer's mailing list relevant to this feature and DB abstraction in general.

Frando’s picture

This issue was created as a consequence of this discussion ;)

pwolanin’s picture

While Moshe's point that the issue should be restricted to an abstract definition for creating schema, I think there is a clear connection to this issue: http://drupal.org/node/79684

Abstracting the CRUD functions suggests to me that it would be very useful to be able to call/get a single array with the entire table definition for an object type so that the insert/update queries could be automatically built.

moshe weitzman’s picture

Dries gave a tentative approval for this proposal on the devel mail list. Nice!

FYI, CCK successfully uses a DDL abstraction layer like this already. It isna bit messy though. See http://cvs.drupal.org/viewcvs/drupal/contributions/modules/cck/content_a.... Perhaps just something to get ideas from.

yched’s picture

That's right. The DDL part is more specifically content_db_add_column and content_db_change_column.
It's not really that part of CCK that's messy - the messy part comes with content_alter_db_field, which is more like pre-FAPI3 stuff.

coreb’s picture

+1, subscribing

paranojik’s picture

+1, this will make life so much easier. I think db_change_column($tablename, $colname, $newdef) is needed, too. Also take a look at cck's date field (date.inc). Maybe you can think of a way to resolve that date_sql() stuff.

Shiny’s picture

+subscribing

if this can be similar vocabulary to the forms api, that'll make the learning curve / uptake for module developers easier.

I'd be keen to see:


  'required => true,
  'default_value'=> CURRENT_TIMESTAMP,
  'unique' => true,

and some way to do foreign keys.. even if it's a second function call. after table creation.

flk’s picture

This seems quite a nice idea to me, but i am bit wary of the drop table....

scenario:
user has somehow messed up a modules code/working and wants to reinstall the module.
module makes use of db_drop_table......oops where has all the data gone?.

I know this question might not belong here, but what kinda safeguard are we putting in place against such situations?

Frando’s picture

Assigned: Frando » Unassigned
Status: Needs review » Closed (duplicate)

I reworked my initial patch - see http://drupal.org/node/136171.
I'd say let's continue the discussion on the new (and IMO much better) approach.
Marking as duplicate.