Right now, Schema API has a few problems.

1) The API for table creation and modification is completely different than the rest of the DB layer, because it's still wacky functions with big arrays rather than fluent method chaining.

2) The API for table alteration doesn't allow multiple alter commands in one query, which in some cases is totally nuts for performance, especially when creating or manipulating indexes. It also leads to some crazy footwork to get around not having an index in places, such as: http://drupal.org/node/159329

So the solution is to make Schema API suck less. I see two possible approaches. (Neither of these affect hook_schema(). That's a layer above all of this.)

1) Introduce DDL query builders, along these lines:

db_alter('tablename')
  ->addIndex('foo')
  ->addField('bar')
  ->addField('blob')
  ->dropField('baz')
  ->execute();

db_table('tablename')
  ->addField('bar')
  ->addIndex('baz')
  ->addField('foo')
  ->execute();

Pros:
- API is similar to the other query builders we have.
- Probably easier to implement than option 2.

Cons:
- Still a query-based approach rather than a DB-state-based approach.

2) Fully objectify Schema API with classes/objects to represent tables and fields.


$schema = $connection->schema();
$table = $schema->getTable('tablename');
$field = new DBField('newfieldname');
$field->setType('blah');
$table->addField($field)l;
$table->dropIndex('bar');
$table->save();

$table = $schema->createTable('tablename');
$table->addField($field);
$array = $table->getArrayVersionLikeHookSchemaUses();
$sql = $table->getSqlCreateStatement();
$table->save();

Pros:
- Richer, more complete API.
- More like other similar systems (I think).

Cons:
- Different than other parts of the DB API at the moment, but less different than it is now.
- More effort required to implement.

Thoughts?

Comments

Crell’s picture

Tagging.

moshe weitzman’s picture

Just by reading those two examples, I think most will agree that #1 has better DX. #2 requires more brainpower and referring to docs. In either case, this could be a very big performance win. InnoDB and Sqlite are very very slow with ALTER TABLE operations and batching them like this is a essential.

Crell’s picture

Note that we certainly could make #2 mostly fluent, and I would rather insist on it. :-) I didn't chain anything for example purposes, but it could still be largely chainable.

bjaspan’s picture

I think both options represent more engineering than is needed to solve Moshe's actual problem: that you cannot perform multiple table alterations in a single query. Larry's suggestion #1 for DDL query builders really boils down to an OO interface to modify a schema structure and invoke the "do what is necessary to make the database table look like this schema structure" function. We always knew we'd want such a function, we just left if out of the initial D6 patch.

So, we could just write that function (per db driver), which the DDL query builder would need anyway, instead of completly redoing Schema API. Or, at least, we could just write the function first and let Larry spend his life on another big API rewrite if he really wants to. :-)

Crell’s picture

No, the DDL query builder (option 1) would not involve a "make the database table look like this somehow" method. (There are no functions in DBTNG that are not just convenience wrappers for methods.) It would still be up to the developer to write the DDL commands for the query builder that did what he wanted to do, just as it's up to the developer to not send nonsensical configuration to SelectQuery.

Method 2 *looks* like that's what it's doing, but it would actually just queue up the changes you direct and issue a similar query to what #1 generates.

In neither case am I looking at the mythical "make the schema look like what hook_schema is now, kthxbye" operation. Either of these methods would be a prerequisite for that.

dgv’s picture

Comment #2 from moshe talks about a performance win, but how would the generated SQL be different between this:

db_alter('tablename')
  ->addIndex('foo')
  ->addField('bar')
  ->execute();

and that:

db_alter('tablename')->addIndex('foo')->execute();
db_alter('tablename')->addField('bar')->execute();

And in what way would it make any difference in performance?

Crell’s picture

@dgv: SQL has the ability to say "ALTER TABLE foo CHANGE X, CHANGE Y, CHANGE Z", and then figure out how to make all of those changes in parallel. If you make it three separate statements, it has to lock the table, alter it, and update all indexes each time separately. There are also some changes, like changing primary keys, that break the table if not done in parallel. Right now we kinda hack around that, but that's not a good solution.

Think of it as the difference between "UPDATE foo SET a=1, b=2 WHERE c=3" and "UPDATE foo SET a=1 WHERE c=3", followed by "UPDATE foo SET b=2 WHERE c=3".

webchick’s picture

subscribe. No time to look at this now.

moshe weitzman’s picture

I read that indexes are not rebuilt until a transaction is finished. Would it make sense to do all of update.php inside a transaction?

bjaspan’s picture

re #9: That would allow update.php to rollback all updates if any fail...

Crell’s picture

MySQL doesn't support transactional DDL, so it commits immediately when you muck about with the schema.

I still don't see us reaching any consensus on the approach to take. We kinda need to decide on that, and if we don't then I'll just decide when I get around to writing it and y'all can suck it up. :-)

dave reid’s picture

Issue tags: +Schema API

Interesting Crell. Adding another tag.

neilnz’s picture

I would love to help and have some ideas around making schema API more robust cross-database (eg. column quoting).

Subscribing for now, but I'll be back.

josh waihi’s picture

subscribing

Crell’s picture

Issue tags: +Needs committer feedback

Adding a new tag as discussed with webchick.

Nick Lewis’s picture

I like #2. Actually, the first example makes no sense to me when I start thinking about how I'd go about setting a fields attributes.

Is this how you'd image actually doing something useful in the first example? (not a rhetorical question)

db_alter('tablename')
  ->addField('bar')
  // i'm guessing that $this does not refer to the field "bar", but to the table named "tablename"?
  ->getField('bar')
  // so I must get the field before I set type ? 
  ->setType('int')
  // when I want to get back to the table, will I do a jquery end() like thingy?
  ->end() 
  // now we are dealing with a table again?
  ->addIndex('foo')
  ->execute();

The first example seems fraught with peril (assuming its possible). Or would we be doing what we always do:

db_alter('tablename')
   // false refers to being unsigned of course
  ->addField('bar', 'int', 'small', false)
// or even better
db_alter('tablename')
// how drupal is that?
  ->addField('bar', 'int', array('size' => 'small', 'unsigned' => false, 'default' => 0))

+1 for #2

double -1 for #1

Crell’s picture

I intended both as viable options. :-) In practice it could be db_table_alter() for the factory. Either way, the idea there would be to model closely to SQL's ALTER TABLE syntax, much the way the other query builders closely model INSERT INTO table_name, SELECT, and so forth.

So in model 1, to add a column to a table you'd do something like:

db_table_alter('mytable')
  ->addColumn('columnname', $info_array_about_column)
  ->execute();

And of course could chain multiple addColumn, removeColumn, renameColumn, addIndex(), and so on commands together just like you can in SQL.

In method 2, you would instead do:

$schema = $connection->schema();
$table = $schema->getTable('tablename');
$column = new DBColumn('newcolumnname');
$column->setType($some_information_about_the_column_here);
$table->addColumn($column)l;
$table->save();

So rather than building a command to run, you load an entity, modify it, and save it again.

Both are, I think, viable approaches. I suspect that #1 will actually be easier to write, but I'm not certain of that.

Nick Lewis’s picture

I edited my comment to make me sound like less of a jackass. :-D

So if in both examples we rely on using the same old drupal arrays, e.g.

db_table_alter('mytable')
  ->addColumn('columnname',array(
    'type' => 'varchar',
    'length' => 88,
    'not null' => true,
    'default' => 'Douglas',
  )),
  ->addColumn('columnname2',array(
    'type' => 'varchar',
    'length' => 88,
    'not null' => true,
    'default' => 'Wombat',
  )),
  ->alterColumn('whatever', array(
   'type' => 'int'
  ))
  ->execute();

If we describe a columns attributes as classic schema api arrays in both examples. then I take back my minus -1 on #1

Something that attracted me to the second was that I wasn't passing these arrays. Usually, my schema updates are pretty damn specific, and I liked the idea of calling a function "e.g. setType, setSize" (which my IDE brings up suggestions for) instead of remembering what array keys.

The second strikes me as being better, but i confess its probably because it reminds me of a lot non-drupal classes I've already worked with. I haven't had a good experience with PHP objects that chain yet...

I also confess I've altered more tables through the existing schema api then straight sql.

Edit
Actually, i'm in #1 camp now. My dream of variable hints in my IDE seems dead, so #1 is at least better than the existing way -- and #2 seems cumbersome in comparison.

Crell’s picture

Well arguably we could do a full set of modifier methods on column objects and table objects. I really haven't thought that far ahead on that approach. #1 would definitely use arrays somewhere at the very least.

Variable hints in the IDE require typed @param and @return statements. I'd like those too, but that's a separate issue. :-)

Drieschick, any input? Beuler?

catch’s picture

Version: 7.x-dev » 8.x-dev

No input, so this will have to wait until D8.

catch’s picture

Priority: Critical » Major
donquixote’s picture

Neither of these affect hook_schema()

Your remark just prevented another sidetracking :)
#1154334: "Fluid" syntax for hook_schema() (method chaining).

larowlan’s picture

subscribe

xjm’s picture

Category: task » feature

Discussed this with webchick in IRC. We agree this is more of a feature request.

webchick’s picture

Version: 8.x-dev » 9.x-dev

For 9.x, we might want to define tables in YAML. hook_schema() is a real PITA to write by hand.

catch’s picture

Priority: Major » Normal

Entity/Field API and 'backends responsible for own storage' make hook_schema() much less necessary, to the point where we could look at removing it entirely in 9.x. Downgrading.

danithaca’s picture

+1 for define tables in YAML! As a contrib module developer, my first instinct was to go to "mymodule/config/install/" to find the db schema yml, but it wasn't there. A temporary work-around solution for D8 is to define an API, such as drupal_schema_yml_to_array(), and then do:

function mymodule_schema() {
  // this function convert a YAML db definition to a hook_schema() array().
  // you could still use the conventional way of define db array in hook_schema().
  return drupal_schema_yml_to_array();
}

I would say use YAML to define data in ContentEntityBase::baseFieldDefinitions() as well. PHP is not Java and I don't see the point of using getters/settings to access data structure. In the "contact_entity_example" module, baseFieldDefinitions() took about 120 lines of code. It feels insane to me to write 120 lines of code just to define such a simple data structure. Even the old way of using hook_entity_info() looks cleaner.

A side note: We as a community should really try to win over developers through better DX. Recently I'm looking at Python-Django per request from a client. I was amazed how simple it was to define tables in django.

webchick’s picture

Version: 9.x-dev » 8.1.x-dev

Actually, unless I'm mistaken, I think this is the kind of thing we could do in 8.1.x, as long as we retain backwards compatibility for the old way of doing things.

webchick’s picture

Status: Active » Postponed

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

xjm’s picture

Category: Feature request » Task
Status: Postponed » Active
Issue tags: -Needs committer feedback

No longer needs to be postponed (nor currently in need of committer feedback).

Also more of a task than a feature request since it's about DX and API improvement.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

donquixote’s picture

Are there still any plans to define database table schemas in yaml instead of a hook?
It seems this would work fine for all schemas that are not dynamic, that is, that don't depend on site configuration.

Of course contrib modules can just call Yaml::parseFile() within their hook_schema(), but then every module would have to invent its own convention how to name and where to store this yml file.

catch’s picture

@donquixote there's nothing active, but that would be a good issue to open.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.