So, MySQL can live happy without LOWER() because on default it's case insensitive. PgSQL on the other hand, is happier with LOWER() because though it is case sensitive, it can index on expressions. So, provide a way to replace queries based on the database type. The performance impact should be very small for non-mysql and for mysql, ah well, the patch does not affect mysql at all :)

When this install hook is in, then we can remove LOWER from taxonomy, user etc. and put the needed replacements into .install and add index to postgresql.

CommentFileSizeAuthor
db_query_replacement.patch4 KBchx

Comments

hswong3i’s picture

So, what is the benefit of maintaining database specific implementation, where an universal solution is also available? This just simply increase the code complexity, on the other hand increase the difficulty of maintenance.

Back to this issue itself, what is our business about database abstraction layer, after a hard feature freeze on early Sept.? We even have D6 beta1 and beta2, which already released for almost 1 month. We can do far more improvement about database abstraction layer in D7 with PHP5.2 + PDO, is this the most suitable timing for database abstraction enhancement?

Moreover, what is the benefit of isolating PostgreSQL prepare handling in such hurry? Prepare statement should become a standard with the help of PDO, even there is no meaning for MySQL as it don't really support for query prepare (http://drupal.org/node/134580):

MySQL, being the great software that it is, doesn't support the query cache for prepared statements. Why? I don't know. However, PDO's original author recommends just using the PDO prepared statement emulation so the query cache still works. I am going to work on the assumption that he knows what he's talking about unless someone proves me otherwise. :-) That should still get us a decent performance boost just by eliminating the regexing.

chx’s picture

This so has nothing to with prepared statements. I might have chosen a bad name. This prepares the query in a Drupal sense as documented in the patch already.

chx’s picture

Also please note that if you have ported module X over then you do not need to touch that -- this is optional hook, on install time, if you do not implement, no problems. There is no API change at all unless you happen to write a database driver or similar -- the db_prepare_query is as public facing API as db_prefix_tables was. Hell, I could even roll this replacement into db_prefix_tables if we so want.

So the win, again: clean code in our module solving the LOWER problem with no DB schema change. There is no other solution offering that.

kbahey’s picture

Finally some sanity and creative solutions!

As we can all see in #83738, the use of LOWER in SQL is slow, and MySQL has no problem overcoming that, because it is case insensitive. For PostgreSQL, the = operator is case sensitive. Just replacing LIKE by ILIKE will take care of this WITHOUT any new columns, without a migration path, without denormalization, without more complexity.

Only nit is the use of "prepare", since it has a specific connotation in databases. Perhaps db_engine_specific_query() or something like that?

If you think of it, 100% database abstraction is only possible by using the lowest common denominator. To really use an engine effectively, one has to use DB specific features often. This hook will allow us to do that, and will allow more engines to be accommodated in the future too.

david strauss’s picture

I still prefer #83738. The patch offered here moves the burden of DB portability into the modules. Such an approach is no better than using a switch statement to select your SQL based on the database system, except this method is harder to read and maintain because it hides the Postgres versions in a sort of translation table.

Moreover, this patch solidifies MySQL as the only database system with first-class support in Drupal. That may be true de facto right now, but I'd rather not reinforce that ranking, especially as Postgres continues to improve and challenge MySQL's assumed performance lead.

chx’s picture

David, the 2nd class citizen status of PostgreSQL is not a technology problem but a human resources one http://drupal4hu.com/node/64 and that will not change overnight. I feel rather it will get worse as MySQL continues to improve and get all the features previously only pgsql had. I would say that this human resources unbalance will stay for at least 1-2 years thus for this cycle we are good.

This solution is better than putting a switch on db_type in the module because if someone writes an Oracle driver then the Oracle support module can inject its replacements without patching the modules.

david strauss’s picture

These DB replacements will be especially fun once db_rewrite_sql() processes the queries. Note that db_rewrite_sql() always runs before db_query() and the the fancy translation logic. One possible solution for the wild unpredictability this introduces is to run the replacement logic in db_rewrite_sql() before the substitutions happen, but then you have to account for SQL that hasn't passed through db_rewrite_sql().

"This solution is better than putting a switch on db_type in the module because if someone writes an Oracle driver then the Oracle support module can inject its replacements without patching the modules."

Then, when a module changes a single character in the query you so carefully mapped, the system breaks without any warning. At least maintaining a patch to a module warns you when there's a conflict applying your patch to an updated version.

chx’s picture

We can separate the replacement from prefixing and run it from db_rewrite_sql and the rewritten queries surely won't be in the mapping so the second run does not hurt anyone. If you develop, then you can adjust your mapping, too. There are other things that are a bit more cumbersome for developers than ordinary users.

chx’s picture

Status: Needs review » Closed (won't fix)

[davidstrauss] chx: Also, one more solution. I think you'll like this one. User searches make two passes. The first looks with "=" and the second uses the current, slow LOWER() method. The first query will *always* find the user in MySQL, but the fallback would be elegant and DB-independent.

ivansb@drupal.org’s picture

Priority: Critical » Minor

Such kind of line won't surely improve the situation:

if (substr($db_type, 0, 5) != 'mysql' && ($replacements = module_invoke($module, 'db_query_replacements', $db_type)) {

what about something like:

if (substr($db_type, 0, 11) != 'myveryowndb' && ($replacements = module_invoke($module, 'db_query_replacements', $db_type)) {

can't you see it won't lower the bar for anyone other than mysql developers? This is an excuse to write not portable code (or with huge performance impact on other db) for all mysql developers. After all everyone else can still use the backdoor... isn't it?[*]
Then we see resistence for absolutely "eco-sustainable" patches (just the cost of an "empty" function call in mysql case) to support Oracle, MS SQL and DB2 (LOB). But no shame trying to push a mysql centric patch.

This kind of solutions aren't going to improve the design and prepare a path to real DB agnosticism.

Furthermore it is not a matter of mysql vs. pgsql. It is a problem of DB abstraction that is not A vs. everything else. And there will always exists, at least this is my hope, more than one DB, not just mysql conquering the world vs. everything else and we are not a DB vendor trying to push one DB in particular. If you ever had the chance to integrate Drupal with some other backend that's not just a web site you'd know there are people that can't use MySQL or wouldn't use it even with a long stick.
I'm perfectly aware that MySQL is the DB for the masses, for "web developement" as "everybody" know it. But well not everybody is "everybody" and I wouldn't underestimate the chances of seeing Drupal in the so called "enterprise" world were there are very few chances our "competitor" will reach us.
Anyway your code reflect your view much more than everything else you wrote on the subject.

This is a technical problem:

  1. It is common interest to support more than one DB, and this is an architectural choiche that is deeper than an if != mysql
  2. There is no interest because supporting more than one DB doesn't pay off

Pretending to chose 1) but doing everything in the direction of 2) is an architectural suicide and a waste of the already few resources trying to follow 1) [*].
The schema api does look to be in a much better direction.

If you ever read anything about SPI design you'd know that thinking in term of 2 "plug-ins" is evil. How can it be better to think in terms of support of 1.5 DB?
This kind of aproach will come to hunt you at night once you'll really have to port drupal to more than one DB.
There is no quick solution to DB abstraction.

People that really took the effort to port Drupal to other DB should be the reference for further developement. Some of those people actually ported Drupal to more than 1.5 DB and they really know the issues you'll encounter if you plan for 1.5 DB.

david strauss’s picture

ivanSB: Dude, "won't fix" means this issue is already dead.