Closed (won't fix)
Project:
Drupal core
Version:
6.x-dev
Component:
database system
Priority:
Critical
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
14 Aug 2007 at 18:51 UTC
Updated:
28 Jul 2010 at 11:06 UTC
Jump to comment: Most recent file
Comments
Comment #1
hswong3i commented@killes: it is also a critical issue for using SUBSTRING: for both Oracle and DB2, it is called as SUBSTR. BTW, comment.module is the only place that we use SUBSTRING, among all core module. maybe we should consider re-factor its programming logic, and implement this sorting by other method?
Comment #2
hswong3i commentedaccording to indeed studying about ADOdb, i found that this is something can be overcome by using database specific abstract function. e.g. we can provide a function call as
db_substr(), which will return a string for substring function, according to database specific syntax (MySQL: substr() or substring(); Oracle: substr(); MSSQL: substring()). please take these as reference (http://phplens.com/lens/adodb/docs-adodb.htm#substr and http://phplens.com/lens/adodb/tips_portable_sql.htm):and so i will postpone this issue into D7. BTW, i will also include this research result in http://drupal.org/node/172541 as D7 database abstract layer technical preview, so we will able to judge its needs by using code, testing and benchmarking, on the other hand correct this idea as soon as we can. i don't hope to let this become another pity issue for D7 ;p
Comment #3
chx commentedCore works fine, there is no issue here.
Comment #4
hswong3i commented@chx: once and once again: are you core committer? please don't close or set others issue as won't fix as you like, it is not your duty :(
Comment #5
chx commentedMight not be my duty but I do close inadequate patches wherever i see them.
Comment #6
hswong3i commented@chx: please don't be too childish: the need of this patch is listed in my previous research remark (http://edin.no-ip.com/html/?q=node/310, http://phplens.com/lens/adodb/docs-adodb.htm#substr), and proved as function within my latest D7 database abstraction layer research progress result (http://drupal.org/node/172541#comment-305267). please feel free to comment about the correctness of my research result; BTW, striking such little footnote won't help about you overall propose (http://www.drupal4hu.com/node/64)
Comment #7
pasqualleWould not be it easier to write 2 stored procedures, substring and length for oracle, db2, mssql (or any other) databases?
Do you really plan to rewrite all modules like this?
Comment #8
hswong3i commented@Pasqualle: i guess i am not proposing something new other than existing ADOdb research result. as they have researched for more than 7 years, with totally 19 different databases supporting. their research result should be very solid.
on the other hand, before i notice about the handling of ADOdb, i try to implement a pre-defined random function for oracle driver once before (integrated with progress of http://drupal.org/node/39260): it is totally functioning:
BTW, isn't this the BEST handling? seems not to be, when you compare it with ADOdb implementation for oracle:
and let's have a look about pgsql implementation, too:
according to ADOdb's suggestion (http://phplens.com/lens/adodb/tips_portable_sql.htm): the numbers of additional abstraction function are limited. according to my studying on their different backend handling, i found that most of them are very reasonable and meaningful, and easy to implement :)
finally, we are not always using those special and specific SQL level functions: there is only around 5~7 lines existing within our latest drupal-6.x.dev CVS HEAD. as we are talking about a huge project as drupal core, you may simply guess about how many lines should be changed within contribution modules :)
Comment #9
pasqualleOk, I searched another php application for reference. I know phpbb support many databases.
so phpbb code snippet
length (only one occurence):
and substring is not used anywhere
I am now convinced, that this issue should be fixed if we want to support more databases.
Comment #10
hswong3i commented@Pasqualle: thanks for your help about exploration on phpBB, and your positive supporting :)
According to ADOdb's document (http://phplens.com/lens/adodb/tips_portable_sql.htm), I try to summarize those database specific special SQL functions which required for abstraction:
DB_CONCAT_OPERATOR(defined constant)DB_LENGTH(defined constant)DB_UPPER(defined constant)DB_RANDOM(defined constant)DB_SUBSTR(defined constant)db_concat()(function wrapper)db_if_null()(function wrapper)Some of them are really unusual, e.g. DB_UPPER and db_if_null(). BTW, as we are now fully implement those abstract functions which are required to be abstract, based on cross database compatibility concern (but not only based on today's needs, which means: trial and error), our developers may fully utilize most powerful database level functions as possibile. This also means that: increase our database abstraction flexibility and encourage our developers creativity.
Progress will integrate with my D7 database abstraction layer technical preview (http://drupal.org/node/172541), which will not duplicate in this issue once again.
P.S. need not to mention, time related handling is also a huge topic for database abstraction: ADOdb provide totally 6 different function wrappers for abstracting those differences (http://phplens.com/lens/adodb/tips_portable_sql.htm). BTW, we shouldn't require for these abstraction, as Drupal always use PHP-level date and time handling, which is very powerful and standardized, and database independence :)
Comment #11
hswong3i commentedJust as example, the following is code snippet from D6 OCI8 implementation, which demo how abstraction is needed for portable code:
Comment #12
hswong3i commentedComment #13
chx commentedComment #14
chx commentedAs Siren is not a project on Drupal.org but an unofficial fork, I am removing it from issue titles to avoid confusion.
Comment #15
hswong3i commentedThis is just a very simple abstraction: replace most
LENGTHandSUBSTRING, intoDB_STRLENandDB_SUBSTR. On the other hand, I also collect some other important abstraction, based on above research founding, and progress of my personal research project Siren.Patch via latest CVS HEAD, and test with both MySQL and PostgreSQL.
Comment #16
Crell commentedFunction abstraction is something that would need to happen as part of the new query builders. This approach is therefore no longer applicable.
Comment #17
hswong3i commentedThanks for TNG DB layer with OOP style, we can now clone ADOdb handling in more ideal implementation. As mentioned in http://drupal.org/node/167335#comment-636082, these abstraction is very useful for other DB driver development.
This patch handling the following abstraction: quote identifier (e.g.
"), concatenate operator (e.g.+), uppercase function, random function, string length function, substring operator, string concatenate function and if NULL benching. Each DB driver will override their required change, where a friendly wrapper function is provided as like as escapeTable => db_escape_table(). 3 set of examples are available (comment.module, locale.module and statistics.admin.inc), too.Patch via CVS HEAD, tested with MySQL and PostgreSQL. MySQL work fine with either case, where PostgreSQL will buggy for: 1. add comment (but thread view work fine); 2. access top visitors page.
P.S. This patch is mainly target for the missing abstraction which is now missing from TNG DB layer, and try to minimize change to examples if possible. TNG DB syntax change is not its main concern, where we may handle afterward.
Comment #18
hswong3i commented@Crell: Do we have any schedule for fixing this issue within our new query builders? Because we may use those abstraction all around ANY type of queries and ANYWHERE within SQL, I would like to have your suggestion about how to work it out...
Comment #19
Crell commentedThe approach I think would make mo sense is to have alternate methods on the SELECT builder that parallel addField(). Vis, addSubstring($table, $field, $alias, $start, $length), or something like that. That may make more sense as an extender... but actually the current extender architecture is not DB-specific, either. Hrm....
Comment #20
hswong3i commentedThe idea of abstract with SELECT builder is good, but I would like to give some love for RAW SQL syntax, too. Most likely we don't need to use SELECT builder for every cases because it is coming with performance degrade (for sure...), some simple abstraction may still needed for direct call.
Just for brainstorming: maybe we can handle this issue in 3 different levels?
Any idea?
Comment #21
chx commentedThere is the mapConditionOperator facility already so that you can map, say 'CONCAT' to anything you want. Now, if the addField method would support conditions then you would be golden and it would be quite easy I guess. We might want to rethink the name but it'd be MUCH simpler than this here.
Edit: no raw SQL love, thanks. String manipulation as database abstraction have failed in the past, it's ugly etc.
Comment #22
hswong3i commentedImplement within db_select() can simplify some abstraction, but not always truth. E.g. How to implement the
RAND(),UPPER, etc? Even theifNll()is not in the same format as above. Don't forget that we are still missing date related abstraction within DBTNG, when compare with that of ADOdb. There are too many variation and so we may need to implement it as isolated for more flexibility.On the other hand, I am agree with catch's previous comment for other issue that "around 95% of SELECT queries will still use db_query()". It is not simple for revamp everything with query builder (and not really needed for), RAW SQL will always work around with us. We hate it, but we can't ignore the needs of it :S
Comment #23
hswong3i commentedPatch enhance with:
IFasdb_if_null(). Pass MySQL simpletest.Besides this patch, we only need to handle legacy PostgreSQL unsigned integer with another issue, and so legacy PostgreSQL abstraction will be cleaned up :D
Comment #24
chx commentedYou again put your foot on the ground and disregard what others say. This lead you being disliked in the first place.
AND s.version = '%s' AND " . db_strlen() . "(s.source) < 75",is not something that can get in core, period.Comment #25
hswong3i commented@chx: agree that patch from #23 is not perfect; BTW, at least they are now working, and able to help the code cleanup of legacy PostgreSQL implementation, therefore simplify cross database maintenance effort.
Maybe we can first let this issue go into core for RAW SQL direct call, and further abstract them within query builders for more elegant look-and-feel with another issue?
Comment #26
chx commentedIt's so ugly and it's hard and unnatural to use. I can't say this is going to work at all. DBTNG works very hard to avoid SQL string manipulations.
Comment #27
hswong3i commentedMay I have some hints about the suggested API look-and-feel? I am still studying our new query builders and so don't have clear idea about how it should look like :-S
On the other hand, I just discover that we give a revamp of DBTNG syntax for includes/session.inc (line 191) as:
So
COUNTis used as SQL string manipulation... If patch from #23 get committed, we will have code as:Maybe we can also revamp db_strtoupper(), db_strlen() and db_substr() as function call, so with similar coding style? E.g.
Any idea?
Comment #28
chx commentedI do not know what are you talking of. Head contains
COUNT(sid)notCOUNT([sid])to begin with. The [] is your idea and it was never committed and I do not recommend trying to sneak it in here. I am trying to lend you a peaceful hand but if you downright lie about what's in core, I might lose my patience. See, COUNT is not string manipulation.COUNT(sid)is just a piece of a string constant, nothign is done with it. Nothing is appended and later on it's not parsed.db_strlen() . 'sid'are two strings concatenated. Now,db_count('sid')looks a lot better. Avoid the.in your API and you shall conquer.Comment #29
hswong3i commentedchx: Sorry for the
COUNT([sid])syntax. It is a mistyping, as I copy it directly from my own CVS (including patch from #371: resolve ANSI SQL-92/99/2003 reserved words conflict in query statements). Please forget it, and I have remove that[]from the above example :SI give a try for the revamp. Most abstraction is now handle with function call but not string manipulation. Detail:
One exception. comment.module coming with following SQL:
In order to rewrite with db_strlen() and db_substr() we should use the following syntax:
It is difficult for remove the
. ' - 1'string manipulation, because we are now having string replacement for SQL syntax, but not real integer operation. In case of DBTNG syntax, it should looks like:Comment #30
hswong3i commentedOoops... Patch reroll via CVS HEAD. Update sequence number for update in system.install.
Comment #31
chx commentedI will catch Crell and discuss because this is something that is worthy of discussion. The outstanind issues:
addFieldmethod and I think if we provide an alias then it'll work, there is nothing that enforces the field to be a real field, so computed ones seems a go.Comment #32
Crell commentedWhy do people insist on using addField()? addField is specifically for actual literal fields. For arbitrary values in the SELECT clause, use addExpression(). That's what it's there for.
I don't like using a public property for these values. It's much more stable to just always use a method, especially when the function wrappers should always be optional, and you should be able to do everything you need just with direct methods on the connection object. E.g.:
Note that this allows a driver to completely change how the length function works (maybe there's some braindead database out there that puts it after the parens), makes it work equally well direct and through the wrapper function, and supports targets of different databases. (There's no reason why different targets on the same connection key have to be the same database type. In fact, if we can ever get SQLite working they may well not be.)
The docblocks need work, as it looks from their current text like what's returned is, for example, either a value or NULL, instead of an SQL fragment that will perform that logic in the database.
"Discontinuous some legacy PostgreSQL custom function abstraction." -> I think you mean "Discontinue some legacy..."
I'm undecided on how we want to handle this for dynamic queries.
Comment #33
hswong3i commentedPatch revamp based on above suggestion, including:
Patch tested with MySQL simpletest and looks good. For the docblocks I would like to have some help from other... My English is too poor...
P.S. I keep class internal method names as sync with ADOdb version (http://phplens.com/lens/adodb/tips_portable_sql.htm#native) in order to simplify debug and code trace.
Comment #34
chx commentedThere is one more thing to consider here. pgsql and sqlite both supports user defined functions (and now mysql too) -- can we say "won't fix" and fix it in the database engline?
Comment #35
Crell commentedThe caveat there is that we then need to implement such functions, as well as standardize on whose SQL flavor we're going to use. In most cases that can be ANSI if available, but you know as well as I do that most of our developers only know from MySQL. So do we give everything a MySQL compatibility layer? What about databases that don't support user defined functions?
(I've not actually tried writing one in any language.)
What about foreign databases that are not the main Drupal DB?
Comment #36
hswong3i commentedIt is not easy to judge if we should implement in the database engine, because we need to abstract a simple native MySQL SQL function, with numbers of PostgreSQL custom SQL functions.
For example, MySQL's
IF()support the following syntax (http://dev.mysql.com/doc/refman/5.0/en/if-statement.html):It is too complicated if we hope to implement such logic in perfect, with PostgreSQL custom SQL functions. That's why we just implement 2 interfaces of
IF()within Drupal's PostgreSQL driver.BTW, a simple abstraction in PHP as string manipulation is much flexible and complete.
Comment #37
chx commenteda simple abstraction in PHP -- mgiht be. string manipulation -- no.
Comment #38
hswong3i commentedSome demonstration for combine use with DBTNG syntax (code snippet from comment.module):
SUBSTRINGandLENGTH(and so other additional abstractions) are able to use withinSELECT ... FROMorWHERE ..., so as flexible asANDandOR. Therefore the implementation of db_substr() and db_strlen() should be similar as db_and() and db_or(): able called within condition() or addExpression() (i.e. we shouldn't implement as$query->strlen()syntax).Patch reroll, with some SQL revamp with DBTNG syntax. For those I don't know how to handle, I give a remark with
@todo: Revamp with DBTNG syntax.Test with MySQL + simpletest, all related tests pass.Comment #39
Anonymous (not verified) commentedThe last submitted patch failed testing.
Comment #40
hswong3i commentedReroll based on comment.module update.
Crell and chx: should we have some more discussion about this issue? I am really looking for this feature build into Drupal core ;-)
Comment #42
hswong3i commentedReroll via CVS HEAD.
Comment #44
chx commentedOnce again, how will you get the core / contrib developers to use something this ugly? What's the benefit to them if they wont use Oracle and they wont? I told you to use this question and there is no answer to this. I still think that either the database engine should solve this like postgresql does in HEAD or you should write operator-like support for the SELECT builder (which is mostly what Crell said in #19). What are you asking from us? Why are you asking us when all is told over and over and over and you never ever listen?
Comment #45
catchdb_query is for running static SQL strings. Especially now that we have the dynamic query builder.
db_placeholders() is on its way out too.
By encouraging users to concatenate strings and function calls, we open up inexperienced developers to start sticking implode($array_of_unescaped_strings) and other nastiness inside their SQL statements, this has far worse consequences than lacking some cross compatibility which can be dealt with in drivers anyway. Like chx and Crell have already said, this might not be out of place in db_select(), but in its current state it would cause many many more problems than it solves - and once more these are issues that can be dealt with in drivers.
Comment #46
hswong3i commentedThis is not a good approach because we always implement a subset of original SQL functions only. E.g.
substringin case of MySQL can be written as:SUBSTRING(str,pos), SUBSTRING(str FROM pos), SUBSTRING(str,pos,len), SUBSTRING(str FROM pos FOR len). Abstract syntax variation with manual SQL function will result as a trim of functionality, or a complicated and duplicated handling.BTW, the different of syntax (substr vs. substring) is the only variation that we must duel with. Therefore abstract with a simple function call should be the most direct approach. It is not about "benefit" to all/any developers, but how to solve the problem with the most simplest and elegant format. I am not encourage/discourage the use of these SQL functions or MySQL/Oracle, but trying to provide options for developers when they do required for.
In case of MySQL, the following 3 queries are both in correct syntax:
If we implement with SELECT builder, we will need to handle similar abstraction for both addExpression(), condition() and orderBy(). For sure that is duplicated. SUBSTR(), LENGTH(), RAND(), ifnull(), etc are generic operator/operations which much similar as AND and OR, but not similar as COUNT() (i.e. addExpression()) or LIKE (i.e. condition()).
This is not simple, too. E.g. comment module is now using SUBSTR() and LENGTH() together, so implement with addSubstring() will not able to cascade with LENGTH(). We need something more flexible.
Comment #47
josh waihi commentedSubscribing to this. Casting has also become an issue as different databases support different casting options. Abstraction in the long term is surely going to be better right? leaving the database driver to structure the SQL rather than the developer which will have less of these compatibility problems. #419858: Database Type Casting Per Database Connection is my suggestion for implementing casting in abstraction.419858
Comment #48
hass commented+
Comment #49
Crell commentedComment #50
andypost.
Comment #51
damien tournoud commentedClosing this old issue. Abstraction in the new database layer is delt with in very different terms, and the string construction proposed here is not very elegant.