Observe that for mysql, db_encode_blob and db_escape_string does not really differ. This a big performance blow. Also we maintain some code duplicated that could live happily in database.mysql-common.inc .

Comments

chx’s picture

Why critical? Because for two releases now we suffer from http://drupal.org/node/10407#comment-56287 this and it's totally, absolutely unnecessary.

chx’s picture

StatusFileSize
new14.11 KB

Let's not forget temporary tables.

RobRoy’s picture

Applied patch #2 to fresh checkout, installed Drupal, created a Page node, and saw the entry in watchdog log. Looks gravy to me!

RobRoy’s picture

Status: Needs review » Reviewed & tested by the community

RTBC

chx’s picture

More notes: pgsql have not changed, that can't broken. #1 broke RobRoy's install so we know that temp tables are used.

gábor hojtsy’s picture

Status: Reviewed & tested by the community » Needs work

Wow, how many issues are involved in this one patch?

- I see some common stuff from mysqli and mysql .inc files are moved to their common files (not checked char-by-char whether they are modified)
- I see _db_query_callback() *moved* from database.inc to the pgsql.inc (which I honestly don't understand)
- the constant was moved there too

Also chx mentions the encoding functions which does not seem to be touched in this patch. I don't yet understand what is happening here, so I don't think this is ready.

chx’s picture

Status: Needs work » Reviewed & tested by the community

Heh. There is one issue in here, just I solved it elegantly so it seems a lot. So, I took db_query* and changed them to use sprintf for mysql. I am lazy so I did it only once. If you check char-by-char they are all modified. If you take a Drupal 4.6 (wow, old!) then you will see striking similarities though... I kicked all the preg_replace stuff over to pgsql because that one needs it.

dries’s picture

It's probably worth documenting this a little better in the code. It would also be comforting to see some test reports.

chx’s picture

StatusFileSize
new14.08 KB

I generated 50 users, 100 nodes and 500 comments. Ran ab -n100. Before patch:

Requests per second:    5.13 [#/sec] (mean)

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   185  194   9.1    193     232
Waiting:      177  186   9.0    185     224
Total:        185  194   9.1    193     232

After patch:

Requests per second:    5.25 [#/sec] (mean)
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   184  189   8.7    187     235
Waiting:      177  182   8.4    179     224
Total:        184  189   8.7    187     235

That's a 2% improvement. Might not be much but still something on a more busy page it'll be even more.

chx’s picture

Forgot to mention: patch is just a reroll against HEAD.

hswong3i’s picture

basically support this patch, as i also hope to move the _db_query_callback() into driver locally: it will don't even become such necessary when using PDO, since most of the placeholder will replace as "?", unless BLOB and CLOB handling. let's have look in following PDO's example:

<?php
$db = new PDO('odbc:SAMPLE', 'db2inst1', 'ibmdb2');
$stmt = $db->prepare("insert into images (id, contenttype, imagedata) values (?, ?, ?)");
$id = get_new_id(); // some function to allocate a new ID

// assume that we are running as part of a file upload form
// You can find more information in the PHP documentation

$fp = fopen($_FILES['file']['tmp_name'], 'rb');

$stmt->bindParam(1, $id);
$stmt->bindParam(2, $_FILES['file']['type']);
$stmt->bindParam(3, $fp, PDO::PARAM_LOB);

$db->beginTransaction();
$stmt->execute();
$db->commit();
?> 
<?php
$db = new PDO('oci:', 'scott', 'tiger');
$stmt = $db->prepare("insert into images (id, contenttype, imagedata) " .
  "VALUES (?, ?, EMPTY_BLOB()) RETURNING imagedata INTO ?");
$id = get_new_id(); // some function to allocate a new ID

// assume that we are running as part of a file upload form
// You can find more information in the PHP documentation

$fp = fopen($_FILES['file']['tmp_name'], 'rb');

$stmt->bindParam(1, $id);
$stmt->bindParam(2, $_FILES['file']['type']);
$stmt->bindParam(3, $fp, PDO::PARAM_LOB);

$stmt->beginTransaction();
$stmt->execute();
$stmt->commit();
?> 

BTW, i don't really think this a critical issue:

  1. code seems to be over simplify, and not look particularly simple or elegant to me
  2. i don't really support this as a solid result about performance improvement: according to my previous experience with benchmarking (http://edin.no-ip.com/html/?q=node/281, http://edin.no-ip.com/html/?q=node/282, http://edin.no-ip.com/html/?q=node/284 and http://edin.no-ip.com/html/?q=node/299), 2% is just simply something that within standard error range. it is usually cases dependent :(

if this patch is trying to simply the development for our next generation database abstract layer (as, with PDO and some more other databases supporting), it seems ok to me; if we are just trying to duel with such 2% improvement, i will suggest for a more indeed review about this patch in the future, may be D7 :)

hswong3i’s picture

Status: Reviewed & tested by the community » Needs work
chx’s picture

The code is straight from Drupal 4.6 . Core is quality enough IMO. I agree it's hard to benchmark but without a doubt preg_replace_callback in such an often ran call is not the fastest.

hswong3i’s picture

Based on the more indeed studying about how PDO works, I found that this patch SHOULDN'T be committed: we SHOULD even EXTENT the level of abstraction, but not REDUCE it. This can be discussed based on 3 main concerns: our PAST, PRESENT, and FUTURE:

Our PAST

Please don't seems our judge past effort as FOOL! It is for sure that we abstract this handling based on cross database compatibility concern. The idea should be something similar as: "This abstraction will become useful for our future database development, so let's abstract it!"; on the other hand, we accept this abstraction as it don't affect the overall performance. It show its needs and functionality, within both 4.7 and 5.0 for no question. We shouldn't rollback to something that we are already fully studied and tested as meaningful by reality.

Our PRESENT

According to the research result of Oracle/DB2/MSSQL development, we can found that our abstraction is something that not yet complete: we just keep our eyes on MySQL, and so loss our focus on the needs of other databases (or even judge that as a "bug" within other database system...) LOBs is something that need to be abstract by using this function callback and additional APIs (where we are still missing...). We should even consider about extending our abstraction level into the field of CLOB. It seems to be not a good idea if we are trying down grade our competence.

Our FUTURE

Our abstraction seems a bit fancy and bulky for PDO development: most placeholder will replaced as ? within our next generation database driver, and so '%s' will become something not suitable in this case.

From the point of view of a developer, it is NOT NECESSARY to enclose %s as '%s', within query level; we are able to abstract this within database driver level. This change can simplify the handling for our developers (and so, lesser buggy code...).

Moreover, the replacement for %s will be much simple when compare with '%s', in case of PDO driver. On the other hand, it will be much better if we can provide a function callback hook for both %d and %f: and so we can simply replace it as ? for PDO implementation, but able to keep the backward compatibility with existing mysql/mysqli/pgsql implementation.

I will try to implement a counter propose based on above ideas:

  1. replace all '%s' syntax into %s, among all core modules
  2. update all %s related handling within database driver
  3. add additional abstraction for both %d and %f

PDO + MySQL is something that not far from today. If we are able to prepare a suitable environment for our next generation driver, we may even able to have a "drop-in" PDO driver within our D6 life cycle :)

chx’s picture

Status: Needs work » Reviewed & tested by the community

Drupal 6 is closed for features for months, there is a beta in two days. Drupal 7 is not my concern here. This patch does not take away any abstraction capabilities -- all it does it makes MySQL code simpler and faster. Code executed on PgSQL does not change at all.

hswong3i’s picture

Status: Reviewed & tested by the community » Needs work

2% performance improvement with no solid benchmark result supporting is not my concern. as a database abstract layer developer, i am just trying not to isolate our MySQL implementation from the normal and standardized architecture, without trading significant performance, simplify its implementation, and always keep our door open for other database driver development. on the other hand, if the change can also benefit for our future, it should be more welcome. i feel that PDO + MySQL is not far from today, it is even able to be achieved within D6 life cycle as a "drop-in" option :)

chx’s picture

Status: Needs work » Reviewed & tested by the community

Sigh. I do not have yet NowPublic.com on six so I can't give you benchmarks enough. There is a benefit there is no doubt. And there is no further database driver development as far as Drupal 6 concerned as there is a feature freeze so that's simply not an argument. Drupal 7 can patch more.

So to sum up: there is less code to maintain, the mysql query functions are simpler and known quality (aka. straight from D4.6 core) and that's it.

If you want to judge this patch I recommend reading http://drupal.org/patch/review esp

No patch can save the world, not even a Drupal core patch. If it works and does something useful on its own, then it is good to go. [...] Additional features can be added later.

Please do not judge the patch with pie-in-sky arguments.

hswong3i’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new2.69 KB

Take a look at both the big picture and the details.

Don't stop reviewing at the first sign of trouble.

the change from 4.6 -> 4.7 was mainly due to cross database compatibility concern; it is keep on functioning perfectly within 4.7 and 5.0; but you are now trying to re-roll it back into our old day, on the other hand without any significant performance boost up. you are now wasting our existing research progress, and blocking our path for the future. sorry that its standing point is not strong enough. we won't add something that is meaningless into core, and so that's also about removing it :)

on the other hand, existing implementation is also keep on running perfectly within the development cycle of D6. it seems not a suitable timing for committing this patch in such hurry, as it didn't cause any problem (on the other hand, change without any significant benefit...). i will suggest to postpone this discussion into D7, as that day will comes soon. any idea?

P.S. sorry that it is a bit difficult for me to agree about it as RTBC: i just can agree it as "patch (code need review)" (at most...). sorry that i will not change its status any more, as i have no comment other than above... hope some other else will agree it as RTBC, but that shouldn't be you, or me :)

Crell’s picture

hswong: PDO is not a consideration in Drupal 6. While it's theoretically possible that a MySQL-PDO backend could be written for our current API (I wrote most of one last spring), my testing showed that there was not really any benefit to doing so at this point. If chx patch makes the current MySQL driver in Drupal 6 faster and doesn't break anything, then it is Good(tm). When we move to PDO in Drupal 7 nearly the entire database backend will be changing. Forward-compatibility inside the engine with a system that is not even fully written yet is not something we should worry about, especially 2 months after code freeze.

I don't have time at the moment to test chx's patch myself, but if it's RTBC aside from concerns for PDO-compatibility, then it is RTBC as PDO-compatibility in D6 is a non-issue.

chx’s picture

Status: Needs review » Reviewed & tested by the community

The patch was tested earlier in the issue before all this what-could-be followups happened. Also I am not taking away any possibilites your precious %b stays, do whatever you need to do with them.

hswong3i’s picture

i just just asking for answer about:

  1. where is the performance boost? please show by benchmark
  2. why roll back to our old implementation? if it is meaningless, why we need it during 4.7 and 5.0?
  3. why not consider about the future, when we are able to consider?
  4. why we need this patch in such hurry? just 4 days before beta release? why not propose it within last 2 month, if it is really a critical issue?
  5. why not consider about postpone it into D7?

please don't just telling me "it is able to do so without error": we can always do something without error, but short sight :(

chx’s picture

where is the performance boost? please show by benchmark

I did a benchmark and linked to the original issue which also showed a performance loose.

why roll back to our old implementation? if it is meaningless, why we need it during 4.7 and 5.0?

Because noone payed attention.

why not consider about the future, when we are able to consider?

Because nothing is set in stone. We can change anything later.

why we need this patch in such hurry? just 4 days before beta release? why not propose it within last 2 month, if it is really a critical issue?

Because there is a beta coming and not likely to get in after beta.

why not consider about postpone it into D7?

Why would I? This change is confined within db_query so programmers will see no change. Drupal 7 is going to be very different.

hswong3i’s picture

Q: where is the performance boost? please show by benchmark
A: I did a benchmark and linked to the original issue which also showed a performance loose.

May be I am miss understanding about the example... this patch (http://drupal.org/node/10407#comment-56279) seems talking about "abstracting _db_query_callback() but not remove it"; the benchmarking (http://drupal.org/node/10407#comment-56287) seems talking about "patched with abstracting _db_query_callback() result in performance boost up", but not the case about opposing it... am I correct?

Q: why roll back to our old implementation? if it is meaningless, why we need it during 4.7 and 5.0?
A: Because noone payed attention.

I guess the main point of moving this abstraction internally due to its meaningless for MySQL. You seems answering "because no one payed attention" about its meaningless, but i am asking about "why you are judging this abstraction as meaningless? if it is meaningless, why we need it during 4.7 and 5.0?". Seems off topic...

Q: why not consider about the future, when we are able to consider?
A: Because nothing is set in stone. We can change anything later.

I am now working for next generation of our database driver (PDO, with code implementation). Nothing is set in stone (surly, as I am working for future), but why do we need to push the case into something meaningless and duplicated (according to current and existing research result), and further more re-roll it back into a general and standardized framework within our next generation? This is something that I am talking about "short sight" :(

Q: why we need this patch in such hurry? just 4 days before beta release? why not propose it within last 2 month, if it is really a critical issue?
A: Because there is a beta coming and not likely to get in after beta.

You are talking about "because this patch is needed, so we will need to commit it on time", but I am asking for "how can you judge it as critical and required to commit on time?"

Q: why not consider about postpone it into D7?
A: Why would I? This change is confined within db_query so programmers will see no change. Drupal 7 is going to be very different.

How can you foresee about this "very different"? Are you now working on our next generation of database abstraction layer? If so, would you mind to share your progress with me, and so let's speed up its development?

chx’s picture

I am not working on future datbase layers but Crell for example does. I totally, absolutely do not understand your comments. I do not change abstraction in any way or form I just changed how mysql queries are performed. What do you oppose so vehemently...?

Also, the old issue I linked to was where %b and the preg_replace_callback got introduced and there the benchmark also shows a performance loss.

hswong3i’s picture

My reason is very simple: since I am focusing on both MySQL/PgSQL/Oracle/DB2/MSSQL/etc development; on the other hand, I am now working with them. This abstraction is a very important hook for other database implementation, which even need to be extended but not reduce. By extending its abstraction level, we can both take case our existing driver implementation (mysql/mysqli/pgsql) for backward compatible; on the other hand able to enrich, prepare and speed up our next generation development (PDO + MySQL/PgSQL/Oracle/DB2/MSSQL, oci8, ibm_db2, mssql, etc...)

Based on a single point of view, this isolation and tweaking for MySQL can boost up performance (!?), with simplified (!?) implementation. BTW, this isolation also result as a door closing for other database development: all other database drivers (including PDO + MySQL/PgSQL) require this abstraction hook, unless existing mysql/mysqli... we will need to public it once again in next generation. So why isolate it now, and result as a duplicated handling?

On the other hand, since ALL other databases development require for this important hook, isolating MySQL will result as more effort during maintenance... I don't know who are focusing on next generation development as me, but hope not to increase our difficulties for standardize and unify the abstraction into a simple implementation, if the change will only result in a very limited performance boot up...

P.S. I am now working on a patch for D6, based on my previous brain storming: remove '%s' syntax, abstract decimal and float. Moreover, implement _db_query_callback() as bi-directional: not only pre-process input query for _db_query(), on the other hand filter $args and return its filtered version. This bi-directional handling can both escape %%|%d|%f|%s|%b into ? (case dependent), on the other hand prepare $args for PDO's variable binding. It is almost complete and I am now testing its correctness (with benchmarking). Would you mind to wait for a little moment, says within 1 day, and let's review my works as D7 technical preview? Hope you may give me a chance, to share my idea with you by code implementation :)

P.P.S. for PDO implementation, our existing '%s' syntax will no longer suitable. Please reference to belows example (http://php.net/pdo):

<?php
// Invalid use of placeholder
$stmt = $dbh->prepare("SELECT * FROM REGISTRY where name LIKE '%?%'");
$stmt->execute(array($_GET['name']));

// placeholder must be used in the place of the whole value
$stmt = $dbh->prepare("SELECT * FROM REGISTRY where name LIKE ?");
$stmt->execute(array("%$_GET[name]%"));
?> 
chx’s picture

require this abstraction hook, I pray, what do they require? What's taken away with this patch? If your answer is longer than one line, then it's not why you want. Most prboably it's just a function name.

chx’s picture

Status: Reviewed & tested by the community » Closed (fixed)