according to PDO implementation (http://drupal.org/node/134580), and latest Oracle driver implementation (http://drupal.org/node/39260), it is suggested to remove db_num_rows(). since db_num_rows() is a database-dependent API, e.g. oci8's oci_num_rows() only return AFFECTED rows number (http://php.net/oci_num_rows), and so duplicate SELECT query is required for Oracle driver implementation, which greatly reduce its performance. most of the queries will able be to rewrite as SELECT COUNT(...).
this patch will split into numbers of small patches, in order to reduce overall impact. patches within this issue will always be a completed and integrated version, based on CVS HEAD, and other patches result.
| Comment | File | Size | Author |
|---|---|---|---|
| #43 | rmnumrow-4.patch | 3.2 KB | pwolanin |
| #41 | rmnumrow-3.patch | 3.2 KB | pwolanin |
| #38 | drupal-6.x-dev-rmnumrow-0.9.3.diff | 3.1 KB | hswong3i |
| #37 | drupal-6.x-dev-rmnumrow-0.9.2.diff | 8.43 KB | hswong3i |
| #35 | drupal-6.x-dev-rmnumrow-0.9.1.diff | 35.97 KB | hswong3i |
Comments
Comment #1
kbahey commentedI think this is a step backwards. Because of Oracle and PDO's limitations, we are now forced to have go lowest common denominator.
This in turn mandates two queries per pager query (one the real query, and the other the COUNT query).
There are 35 places in core that use db_num_rows() already. Some of that can go, but some of them are really needed.
On a large system, this means performance goes down the tube.
So, my recommendation is that the lesser endowed engines work around it, and leave engines that support it with db_num_rows() ability.
Comment #2
Crell commented"Lesser endowed engines" includes the PDO-MySQL driver, if you're using the default forward-only cursor. Most places in core where num_rows is used it should only be one query anyway because that's all it's being used for. Doing a full normal select and getting back all of the data just to get a row count is wasteful.
Core needs to be able to handle all supported databases directly, so that means "working around it" in core for everything. PDO offers a simple solution, though.
Now it's just an array and count() works. That's no slower than what the mysql driver already does internally, in fact. (That's why mysql_unbuffered_query() exists.) In most cases, though, SELECT COUNT(*) is going to be faster if that's all you need.
Comment #3
hswong3i commented@kbahey: a scientific benchmarking is always the most solid solution, when we are talking about performance impact. according to my benchmarking result, remove db_num_rows() will not reduce overall performance, but even boost up for around 5%. surly that i am not able to perform a "complete" benchmarking for every affected modules, but the result may still consider as solid, based on benchmarking suggestion from Dries and killes :)
besides performance consideration, i guess most of the child-patches of "remove db_num_rows()" should able to commit with no question. some of them may need reviews for better implementation, but that's not such difficult :)
Comment #4
hswong3i commentedsorry, the real benchmarking result: http://edin.no-ip.com/html/?q=node/282
Comment #5
gerhard killesreiter commentedI've reviewed the patch and pointed out a few errors. The only real problem in core is the user online block and even there you can work around the problem.
Comment #6
hswong3i commentedcomplete patch with minor bug fixes, and code clear up. progress will split into according small patches :)
P.S. user_block() will be reviewed after this patch, for a better implementation :)
Comment #7
hswong3i commentedcomplete patch based on latest CVS HEAD, update book.module and user.module (user_block() implementation). progress will split into corresponding small patches :)
Comment #8
dries commentedI think it's OK to remove this function as it improves database compatibility. There is a good work-around for it, that shouldn't be too expensive in the vast majority of the cases.
I haven't reviewed the latest patch yet.
Comment #9
moshe weitzman commentedshouldn't that be COUNT(*) in order to take advantage of mysql optimization? there is no drawback that i can see. it is (slightly) semantically better as well.
Comment #10
hswong3i commentedaccording to moshe's suggestion, and his document reference (http://www.mysqlperformanceblog.com/2007/04/10/count-vs-countcol/), this patch is updated. many thanks :)
P.S. seems the full version of this patch will be able to accept. i will not update its child patches at this point. BTW, if we still hope to keep db_num_rows() for backward supporting (where i think it is a bit meaningless, as it is database-dependence), those child patches will be required, as they can improve overall performance. so i will let them keep open, until we are all agreed to remove db_num_rows() supporting from DB API :)
Comment #11
dries commentedI'm OK with committing this patch, as it will really help development of other database backends. At the same time, I've no problem with postponing this patch to Drupal 7 either. As we're going to do a lot of database work in D7, we might be better of to postpone this patch. Some feedback would be appreciated.
Comment #12
hswong3i commentedit may be a bit pity if we postpone this patch (and also the same case as LOBs patch)... as a lot of improvement is already done for our D6 database API, oracle driver (and some other else, e.g. MSSQL and DB2) is now able to develop with no difficulty.
db_num_rows() (and also LOBs handling) is the last difficulty that i can discover about cross database compatibility. we shouldn't waste our previous effort, overcome these finial difficulty, cleanup the path for other database driver development, and improve the database support ability for D6. i am looking for a pretty feature of Drupal in the field of enterprise, we should catch these chance on time :)
Comment #13
moshe weitzman commentedi think it is OK to commit this patch remove the num_rows DB function and either put db_num_rows() in legacy.module or paste it into the upgrade docs for lazy upgraders. the fix is straightforward enough that we won't upset module developers too much.
Comment #14
Crell commentedI'm generally in support of lots of little patches in the same direction over one big patch, as it makes it much easier to maintain patches during development. For that reason, I'd favor committing this for D6.
If someone really needs to do old-style num_rows() behavior, the PDO answer would be ->fetchAll(). Do we want to consider implementing a db_fetch_all() now to replace db_num_rows(), or is that a silly addition if it's going to only last one revision?
Comment #15
hswong3i commentedi think db_fetch_all() maybe not too useful: we always loop though result set by using foreach , and preform corresponding action. it is usually pointless for a complete result set fetching :)
on the other hand, i will vote for SELECT COUNT(...) and program logic re-factor, rather than implementing db_fetch_all(), as db_num_rows() replacement. SELECT COUNT(...) and program logic re-factor may improve overall performance (according to benchmark result, it really did that), which should also be considered :)
Comment #16
dries commentedIf that's the general consensus, I'd be happy to commit this patch once it is marked "Ready to be committed". This should happen sooner than late.
I looked at the code and it looks good except for:
Now, $authenticated_count willl be smaller or equal to $max_users. On drupal.org, $max_users is 15 but $authenticated_count should return 300 or so.
Comment #17
hswong3i commentedsorry for such careless bug, and here is the updated version.
P.S. i found that DB2 will also face problem when implementing db_num_rows():
http://www-128.ibm.com/developerworks/ibm/library/i-osource9/index.html
Comment #18
hswong3i commentedminor update based on latest CVS.
i am now using it for both oracle and DB2 driver development, and found it is necessary. hope someone can take a final review for this patch :)
Comment #19
gábor hojtsyWhile reviewing the patch, I found a minor and a major problem:
- the session.inc changes introduce an E_NOTICE error as far as I see, $items is not initilized
- all changes related to node_title_list() calls should be revisited... this seem like passing on the data to theme('node_list') which invokes theme('item_list') which returns HTML output even if there were no rows in the result set... this is not only superfluous HTML output, but cases bugs in cases where it is used as if it returns FALSE if there were no rows in the set, like
if ($daytop && ($result = statistics_title_list('daycount', $daytop)) && ($node_title_list = node_title_list($result, t("Today's:")))) {Comment #20
hswong3i commented@goba: thanks for your advice, and i have update the patch. now the checking will preform within node_title_list(): if nothing can fetch from input DB result object, this function will return FALSE, without calling theme_node_list(). it is a program logic re-factor, as like as other cases. thank you :)
Comment #21
gábor hojtsyThis version looks OK now. But this is a significant amount of change, so it would be nice to see independent tests done.
Comment #22
pwolanin commentedminor suggestions on the code. Typo in these code comments:
"do" should be "does", as in "db_affected_rows() does not give"
Also, probably not much of a difference, but db_table_exists() and db_column_exists should be made consistently like this (use db_result rather than db_fetch_object):
now the usage seems to vary between PostgreSQL and MySQL driver - unless there is a good reason to do so?
Finally, I think the logic change in function menu_tree_page_data() is faulty - $item will always be FALSE at that point in the loop. You need to create a flag instead.
Comment #23
pwolanin commentedhere's a possible implementation for a correct change to the menu.inc part: http://drupal.pastebin.com/d2a1733d8
Note to see the bug introduced by the patch for menu.inc, you may need to be clearing the cache_menu table by hand regularly.
here's the test: set admin, admin/build, and admin/build/menu to be expanded
On visiting any page, the navigation menu should be expanded down fully to admin/build/menu. With the patch applied that fails to happen.
Comment #24
hswong3i commentedjust post the complete and updated version here. and let commit its split version one by one :)
Comment #25
dries commentedLet's focus on this patch rather than on the multiple smaller ones.
Comment #26
hswong3i commentedooo sorry about that... seems i have misunderstand Goha's meaning... very sorry :(
Comment #27
hswong3i commentedminor update for standardize coding format. reuse small patches progress.
Comment #28
hswong3i commentedComment #29
dries commentedI've reviewed this patch once more and spent some time testing it. Everything looks OK. It gives us better cross-database compatibility and better performance.
(It's not clear whether pwoladin's suggestion made it in, the snippet already expired, but I assume it has.)
Comment #30
chx commentedHold. The patch is not yet committed and it should not be because at least taxonomy_render_nodes is broken.
It's guaranteed that $node is FALSE after loop. You wanted if ($output)
Comment #31
chx commentedFound a few more of this in the patch. Edited version of the irc chat follows. You need a variable like $is_more for menu.inc find a good name for it make it FALSE before the while ($item = and make it TRUE inside the loop (I will call this control variable later aggregator_parse_feed , if ($item) , wrong, if ($items) . comment_render , tricky, might need use a control variable here as well. node_title_list , return $node ? change to return $items ?
node_revisions( , if ($node) ===> if ($output)
system_update_179 , use control variable
return count($authmaps) ? $authmaps : 0; <= no need for count, php will nicely cast empty array to false , nonempty to true, try it.
Comment #32
pwolanin commentedThis patch still breaks the menu system and should not be committed - see my pastebinned code above for the fix.
Comment #33
hswong3i commented@chx and @pwolanin: thanks for your advice. now i add some counter for extra checking, and should able to due with the problem :)
Comment #34
pwolanin commentedwell, at least that won't break menu, but there is no reason to increment a counter - all we care is whether we have 0 or > 0 results. Just use the code with the boolean falg I suggested which will be faster.
Comment #35
hswong3i commented@pwolanin: update based on your suggestion. thank you very much :)
P.S. i keep some handling as before, since we are really considering about row numbers.
P.P.S. i am still prefer about using integer counter, for standardize coding format between db_num_rows() => program logic re-factor. anyway, both v0.9 and v0.9.1 are working equally, please feel free to commit any one of this, based on different concern ;p
Comment #36
dries commentedCommitted the last version of this patch. I'm marking this 'active' until this has been documented in the upgrade instructions.
Comment #37
hswong3i commented@Dries: sorry that the version you just commit is a buggy version. according to chx and pwolanin help, we update this thread, and here is the bug fix patch for latest CVS.
Comment #38
hswong3i commentedencore version... change integer counter into boolean flag, based on chx and pwolanin suggestion :(
Comment #39
hswong3i commentedthe patch in #38 is only a tweaking version of current implementation, it should be able to function without any question, but improve some performance :)
Comment #40
pwolanin commentedpatch was not made from the drupal root directory
Comment #41
pwolanin commentedhere's a new patch re-rolled from the root dir (the other applied using patch -p1). Otherwise identical. The logic looks right and at least the taxonomy module tests ok.
Comment #42
gábor hojtsyIf $num_rows does not contain a number, why call it $num_rows? (In all of the files patched)? I think $has_rows or something even simpler would be better.
Comment #43
pwolanin commentedchanged variable names.
Comment #44
gábor hojtsyThis looks much better, thanks. Committed!
Comment #45
(not verified) commented