problem: drupal_is_denied() run a query that return anonymous result set, which will cause problem in MSSQL db_query_range() implementation.
detail: in case of MSSQL, db_query_range() is implemented as nested query. therefore a anonymous result set will cause an exceptional case and terminated with error. this patch assign a label to the result set, and so solve the problem.
P.S. here is the current implementation of db_query_range(), within MSSQL driver:
<?phpfunction db_query_range($query) {
$args = func_get_args();
$count = array_pop($args);
$from = array_pop($args);
array_shift($args);
$query = db_prefix_tables($query);
$query = _db_query_rewrite_temp_tablename($query);
$query = preg_replace_callback(ANSI_RESERVED_REGEXP, '_db_query_callback_ansi_reserved', $query);
$query = preg_replace_callback(MSSQL_RESERVED_REGEXP, '_db_query_callback_mssql_reserved', $query);
$query = preg_replace_callback(MSSQL_MAX30_REGEXP, '_db_query_callback_mssql_max30', $query);
if (isset($args[0]) and is_array($args[0])) { // 'All arguments in one array' syntax
$args = $args[0];
}
_db_query_callback($args, TRUE);
$query = preg_replace_callback(DB_QUERY_REGEXP, '_db_query_callback', $query);
$query = preg_replace('/SELECT/Dsi', 'SELECT TOP(100) PERCENT', $query);
$query = 'SELECT * FROM (SELECT sub2.*, ROW_NUMBER() OVER(ORDER BY sub2.line2) AS line3 FROM (SELECT 1 AS line2, sub1.* FROM (' . $query . ') AS sub1) as sub2) AS sub3 WHERE line3 BETWEEN ' . ($from + 1) . ' AND ' . ($from + $count);
return _db_query($query);
}
?>
Comments
Comment #1
hswong3i commentedthanks for Zlender report (http://drupal.org/node/165160#comment-289474), we may also need to update this query as:
BTW, this may be not function within DB2... i will spend some time for cross database compatibility testing later :)
Comment #2
mooffie commented>
> [...] which will cause problem in MSSQL db_query_range() implementation [...]
I don't think you should bother yourself with db_query_range() right now. Pretend for the meantime that this query is a simple db_query(). The problem is that the query doesn't do what it's supposed to do.
> SELECT [...] FROM {access} WHERE type = '%s' AND
> (LOWER(mask) LIKE LOWER('%s')
> OR LOWER('%s') LIKE LOWER(mask)) [...]
I think the original version, before the patch, was the correct one. It sould only be
LOWER('%s') LIKE LOWER(mask).An example:
The {access} table contains rules for, e.g., IP access. These rules are stored as SQL wildcards in the 'mask' column of the table. When Drupal wants to check if IP '194.14.56.98' has access to our website it does:
[...] WHERE LOWER('194.14.56.98') LIKE LOWER(mask)
It's logical. You say that DB2 dosn't support "text LIKE column", but switching the two won't solve the problem: this changes the logic to one which isn't the right one.
Maybe there's no options but to put a conditional that executes a different, suitable query when DB2 is used.
(BTW, I think there's an error in the documentation for drupal_is_denied(): I don't think the $mask parameter ever contains SQL wildcards.)
Comment #3
hswong3i commentedafter a indeed studying about limitation of DB2, i found that is almost impossible to implement a similar feature without changing program logic (http://www.webservertalk.com/archive220-2005-11-1272872.html):
so here are some proposes
@db2_execute(), therefore error message will not show out, and simply return a NULL result set. this means the allow/deny checking will be by-passed within DB2 implementation. if this is our choice, all we need is document this risky within DB2's README, and roll back the patchfrom my point of view, we shouldn't save compare rules in SQL format, which may cause SQL injection. on the other hand, if this is a database dependent implementation, we should consider about re-implement it. so i will suggest re-factor it with case 3 (which shouldn't be too difficult). BTW, if performance is our main concern, just feel free to disable such feature within DB2 :)
Comment #4
hswong3i commentedjust brain storming: is it possible to use preg_match + cache system for allow/deny filtering?
e.g. rules are written in perl regex format if starting with "^", or else written in plan text for complete word mapping. this is the similar case as mailman and some other system's allow/deny filtering method.
on the other hand, the regex patten will save within cache system, as like as variable. before calling drupal_is_deny(), we will initialize those regex for compare (on the other hand, build the regex and save into cache if not yet exists).
so the overall procedures will be limited into: (1 query: access cache init) + (2 preg_match: for allow and deny). i guess it shouldn't trade off too much performance, but this should be proved by benchmarking :)
Comment #5
mooffie commented>
> so here are some proposes
> [...]
> re-factor programming logic, by query level regex.
BTW, how hard it would be to put code in DB2's db_query() that replaces any /LIKE (\w+)/ with the correct syntax? I mean, something similar to this imaginary line:
$query = preg_replace('/([\w.]+) LIKE (\w+)/', "regexp(\\1, \\2, replace(\\2, '%', '.*'))", $query);
(Since \w+doens't match quotes, in would only match LIKE'ing agains columns.)
Could you please submit a patch that restores the original line in drupal_is_denied() ?
Comment #6
hswong3i commented@mooffie: i think the patch for rollback won't be too difficult. we just need to shift the position of 'mask' and '%s', and it will function as before.
here i try to introduce a patch that change matching rule into regex format, as like as mailman's sender/recipient filters. so rules will written as:
. \ + * ? [ ^ ] $ ( ) { } = ! < > | :) will be escaped by preg_quote. e.g. user rule with "dummy" will only map with user name "dummy"on the other hand, it clone the variables caching handling: the processed regex expression will saved into cache table, and initialize once before calling drupal_is_denied() within drupal_bootstrap(), if it is not existing.
total processing will limited into: 1 simple select query for cache_get() + [numbers of query and procedure for cache rebuild (optional)] + [1 preg_match() for deny rules (optional)] + [1 preg_match() for allow rules (optional)]. it shouldn't be much performance different with existing implementation.
for backward compatibility, we can use a script to translate existing rules into new format: "%" can replace as ".*", where "_" can replace as ".". i will work out that translate script within *.install, after this patch's review :)
Comment #7
hswong3i commentedComment #8
hswong3i commented@killes: thanks for your comment. here is an updated version with more document and comment.
Comment #9
hswong3i commentedminor update, due to document miss typing :(
Comment #10
moshe weitzman commentedthis does not look particularly simple or elegant to me.
Comment #11
hswong3i commented@moshe: i hope this presentation shouldn't be too difficult for system administrators and end users, if they have expreian
more over, this patch comes with numbers of benefit:
i will try to clone mailman's "non-regexp matches are always done first" handling in coming on version (quote from mailman sender filter setting page):
Comment #12
hswong3i commentedversion 0.3 includes:
here is a simple example. for database with following rules defined:
access rule will preprocessed and cached in $access as:
and so rules will used for preg_match() directly without any changes :)
Comment #13
dries commentedBecause this query is used on _every_ page view, we spent a lot of time optimizing it. In Drupal 5, we managed to improve performance of cached pages by more than 5% by making a tiny improvement to this query. In other words, if the proposed changes make this query slower, it can result in a significant performance loss. We'll want to benchmark this change.
I agree with Moshe that this patch isn't particularly elegant, but let's first see what benchmarks tell us.
Comment #14
hswong3i commentedi just complete a simple benchmarking testing with CVS HEAD, regex patch (no cache), and regex with cache, with the following drupal setup:
all other details are following with "HowTo: Benchmark Drupal code": http://drupal.org/node/79237. here is the benchmarking result with 0/50/100/150/200/250/1000 rules. each cycle test for 500 page hits:
according to above result, we can found that regex patch for drupal_is_denied() (with cache handling) won't drop a lot of performance: at least it preform as fast as existing implementation. on the other hand, as discussed above, the regex patch comes with numbers of benefits: a more powerful matching rules, and it is database independent. hope this benchmarking result can helps about our decision :)
Comment #15
hswong3i commentedminor update: code style clean up and document update. it should now seems better :)
Comment #16
moshe weitzman commenteddoes i makes sense to store the access rules with variable_set() and get rid of the access table? that way you get the cache set/get for free, and simplify some code.
Comment #17
hswong3i commented@moshe: i think variable_get/set may be not suitable in this cases:
Comment #18
hswong3i commentedminor update: also fix bug with INSERT statement, which manually insert aid into "serial" field (cause error), on the other hand uselessly fetch last insert id after INSERT.
Comment #19
hswong3i commentedcorrect version
Comment #20
hswong3i commentedwait! i will try to standardize access rules handling, as like as variables. therefore we may also implement Apache-like access/deny rules handling for block display, or some other cases :)
Comment #21
hswong3i commentedfeature add version, which standardize access checking into Apache mod_authz_host style (http://httpd.apache.org/docs/2.2/mod/mod_authz_host.html):
TODO: now we can filter IP address by: 1. full IP address format, or 2. standard regex format. i would like to implement the following IP filtering formats within coming on version (implement translation within access_init()):
Comment #22
hswong3i commentedi clone most handling from Apache mod_authz_host, unless partial domain-name notation (as it will conflict with plain text checking). now the access checking support the following formats:
^.*drupal\.org)NOTE: i keep the rules caching system and simplify the access rule checking, so it should keep the same performance as previous benchmarking result.
Comment #23
hswong3i commentedsimplify full IP address handling.
Comment #24
hswong3i commentedminor update:
Comment #25
hswong3i commentedminor update: bulky code cleanup, minor bug fix, document update.
Comment #26
hswong3i commentedi try to minimize the change to existing API in this version, remove access_set() and access_del(). it is now database independent, with following additional features:
BTW, as the function is now trying to check for access allow, it is better to change its name from drupal_is_denied() into drupal_is_allowed(). anyway, if we are very serious about "not changing API after code freeze", i can create a newer version that keep it as drupal_is_denied() (just some simple "NOT" handling).
function is not complicated but powerful, and fully tested. it should able to be RTBC.
Comment #27
gábor hojtsyHum, please do not RTBC your own patches. Get others test it and mark RTBC if they think so. Peer review works this way. Also, this patch seems to be escalated from the original goals to lots of new features and changes, for which it seems to be quite late in the cycle to me. Not only the function name change is an API break, but the matching format change is such too.
Comment #28
hswong3i commented@Gabor: sorry about that, as i just hope to catch more focus, and explore its possibility for D6 as i can. according to the counter patch for D6 beat in http://drupal.org/node/165160, i will keep this issue for D7, and so we can review this once again in the near feature :)
Comment #29
chx commentedComment #30
birdmanx35 commentedNo surprise, but this no longer works against HEAD.
Comment #31
catchhttp://drupal.org/node/228594