Here follows a block of long, detailed information. Please skip this for the short version, if you don't have time.

Imagine the following situation: you have types (taxonomy) of events (content type). You need several views which all show certain groups of events (several types) and can also be filtered for the types in these groups.

How you probably intend to do this? You add a filter for the event type which you set to be the group (say 'Type A' and 'Type B'). You need to expose this, you check "Limit list to selected items", cause you don't want Type C, D, E, etc. to show.
What happens? Type C, D, E, ... are nicely removed from the exposed filter, BUT events of Type C, D, E, ... show in the list nonetheless. Obvious solution: add another non-exposed filter for the event type, this time simply selecting Type A & B. What happens? At first sight it works as expected, BUT oh no, it doesn't.
As soon as you actually USE the exposed filter something utterly wrong happens: events don't show up at all.

In short: you have two filters of the same type. One filters for term1 and term2, the other filters for term2.
What you expect: nodes of type term2.
What you get: nothing at all.

What your query ends up is something like:

SELECT node.nid AS nid FROM {node} node
INNER JOIN {field_data_field_event_type} field_data_field_event_type ON node.nid = field_data_field_event_type.entity_id AND (field_data_field_event_type.entity_type = 'node' AND field_data_field_event_type.deleted = '0')
LEFT JOIN {field_data_field_event_type} field_data_field_event_type2 ON node.nid = field_data_field_event_type2.entity_id AND field_data_field_event_type2.field_event_type_tid != 'tid2'
WHERE (( (node.status = '1') AND (node.type IN  ('event')) AND (field_data_field_event_type.field_event_type_tid IN  ('tid1', 'tid2')) )AND( (field_data_field_event_type2.field_event_type_tid = 'tid2') ))

See that second join?

...field_data_field_event_type2.field_event_type_tid != 'tid2'

What the...???

Now, I took the time to investigate this matter further and found the culprit in includes/handlers.inc at line 911:

<?php
         
if (!empty($this->handler->view->many_to_one_tables[$field])) {
            foreach (
$this->handler->view->many_to_one_tables[$field] as $value) {
             
$join->extra = array(
                array(
                 
'field' => $this->handler->real_field,
                 
'operator' => '!=',
                 
'value' => $value,
                 
'numeric' => !empty($this->handler->definition['numeric']),
                ),
              );
            }
          }
?>

also at line 872:

<?php
     
if (!empty($view->many_to_one_tables[$field])) {
        foreach (
$view->many_to_one_tables[$field] as $value) {
         
$join->extra = array(
            array(
             
'field' => $this->handler->real_field,
             
'operator' => '!=',
             
'value' => $value,
             
'numeric' => !empty($this->definition['numeric']),
            ),
          );
        }
      }
?>

This just can't be right. Do we really browse through all the previous values used by filters and add the last of them negated to the join condition?

I have no clue why this piece of code is there, but it sure doesn't seem OK to me.
What about previous filters' operators? Why do this in the first place?

I hazard a guess that this roots from some kind of attempt for query optimization, but it sure doesn't sound right that a logical condition like (A or B) AND (A) results in a nullset and not rows matching A, nor do I see any kind of use case where such trickery could be useful.

Comments

Priority:Critical» Normal
Status:Needs work» Closed (works as designed)

Yes, that code is supposed to be there. It's the only reasonable way to be able to find things that are tagged with 'apple' AND 'orange' if that's what you want to do.

Also, this issue clearly does not meet the definition of critical.

Also, I am having trouble understanding how you're adding a filter for event type, and then getting event types that are not in the filter; the problem you're trying to resolve doesn't seem like one you should be having at all.

Your best bet is probably to make the filter required and then turn on allow multiple selections; that way all of your limit will be selected by default and thus get the result you want.

http://drupal.org/node/510068 also seems relevant.

Priority:Normal» Critical
Status:Closed (works as designed)» Needs work

Tried using what you proposed, but it doesn't work, because the limit feature is bugged as well, see:
http://karsa.org/views_test/articles
(Just deselect both type groups and press search, which leads to all types appearing).
But this is not the point here at all, read on.

On the other hand, here's my approach, which doesn't work either because of the bits of query the pieces of code above add to it: http://karsa.org/views_test/articles2
It just goes all blank. If you deselect Type A Article ABC pops up, while if you deselect only Type B & C Article B of Type B does show up, because only field_data_field_tags2.field_tags_tid != '3' is added to the query (why?). If you deselect both groups, it works as expected (because of the additional hidden filter).

I don't understand why is it that a group of filters like "Tags in (A, B, C) AND Tags in (A, B, C)" resulting in a nullset is considered "working as designed". It makes no sense! Shouldn't we follow first-order logic?

No matter what, these two pieces of code need reviewing, they simply can't be right, just look at them.

<?php
// Do we have previous values from previous filters?
if (!empty($view->many_to_one_tables[$field])) {
   
// Go through all of these values
   
foreach ($view->many_to_one_tables[$field] as $value) {
       
// OVERRIDE the extra condition of our join with the value found.
       
$join->extra = array(
            array(
               
'field' => $this->handler->real_field,
               
'operator' => '!=',
               
'value' => $value,
               
'numeric' => !empty($this->definition['numeric']),
            ),
        );
    }
}
// Here our join is extended by a condition that reads sth. like "our field != last value".
?>

How does that make any sense?

As to your explanation:

Yes, that code is supposed to be there. It's the only reasonable way to be able to find things that are tagged with 'apple' AND 'orange' if that's what you want to do.

Seriously?... By your logic, if someone wanted to do that, he/she should add a filter criteria for field_tags with the "Is all of" operator.
I see your point though, removing these bits of code would result in violating first-order logic too, but they sure don't work as they should. For one thing, they don't give a toss about what operator was used in filtering for previous values or if those values were grouped or not, et cetera, et cetera, et cetera...

They do solve logic conditions like (type = A) AND (type = B), but fail to solve ones like:

and so on. Namely all cases where the values we currently filter for intersect with values we previously filtered for. This really is a severe bug.

Play with this for a while using intersecting sets of types to see what I'm talking about: http://karsa.org/views_test/articles3

I've taken another good look at this, but I don't quite see how adding these conditions is necessary for the type of query you've brought up as an example (i.e. nodes tagged with both 'apple' AND 'orange').
I've commented out these lines of code and such a query seems to work just fine, see:

views intact: http://karsa.org/views_test/articles3?keys=&field_tags_tid_op=or&field_t...
vs.
codes above removed: http://karsa.org/views_test2/articles3?keys=&field_tags_tid_op=or&field_...

I've attempted to make a comparison of joins, corresponding subqueries, and whether they cause duplicates, how scalable they are, and also, if first order logic applies if you use them in complex nested queries, if I'm not mistaken, adding these != conditions is simply not necessary, but do contradict me if I'm wrong.

I've left of the trivial conditions (%basetable.%id=%field.entity_id, entity_type=%type, etc.) from the subqueries and joins to make it more legible, but here it goes, I hope it's still intelligible enough:

operator condition query type currently implemented causes duplicates first order logic applies scalability
one of type IN (A,B) WHERE EXISTS(SELECT FROM tags WHERE tid IN (A,B)) no no yes medium
INNER JOIN tags WHERE tid IN (A,B) default yes yes high
LEFT JOIN tags ON tid = A
LEFT JOIN tags ON tid = B
WHERE (tags_a is NOT NULL OR tags_b IS NOT NULL)
with reduce duplicates no yes medium
all of type = (A,B) WHERE EXISTS(SELECT FROM tags WHERE tid=A)
AND EXISTS(SELECT FROM tags WHERE tid=B)
no no yes medium
INNER JOIN tags ON tid = A
INNER JOIN tags ON tid = B
WHERE (tags_a is NOT NULL AND tags_b IS NOT NULL)
default no yes high
none of type NOT IN (A,B) WHERE NOT EXISTS(SELECT tid FROM tags WHERE nid=entity_id AND tid IN (A,B)) no no yes medium
LEFT JOIN tags ON tid IN (A,B)
WHERE tid = NULL
default no yes high
empty type IS NULL LEFT JOIN tags
WHERE tid IS NULL
default no yes high
WHERE NOT EXISTS(SELECT tid FROM tags) no no yes medium
not empty type IS NOT NULL LEFT JOIN tags WHERE tid IS NOT NULL default yes yes high
WHERE EXISTS(SELECT tid FROM tags) no no yes medium

(Incidentally, the current handler for "not empty" apparently causes duplicates even with "reduce duplicates" checked).

Priority:Critical» Major

Issue summary:View changes

slight corrections.