taxonomy_node_get_terms doesn't sort by vocabulary [#65517]

Comment	File	Size	Author
#33	65517-D6.patch	1.96 KB	andypost
#31	65517-D6.patch	1.06 KB	jhodgdon
#9	65517.patch	603 bytes	jhodgdon
	taxonomy_node_get_terms_1.patch	826 bytes	praseodym

Comment #1

David Lesieur commented 20 August 2006 at 00:07

I think the description actually means "Find all terms associated to the given node, ordered by vocabulary weight and term weight". Could you give an example where sorting also by vid greatly simplifies a module?

Log in or register to post comments

Comment #2

killes@www.drop.org commented 30 August 2006 at 08:36

Version:

4.7.1

» x.y.z

features go into cvs.

Log in or register to post comments

Comment #3

webchick

she/they

English

Vancouver 🇨🇦

commented 24 June 2007 at 04:23

Version:	x.y.z	» 6.x-dev
Status:	Needs review	» Needs work

No longer applies.

Log in or register to post comments

Comment #4

catch

he/him

English

commented 24 October 2007 at 14:14

Version:

6.x-dev

» 7.x-dev

Log in or register to post comments

Comment #5

catch

he/him

English

commented 28 December 2008 at 02:00

Status:

Needs work

» Closed (won't fix)

'vocabulary and term weight' makes plenty of sense especially with the query being self explanatory, IMO this is won't fix.

Log in or register to post comments

Comment #6

jhodgdon

she/her

English

commented 28 April 2009 at 19:38

Status:

Closed (won't fix)

» Needs review

I think this decision of "won't fix" should be revisited. It's also a bug in D6, and it really makes the function fairly worthless if you have multiple vocabularies for a single node type. At a minimum the documentation should be revised to reflect what the function does (shouldn't have to read the code, that's the reason we have documentation).

The reason is that if you have multiple vocabularies on a given node type, and you haven't dragged the vocabularies to assign them weights (i.e. just left them alphabetical), the terms do not come out grouped by vocabulary. In fact, if you've assigned weights to the terms within one vocabularly, perhaps, and left the others alphabetical, you'll see your terms in some kind of mixture of alphabetical and by weight (you don't generally see the weights at all on the screen in D6/D7, because the interface is drag and drop rather than assigning numbers, so who knows what weights are being assigned behind the scenes?).

My preference would be to change the function to make the ORDER BY clause in the query be v.weight, v.name, t.weight, t.name, so that the terms would be grouped by vocabulary no matter what. This is NOT what the above patch does (that uses v.vid, which I don't think makes any sense at all).

If you are not going to make that change, then please consider changing the documentation of the function, because to me "ordered by vocabulary" implies that the terms would be grouped together by vocabulary, in the same order I would see the vocabularies on the Taxonomy list page. Change it to "ordered by vocabulary weight" and put a note in saying that if you haven't changed the vocab weights to something other than alphabetical, there is no ordering by vocabulary at all.

Log in or register to post comments

Comment #7

cburschka

they

commented 28 April 2009 at 20:29

I also have so far thought that all "weight" fields were merely an overriding value to affect the natural sort order (by ID or alphabet). The inference would be that if you use any weight field for sorting, you would also need to use the natural order as a second sorting field. This was just my interpretation, of course.

Log in or register to post comments

Comment #8

catch

he/him

English

commented 1 June 2009 at 20:27

Status:

Needs review

» Needs work

Not sure why the testbot never made it here, but that patch won't apply any more now the taxonomy has been converted to dbtng.

Log in or register to post comments

Comment #9

jhodgdon

she/her

English

commented 9 June 2009 at 22:43

Category:	feature	» bug
Status:	Needs work	» Needs review

Status	File	Size
new	65517.patch	603 bytes

Here is a patch for D7 that adds the vocabulary name as a backup sort order, after vocabulary weight.

Log in or register to post comments

Comment #10

catch

he/him

English

commented 11 June 2009 at 11:07

jhodgdon - could you post an EXPLAIN on the new query, or even better benchmarks?

Log in or register to post comments

Comment #11

jhodgdon

she/her

English

commented 11 June 2009 at 15:38

Ummm. Not sure how to acheive that, any pointers?

The two current users of this function in Drupal 6 core (taxonomy_nodeapi and taxonomy_form_alter), and the one in Drupal 7 core (taxonomy_form_alter), won't benefit from this patch, by the way, because they don't care about the order of the returned terms.

Log in or register to post comments

Comment #12

catch

he/him

English

commented 11 June 2009 at 15:45

If you've got a D6 site with a version of the patch, easiest thing to do is get devel query log, copy the query from there, put into mysql command line or phpmyadmin, and then prefix with 'explain'.

For benchmarking, probably give anonymous users access to create/edit a content type with a taxonomy select on it (ideally a large vocabulary - at least couple of hundred terms and at least 1-2000 in the database altogether) - then run ab -c1 -n500 on that page before and after the patch. I have a feeling the query is currently so unoptimised that adding the extra sort won't make it noticeably worse, but if it does we should know.

Log in or register to post comments

Comment #13

jhodgdon

she/her

English

commented 11 June 2009 at 16:31

OK, here's Explain output anyway...

Here's a sample query from D6, before patching:

SELECT t.* FROM d1_term_node r INNER JOIN d1_term_data t ON r.tid = t.tid INNER JOIN d1_vocabulary v ON t.vid = v.vid WHERE r.vid = 1 ORDER BY v.weight, t.weight, t.name

Explain (may need to widen your browser screen to see this well):

+----+-------------+-------+--------+--------------------------------+---------+---------+---------------+------+---------------------------------+
| id | select_type | table | type   | possible_keys                  | key     | key_len | ref           | rows | Extra                           |
+----+-------------+-------+--------+--------------------------------+---------+---------+---------------+------+---------------------------------+
|  1 | SIMPLE      | r     | ref    | PRIMARY,vid                    | vid     | 4       | const         |    1 | Using temporary; Using filesort | 
|  1 | SIMPLE      | t     | eq_ref | PRIMARY,taxonomy_tree,vid_name | PRIMARY | 4       | drupal6.r.tid |    1 |                                 | 
|  1 | SIMPLE      | v     | eq_ref | PRIMARY                        | PRIMARY | 4       | drupal6.t.vid |    1 |                                 | 
+----+-------------+-------+--------+--------------------------------+---------+---------+---------------+------+---------------------------------+

After patching (cannot apply the patch exactly as DB format is different, but it's the same effect), the query is:

SELECT t.* FROM d1_term_node r INNER JOIN d1_term_data t ON r.tid = t.tid INNER JOIN d1_vocabulary v ON t.vid = v.vid WHERE r.vid = 1 ORDER BY v.weight, v.name, t.weight, t.name

And the EXPLAIN output is:

+----+-------------+-------+--------+--------------------------------+---------+---------+---------------+------+---------------------------------+
| id | select_type | table | type   | possible_keys                  | key     | key_len | ref           | rows | Extra                           |
+----+-------------+-------+--------+--------------------------------+---------+---------+---------------+------+---------------------------------+
|  1 | SIMPLE      | r     | ref    | PRIMARY,vid                    | vid     | 4       | const         |    1 | Using temporary; Using filesort | 
|  1 | SIMPLE      | t     | eq_ref | PRIMARY,taxonomy_tree,vid_name | PRIMARY | 4       | drupal6.r.tid |    1 |                                 | 
|  1 | SIMPLE      | v     | eq_ref | PRIMARY                        | PRIMARY | 4       | drupal6.t.vid |    1 |                                 | 
+----+-------------+-------+--------+--------------------------------+---------+---------+---------------+------+---------------------------------+

For reference (from looking at http://api.drupal.org/api/function/taxonomy_schema/6):

term_data has indices:

    'primary key' => array('tid'),
    'indexes' => array(
      'taxonomy_tree' => array('vid', 'weight', 'name'),
      'vid_name' => array('vid', 'name'),
    ),

and vocabulary has indices:

   'primary key' => array('vid'),
    'indexes' => array(
      'list' => array('weight', 'name'),
    ),

D7 has the same indices for the equivalent tables (table names changed to taxonomy_term_data and taxonomy_vocabulary).

So it doesn't look like the "list" index is being used, even in the new query. Maybe we should index separately on "weight" and "name"? Shall I add that to the patch? Thoughts?

I'll see what I can do about generating lots of DB size in order to benchmark...

Log in or register to post comments

Comment #14

jhodgdon

she/her

English

commented 11 June 2009 at 16:30

Interestingly, the term_node table has indices for vid and nid, but not tid (this is true in D6 as well as D7). It might help to add an index on tid as well? Pretty much any query on term_node would tend to be joined to term_data on tid, wouldn't it?

I'm not a database index efficiency expert though... I'm not sure which parts of queries indices actually help optimize, is it just WHERE and ORDER BY, or do indices help optimize JOIN ON as well?

Log in or register to post comments

Comment #15

catch

he/him

English

commented 11 June 2009 at 16:59

I think it's pretty unlikely the existing query uses indexes - order by across tables usually results in the worst possible query - that temporary table in both EXPLAINs sounds about right. What I'm not sure about is whether adding the order by on vocabulary name makes the existing bad query worse, or has no impact either way.

The term_node tid index seems covered by the primary key already I think?

Log in or register to post comments

Comment #16

jhodgdon

she/her

English

commented 11 June 2009 at 18:12

Not sure. term_node's primary key is the combination (tid,vid).

Benchmarks coming shortly...

Log in or register to post comments

Comment #17

jhodgdon

she/her

English

commented 11 June 2009 at 19:03

Here are benchmark results, using ab -c1 -n500 on a node edit page, as catch suggested.

I generated 1000 nodes of this type with the devel module (there were a few others already there, but not many), and this content type uses two vocabularies (also generated with devel; one has 90 entries and the other has 10).

Benchmark results for 500 page queries:

- old query (no v.name ordering):

Concurrency Level:      1
Time taken for tests:   281.484 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Total transferred:      8815000 bytes
HTML transferred:       8542000 bytes
Requests per second:    1.78 [#/sec] (mean)
Time per request:       562.967 [ms] (mean)
Time per request:       562.967 [ms] (mean, across all concurrent requests)
Transfer rate:          30.58 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   408  563 164.1    480     926
Waiting:      378  525 156.8    443     873
Total:        408  563 164.1    480     926

Percentage of the requests served within a certain time (ms)
  50%    480
  66%    635
  75%    714
  80%    738
  90%    841
  95%    851
  98%    857
  99%    861
 100%    926 (longest request)

- new query (with v.name ordering):

Concurrency Level:      1
Time taken for tests:   285.488 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Total transferred:      8815000 bytes
HTML transferred:       8542000 bytes
Requests per second:    1.75 [#/sec] (mean)
Time per request:       570.977 [ms] (mean)
Time per request:       570.977 [ms] (mean, across all concurrent requests)
Transfer rate:          30.15 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   408  571 163.8    498     864
Waiting:      379  531 156.2    467     809
Total:        408  571 163.8    498     864

Percentage of the requests served within a certain time (ms)
  50%    498
  66%    620
  75%    717
  80%    753
  90%    844
  95%    852
  98%    857
  99%    860
 100%    864 (longest request)

Log in or register to post comments

Comment #18

jhodgdon

she/her

English

commented 11 June 2009 at 19:06

Just for kicks, I manually added a couple of indices to the database (name, weight, and tid in term_data, and weight in vocabulary). I re-ran the benchmark on the unmodified query (with no vocabulary.vid sorting):

Concurrency Level:      1
Time taken for tests:   255.347 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Total transferred:      8815000 bytes
HTML transferred:       8542000 bytes
Requests per second:    1.96 [#/sec] (mean)
Time per request:       510.693 [ms] (mean)
Time per request:       510.693 [ms] (mean, across all concurrent requests)
Transfer rate:          33.71 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   407  511 141.0    419     858
Waiting:      371  475 134.4    389     804
Total:        407  511 141.1    419     858

Percentage of the requests served within a certain time (ms)
  50%    419
  66%    494
  75%    589
  80%    649
  90%    741
  95%    841
  98%    853
  99%    856
 100%    858 (longest request)

Log in or register to post comments

Comment #19

jhodgdon

she/her

English

commented 11 June 2009 at 19:12

What do you think? It looks to me like adding the extra sorting term is a pretty small effect (mean time went up by 8ms, and median went up by 18ms, and both are way less than 1 standard deviation). Adding indices to the tables has a bigger effect (about 50-60ms time reduction).

Log in or register to post comments

Comment #21

24 June 2009 at 01:25

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #22

jhodgdon

she/her

English

commented 25 June 2009 at 19:02

Status:

Needs work

» Needs review

Setting back to Needs Review in hopes that the patch will get retested. The test bot was apparently malfunctioning (I had about 7 patches fail in that time period, all were doc, none broke the HEAD install I am pretty sure).

Log in or register to post comments

Comment #23

28 June 2009 at 03:55

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #24

andypost

he/him

Russian

commented 28 June 2009 at 04:23

Status:

Needs work

» Needs review

Bot was failed

Log in or register to post comments

Comment #25

jhodgdon

she/her

English

commented 3 August 2009 at 18:21

This patch still applies cleanly, and the benchmark tests have been done. Any hope of getting a review?

Log in or register to post comments

Comment #26

jhodgdon

she/her

English

commented 20 August 2009 at 14:26

Issue tags:

+API change

adding tag

Log in or register to post comments

Comment #27

catch

he/him

English

commented 20 August 2009 at 16:16

I'm very much hoping we can remove this entirely with #412518: Convert taxonomy_node_* related code to use field API + upgrade path.

Log in or register to post comments

Comment #28

8 October 2009 at 21:05

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #29

jhodgdon

she/her

English

commented 8 October 2009 at 22:14

Version:

7.x-dev

» 6.x-dev

Apparently, taxonomy terms are now attached to nodes using the field API. So this function no longer exists in Drupal 7. And it looks like each taxonomy field in D7 has a vocabulary ID on it, so presumably you would query each field separately to find the terms. Anyway, the whole issue just doesn't apply to D7.

I would still like the function to be fixed in Drupal 6, because I think it's a bug that the results are not grouped by vocabulary, since the doc says they should be. Or else it's a doc bug.

Anyway... setting to D6...

Log in or register to post comments

Comment #30

jhodgdon

she/her

English

commented 8 October 2009 at 22:15

And by the way, the above benchmarks were done in Drupal 6.

Log in or register to post comments

Comment #31

jhodgdon

she/her

English

commented 9 October 2009 at 13:59

Status	File	Size
new	65517-D6.patch	1.06 KB

Here is a patch for Drupal 6.

Log in or register to post comments

Comment #32

jhodgdon

she/her

English

commented 9 October 2009 at 14:01

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #33

andypost

he/him

Russian

commented 9 October 2009 at 16:59

Status	File	Size
new	65517-D6.patch	1.96 KB

This patch works fins and +1 for rtbc this but this is a part of issue - only a view

When you at the edit node form - vocabularies are sorted wrong

function taxonomy_form_alter()
....
    $c = db_query(db_rewrite_sql("SELECT v.* FROM {vocabulary} v INNER JOIN {vocabulary_node_types} n ON v.vid = n.vid WHERE n.type = '%s' ORDER BY v.weight, v.name", 'v', 'vid'), $node->type);

Order is same as patch provides but after tracing I found that order of vocabs is changed in final drupal_render()

So I think to make order of vacabs with same weight we need add "SubWeight" in taxonomy_form_alter() to edit widgets

Log in or register to post comments

Comment #34

jhodgdon

she/her

English

commented 9 October 2009 at 18:00

#33 sounds like a separate issue to me - that vocabularies on the node edit page are not presented by order of their weights assigned on the taxonomy admin screens -- or at least, I think that is what you are saying?

The issue filed here is that this specific function, taxonomy_node_get_terms(), does not do what it says it does.

Log in or register to post comments

Comment #35

andypost

he/him

Russian

commented 9 October 2009 at 20:10

@jhodgdon yes, I file another issue about

So we need more reviews on your patch #31 to rtbc, I've tested it already

Log in or register to post comments

Comment #36

thedavidmeister commented 18 June 2013 at 14:21

Status:

Needs review

» Closed (won't fix)

Aside from the fact that the patch no longer applies:

error: taxonomy/taxonomy.module: No such file or directory

This would be an API change for D6 for a problem that doesn't exist in D7+ so I don't think it will ever be fixed.

Log in or register to post comments

taxonomy_node_get_terms doesn't sort by vocabulary

Comments