Taxonomy module doesn't use it's own APIs

catch - December 7, 2008 - 01:46
Project:Drupal
Version:7.x-dev
Component:taxonomy.module
Category:bug report
Priority:normal
Assigned:Unassigned
Status:closed
Issue tags:Performance
Description

There's a lot of functions in taxonomy module which return an array of term objects - but which are loaded direct from the database instead of via taxonomy_term_load(). Here's a run on taxonomy_get_term_by_name() to start clearing that up now that the multiple load patch is in.

I had to make a couple of changes to taxonomy_term_load_multiple(), because get_term_by_name is case insensitive. So $conditions['name'] does a strcasecmp() in PHP, and a LIKE() in SQL (this is case insensitive in MySQL and ought to be mapped to ILIKE in postgresql by dbtng - need to check that). Changed from LOWER() to LIKE() since LOWER()'s not indexed or query cacheable at all, afaik LIKE is a bit better on that front. I can't think of a situation where you wouldn't want case insensitive matching, so seems OK to treat it in this way.

Comes with tests.

AttachmentSizeStatusTest resultOperations
get_term_by_name.patch5.73 KBIdleFailed: Failed to install HEAD.View details | Re-test

#1

catch - December 7, 2008 - 02:05
Status:active» needs review

#2

catch - December 15, 2008 - 23:15

Re-rolled to use drupal_strtolower() instead of strcasecmp() so things are multi-byte safe.

AttachmentSizeStatusTest resultOperations
get_term_by_name.patch5.46 KBIdleFailed: Failed to apply patch.View details | Re-test

#4

catch - December 17, 2008 - 02:59

Added taxonomy_term_select() too. These are the two places where we can convert fairly easily, some of the rest of taxonomy.module will need refactoring before it can be done there.

AttachmentSizeStatusTest resultOperations
multiple_load_conversion.patch6.84 KBIdleFailed: Failed to apply patch.View details | Re-test

#5

catch - January 13, 2009 - 22:47

Re-rolled for taxonomy table name changes.

AttachmentSizeStatusTest resultOperations
multiple_load_conversion_0.patch6.86 KBIdleFailed: Failed to apply patch.View details | Re-test

#6

System Message - January 13, 2009 - 23:02
Status:needs review» needs work

The last submitted patch failed testing.

#7

catch - January 14, 2009 - 21:58
Status:needs work» needs review

retesting.

#8

System Message - January 15, 2009 - 00:55
Status:needs review» needs work

The last submitted patch failed testing.

#9

catch - January 17, 2009 - 00:08
Status:needs work» needs review

better re-roll this time.

AttachmentSizeStatusTest resultOperations
multiple_load_conversion.patch6.86 KBIdleFailed: Failed to apply patch.View details | Re-test

#10

System Message - January 23, 2009 - 01:55
Status:needs review» needs work

The last submitted patch failed testing.

#11

catch - January 23, 2009 - 14:01
Title:Convert taxonony_get_term_by_name to multiple load» Replace direct queries with taxonomy_term_load_multiple()
Status:needs work» needs review

Re-titling, since this is a general cleanup of stuff we can simplify now multiple load is in, and re-rolled now the test has moved.

AttachmentSizeStatusTest resultOperations
multiple_load_conversion.patch6.57 KBIdleFailed: Failed to apply patch.View details | Re-test

#12

moshe weitzman - January 24, 2009 - 14:22

why "// When name is passed as a condition use LIKE.". seems like new behavior that slows down all name queries a little. just wondering.

#13

catch - January 24, 2009 - 14:44

The name queries currently use LOWER('name') in taxonomy_get_term_by_name() which isn't indexable at all - chx and Crell have promised that they'll fix the mappings in the postgres driver so we can have case insensitive LIKE behaviour and avoid the LOWER() - this may have already been done but I'm not sure either way. So the current version of the patch is partially dependent on that.

It's true that taxonomy_term_load_multiple() allows for a case sensitive name query, but we've not had that in previous versions of Drupal, so it wouldn't be an API change as such - or at least only within D7 itself. I think it's more important to have proper loading (and a shared static to prevent duplicate queries) than it is to support a case sensitive search.

#14

System Message - January 28, 2009 - 22:30
Status:needs review» needs work

The last submitted patch failed testing.

#15

catch - March 14, 2009 - 11:58
Title:Replace direct queries with taxonomy_term_load_multiple()» Load full term objects where possible
Status:needs work» needs review

re-rolled.

AttachmentSizeStatusTest resultOperations
multiple_load_conversion.patch7.04 KBIdleFailed: Failed to apply patch.View details | Re-test

#16

System Message - April 2, 2009 - 18:06
Status:needs review» needs work

The last submitted patch failed testing.

#18

catch - April 20, 2009 - 11:46
Status:needs work» needs review

That patch has been reverted, so back to needs review.

#19

System Message - April 20, 2009 - 12:05
Status:needs review» needs work

The last submitted patch failed testing.

#20

catch - April 20, 2009 - 14:47
Status:needs work» needs review

re-rolled after dbtng conversion.

AttachmentSizeStatusTest resultOperations
multiple_load_conversion.patch6.97 KBIdleFailed: Failed to apply patch.View details | Re-test

#21

catch - May 12, 2009 - 12:57
Title:Load full term objects where possible» Taxonomy module doesn't use it's own APIs
Category:task» bug report

Trying a different title to see if I can get more eyes on the patch ;)

#22

System Message - May 21, 2009 - 21:20
Status:needs review» needs work

The last submitted patch failed testing.

#23

catch - May 27, 2009 - 09:07
Status:needs work» needs review

re-rolled.

AttachmentSizeStatusTest resultOperations
taxonomy.patch7.16 KBIdleUnable to apply patch taxonomy_49.patchView details | Re-test

#24

andypost - May 30, 2009 - 10:07

Looks great by why taxonomy_get_term_data() was removed?

So only taxonomy_term_load_multiple() will be used to get term object?

#25

catch - May 30, 2009 - 10:20

Yeah when taxonomy_term_load() went in (with load hook, before multiple), there were several places in taxonomy module where potentially dozens of terms were getting loaded only to get their titles and similar. _multiple() now covers those situations more efficiently, and _get_term_data() doesn't fire hooks - so you never get a proper term object if you use that. Additionally, if you have _load() and get_term_data() on the same page it causes duplicate database queries. The function is basically the same as taxonomy_get_term() in D6 but it took a couple of patches to get to a point where we could kill it.

#26

andypost - May 30, 2009 - 10:33

So suppose RTBC
I'm checking taxonomy_get_term_data() only used once and just eats memory with static

#27

Berdir - May 30, 2009 - 12:12

Would it make sense to rename taxonomy_get_term_by_name?
We have http://api.drupal.org/api/function/user_load_by_name/7, so taxonomy_term_load_by_name() would be similiar...

Patch looks good, the broken notNull assertion is a good catch :)

#28

stella - May 30, 2009 - 12:30

Patch looks good. Though I wonder if you need to explain the following comment further:

+    // Name matching is case insensitive, note that with some collations
+    // LOWER() and drupal_strtolower() may return different results.

I'd be happy to set to RTBC assuming all tests pass.

#29

andypost - May 31, 2009 - 06:20
Status:needs review» reviewed & tested by the community

#30

Dries - May 31, 2009 - 06:58

This looks good to me, will commit later today. Some of these tests could probably become UnitTests instead of WebTests but we can leave that to a follow-up patch as it probably requires some additional refactoring of the tests.

#31

Dries - June 3, 2009 - 19:35
Status:reviewed & tested by the community» fixed

A few days late, but committed this to CVS HEAD. Thanks!

#32

System Message - June 17, 2009 - 19:40
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.