I'm going to try and explain this as best I can, so bear with me here.
I have a lot of Taxonomy vocabularies and terms and so forth (one node type has 10 vocabularies with 5-20 terms each), and I exposed each vocabulary in thew hopes of creating a taxonomy search-like behavior with Views. Everything was fine, until my site went offline. After a bit of investigation, it seemed that the queries were taking too long, and were locking down my MySQL database, causing it to refuse connections.
I've looked far and wide on Drupal.org for a taxonomy browser (I've tried Taxonomy Browser, Faceted Search, Views Exposed, and more), and all of them seem to crash or bog down my database.
Am I just using too many terms? Would caching improve the database stability (or not, because each possible combination has to be cached?) If Views can't help me here, can someone recommend another module that does reliable taxonomy browsing/search?
I'd greatly appreciate any help on this.
Comments
Comment #1
dawehnerThe more filter you are using the slower gets the query, thats unchangeable.
I think for you a cache will not really help, because you have too many possible combinations.
Additional taxonomy per default is not the best todo, if you need complex filtering, because taxonomy-s datastructure is not so easy, so its need some extra joins to achive what you need.
Comment #2
SpikeX commentedWell, I originally picked Drupal because of the categorization options available in the Taxonomy module... and now I'm finding out that I've overused the taxonomy system... great.
Anyone else have any ideas? Or am I pretty much stuck here?
Comment #3
dawehnerThats just my oppinion.
If you want you could answer Earl, too. But i think he will have some taxonomy horror stories for you ;)
Comment #4
SpikeX commentedUh... who's "Earl"?
Comment #5
dawehnerGood question :) I think you should find it out for yourself.
Could you also post the exact slow query?
Comment #6
merlinofchaos commentedSorry, I'm Earl, the author of the module.
Taxonomy is a wonderful module, but it is, sadly, ridiculously easy to over use. And the way its data is stored is NOT conducive to high performing, complex queries. Your experience with other modules shows that the problem is fundamental to taxonomy.
Fixing this is probably going to mean re-architecting how you're using your data. I am unable to give any advice without understanding what you're doing, but my general advice would be this:
If you are using taxonomy as a way to simply add select fields to a node, you should be using CCK for that.
If you are using taxonomy for 'free tagging' that's a good use of taxonomy.
If you are using taxonomy for a hierarchical categorization system, that's a good use of taxonomy.
You should try not to use more than one hierachical categorization system if you can avoid it.
Comment #7
SpikeX commentedOn a particular content type I have maybe 12 vocabularies each with 5-12 terms on it... and with Views (or another system), could this present a problem when filtering?
Would it be wiser to try and lump all of my tags into one vocabulary? In your opinion do you think that would cut down on the "weight" of the query, and speed things up, allowing me to use things like Views Exposed Filters (or another similar module)?
Thanks for your reply.
Comment #8
SpikeX commentedUh...bump, I guess? I'm kind of at a loss for what to do here. My users are asking me for a way to search through "tags", as they're called (just taxonomy/vocabulary items), and I really don't know what I'm doing wrong. Is more vocabularies with less terms better? Or is one vocabulary with a lot of terms better?
This may be a stupid question, but... there isn't an alternative to Taxonomy that would allow me to search/filter by taxonomy tags, is there? Obviously Taxonomy comes with Drupal, but if it's not the solution I'm looking for, is there a better one?
I'm just a little frustrated that after trying three different taxonomy search/filter modules and having them fail, that I can't figure out what I'm doing wrong. Any advice would be appreciated.
Comment #9
merlinofchaos commentedYou might be able to get better performance by putting a number of your tags in the same vocabulary with a hierarchy, so that what you're currently using as a vocabulary might be a term. However, utilizing depth in your filtering could also make things slow.
Another option might be to use the content_taxonomy module which will let you create your vocabularies as taxonomy terms. For a lot of purposes, CCK select fields are vastly superior to taxonomy if you're just adding single-select values to a node.
It's hard to provide better advice without some understanding of what you're really using these vocabularies for.
Comment #10
SpikeX commentedIt's largely categorization... if a node is a certain type (A vs B vs C), except there are like 5 or 6 vocabularies like that for some content types. And then I have a large vocabulary dedicated to "tagging" posts, as well. So kind of a split between categorization, and tagging (tagging being largely for search purposes, which again, I'm having a seemingly hard time with).
So you're saying that for the categorization items (not the tag items), content_taxonomy may be better? My site is largely Views-powered (hooray!), so I'm assuming it's fully compatible with it?
Comment #11
danieldd commentedSpikeX, I have also made a lot of use of tags with views.
I think Earl's suggestion is, wherever you have a vocabulary that is just a simple series of options (ie A vs B vs C) do these as CCK fields rather than taxonomy terms. As you've probably discovered, you can set allowed values for CCK fields, so you can set exactly what terms you allow users to enter here. You can also use Content Taxonomy for this purpose, although this may add a bit of complexity if you are unused to the module. If you move most of your vocabularies to this set up I imagine it will greatly improve performance.
I also have a related taxonomy performance question myself, if someone can help. I have a large site (>50k nodes) that uses 2 heirarchical taxonomy vocabularies. As you can imagine performance is not great, and I get round this to some extent through extensive use of caching. However, I'm wondering if it would improve performance if I deleted tags from some nodes (currently a very large number of nodes are categorised as "other" in one of my vocabularies, because I have set this taxonomy as "required". I figure I could delete these tags without losing any valuable data, but only want to do this if there is some benefit to performance). Would welcome some advice! Thanks
Comment #12
merlinofchaos commentedI believe that yes, you would improve performance because the number of records necessary to search in the node_term table would decrease dramatically.
Comment #13
Letharion commentedTwo useful answers, no reply from questioner for many months. Closing as fixed. Please re-open if this is still an issue.
Comment #14
inforeto commentedI alleviated a problem of this kind by using content taxonomy.
With that you pass the stress to cck filters in views.
Further gain can be done if you set the option that makes the nodes not be saved on the term table.
This leaves the term table free to the work of building the term selections which can be slow by itself.
Also use hierarchical select for ajax filtering, which may save repeated calls.