CRE with Vote api modules

stephencarr - January 30, 2008 - 02:20
Project:Content Recommendation Engine
Version:5.x-1.x-dev
Component:Code
Category:support request
Priority:normal
Assigned:Unassigned
Status:active
Description

When using CRE and a vote api module such as updown on sites with >1000 nodes silly amounts of SQL queries are made. I had over 3000 queries made via the CRE hooks per vote. The CRE matrix table in my DB was over 75MB which was 3 times the size of the rest of my DB tables put together. Needless to say as interesting as CRE is I had to remove it, which is a shame because it offers something that is quite useful if working properly.

Perhaps on smaller sites it makes sense but certainly if you are looking for an efficient recommendation/sorting module this is not the one. Unfortunately there really are no other options at this time.

#1

Scott Reynolds - January 30, 2008 - 02:32
Category:bug report» support request
Status:active» closed

Any voting based recommendation algorithm will do this. This is why model based recommendation algorithms are becoming popular of memory based (which is what this is).

Please, don't flood issue queue with nothing...

#2

patchak - February 14, 2008 - 08:49
Version:5.x-1.0» 5.x-1.x-dev
Priority:critical» normal
Status:closed» active

Hey there Scott. I'm using CRE on my site as well since a couple of months, and I have more than 20 000 nodes at this point. The voting is really slow and I think it is related to the number of queries that is mentionned here.

My question is this. I know you have a lot of nodes on your test db's for this module, I waqs wondering if you could give me some pointers on how did you manage to keep your site fast and keep the votes being recorded fast even on an enourmous amount of nodes and votes.

Thanks,
Patchak

#3

Scott Reynolds - February 14, 2008 - 13:38

For an immediate fix, you could change the votingapi settings to calculate votes on cron.

I can probably change so that one query happens that adds all the records. It will be a sizeable query.

And I would like to add an option to allow the similarity between objects to be calculated during cron run instead of as soon as a vote is posted.

#4

patchak - February 22, 2008 - 18:55

Hey there Scott those would be really cool additions. DO you have some time to work on this? I might be able to sponsor some of this work.

Thanks,
patchak

#5

patchak - February 27, 2008 - 18:36

Hey there Scott.

I installed devel module cause I though that my site was pretty slow.

Check out this query:

27575.51 1 cre_top SELECT d.content_id1 as 'content_id',ABS(d.sum/d.count) as 'score',n.title FROM cre_similarity_matrix d, votingapi_vote r, node n WHERE d.content_type1 = 'node' AND r.uid = 1 AND r.value = 1 AND d.content_id2 = r.content_id AND n.nid = d.content_id1 AND n.uid <> 1 AND r.tag = 'vote' AND n.created >= 1202408995 GROUP BY d.content_id1 ORDER BY score ASC

The first number is the time of execution. This is for the home page of the site, not even on the recommandations page. How is it that on the home page this query is calculated?

Would it be possible to use or to create a caching mechanism that would simply cache the recommandations until the next calculation, on cron time for example?

Like I said, I'm really in need of a faster solution adn willing to pay for it to happen. Would be interested/have the time to work on this???

Thanks,
Patchak

#6

patchak - February 27, 2008 - 23:01

As a follow up : I just realised that those queries are made on the front page and other pages, cause I enabled the 'recommended content' block on those pages. I will remove the block atm, until we can find a way to cache those recommandations...

Patchak

#7

patchak - March 4, 2008 - 17:39

Hey there Scott,

I just posted a bounty to enhance performance of the CRE module here :

http://drupal.org/node/229726

Just to let you know!
Patchak

#8

patchak - March 4, 2008 - 17:39

Hey there Scott,

I just posted a bounty to enhance performance of the CRE module here :

http://drupal.org/node/229726

Just to let you know and to see if you are interested in taking the task!

Patchak

#9

funana - May 11, 2008 - 03:46

+1.

Very nice idea but open buffet syndrome...

@patchak: As you said that it's a problem with the block, did you already try to use blockcache module to cache the block?

#10

hickory - July 8, 2008 - 14:16

"And I would like to add an option to allow the similarity between objects to be calculated during cron run instead of as soon as a vote is posted."

This would be very useful - there's rarely a need to get this information straight away, in my case.

 
 

Drupal is a registered trademark of Dries Buytaert.