Refine the computePrediction() logic for the CorrelationRecommender algorithm
danithaca - June 5, 2009 - 16:11
| Project: | Recommender API |
| Version: | 6.x-2.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | critical |
| Assigned: | danithaca |
| Status: | active |
Jump to:
Description
The code works but looks ugly. Need to clean it up a little bit.
Also, it doesn't support (Resnick 94), which is to compute the mean based only on the items that are co-voted. I think my algorithm should work fine. But it's better to provide a variation as well.

#1
Also, implement Amazon's adjusted tweak (Linden, 2003)
#2
Fixed (Resnick, 94) problem as a side effect of #483102: Internalize NAN support to the Vector/Matrix class.
however, computePrediction() still need more work.
#3
Not fixed. Should work on it again.
#4
has to fix this and make it more efficient (maybe limit to k nearest neighbor). otherwise it'll cost too much time.
on a site w/ 50 users and 500 nodes, computeSimilarity() took <10 seconds, and computePrediction() took 5-6 mins.
#5
the computationPrediction() logic probably requires a new architecture. the key to improve performance is to skip un-necessary computations, such as using knn (k-nearest-neighbor)
#6
implemented the knn algorithm. could work faster. but the bug still exists. and the code is error prone. need refactoring.
#7
we re-used the 'lowerbound' setting for computeSimilarity(). maybe we should use a new config param for it.
#8
I suspect you're going to need to come up with some sort of API to allow offloaded batch computation. That way people can hook up multicore processors on the LAN to do the matrix computations.
Are you using singular value decomposition?
#9
thanks for the comments.
i'm not sure whether PHP would support multiprocessing. for high performance computing, i'm thinking to outsource to ApacheMahout or other Java/C++ implementations. this module will remain logically simple, and provide interface for 3rd party app integration.
SVD is going to be the next algorithm I'm going to develop.