Note: Please view the latest documentation in the README.txt or README.html files in the Recommender API module.

What is Recommender API?

Recommender API provides easy-to-understand, easy-to-use, fully-documented APIs for other Drupal content recommendation modules (e.g., Fivestar Recommender, Ubercart Product Recommender, etc.). It also provides a unified approach to configure & execute the recommender algorithms, and to display results to end users.

What's new in 7.x-3.x?

The 3.x release completely re-writes the API code, and is not compatible with earlier releases. Major changes and features are:

  • All complex computation is now done under Apache Mahout (http://mahout.apache.org/), which is much faster than the original PHP implementation. See http://drupal.org/node/1180000 for more details.
  • All end user display is now through customizable Views rather than hard coded blocks. See http://drupal.org/node/673786 for more details.
  • You can isolate the resource intensive recommender computation from the Drupal production site.
  • It now supports more algorithms (e.g. SVD) provided by Apache Mahout.

Installation, Configuration & Execution

How does Recommender API work?

Conceptually, you need 2 computers to run Recommender API: the Drupal server for your Drupal site, and the recommender server to compute recommendations. Of course, you can run the recommender server on the same Drupal server if you like, but note that the recommender computation could easily consume all resources. On the Drupal server, you simply issue a command to run recommenders. On the recommender server, you would run the real recommender program (written in Java), which retrieves the command, computes the results, and saves the results back to the Drupal server for display.

Installation

Step 1. Download Apache Mahout v.0.5+ from http://mahout.apache.org/, and extract it to any directory on the recommender server.

Step 2. Install the Async Command module (http://drupal.org/project/async_command) to your Drupal server under sites/all/modules/async_command. Copy the 'lib' sub-directory and async-command.jar to any folder on the recommender server.

Step 3. Install the Recommender API module (http://drupal.org/project/recommender) to your Drupal server under sites/all/modules/recommender. Copy the 'recommender.jar' file to any folder on the recommender server.

Step 4. Install any helper modules to your Drupal server, such as Browsing History Recommender, Fivestar Recommender, Ubercart Product Recommender, etc.

Configuration & Execution

On your Drupal server, you can go to admin->configure->recommender to run various recommenders. Note that this doesn't do any computation, but merely puts the commands in the {async_command} queue to be executed on your recommender server. Remember to set "administer recommender" permission too.

On your recommender server, you need to do the following:

Step 1. Create the 'config.properties' file by following the example under async_command module. This configuration file tells the recommender program ("recommender.jar") how to directly access Drupal database. The database access doesn't need full privileges. The minimum privileges are:

  • SELECT/UPDATE on {async_command}, {recommener_app}
  • SELCCT on {node}, {users}, or other tables (see documentation of helper modules)
  • SELECT/INSERT/UPDATE/DELETE on {recommender_similarity}, {recommender_prediction}, {recommender_preference_staging}

Step 2. Create the 'run.sh' file by following the example under recommender module. This script sets the required Java CLASSPATH, and executes 'recommender.jar'. If you are on Windows system, please write your own 'run.bat' file.

Step 3. This is an optional step. You might want to run 'run.sh' as a cron job on your recommender server. This is different from the cron settings on the Drupal server, which just issues the command. The cron settings could be like this (run every 30 minutes):

 
# in crontab -e, add the following line.
*/30 * * * * flock -n /tmp/recommender.lock recommender/run.sh >> /tmp/recommender.log 2>&1

Similarity vs. prediction

Recommender API offers two types of recommendations, similarity-based and prediction-based, although different algorithms might implement both or either (e.g. SlopeOne algorithm only has prediction-based recommendations).

One type of recommendations is based on the similarity among nodes (or users, or other types of entities). For example, if you are viewing a node, it will recommend other similar nodes. The recommended nodes are the same for this particular node regardless of which user is viewing it. The similarity scores are computed based on the fact that, for example, if two nodes are usually viewed together, or two products usually purchased together, then the two nodes/products are similar. The helper modules actually define what information to use to compute the similarity scores. The similarity scores range from -1 (completed dissimilar) to +1 (completely similar), and are directional: A is similar to B doesn't mean B is similar to A.

The other type of recommendations is based on the "prediction scores", which predict how much a user would like a node. The recommendations are personalized: different users would see different recommendations. But for each user, she would see the same recommendations regardless which page she is viewing. The prediction scores are computed based on the user's personal history. For example, if a user purchased products A and B, she might be interested in purchasing C which is similar to A and B. Exactly what "personal history" to use is defined by the helper modules. (A side note: If you treat users as nodes and nodes as users, you can then predict how much a node would "like" a user. This is useful when you want to promote a node to the most interested users.)

You need to understand the distinction between similarity and prediction in order to work with Views.

Views support

Recommender API supports Views 3, which is the preferred way to display recommendations. Most helper modules would create default Views, and you can just customize those.

However, if you do want to create your own recommender views, here's how:

Step 1: Choose the views base table, either Recommender Similarity or Recommender Prediction, depending on which type of recommendations you would show.

Step 2: In "filter criteria", you need to select which recommender application to provide the recommendations. Usually you just need the "Application ID" filter (if you are a helper module developer, please use "Application Name" filter). Use other filters if you want.

Step 3: In "relationships", add a new "Entity ID (Target)" relationship. In the next page, select the entity type of the recommended items. For example if your recommendations are nodes, then use "Content". Also check the "Require this relationship".

Step 4: In "contextual filters" (a.k.a. "Arguments"), add "Entity ID (Source)". This is where the recommendations are made for. If your recommendations are made for the current user, then here is the UserID of the current user. Usually you want provide a default value of either the current node or the current user.

Step 5: Add "fields", "sort criteria", or make other Views settings as you see fit. When you sort by similarity scores or prediction scores, choose "descending".

Recommender Algorithms Explained

User-user vs. item-item

The two most popular recommender algorithms are user-user and item-item. The user-user algorithm first computes similarities among users based on the users history records (such as purchasing history, nodes browsing history, etc), and then predicts how much a user likes an item based on how much the user's similar users like the item. The item-item algorithm first computes similarities among items based on some information (e.g., the items are always purchased together, the items are always rated the same scores, etc.), and then predicts how much a user likes an item based on how much the user likes the item's similar items.

Academic research shows that the item-item algorithm usually works better than user-user. Amazon.com uses the item-item algorithm in its recommender system.

SlopeOne

The advantage of SlopeOne is performance. But it doesn't compute similarity scores, and I don't know many real systems use this algorithm. (Note: This algorithm will be added later.)

SVD

This algorithm worked really well in the Netflix Prize (http://www.netflixprize.com/). It is especially useful when you have sparse datasets. (Note: This algorithm will be added later.)

For Developers

(To be added.)

Temporarily see the comments in recommender.module file.

FAQ

Why not using REST to access Apache Mahout?

Apache Mahout provides REST access. However, this module choose not to use it for the following reasons:

  • Each recommender application (Fivestar Recommender, Browsing History Recommender, etc.) would require an independent Mahout REST instance, which involves lots of administration overhead.
  • Even though we can use the REST interface to query recommendations, Mahout still requires direct database access through its JDBCDataModel.
  • The recommender algorithms usually requires access to the entire database tables all at once. It's much more efficient using direct database access than using REST.

What happens to the mouse/cheese metaphor used in the 6.x-2.x release?

The mouse/cheese metaphor was used for two reasons. First, it's more lively than the user/item terminology. Second, from a programming perspective, users and items are usually inter-changible. So a "mouse" can act as a user at one time but as an item at another time, same for "cheese". But Mahout adopts the user/item terminology, and it handles the user/item inter-changibility by class hierarchy. To avoid confusion in 3.x, I'm not using the mouse/cheese metaphor anymore.

Is there a cloud service alternative?

We will launch a cloud service shortly.

Where to find more documentation and support?

The HTML version of this documentation will be posted to http://drupal.org/documentation. You can use rst2html to generate HTML too.

For bugs report, new features requests and all other requests, please submit issues at http://drupal.org/project/issues/recommender.

If you need customization or consulting services, please contact the author at danithaca@gmail.com.

Comments

ñull’s picture

I tried to set up my own recommender server, but ran into a problem that nobody seems able to solve. A cloud service would be very welcome (too). We are almost 3 years later after this promise was written. Any news?