Currently Recommender API requires direct database access. This is the design choice for 2 reasons: 1) Mahout requires direct database access, 2) it has better performance. However, when deploying Recommender API on a remote server, people often can't give Recommender API direct database access because of technical limitations or security concerns. In those cases, the solution is to transfer required Drupal data to another database where Recommender API can have direct database access.
Some Drupal modules are for data transfer purposes:
1. http://drupal.org/project/services: Drupal publish data for others to pull.
2. http://drupal.org/project/feeds: Drupal pull data from another data source.
What we need here is Drupal push data to another database. This seems to be missing in Drupal. Some solutions:
First approach is to have Drupal output data and the remote RecAPI pulls it:
- Drupal uses the services module to publish data, and RecAPI pulls it.
- Drupal users Backup and Migrate module to dump data, then transfer it to RecAPI.
- Drupal uses Views or RSS to export data, and RecAPI pulls it.
Second approach is to use database replication/synchronization tools to transfer data between Drupal and the database RecAPI will use. Such tools include:
- http://symmetricds.codehaus.org
- http://dbreplicator.org
- http://opensource.replicator.daffodilsw.com/
- https://www.forge.funambol.org/download/
Third approach is to have Drupal push data directly to RecAPI.
- Could look at the framework in http://drupal.org/project/apachesolr on how data is pushed to another server.
- Write a new Drupal module which uses XMLRPC or REST to call RecAPI web service and push data to it.
- Could use Web Analytics (GA, piwik, etc) to transfer, say, browsing/purchasing data, directly from the client to RecAPI
I'm still investigating what is the best solution here.
Comments
Comment #1
danithaca commentedI'll first look at the third approach, and then the second approach.
If you have suggestions, please comment.
Comment #2
danithaca commentedSome more thoughts:
So the most promising approach seems to be the first one.
Comment #3
danithaca commentedLooks like http://dbreplicator.org is a fork/successor of http://opensource.replicator.daffodilsw.com/. Both use java.rmi to transfer data so basically it's not applicable here. https://www.forge.funambol.org/download/ is very complicated. For the second approach, looks like http://symmetricds.codehaus.org is the only possible solution.
Another promising approach is just to generate CSV file and send both the input/output file automatically via HTTP. Could be the most effective way, and also a simple way.
Comment #4
mikeytown2 commentedSomething to think about
http://drupal.org/project/httprl
Comment #5
danithaca commentedThis is done. See #1238572: create cloud service to help people use the Recommender modules. for details on how to use the cloud service.