Right now Memetracker depends on the Python library Pycluster. Pycluster is a wrapper for a C library Cluster. Cluster is what actually does the hard work of finding memes.
Many people run Drupal on shared hosts or other environments where they can't install the above dependencies. This dependency could be removed by writing a C script which uses the Clustering library directly. This would simplify installation of Memetracker significantly so enlarge the group of potential users. I didn't go this route initially as I don't know C but I do know python.
It'd be great if someone ports the cluster.py script (found in machinelearningapi) to a straight C script. It'd need to be compiled for both Linux and Windows. If you know C, this shouldn't be a hard job. Contact me (Kyle Mathews) if you are capable of and interested in converting cluster.py to a C script.
Comments
Comment #1
SeanBannister commentedI'd be really interested to see the dependency on Python removed however I don't know C. I haven't looked at Pycluster, would this be hard to rewrite in PHP. Just looking at the options.
Comment #2
dannz commented"Many people run Drupal on shared hosts or other environments where they can't install the above dependencies."
I'd be really interested in seeing Drupal make more use made of Python libraries, so rather than look at removing the dependency, my suggestion would be to look at how to allow this to be used by people on shared hosting which doesn't offer Python. One solution might be to use a virtual python environment (http://pypi.python.org/pypi/virtualenv#what-it-does). The suggestion then is a Drupal module (or utility) which people on shared hosts could use, and which would perform a simple setup of virtualenv.
OK, so installing a Virtual Python just for memetracker might seem a bit overboard (it uses several Mb), but the same could be used by other modules, so bringing all the Python goodies to Drupal.
At a guess I wouldn't think it too hard to write a Drupal module which bundles a build of virtualenv. This could then be extracted and used to run a py script to make a proper install, and then remove the temp copy of virtualenv in the (not so secure) public modules directory. It may perhaps be easier than a rewrite of memetracker in C or PHP.
Comment #3
kyle_mathews commentedInteresting idea. I hadn't heard of the virtual python environment. Are you familiar with it? I think I'd be more comfortable for memetracker alone just reworking it to rid the python dependency -- but if the VPE is genuinely workable, it would be cool to popularize that for wider Drupal community usage.