This project is not covered by Drupal’s security advisory policy.

Drupal Computing is a framework that facilitates distributed computing between Drupal and external programs written in non-PHP languages such as Java and Python. It is particularly designed for the cases where you need to use Drupal with non-PHP, computational-intensive code libraries (such as Apache Mahout, NumPy/SciPy, R, etc) for offline big data analytics. In addition to this Drupal module, you also need the companion Java and Python client library at https://github.com/danithaca/drupal-computing.

The target audience would be developers who write Java/Python programs for Drupal. Known examples of modules built with Drupal Computing:

For information about installation, configuration and development, see README.txt that comes with the module.

Warning: The 7.x-2.x release is not compatible with 7.x-1.x release and there is no direct upgrade path. Please uninstall the old release and reinstall.

Key Features

  • Reusable Java and Python code to interact with Drupal (https://github.com/danithaca/drupal-computing)
  • Two ways of Drupal interaction using either Drush (unlimited access) or Services (restricted but more secure).
  • A work queue system within Drupal to facilitate data exchange and asynchronous execution with agent programs.
  • Entity API support to provide flexible data structure.
  • Views integration.
  • Rules integration. For example, send an email when a certain agent program finishes running.

How Does It Work?

In the simplest case, you only need to write an agent program (i.e., the external program that interacts with Drupal) using the Java/Python client library. However, to fully take advantage of the framework, you would usually write both a Drupal module and an agent program.

Your Drupal module prepares data on the Drupal site (e.g., takes input from admins), and saves the data as a JSON string in a "computing record" entity. Your agent program then claims the "computing record" entity from the work queue, takes the JSON string as input, processes the data, and saves results data back to the "computing record" entity. Your Drupal module finally takes the results data from the "computing record" entity, and uses them in the Drupal site.

You can see code examples at https://github.com/danithaca/drupal-computing. See Recommender API for a full example.

FAQ

Q: How does this module compare to "beanstalkd" and other work queue systems?

Drupal Computing is a framework to help you write distributed computing programs in Java and Python, not just a work queue system such as "beanstalkd". The native work queue system provides tighter integration with Drupal (e.g., Rules/Views integration) and less dependancy on other systems (e.g., beanstalkd). In the future, the Drupal Computing module might provide a feature that allows the use of "beanstalkd" to replace the native work queue system.

Q: How does this module compare to the "Background Process" module?

Both modules are designed to handle tiem-consuming processes. However, the Background Process module is still for PHP, and focuses on "non-blocking" real-time applications (similar to node.js). Drupal Comuting is designed to handle non-PHP, offline programs written in Java and Python.

Q: What about D6 and D8 support?

This module will supoort Drupal 8 soon, but not Drupal 6.

Where to find more information?

  • This project page on drupal.org for an overview of the framework.
  • The README.txt file in this module for technical details.
  • The README.md file on github for technical details about the Java/Python client.
  • The source code.
  • drupal.org issue queue.
  • issue queue on github for issues particularly about the Java/Python client library.

Project information

Releases