DataSync

The DataSync module was written to import data reliably on a large scale. It allows you to schedule and run multiple types of import jobs on multiple servers in a reliable and centralized way. It is NOT extremely scalable at the moment because Drupal 5 does not work well with database transactions so you should only run each consumer on one machine at a time in order to prevent race conditions. This should be fixed in the Drupal 6 version. It is however very functional and has run tens of thousands of jobs on our production servers already.

Some of what DataSync does:

  • Automatically runs scheduled jobs at any regular interval (with any amount of jobs at a time)
  • Provides an API for you to define those jobs
  • Handles job errors and timeouts gracefully
  • Provides an API to make sure data synchronization causes no duplicates or invalid updates
  • Transaction support in Drupal 6

I am looking for people who are interested in testing this module on their own setups. Please contact me if you need help.

Originally contributed by SonyBMG.

Releases

Official releasesDateSizeLinksStatus
5.x-1.02008-Aug-2618.89 KBRecommended for 5.xThis is currently the recommended release for 5.x.
Development snapshotsDateSizeLinksStatus
5.x-1.x-dev2008-Aug-2718.9 KBDevelopment snapshotDevelopment snapshots are automatically regenerated and their contents can frequently change, so they are not recommended for production use.
 
 

Drupal is a registered trademark of Dries Buytaert.