Storage API is a low-level framework for managed file storage and serving.

It has the following features:

  • Pluggable architecture - it can be extended to work with any storage service.
  • Redundancy - it can be configured to store your files in multiple services and instantaneously change which one is serving. This means your site will not be brought down by a service having problems.
  • Access control API - can be used for e-commerce.
  • Deduplication - when files that are identical are stored in the same container, only one instance will be created. This saves bandwidth and storage.
  • File and image field integration - enable the "core bridge" sub-module.
  • Audit module - compares a manifest of files with what is recorded in the database to ensure that the record is accurate.

There are some screencasts demonstrating various features of the module.

Services

  • Filesystem - files are stored in a local directory and served via any HTTP or FTP service running on the server.
  • Database - really just a proof of concept. All files are stored in the database and served via Drupal.
  • FTP - files are uploaded to a directory via FTP. A URL prefix can be defined for serving.
  • Amazon S3 / CloudFront - files are uploaded to an S3 'bucket'. Serving is handled by S3 or CloudFront. Also supports media streaming and time-limited cryptographically-signed URLs.
  • Rackspace Cloud Files - served using Limelight Network's CDN.

Containers

A container is an instance of a service. It is a place to store files. Configuration is service-specific, e.g. a filesystem container needs to know the directory where the files should go; an S3 container needs account credentials along with the bucket name.

It is extremely important that more than one site does not consider itself to be the owner of a container. For example, if a live site with an S3 container was duplicated onto a local development environment and then a node was deleted from the local site, files owned by this node would be deleted from the S3 container that the live site is using. The easiest way to mitigate this problem is to edit the S3 container on the local site and mark it as "External". This means that the local site will not have write access to the container.

Classes

A class is a prioritised list of containers. Storage API works in cron to propagate files to every container in their class. If a file is fully propagated, then any instances of the file in containers not in the class will be deleted, e.g. if a container has been removed form the class.

Files are served from the highest priority serving container that has the file.

Classes have configured an "initial container". This is where files are first put when they are added to the class. From here they are propagated to other containers. It the initial container is not listed as a container in the class, then file instances in the container will be deleted once the file has been fully propagated to all containers in the class.

Examples of classes might be:

  • Session - just has a local filesystem container to store files temporarily.
  • User - contains Rackspace and S3 containers for maximum protection from data loss.
  • Thumbnails - has the local filesystem container as its initial container, but only has a reduced redundancy S3 container in the class. This means that files can be generated and served quickly, but they are only stored permanently on S3. Cheaper reduced redundancy containers can be used because the files can be regenerated.

Project Information

Downloads