On this page
System Requirements
The following is an overview on the software and hardware requirements for a successful installation of the Drupal Toolkit. In addition, advice is given on software and hardware recommendations as well as a notice on the important topic of disk usage.
Software
The Drupal Toolkit requires Drupal and all its dependencies. Currently, the Drupal Toolkit is only developed for Drupal 6. There have been some discussions related to porting it to Drupal 7, however, there is no timeline in place for development.
| Required | Recommended | |
|---|---|---|
| Web Server | HTTP | Apache HTTP |
| Drupal | 6 | 6.25 |
| PHP | 5.2 | 5.2 |
| MySQL | 5.1 | 5.5+ |
| Apache Solr | 1.2 | 3.3+ |
| Java | 1.6 | 1.6+ |
Note: The user should have good knowledge on the operating system on which these softwares are going to be installed. Also the user should have admin(root) permission to install otherwise there is a chance to run into permission problems.
PHP
PHP is required for Drupal, however for the Drupal Toolkit, PHP must be complied with support for cURL, DomDocument, and SimpleXML. Such functionality is necessary for making external HTTP requests to OAI and NCIP servers and handling XML responses.
Apache HTTP
As of now, all Drupal Toolkit development has used Apache HTTP as a web server. It is defiantly possible to run the Drupal Toolkit on other servers, such as Lighttpd, Nginx, or even Microsoft IIS. However, the use of such has not been tested by our developers.
Apache Solr
In addition to Drupal, the toolkit also requires Apache Solr for indexing, searching, and faceting as well as other features.
MySQL
Although Drupal can work with many other database systems, such as PostgreSQL, MySQL is strongly recommended for the Drupal Toolkit. Certain functionality with the Drupal Toolkit will only work with MySQL, such as the CSV import process for the OAI Harvester, which uses a MySQL-specific command to increase performance.
Java
Java is required for Apache Solr. Therefore, to use the indexing, searching, and browsing functionality of the Drupal Tooklit, this must be installed. Moreover, we recommend installing the official Java provided by Sun or Oracle.
Linux
The use of Linux is highly recommended and all examples within this guide are in respect to a Linux operating system. Although not necessary, the software environment used to develop, install, configure, and test the Drupal Toolkit has been Linux-based.
Other Operating Systems
In our opinion, the Drupal Toolkit software does work on Mac OS and should work on Windows systems, however, the instructions on how to build such an environment and the support for maintaining such and environment are beyond the scope of this guide. If you choose to use any other operating system, use your own judgement and experience to make the necessary adjustments to have a working environment. Also note that installation and configuration on other operating systems may be difficult or impossible to accomplish.
Other Software
Finally, our developers recommend downloading Wget in order to use the included BASH scripts, installing Git for obvious reasons, and using Drush because it saves time... and mouse clicks!
Hardware
Drupal requires a lot of processing power. It can easily consume a lot of system resources, particularly during metadata harvesting, indexing of records, displaying of views (such as those in the administration panel or custom views) and node generation. Although we can not give you exact hardware requirements necessary for your particular site, we can provide some helpful information as a baseline starting area that should work for even the smallest possible sites.
| Minimal Recommendation | |
|---|---|
| Server | 2 GHz Dual-core |
| Architecture and Operating System | 64-bit |
| Memory (RAM) | 4 GB |
| Hard Drive | 120 GB |
Disk Space
If you plan to harvest large sets of metadata records, you may require a significantly large amount of available disk space. In this case, be mindful that the disk space required is mostly determined by the size and number of records harvested and plan to have available at least five times the amount of disk space necessary to store the metadata records you plan to harvest in plain text.
The amount of disk space necessary during the harvesting process is up to three times the size of the disk space required to store the harvested records. This is primarily determined by the harvester's settings since both (1) caching XML responses from a repository, and (2) delaying the necessary SQL INSERT statements by using CSV files and the LOAD DATA INFILE statement after harvesting for metadata storage and node generation, effectively double the disk space required. That is why, put together, choosing both settings require three times as much disk space. In addition to that, during the indexing process, the need to optimize the Solr index also doubles the size of the entire index, effectively adding to the disk space requirment.
Disk Usage Example
Consider the following example. Our demo sites harvests over around millions of XC records. The total size of the XC records in plain text is around 12 GB. So, we estimate that we would need at least 60 GB of space for successful harvesting. Here's the breakdown:
- 12 GB for the OAI response cache
- 10 GB for the MySQL CSV load files used to delay SQL inserts
- 10 GB for the MySQL database
- 14 GB for the Solr index
- 14 GB for the Solr index optimization
It is important to note that this is the peak usage requirements. Once the harvesting, indexing, and node generation processes are complete, you can reduce the disk space used by simply deleting the OAI and SQL caches. The disk space used by Solr's optimization process will reduce automatically on its own when the process is complete.
Using Multiple Drives or Partitions
If you have multiple drives or partitions available, for example, one larger and one smaller, as is the case with many servers, we suggest that you install Solr on the larger disk. This will keep it separate from the Drupal instance and web server. You may also want to do this for your MySQL data directory and Drupal files directory.
For more information on on how to move your MySQL data directory, take a look at this article, suggested by Brandon Gant.
You may also want to review this example of how to create a soft link or symbolic link to move your Drupal files directory to a larger disk. It assumes you have a large hard drive mounted at "/mnt/bigdisk" and a Drupal instance running from "/var/www/drupal".
cd /var/www/drupal
mv sites/default/files /mnt/bigdisk/drupal_files
ln -s /mnt/bigdisk/drupal_files sites/default/files
Help improve this page
You can:
- Log in, click Edit, and edit this page
- Log in, click Discuss, update the Page status value, and suggest an improvement
- Log in and create a Documentation issue with your suggestion