Last updated May 14, 2013. Created by mikeryan on May 14, 2013.
Log in to edit this page.

Data migration has different performance characteristics from normal website operations. The construction and delivery of a single web page usually involves a myriad of small SELECT queries, and even submitting a node form involves INSERTs and UPDATEs of a small number of rows. When running a migration import operation, however, the traffic against the destination Drupal server consists of many INSERT and UPDATE operations in rapid succession. If you get timings on your migration operations (using xhprof, or the --instrument=timer flag) you'll usually find the bulk of your time is spent in node_save(), user_save(), etc. - and the bulk of that time is spent writing to the database. And, on rollback, we make use of "multiple" operations where we can - node_delete_multiple() for example. These can result is large DELETE queries.

So, in a large-scale migration project, you'll want to tune your development and staging environments appropriately (and perhaps even, temporarily, your production environment) to improve write performance.

Indexes

Indexes can add a lot of overhead to database writes, particularly when there are multiple multi-column indexes on a table (I'm looking at you, comment!). Dropping the indexes before migration and recreating them afterwards can be a big help. You can do this by wrapping your drush migrate commands in a script that issues the necessary SQL commands before and after the meat of the migration. Or, you can do in the context of the migration by creating classes derived from MigrationBase to do the work.

MySQL Configuration

Here are some sample configuration parameters for MySQL which will help improve things. Note some are InnoDB-specific and some MyISAM-specific. Also note that using MyISAM during migration may significantly increase performance.

innodb_flush_log_at_trx_commit = 0
innodb_doublewrite = 0
innodb_support_xa = 0

By default, each database transaction is flushed at commit. Disabling the flush allows MySQL to queue up commits for up to a second, and makes a big difference in performance when writing profusely. It does carry some small risk of losing data if the server crashes, but in a development or staging environment which can always be rebuilt in case of disaster the performance boost is well worth the risk. Similarly, disabling doublewrite and support_xa improve performance by reducing flushing I/O.

key_buffer_size = 128M
max_allowed_packet = 20M
query_cache_size = 128M
table_open_cache = 64
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 16M
join_buffer_size = 4M
tmp_table_size = 92M
max_heap_table_size = 92M
sort_buffer_size = 4M
innodb_additional_mem_pool_size = 8M

Migration tends to involve more complex queries, and larger volumes of data per query, than normal website operations, so it's helpful to bump up all sorts of buffer/cache sizes.

innodb_file_per_table

This isn't about migration performance per se, but in migration development and staging environments you're often rebuilding the environment, and thus deleting tables - by storing tables in separate files you're sure of fully reclaiming the disk space used.

Hardware

If you have enough flexibility regarding your hardware, RAID 10 will significantly outperform non-RAID storage. And, of course, SSD will be much faster than a classic disc drive.

Looking for support? Visit the Drupal.org forums, or join #drupal-support in IRC.