By default, each time a migration is run, any previously unimported source items are imported (along with any previously-imported items marked for update). If the source data contains a timestamp that is set to the creation time of each new item, and changed to the update time every time the item is updated, then you can have those updated items automatically reimported by setting the field as your highwater field.
There are two key things to do in your Migration constructor to take advantage of highwater marks:
- Define $this->highwaterField, to indicate what field returned by your source query will reflect updates to your data.
- Order your query by the field specified in $this->highwaterField.
So, if your query looks like
$query = db_select('migrate_example_wine', 'w')
->fields('w', array('wineid', 'name', 'body', 'excerpt', 'accountid',
'posted', 'last_changed', 'variety', 'region', 'rating'));
and the last_changed field is a (UNIX integer) timestamp that is set to the created date/time when the source object is created, and updated whenever it is changed, in your constructor include:
$this->highwaterField = array(
'name' => 'last_changed',
'alias' => 'w',
'type' => 'int',
Because the last_changed field is a UNIX timestamp, 'type' => 'int" is required here. If you have date/time fields that ar lexicographically sortable (e.g., '2011-05-19 17:53:12'), you can omit the 'type' entry.
And, it's important to sort your query by the highwater field:
What precisely happens when your migration is setup to use highwater marks?
- The first time you import, Migrate saves the highwater field value as the "highwater mark" (this is why you must order the query, so the last row processed has the highest highwater value).
- Over time, new content is added (with last_updated values greater than the highwater mark saved by Migrate), and old content is updated (changing the last_updated value to be greater than the highwater mark).
- The next time you run import, Migrate will automatically alter your source query to pull all content where the highwater field is greater than its saved highwater mark. That means it will both import any content added since the last time it ran, and also re-import any content changed since that time.
Thus, if you want to schedule (e.g., via cron) regular updates of your destination site taking into account both inserts and updates of your source data, highwater marks are very useful.