Before doing some transfers (a db pull from live back to -dev) a 'Backup' was made through the Aegir UI of the working state of our local copy.
It turns out that we later needed to restore to that point, just as well we had a backup, and attempted to use 'Restore'.

It failed with a number of errors of the form:

The version of the system module found on this platform (7077) has a lower Schema version than the one the site has installed (7078)

Then looked like it tried to roll back and failed doing that too.

I was hoping that 'Restore' would at least get us back to the previous state, whatever that may be. However it seems to be doing some extra validation.
I am *suspecting* (this is not my project, I'm just troubleshooting) that the previous state that was backed up had had a core update done on that platform (so far so good) but may not have had the necessary update.php run yet.
Technically, the snapshot that was taken was probably 'unstable', and needed that schema upgrade to be applied. But still...

This validation thing that provision-restore is doing is nice and all, but it means I now have no recovery path at all. Site is disabled.
I was hoping for a --force option I could pass to provision-restore, but can't see it in the help info.

.. We manually unpacked the tgz and dumped the SQL onto the broken site. 'Verify' made it live again.
Looking at it ... Nope, it doesn't seem that an update.php was pending.

Re-reading the error message, the (restored?) CODE was older than the (restored?) DB version. Hm.
So that's the reverse of my guess.
It's now harder to imagine how that could have been the state that got backed up in the first place.

Not sure what's up with that. But maybe a --force option to override that validation would have been handy.

Comments

dman’s picture

Category: bug » support

I'm trying to absorb the description in The best recipes for disaster and how to avoid them and then communicate that to the team member that got into this state.
I think the key point to take away is that:
*Backup* does not backup the platform, and *Restore* does not restore the platform.
So Restoring a site onto an upgraded platform (We do inline git-managed upgrades) sorta doesn't fit.

I guess our training need to keep hammering in the distinction between 'platform' and ' site'. It's hard when all our projects are one-offs, and always bound 1:1 between site and project-platform.

ergonlogic’s picture

Status: Active » Fixed

had a core update done on that platform

We recommend against changing platforms, once they're in place. Doing so won't run any necessary updates on the sites, nor does it provide a safe rollback. To update core and modules in sites/all, we recommend creating a new platform with the up-to-date code, and then migrating sites to the new platform.

*Backup* does not backup the platform, and *Restore* does not restore the platform.

Right, these are site tasks, and a backup will only capture a database dump, and anything in the site's directory. There's been some work on #1138882: Support drush archive-dump, which might give you an archive (backup) with the whole platform, but it remains a work-in-progress. We'd probably need this venerable issue resolved, for that to be useful too: #322788: generic import mechanism for external database dumps and site backups

I'm marking this as 'fixed', as #1 seems to indicate that training is the solution here. Feel free to re-open, or open a new issue, if I missed anything. Also, feel free to ask questions in #aegir on freenode, if you're on IRC.

dman’s picture

Yeah, thanks. It's certainly fixed-as-designed. I mainly noted it here as a description of an issue to clarify my thinking while trying to get it straight.

I'm aware of the idea of making a new version of a platform for every code change - and am trying to get our devs into at least doing tagged releases. But for sites that are a work-in-progress pre-release that causes a lot of overhead for our content managers and builders if the 'dev' URL was to keep hopping ahead to a new version every week. Thinking of seeing if I can use url-aliases to mask that issue, but it still mains different file paths for the devs each time ..
I know that Aegir is geared towards managed deployments of totally finished sites, but our usage currently is for rapid setup of active dev and staging sites more than anything.

I've been watching and hoping for archive-dump support :-) partly because I think it will bridge the unneccessary gap between Aegir and Acquia hosting that exists at the moment. But also because it could provide full manual portability between systems.

Thanks for the reply!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.