While it runs drush @server_master provision-verify at the beginning of the upgrade, it should run it also at the end, in the function drush_provision_post_hostmaster_migrate(). However, this backend verify doesn't really work, as it doesn't update drush alias for @server_master and @server_localhost and doesn't update web server (in this case Nginx) config files.

The only working method to get both drush alias and config files updated on hostmaster upgrade is to call it via front-end with force flag, so it will look like below:

function drush_provision_post_hostmaster_migrate($site, $platform) {
  // we pass the context names we generated to the task so we can enforce that the names
  // stay the same.
  provision_backend_invoke(drush_get_option('site_name'), 'hostmaster-resume', array(), array(
    'old_platform_name' => drush_get_option('old_platform'),
    'new_platform_name' => drush_get_option('new_platform'),
  ));
  // We need to re-verify @server_master via frontend to re-generate
  // its drush alias and to update Nginx configuration files.
  provision_backend_invoke('@hostmaster', 'hosting-task', array(@server_master, 'verify'), array('force' => TRUE));
}

Comments

omega8cc’s picture

Of course this will initiate all platforms re-verify, which is not really good side effect, so it should be fixed in the backend probably.

omega8cc’s picture

Project: Hostmaster (Aegir) » Provision

Moving this to the correct queue.

omega8cc’s picture

Status: Active » Needs review
omega8cc’s picture

Priority: Normal » Critical

I think this is critical, as it causes broken upgrades (Nginx restart fails) between versions with new variables or maps introduced/changed, as it is in #1552228: Nginx configuration for limit_conn should be compatible with version >= 1.1.8.

Steven Jones’s picture

Status: Needs review » Postponed

6.x-2.x is super unstable during the work to get it upgraded to Drush 5, so I'm going to mark this as postponed until we've got the rest of Aegir working, and then we can look at the upgrade path.

omega8cc’s picture

OK, but note that the same issue applies to 6.x-1.x so we could move this to 6.x-1.x instead.

Steven Jones’s picture

Version: 6.x-2.x-dev » 6.x-1.x-dev
Status: Postponed » Needs review
Steven Jones’s picture

I'm not sure why we need to go via the frontend to re-verify servers, the backend should have all the information it needs to re-verify successfully.

Anonymous’s picture

Status: Needs review » Needs work

However, this backend verify doesn't really work, as it doesn't update drush alias for @server_master and @server_localhost and doesn't update web server (in this case Nginx) config files.

I'm also not clear on what's wrong with the alias at this point? What attributes are reflecting the old system? There shouldn't be anything specific to the old aegir system in the server contexts I think?

If that is actually an issue, then the issue seems to be that looking at migrate.hostmaster.inc, we only save the *server* contexts if those aliases didn't already exist (e.g pre 0.4-alpha9), and/or that resume.hostmaster.inc only updates the platform-wide contexts.

But since I don't think we need to update the server contexts, we only need to invoke a verify of the server objects, and shouldn't they automatically update your configs as a result? Or are you putting something new into the server contexts that affects the updating of those configs?

omega8cc’s picture

This code detects Nginx version and compiled in modules to decide what config options and include files to use: http://drupalcode.org/project/provision.git/blob/refs/heads/6.x-1.x:/htt...

The problem here is that in fact we depend on this code running and updating the server drush alias and then Nginx config (options in the main server config and config includes used in vhosts) in two situations: when something changes on the system in the meantime - like Nginx version and/or its compiled in modules (which is an edge case and not the real problem here), or when there are some options added/changed (like maps) in the main Nginx config, affecting variables used/available in the config includes.

Basically, the config includes are not updated on hostmaster upgrade - no matter if there are any changes in the Nginx config itself or Nginx version and compiled in modules.

The point is that it works just fine when you run server verify via frontend and doesn't work when you run server verify in the backend - which is something I simply don't get. It should just work the same and update both server alias and config files, but it doesn't do that - even if we do run the server verify in the backend on upgrade. Also, even if you will add backend-only verify in more places than it is now, it will not help, it just doesn't touch server alias and web server config on upgrade as it does when running the verify via frontend.

Anonymous’s picture

The only thing that is different in the frontend is that the frontend runs the provision-save first, then provision-verify.

Maybe we just need to run provision-save from backend with all appropriate arguments, and then provision-verify from backend too? Is there any chance that only provision-verify was run in your tests, meaning that the context wasn't updated in order for the following provision-verify to change anything?

omega8cc’s picture

Probably.

Here is how it looks like now: https://gist.github.com/2693272

And with my patch: https://gist.github.com/2693467

I have included only all Running: lines for clarity.

anarcat’s picture

Regenerating the server configs seem to be appropriate in hostmaster-migrate.

I think we should run provision-verify instead of hosting-task, however. I do not understand why we should run provision-save though - what's to be updated in the alias anyways?

omega8cc’s picture

I have tried to explain why we need the alias updated in #10 above.

Here is an example of variables stored in the server drush alias we rely on to properly configure/update Nginx configuration on hostmaster upgrade:

  'nginx_has_gzip' => 1,
  'nginx_web_server' => 1,
  'nginx_has_upload_progress' => 1,
  'nginx_is_modern' => 1,
  'provision_db_cloaking' => false,
anarcat’s picture

Is it possible that you are a victim of the bug i found here? #1612044: upgrade from 1.7 to 2.x fails because of drush 5 command file cache

omega8cc’s picture

I don't think it could be related, because we don't use any Drush 5 related code yet, we use Provision 2.x fork in the Drush 4.6-dev compatibility mode, so we can still support Drupal 5,6,7 and 8 core.

ergonlogic’s picture

Status: Needs work » Postponed (maintainer needs more info)

Is this still a problem? If so, can you provide step-by-step instructions on how to replicate this bug?

omega8cc’s picture

Version: 6.x-1.x-dev » 6.x-2.x-dev
Status: Postponed (maintainer needs more info) » Needs work

Yes, the problem is still there. The steps to reproduce this are simple: just try standard upgrade on system with Nginx as a web server and check if the web server include config files have been updated as they should after backend based server verify - they don't. Hence the same trick with server verify via front-end is still required, as shown here.

anarcat’s picture

Status: Needs work » Postponed (maintainer needs more info)

In light of recent improvements to the config file management, I'd like to ask for another test on this, I believe this is fixed now. :) See #2000038: cannot generate multiple configuration files in nginx for the fix...

ergonlogic’s picture

Issue summary: View changes

If a front-end verify is required on the hostmaster site, it seems like drush_provision_post_hostmaster_migrate() isn't the right place to do it. It'd be more appropriate to enqueue a verify task in hook_update_n() for the relevant module(s) that introduced the changes.

Alternatively, if this has to happen before the front-end updates, then calling provision-save with the appropriate parameters and then provision-verify should have the same effect.

ergonlogic’s picture

Priority: Critical » Major

Also, bumping down to 'major' since there hadn't been an update to this issue in 2 years.

helmo’s picture

Status: Postponed (maintainer needs more info) » Closed (outdated)

The 6.x-2.x branch will go EOL along with Drupal this week. So I'm closing
this issue. If it remains a confirmed issue in 7.x-3.x, feel free to re-open,
or better yet, create a new issue referencing this one.