https://gist.github.com/2258564 (server config pre upgrade, post available if needed)

Just attempted an upgrade to 2.0.3 and had the following errors and warnings. End result seemed to be all sites on "under maintenace" pages, and CGP was also not displaying any graphs. Reloaded a server backup after that, but saved a copy of the same files as in the above link from the botched upgrade in case they'd be handy. I've never had upgrade troubles before, except having to add that "upload_progress uploads 1m;" to the config file to get nginx to start properly. Do any of these look familiar? Thanks!

nginx on server.mysite.com could not be restarted.      [warning]
Changes might not be available until this has been done.
(error: Reloading nginx configuration: nginx: [warn] the
"limit_zone" directive is deprecated, use the
"limit_conn_zone" directive instead in
/etc/nginx/conf.d/aegir.conf:76
nginx: [warn] conflicting server name "_" on
173.xxx.xxx.xxx:443, ignored
Drush was not able to start (bootstrap) the Drupal database. [error]
Hint: This error often occurs when Drush is trying to
bootstrap a site that has not been installed or does not have
a configured database.

Drush was attempting to connect to : 
  Drupal version    : 6.24
  Site URI          : aegir.mysite.com
  Database driver   : mysqli
  Database hostname : server.mysite.com
  Database username : aegirmysite_0
  Database name     : aegirmysite_0
  Default theme     : garland
  Administration theme: garland
  PHP configuration : /usr/local/lib/php.ini
  Drush version     : 4.6-dev
  Drush configuration:
/var/aegir/host_master/011/sites/aegir.mysite.com/drushrc.php
/var/aegir/host_master/011/drushrc.php
  Drush alias files :
/var/aegir/.drush/hostmaster.alias.drushrc.php
/var/aegir/.drush/server_master.alias.drushrc.php
/var/aegir/.drush/platform_011.alias.drushrc.php
/var/aegir/.drush/platform_010.alias.drushrc.php
  Drupal root       : /var/aegir/host_master/011
  Site path         : sites/aegir.mysite.com
  Modules path      : sites/aegir.mysite.com/modules
  Themes path       : sites/aegir.mysite.com/themes
  %paths            : Array
nginx on server.mysite.com could not be restarted.      [warning]
Changes might not be available until this has been done.
(error: Reloading nginx configuration: nginx: [warn] the
"limit_zone" directive is deprecated, use the
"limit_conn_zone" directive instead in
/etc/nginx/conf.d/aegir.conf:76
nginx: [warn] conflicting server name "_" on
173.230.158.212:443, ignored
nginx: the configuration file /etc/nginx/nginx.conf syntax is
ok
nginx: [emerg] zero size shared memory zone "uploads"
nginx: configuration file /etc/nginx/nginx.conf test failed)

Comments

leevh’s picture

Title: Problems upgrading » not able to start (bootstrap) the Drupal database on Aegir upgrade
omega8cc’s picture

Is there any chance you have the backend upgrade log archived?

It is stored in /var/backups/barracuda-upgrade-DATE-TIME.log file.

We have upgraded already many servers on Debian Squeeze from 2.0.2 to 2.0.3, including some with really long upgrade history, even from pre-BOA installs and we didn't experience anything like that on any server, so we would need more debug logs to check for details on what could wrong here.

It could also help if you could attach/post console log, as there should be also debug info included related to the hostmaster upgrade.

leevh’s picture

Hi Omega8cc, thanks for having a look

upgrade log +partial console log (missed the beginning of aegir upgrade) :
https://gist.github.com/2732898

If a full console history is needed I could redo the upgrade and get it. To get these logs I had to reboot into the botched profile in linode, and after adding the "upload_progress uploads 1m;" hack again (I *still* need this apparently) I was able to start nginx and all my sites worked! I can't tell if my master aegir was actually upgraded or not though. Is there an easy way to tell?

Thanks again!

omega8cc’s picture

So it looks similar to the other issue: #1593912: Barracuda 2.0.3 upgrade disconnects database

It is really weird why there is still that nginx: [emerg] zero size shared memory zone "uploads" error - it looks like this hostmaster was not really upgraded.

So you are using this "broken" server and it just works after fixing manually this nginx: [emerg] zero size shared memory zone "uploads" error?

If yes, please check where the hostmaster site really is now - in /var/aegir/host_master/011/sites/aegir.mysite.com or maybe in /var/aegir/host_master/012/sites/aegir.mysite.com?

leevh’s picture

I am not using the "broken" server, but rather a restore. I do have access to it though upon reboot.

I rebooted into the broken server and checked what you asked. The aegir site was still in 011, though 012 had been created and was empty.

Is there any other info I could provide, or I could repeat the upgrade and see how it goes and perhaps provide any details from that.

thanks for your help as usual!

leevh’s picture

Attempted the upgrade again with the same result, but I logged the entire SSH session, in case it provides any clues :)

https://gist.github.com/2258564#file_varaegirinstall.log

leevh’s picture

Searching for the bootstrap issue came up with http://drupal.org/node/1277696 His solution worked for me too! simply ran the up-stable again and this time the hostmaster upgrade worked! I'm *guessing* it must be something to do with the database updates that happened in the initial upgrade that needed a reboot or something before it could be used..

CGP graphs not working though, am following http://drupal.org/node/1598676 :)

omega8cc’s picture

Component: Miscellaneous » Code
Category: support » bug

Thanks for details. We need to find a pattern and then figure out how to fix this, since similar issues affected already a few people, and while it is hard to reproduce, it is something we need to fix, one way or another.

jamiet’s picture

I suffered similar symptons to this and issue #1593912: Barracuda 2.0.3 upgrade disconnects database. Nginx would restart but I received a similar warning

nginx: [warn] the
"limit_zone" directive is deprecated, use the
"limit_conn_zone" directive instead in
/etc/nginx/conf.d/aegir.conf:76

I had to restore the server from a backup as it was hosting live sites but I uploaded the upgrade log to the other issue for info prior to restoring the server.

Hopefully this will help figure out the issue?

Reading some of the other comments I was thinking it may be worth running the upgrade again and saying N to hostmaster upgrade then reboot the server check all the sites connect to DB ok and then run the upgrade a second time and select Y for upgrading the hostmaster? Do you think that may work? Not sure I want to go through the hassle of a restore again ;).

TIA,

JamieT

omega8cc’s picture

Status: Active » Closed (cannot reproduce)

We now force Nginx packages uninstall on every upgrade, and we use always Nginx built from sources, so issues like this shouldn't happen again, because package will no longer be able to overwrite working Nginx configuration and cause further issues, like broken hostmaster upgrades etc.