Downloaded boa.sh.txt as usual ..
then bash Boa.sh.txt
then .. boa in-stable public

================== always the octopus end of the upgrade fails with ..
Octopus [Mon Apr 8 17:39:05 BST 2013] ==> UPGRADE B: Hostmaster STATUS: upgrade completed
Octopus [Mon Apr 8 17:39:05 BST 2013] ==> UPGRADE B: Simple check if Aegir upgrade is successful Octopus [Mon Apr 8 17:39:07 BST 2013] ==> UPGRADE B: FATAL ERROR: Required file /data/disk/o1/aegir/distro/008/sites/octopus.somedomain.org/settings.php does not exist
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> UPGRADE B: FATAL ERROR: Aborting AegirSetupB installer NOW!
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> UPGRADE A: FATAL ERROR: AegirSetupB installer failed
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> UPGRADE A: FATAL ERROR: Aborting AegirSetupA installer NOW!
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> FATAL ERROR: AegirSetupA installer failed
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> FATAL ERROR: Aborting Octopus installer NOW!
waiting 4 sec
Done for /data/disk/o1
==================

I followed some other threads, and deleted folders listed after the last successful upgrade .. which in my system seems to be /data/disk/o1/aegir/distro/007 ..

but whether running 'octopus up-stable o1' .. or restarting the entire thing via 'boa in-stable public' .. I always end up exactly with the same failure ..

I followed the thread of another issue, and enabled debugs before my last run .. but I am sure what logfile is required to be uploaded to this issue ..

Files: 
CommentFileSizeAuthor
#5 boa-2.0.8.issues-1.txt48.41 KBcnergis
#3 boa-2.0.8.issues.txt64.85 KBcnergis

Comments

Category:bug» support
Status:Active» Postponed (maintainer needs more info)

You should *not* run upgrade with boa installer. Please read the docs/UPGRADE.txt linked on the project page and provide more details and full debug output.

Also, try to run before the upgrade:

syncpass fix o1

If this doesn't help, check if you have duplicate/old grants similar to existing in the octopus.awatori.org/settings.php file and delete them first.

But really, enable debugging as required and attach a full console output for further assistance.

Also, make sure you have checked for duplicates both db and user tables in the mysql database, remove any duplicates using IP or similar but not used, then run syncpass fix o1 and then upgrade.

Category:support» bug
Status:Postponed (maintainer needs more info)» Active
StatusFileSize
new64.85 KB

Thanks for the speedy response, omega8cc .. Much appreciated.

I checked the database, and there were two similarly named DBs, with one of them having no data and just one non-information_schema table .. So, I deleted that table, and it was also ironically the database in the settings file for the last upgrade I ran for boa-2.0.7 .. So, it is possible that this problem was created by the upgrade to 2.0.7 and not this one. Incidentally, mysql struggled in deleting this table and the kept crashing complaining that mysql.proc needed to be repaired (even after successful repairs of mysql.proc table) .. In the end, I had to shutdown mysql and move the innodb files for this DB out of the way, and then restarted mysql

I also found the owner of that DB I deleted existing in the user's table, but unlike the owner of the seemingly more valid DB (with all tables and data), it did not have a corresponding user account corresponding to the aegir instance .. So, I also deleted this extra user account.

I then synced password as you suggested .. and again, ran 'bash BOA.sh.txt' .. followed by 'boa in-stable public' .. but I ended up with the same error result. My debug session is attached here ..

In hindsight, perhaps I should also have deleted the directory created by the latest wrong upgrade and the one before that (boa 2.0.7 upgrade) whose settings are pointing to the wrong DB .. But I will await your knowledgeable suggestions.

It is worth nothing that I ran the very same upgrade on another completely different server, and it completed without incidence ..

Please, replace 'deleting table' in my last comment with 'deleting database' .. apologies ..

StatusFileSize
new48.41 KB

I went back and deleted 008 corresponding to my last attempt to upgrade to boa-2.0.8 .. I also moved aside 007 which had the wrong DB information reported in my last post.

Then I synced passwords again .. and re-ran BOA.sh.txt and 'boa in-stable public' .. still the same end result, but with other error messages along the way .. see my debug session attached here

My current directory tree after my last 'boa in-stable public' looks like so:

agu:/data/disk/o1/aegir/distro# ls -ld *
drwxr-xr-x 8 o1 users 4.0K Jun 22 2012 001/
drwxr-xr-x 8 o1 users 4.0K Jul 16 2012 002/
drwxr-xr-x 8 o1 users 4.0K Jul 16 2012 003/
drwxr-xr-x 8 o1 users 4.0K Apr 9 13:33 004/
drwxr-xr-x 8 o1 users 4.0K Apr 9 13:33 005/
agu:/data/disk/o1/aegir/distro#

So, it seems the update process deleted the 006 as invalid, and then tried to use the 005 as the directory for the last successful upgrade .. or something along those lines ..

I am beginning to wonder if my previous recent upgrades have been failing silently ..

I see you're using Percona - I have the same problem since the 2.0.6 upgrade, when I added _DB_ENGINE=InnoDB and _USE_STRONG_PASSWORDS=YES.

I try to upgade o1 or o2 and it cannot find the settings.php file. Furthermore, I set in the octopus upgrade _USE_CURRENT=YES, and it does NOT reinstall on the current platform, but attempts to create a new distro anyways! Do you also have that problem?

I have since upgraded to MariaDB, but it did not help - I cannot even delete a database - neither from the command line nor from chive - I get lost connection to the mysql database - and socket connection problem errors, and attempting to drop a database causes mysql to restart - every time! I am on 2.0.8 stable now and wonder if forcing a rebuild of mysql would help? Is there a command _FORCE_MYSQL_REINSTALL=YES or something like that that I could add to the .barracuda.cnf file and run baracuda up-stable again, that might take care of this problem (without destroying the existing databases, just reinstalling the MySQL system, if this is possible? (Guess I could always reinstall MySQL and import .sql backup files ...)

1 day later update:
I have been able to get MySQL from the command line and from chive working again by:
running barracuda up-stable again, this tile setting the _DB_BINARY_LOG=YES for the first time, waiting 24 hours.
Both MySQL from the command line and chive still have small problems:

chive shows show empty databases in the schema - landing page, that still do not delete - but give an error which causes MySQL to restart immediately.

MySQL from the command line lists databases when I do a select, that, when I try to drop them I get an error stating that the database does not exist

EdNet, it does seem as if you have investigated your issues more extensively than myself .. But it also seems like your DB is in some strange state. Mind you I have not tested any other delete operations against mysql, so it is not impossible that my challenges are as much as yours.

But, if I see any further inconsistencies with mysql, then I will dump the entire databases, and rebuild but I am generally disinclined from doing this because the primary issue is likely more a script oriented and permissions issue, rather than a DB problem in particular (well, that is what I think) ...

In the worst scenario, it could be a case of finding out how to take backups of site's (without full access to the boa/octopus control panel), scrapping everything and starting again. I hate inconsistencies that I am having to carry along with in hope they will go away. Sometimes, it is just more efficient to bite the bullet and rebuild.

If you have also opened an issue, then I will suggest that we await omega8cc's suggestions. I am a sysadmin (day job .. LOL), so I blame myself for not yet finding enough time to fully dig into the scripts that make up boa/barracuda/octopus and fully understanding exactly how they work .. A greater insight into how the scripts fit together and what they do will make resolving issues like these much easier than just mere guesswork ..

Category:bug» support
Status:Active» Postponed (maintainer needs more info)

@cnergis You shouldn't delete any tables or databases, just duplicate records from the "mysql" database two tables I have listed.

Look at this:

Access denied for user 'octopussomedomainor'@'localhost' (using password: YES) [1.28 sec, 15.32 MB]                              [bootstrap]
Drush was not able to start (bootstrap) the Drupal database.                                                                  [error]
Hint: This error often occurs when Drush is trying to bootstrap a site that has not been installed or does not have a
configured database.

This means that either db credentials for octopussomedomainor are incorrect or there are two conflicting grants with different passwords.

You have to either remove duplicates or make them all match the credentials from this hostmaster site settings.php file, with command like:

mysql -u root -e "UPDATE mysql.user SET Password=PASSWORD('FOOBAR') WHERE User='octopussomedomainor';"

Note that syncpass tool doesn't sync/fix these credentials (at least not yet). It only syncs the credentials for the Aegir system/backend user.

@EdNet This issue has noting to do with Percona/MariaDB differences or secure passwords or InnoDB/MyISAM etc.

If your db server crashes and restart is forced on db delete attempt, you have some serious issues to fix in the broken databases.

I would suggest to run tail -f /var/log/syslog | grep mysql while restarting mysql in the second terminal window for hints on what is broken and how to repair this.

Forcing db server re-install will not help, because you have broken databases, not the server, so don't try to be too creative in debugging there.

That said, debugging and fixing crashed/broken databases is beyond the scope of the project and its support.

@cnergis Again, don't use boa command for upgrades. Please read and follow docs/UPGRADE.txt

omega8cc, thanks for the feedback .. and info ..

Sorry, been traveling .. Will now revisit this issue .. and come back with some feedback

Taken a good look now, and learnt a great deal more about how BOA components plug together ..

I think the initial problem was created during the install of 2.0.7 (it seems using the boa script created the situation, but I have used it in the past without issues ) .. It seems boa moved aside the old database and tried to recreate them, and was recreating it and failed half-way .. (just a stab in dark as to the original cause) .. I can see that the DB names changed for octopussomedomain and aegirsomedomain, and an attempt was made to recreate them .. unsuccessfully ..

And it seems like a name glob is used to match DBs for a domain, and the renamed and new (or expected) DB names differ by just 2 characters at the end .. So, this seemed to be creating all sorts of side effects

I traced the DB changes via the database backups .. So, I guess my recovery would involve restoring the DBs as they were (including deleting the DBs with similar names), along with permissions and trying again .. I am doing this now ..

Will come back with some feedback .. Now, I can understand why omega8cc warns repeatedly against using boa for upgrades ..

Status:Postponed (maintainer needs more info)» Closed (works as designed)

We have introduced extra procedure to sync all passwords before and after the upgrade, both for Aegir system user and the hostmaster site, so it should work fine, as long as you follow the docs and there is not too serious damage already done, like dropped database etc.

The commits for reference:

http://drupalcode.org/project/octopus.git/commit/d4538e9
http://drupalcode.org/project/octopus.git/commit/e2f5459

Status:Closed (works as designed)» Postponed (maintainer needs more info)

I still have this problem ..

But I think now the problem is with the bits of boa that generate a new directory under distro/00[0-9] .. presumably by calling drush commands ..

The primary error message now is:

Initializing drush commandfile: provision_cdn [1.43 sec, 13.02 MB]                                                  [bootstrap]
Initializing drush commandfile: provision_civicrm [1.43 sec, 13.02 MB]                                              [bootstrap]
Including /data/disk/o1/.drush/provision_civicrm/verify.provision.inc [1.43 sec, 13.02 MB]                          [bootstrap]
Including /data/disk/o1/.drush/provision/dns/verify.provision.inc [1.43 sec, 13.02 MB]                              [bootstrap]
Including /data/disk/o1/.drush/provision/platform/backupmigrate/verify.provision.inc [1.43 sec, 13.03 MB]           [bootstrap]
Including /data/disk/o1/.drush/provision_boost/verify.provision.inc [1.43 sec, 13.03 MB]                            [bootstrap]
Including /data/disk/o1/.drush/provision_cdn/verify.provision.inc [1.43 sec, 13.03 MB]                              [bootstrap]
Including /data/disk/o1/.drush/provision/platform/verify.provision.inc [1.43 sec, 13.03 MB]                         [bootstrap]
The directory /data/disk/o1/aegir/distro/007 does not contain a valid Drupal installation [1.43 sec, 13.03 MB]      [error]
Drush command terminated abnormally due to an unrecoverable error.                                                  [error]
Error: Call to undefined function conf_path() in /data/disk/o1/tools/drush/includes/environment.inc, line 823 [1.43
sec, 13.03 MB]
Output from failed command :                                                                                        [error]
Fatal error: Call to undefined function conf_path() in /data/disk/o1/tools/drush/includes/environment.inc on line

There are pointers there to keep trying to figure this out .. while awaiting input from omega8cc ..

Status:Postponed (maintainer needs more info)» Closed (cannot reproduce)

I'm afraid we can't provide further assistance, as something looks really broken there, sorry.

Make sure to use HEAD and not STABLE, if you are trying to use extended auto-recover capabilities, built-in BOA.

I can only guess that you have made things worse by manually deleting directories, instead of allowing BOA to try to recover from this mess.

Now it is hard to determine/say remotely what has been screwed up and how to fix this.

omega8cc, I thank you for your efforts in trying to help me get on top of this situation. In the end, I got too busy to try and troubleshoot the situation some more.

Besides, I had a relaxed attitude about the situation due to the fact that system was like a development system, and I was able to quickly migrate away the few important things on it.

I am starting afresh on a clean slate, and I have learnt so much more about the setup of the system.

Going forward, assuming I can find some spare cycles, in what ways can I contribute to this project ?

thanks and best regards

Issue summary:View changes

dont' know what I was thinking, but my domainname was showing in the text I had pasted here .. just fixed that.