Downloaded boa.sh.txt as usual ..
then bash Boa.sh.txt
then .. boa in-stable public
================== always the octopus end of the upgrade fails with ..
Octopus [Mon Apr 8 17:39:05 BST 2013] ==> UPGRADE B: Hostmaster STATUS: upgrade completed
Octopus [Mon Apr 8 17:39:05 BST 2013] ==> UPGRADE B: Simple check if Aegir upgrade is successful Octopus [Mon Apr 8 17:39:07 BST 2013] ==> UPGRADE B: FATAL ERROR: Required file /data/disk/o1/aegir/distro/008/sites/octopus.somedomain.org/settings.php does not exist
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> UPGRADE B: FATAL ERROR: Aborting AegirSetupB installer NOW!
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> UPGRADE A: FATAL ERROR: AegirSetupB installer failed
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> UPGRADE A: FATAL ERROR: Aborting AegirSetupA installer NOW!
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> FATAL ERROR: AegirSetupA installer failed
Octopus [Mon Apr 8 17:39:07 BST 2013] ==> FATAL ERROR: Aborting Octopus installer NOW!
waiting 4 sec
Done for /data/disk/o1
==================
I followed some other threads, and deleted folders listed after the last successful upgrade .. which in my system seems to be /data/disk/o1/aegir/distro/007 ..
but whether running 'octopus up-stable o1' .. or restarting the entire thing via 'boa in-stable public' .. I always end up exactly with the same failure ..
I followed the thread of another issue, and enabled debugs before my last run .. but I am sure what logfile is required to be uploaded to this issue ..
Comment | File | Size | Author |
---|---|---|---|
#5 | boa-2.0.8.issues-1.txt | 48.41 KB | cnergis |
#3 | boa-2.0.8.issues.txt | 64.85 KB | cnergis |
Comments
Comment #1
omega8cc CreditAttribution: omega8cc commentedYou should *not* run upgrade with
boa
installer. Please read the docs/UPGRADE.txt linked on the project page and provide more details and full debug output.Also, try to run before the upgrade:
syncpass fix o1
If this doesn't help, check if you have duplicate/old grants similar to existing in the
octopus.awatori.org/settings.php
file and delete them first.But really, enable debugging as required and attach a full console output for further assistance.
Comment #2
omega8cc CreditAttribution: omega8cc commentedAlso, make sure you have checked for duplicates both
db
anduser
tables in themysql
database, remove any duplicates using IP or similar but not used, then runsyncpass fix o1
and then upgrade.Comment #3
cnergis CreditAttribution: cnergis commentedThanks for the speedy response, omega8cc .. Much appreciated.
I checked the database, and there were two similarly named DBs, with one of them having no data and just one non-information_schema table .. So, I deleted that table, and it was also ironically the database in the settings file for the last upgrade I ran for boa-2.0.7 .. So, it is possible that this problem was created by the upgrade to 2.0.7 and not this one. Incidentally, mysql struggled in deleting this table and the kept crashing complaining that mysql.proc needed to be repaired (even after successful repairs of mysql.proc table) .. In the end, I had to shutdown mysql and move the innodb files for this DB out of the way, and then restarted mysql
I also found the owner of that DB I deleted existing in the user's table, but unlike the owner of the seemingly more valid DB (with all tables and data), it did not have a corresponding user account corresponding to the aegir instance .. So, I also deleted this extra user account.
I then synced password as you suggested .. and again, ran 'bash BOA.sh.txt' .. followed by 'boa in-stable public' .. but I ended up with the same error result. My debug session is attached here ..
In hindsight, perhaps I should also have deleted the directory created by the latest wrong upgrade and the one before that (boa 2.0.7 upgrade) whose settings are pointing to the wrong DB .. But I will await your knowledgeable suggestions.
It is worth nothing that I ran the very same upgrade on another completely different server, and it completed without incidence ..
Comment #4
cnergis CreditAttribution: cnergis commentedPlease, replace 'deleting table' in my last comment with 'deleting database' .. apologies ..
Comment #5
cnergis CreditAttribution: cnergis commentedI went back and deleted 008 corresponding to my last attempt to upgrade to boa-2.0.8 .. I also moved aside 007 which had the wrong DB information reported in my last post.
Then I synced passwords again .. and re-ran BOA.sh.txt and 'boa in-stable public' .. still the same end result, but with other error messages along the way .. see my debug session attached here
My current directory tree after my last 'boa in-stable public' looks like so:
agu:/data/disk/o1/aegir/distro# ls -ld *
drwxr-xr-x 8 o1 users 4.0K Jun 22 2012 001/
drwxr-xr-x 8 o1 users 4.0K Jul 16 2012 002/
drwxr-xr-x 8 o1 users 4.0K Jul 16 2012 003/
drwxr-xr-x 8 o1 users 4.0K Apr 9 13:33 004/
drwxr-xr-x 8 o1 users 4.0K Apr 9 13:33 005/
agu:/data/disk/o1/aegir/distro#
So, it seems the update process deleted the 006 as invalid, and then tried to use the 005 as the directory for the last successful upgrade .. or something along those lines ..
I am beginning to wonder if my previous recent upgrades have been failing silently ..
Comment #6
Anonymous (not verified) CreditAttribution: Anonymous commentedI see you're using Percona - I have the same problem since the 2.0.6 upgrade, when I added _DB_ENGINE=InnoDB and _USE_STRONG_PASSWORDS=YES.
I try to upgade o1 or o2 and it cannot find the settings.php file. Furthermore, I set in the octopus upgrade _USE_CURRENT=YES, and it does NOT reinstall on the current platform, but attempts to create a new distro anyways! Do you also have that problem?
I have since upgraded to MariaDB, but it did not help - I cannot even delete a database - neither from the command line nor from chive - I get lost connection to the mysql database - and socket connection problem errors, and attempting to drop a database causes mysql to restart - every time! I am on 2.0.8 stable now and wonder if forcing a rebuild of mysql would help? Is there a command _FORCE_MYSQL_REINSTALL=YES or something like that that I could add to the .barracuda.cnf file and run baracuda up-stable again, that might take care of this problem (without destroying the existing databases, just reinstalling the MySQL system, if this is possible? (Guess I could always reinstall MySQL and import .sql backup files ...)
1 day later update:
I have been able to get MySQL from the command line and from chive working again by:
running barracuda up-stable again, this tile setting the _DB_BINARY_LOG=YES for the first time, waiting 24 hours.
Both MySQL from the command line and chive still have small problems:
chive shows show empty databases in the schema - landing page, that still do not delete - but give an error which causes MySQL to restart immediately.
MySQL from the command line lists databases when I do a select, that, when I try to drop them I get an error stating that the database does not exist
Comment #7
cnergis CreditAttribution: cnergis commentedEdNet, it does seem as if you have investigated your issues more extensively than myself .. But it also seems like your DB is in some strange state. Mind you I have not tested any other delete operations against mysql, so it is not impossible that my challenges are as much as yours.
But, if I see any further inconsistencies with mysql, then I will dump the entire databases, and rebuild but I am generally disinclined from doing this because the primary issue is likely more a script oriented and permissions issue, rather than a DB problem in particular (well, that is what I think) ...
In the worst scenario, it could be a case of finding out how to take backups of site's (without full access to the boa/octopus control panel), scrapping everything and starting again. I hate inconsistencies that I am having to carry along with in hope they will go away. Sometimes, it is just more efficient to bite the bullet and rebuild.
If you have also opened an issue, then I will suggest that we await omega8cc's suggestions. I am a sysadmin (day job .. LOL), so I blame myself for not yet finding enough time to fully dig into the scripts that make up boa/barracuda/octopus and fully understanding exactly how they work .. A greater insight into how the scripts fit together and what they do will make resolving issues like these much easier than just mere guesswork ..
Comment #8
omega8cc CreditAttribution: omega8cc commented@cnergis You shouldn't delete any tables or databases, just duplicate records from the "mysql" database two tables I have listed.
Look at this:
This means that either db credentials for
octopussomedomainor
are incorrect or there are two conflicting grants with different passwords.You have to either remove duplicates or make them all match the credentials from this hostmaster site settings.php file, with command like:
mysql -u root -e "UPDATE mysql.user SET Password=PASSWORD('FOOBAR') WHERE User='octopussomedomainor';"
Note that
syncpass
tool doesn't sync/fix these credentials (at least not yet). It only syncs the credentials for the Aegir system/backend user.Comment #9
omega8cc CreditAttribution: omega8cc commented@EdNet This issue has noting to do with Percona/MariaDB differences or secure passwords or InnoDB/MyISAM etc.
If your db server crashes and restart is forced on db delete attempt, you have some serious issues to fix in the broken databases.
I would suggest to run
tail -f /var/log/syslog | grep mysql
while restarting mysql in the second terminal window for hints on what is broken and how to repair this.Forcing db server re-install will not help, because you have broken databases, not the server, so don't try to be too creative in debugging there.
That said, debugging and fixing crashed/broken databases is beyond the scope of the project and its support.
Comment #10
omega8cc CreditAttribution: omega8cc commented@cnergis Again, don't use
boa
command for upgrades. Please read and follow docs/UPGRADE.txtComment #11
cnergis CreditAttribution: cnergis commentedomega8cc, thanks for the feedback .. and info ..
Sorry, been traveling .. Will now revisit this issue .. and come back with some feedback
Comment #12
cnergis CreditAttribution: cnergis commentedTaken a good look now, and learnt a great deal more about how BOA components plug together ..
I think the initial problem was created during the install of 2.0.7 (it seems using the boa script created the situation, but I have used it in the past without issues ) .. It seems boa moved aside the old database and tried to recreate them, and was recreating it and failed half-way .. (just a stab in dark as to the original cause) .. I can see that the DB names changed for octopussomedomain and aegirsomedomain, and an attempt was made to recreate them .. unsuccessfully ..
And it seems like a name glob is used to match DBs for a domain, and the renamed and new (or expected) DB names differ by just 2 characters at the end .. So, this seemed to be creating all sorts of side effects
I traced the DB changes via the database backups .. So, I guess my recovery would involve restoring the DBs as they were (including deleting the DBs with similar names), along with permissions and trying again .. I am doing this now ..
Will come back with some feedback .. Now, I can understand why omega8cc warns repeatedly against using boa for upgrades ..
Comment #13
omega8cc CreditAttribution: omega8cc commentedWe have introduced extra procedure to sync all passwords before and after the upgrade, both for Aegir system user and the hostmaster site, so it should work fine, as long as you follow the docs and there is not too serious damage already done, like dropped database etc.
The commits for reference:
http://drupalcode.org/project/octopus.git/commit/d4538e9
http://drupalcode.org/project/octopus.git/commit/e2f5459
Comment #14
cnergis CreditAttribution: cnergis commentedI still have this problem ..
But I think now the problem is with the bits of boa that generate a new directory under distro/00[0-9] .. presumably by calling drush commands ..
The primary error message now is:
There are pointers there to keep trying to figure this out .. while awaiting input from omega8cc ..
Comment #15
omega8cc CreditAttribution: omega8cc commentedI'm afraid we can't provide further assistance, as something looks really broken there, sorry.
Make sure to use HEAD and not STABLE, if you are trying to use extended auto-recover capabilities, built-in BOA.
I can only guess that you have made things worse by manually deleting directories, instead of allowing BOA to try to recover from this mess.
Now it is hard to determine/say remotely what has been screwed up and how to fix this.
Comment #16
cnergis CreditAttribution: cnergis commentedomega8cc, I thank you for your efforts in trying to help me get on top of this situation. In the end, I got too busy to try and troubleshoot the situation some more.
Besides, I had a relaxed attitude about the situation due to the fact that system was like a development system, and I was able to quickly migrate away the few important things on it.
I am starting afresh on a clean slate, and I have learnt so much more about the setup of the system.
Going forward, assuming I can find some spare cycles, in what ways can I contribute to this project ?
thanks and best regards
Comment #16.0
cnergis CreditAttribution: cnergis commenteddont' know what I was thinking, but my domainname was showing in the text I had pasted here .. just fixed that.