I tried to install Panopoly which failed (#1837312: Troubles Implementing Default Permissions for Panelizer).
The anoying thing is that log in hostmaster was not helping.

There are a lot of green lines, and some white. But none red...

After the standard site setup stuff it gets interesting with:

Backend invoke: php /usr/share/php/drush/drush.php --backend=2 --yes @panopoly2.dev.example.com provision-install-backend --client_email='webmaster@example.com' 2>&1

But then the inconspicuous line:
Undefined index: login_link install.provision.inc:60

And provision begins to clean up the site.

Is this related to #1166530: Aegir use of $integrate = FALSE in drush backend invoke?

If we don't want to integrate the full backend output, could we at least print a summary line to state a failure?

Comments

We should get the full log in the frontend, that's for sure.

This is still bugging me every now and then.

A usecase:

The http://drupal.org/project/hosting_site_backup_manager module.
When you run a 'Export backup' task a second time it fails.

The reason is clear, the link this task is trying to create already exists. (I should aad a test for that ;))

In the aegir frontend I only see green and white lines. But still the task itself is 'Failed'.

When I run the same task on the commandline:

$ drush --debug @www.example.com provision-export_backup /var/aegir/backups/www.example.com-20130503.144907.tar.gz
Bootstrap to phase 0. [0.01 sec, 3.04 MB]                            [bootstrap]
Drush bootstrap phase : _drush_bootstrap_drush() [0.02 sec, 3.21 MB] [bootstrap]
Loading drushrc "/var/aegir/.drush/drushrc.php" into "home.drush" scope. [0.02 sec, 3.22 MB]                      [bootstrap]
Loaded alias @www.example.com from file /var/aegir/.drush/www.example.com.alias.drushrc.php [0.03 sec, 3.23 MB]       [notice]
Loading drushrc "/var/aegir/platforms/myplatform/sites/www.example.com/drushrc.php" [bootstrap]
into "site" scope. [0.03 sec, 3.23 MB]
Loading drushrc "/var/aegir/platforms/myplatform/sites/www.example.com/drushrc.php" [bootstrap]
into "site" scope. [0.05 sec, 4.34 MB]
Cache HIT cid: 5.8-commandfiles-0-79226bd154e24f6bec355c2411ac7322 [0.06 sec, 4.35 MB]                                                 [debug]
Bootstrap to phase 0. [0.15 sec, 9.65 MB]                                                                                          [bootstrap]
Bootstrap to phase 0. [0.21 sec, 9.65 MB]                                                                                          [bootstrap]
Found command: provision-export_backup (commandfile=hosting_site_backup_manager) [0.22 sec, 9.65 MB]                               [bootstrap]
Loaded alias @myserveralias from file                                                                               [notice]
/var/aegir/.drush/myserveralias.alias.drushrc.php [0.74 sec, 9.68 MB]
Loaded alias @server_master from file /var/aegir/.drush/server_master.alias.drushrc.php [1.17 sec, 9.7 MB]                [notice]
Loading mysql driver for the db service [1.3 sec, 9.82 MB]                                                                            [notice]
Loading apache driver for the http service [1.31 sec, 9.91 MB]                                                                        [notice]
Loading mysql driver for the db service [1.31 sec, 9.92 MB]                                                                           [notice]
Loading apache driver for the http service [1.31 sec, 9.92 MB]                                                                        [notice]
Loaded alias @myplatformalias from file                                                                     [notice]
/var/aegir/.drush/myplatformalias.alias.drushrc.php [1.57 sec, 9.93 MB]
Start export backup command [1.72 sec, 9.99 MB]                                                                                    [ok]
Calling chdir(/var/aegir/backup-exports/myclient) [1.72 sec, 9.99 MB]                                                   [debug]
Executing: ln /var/aegir/backups/www.example.com-20130503.144907.tar.gz .
  ln: creating hard link `./www.example.com-20130503.144907.tar.gz': File exists
Calling chdir(/var/aegir/backups) [1.72 sec, 9.99 MB]                                                                 [debug]
Drush was unable to export the backup.                                                                                             [error]
The specific errors are below:
ln: creating hard link `./www.example.com-20130503.144907.tar.gz': File exists [1.72 sec, 9.99 MB]
Command dispatch complete [1.72 sec, 9.94 MB]                                                                                         [notice]
Peak memory usage was 11.2 MB [1.73 sec, 9.94 MB]

So the drush_shell_cd_and_exec() call gives an error, the else branch of the if logs it. But the output is lost....

Priority:Normal» Critical

I wonder if this is related with #1982502: Reduce complexity of backend invoke; consider removing concurrency - it seems that between drush 5.4 and 5.5, DRUSH_BACKEND packets started being redirected to stderr (see comment #4). In the debian package, I fixed this problem by merging stderr to stdout (see #2002076: the debian package shows a lot of drush backend garbage).

Also, I believe this is a critical bug that should be fixed before 2.0 is released.

Status:Active» Needs review

i got a patch that sends the drush packets in the right place in #1982502: Reduce complexity of backend invoke; consider removing concurrency

I also assumed that this was caused by #1982502: Reduce complexity of backend invoke; consider removing concurrency

I now have a Debian Wheezy, php 5.4, Drush 6.x-dev(with drush-correct-minor-defects-1982502-40.patch) , Aegir 6.x-2.x,

A task has a failed status, but no red log lines.

From the command line:
drush @hostmaster  hosting-task <task-nid> --force

Gives no output lines there. Viewing the task node now does include the error lines.

A retry initiated from the frontend results in a log without the error lines.
---
Disclaimer: I'm messing with devshop and valkyrie

Running such a task via the regular cron hosting-dispatch also works OK.

So I guess it's a hosting-queued daemon issue

Cron does: drush.php '@hostmaster' hosting-dispatch

Queued does: php /usr/bin/drush @hostmaster hosting-queued

Do these invoke in a different way?

In hosting-queued, we invoke like so:

drush_invoke_process('@self', 'hosting-task', array($task->nid));

whereas in hosting-dispatch, we invoke a queue command (hosting-tasks), which in turn does:
drush_invoke_process('@self', "hosting-task", array($task->nid), array('invoke' => TRUE), array('fork' => TRUE));

First I tried just changing hosting-queued to invoke like hosting-dispatch in hosting_tasks_queue():

drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE), array('fork' => TRUE));

This resulted in getting the error output into the front-end task log, but resulted in an error in queued:

This task is not queued, use --force                                 [error]

So it looks like queued is trying to run the task twice. In fact, this is what happens when we pass '--force':

drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE, 'force' => TRUE), array('fork' => TRUE));

If we remove the 'fork' backend option, we lose the output in the front-end task, but the task doesn't try to run twice:

drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE));

If we instead set the 'fork' backend option to FALSE, we don't get the output in the front-end, but we get a bunch of output from hosting-queued:

drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE), array('fork' => FALSE));

I'm not really sure what's going on here, nor how to fix it.

First I tried just changing hosting-queued to invoke like hosting-dispatch in hosting_tasks_queue():

drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE), array('fork' => TRUE));

This resulted in getting the error output into the front-end task log, but resulted in an error in queued:

This task is not queued, use --force                                 [error]

So it looks like queued is trying to run the task twice. In fact, this is what happens when we pass '--force':

drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE, 'force' => TRUE), array('fork' => TRUE));

If we remove the 'fork' backend option, we lose the output in the front-end task, but the task doesn't try to run twice:

drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE));

If we instead set the 'fork' backend option to FALSE, we don't get the output in the front-end, but we get a bunch of output from hosting-queued:

drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE), array('fork' => FALSE));

I'm not really sure what's going on here, nor how to fix it. The Drush documentation of these invoke commands is pretty opaque.

I just tried "changing hosting-queued to invoke like hosting-dispatch in hosting_tasks_queue():"

diff --git a/queued/hosting_queued.drush.inc b/queued/hosting_queued.drush.inc
index d4474c1..fe7abea 100644
--- a/queued/hosting_queued.drush.inc
+++ b/queued/hosting_queued.drush.inc
@@ -103,7 +103,7 @@ function drush_hosting_queued() {
       sleep(1);
       // Execute the task in the backend
-      drush_invoke_process('@self', 'hosting-task', array($task->nid));
+      drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE), array('fork' => TRUE));
       drush_log(dt('Finished executing task.'));
       // Delay for a configurable amount of time.

After starting the queue-daemon manually with --verbose:
I also saw the "This task is already running, use --force" error three times.. also three invokes for the task... odd, but the task succeeds.

Setting the hosting_queued_post_task_delay variable to 30 seconds, prevented these from occurring.

So... concurrency?

the queue daemon loops... finds a task, forks, and restarts the loop to look for a new one....
The "This task is already running, use --force" seems to be what's saving us from having the task executed multiple times... a good thing.

Setting the hosting_queued_post_task_delay variable to 2 seconds, reduced it to just one already running error.

So this side effect of forking seems to be just a race condition that's already being caught. drush_hosting_task_validate updates the task status, maybe that can be done earlier.... ? But more important, we do get the full log.

Status:Needs review» Fixed

Okay, so I think we've identified the proper solution here. So this is fixed 850e53cb. We've also identified a new bug, which I've documented in #2025173: Possible race condition with _hosting_get_new_tasks().

Feel free to revert the commit and re-open this issue, if anyone feels we've taken the wrong approach.

Status:Fixed» Needs review

So i changd that to --quiet instead of fork, because that's what we were looking for. This still needs some testing, but forking definitely is not the right way to do this. To quote my commit:

commit 8b85ad495fcb720fbd062c8098daf9a02a965b0b
Author: Antoine Beaupré <anarcat@koumbit.org>
Date:   Fri Jun 21 15:30:08 2013 -0400
    don't fork when we fire up tasks with the queue daemon
    the whole idea of the queue daemon is to process one task at a time
    and wait for them to be processed. if we fork, we lose track of them
    and they will pile up like crazy, and we really don't want that.
    instead, use --quiet to start the drush sub-process, which is the
    side-effect of fork=TRUE that we were looking for. masochists can look
    in _drush_backend_invoke() in drush 5's backend.inc (line 984) for
    this marvel of engineering.
diff --git a/queued/hosting_queued.drush.inc b/queued/hosting_queued.drush.inc
index fe7abea..855dd00 100644
--- a/queued/hosting_queued.drush.inc
+++ b/queued/hosting_queued.drush.inc
@@ -103,7 +103,7 @@ function drush_hosting_queued() {
       sleep(1);
       // Execute the task in the backend
-      drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE), array('fork' => TRUE));
+      drush_invoke_process('@self', 'hosting-task', array($task->nid), array('invoke' => TRUE, 'quiet' => TRUE));
       drush_log(dt('Finished executing task.'));
       // Delay for a configurable amount of time.

Status:Needs review» Needs work

so this doesn't work, so there's got to be another magic thing going on here.

Unfortunately, that doesn't work. Here's the output after that change:

php /usr/share/drush/drush.php  --backend=2 --invoke --quiet --verbose --debug --uri=aegir2.aegir2.local --root=/var/aegir/hostmaster-6.x-2.x  hosting-task 212 --context_type=site --platform='@platform_hostmaster' --server='@server_master' --db_server='@server_localhost' --site_path=/var/aegir/hostmaster-6.x-2.x/sites/aegir2.aegir2.local --site_enabled --language=en --client_name=admin --redirection= --cron_key= --profile=hostmaster 2>&1

and here's what was happening with 'fork':

Calling proc_open(php /usr/share/drush/drush.php  --invoke --verbose --debug --uri=aegir2.aegir2.local --root=/var/aegir/hostmaster-6.x-2.x  hosting-task 212   --context_type=site --platform='@platform_hostmaster' --server='@server_master' --db_server='@server_localhost' --site_path=/var/aegir/hostmaster-6.x-2.x/sites/aegir2.aegir2.local --site_enabled --language=en --client_name=admin --redirection= --cron_key= --profile=hostmaster 2>&1 --quiet &);

The proc_open() is apparently what's making this work. So here's what happens when we replace 'fork' with 'interactive':

Calling proc_open(php /usr/share/drush/drush.php  --invoke --verbose --debug --uri=aegir2.aegir2.local --root=/var/aegir/hostmaster-6.x-2.x  hosting-task 212   --context_type=site --platform='@platform_hostmaster' --server='@server_master' --db_server='@server_localhost' --site_path=/var/aegir/hostmaster-6.x-2.x/sites/aegir2.aegir2.local --site_enabled --language=en --client_name=admin --redirection= --cron_key= --profile=hostmaster > `tty` 2>&1);
sh: cannot create /dev/pts/1: Permission denied

This appears to fail because of the redirect to 'tty' in interactive. Commenting out the relevant bit from _drush_backend_generate_command() in includes/backend.inc in Drush makes this work properly:

  else {
    // TODO: `tty` is not usable on Windows.  Is this necessary at all, and if so, is there a better way to do it?
//    if (!drush_is_windows() && !empty($backend_options['interactive'])) {
//      $command .= ' > `tty`';
//    }
  }

I have opened #2025527: > `tty` is evil and killed my cat for this, let's see if we can factor this into drush 5, even as a custom package or something...

So note that this affects only hosting-queued, and only if it is ran through sudo aegir. If you fix your terminal so that aegir can actually write to `tty`, you should be able to use #interactive without problems.

Status:Needs work» Postponed

This is fixed in a33d62da4e by switching to use 'interactive' mode. Unfortunately, it requires the patch in #4 of #2025527: > `tty` is evil and killed my cat. Marking as 'postponed' pending resolution of that issue, so we don't lose track of this.

Can we instead introduce something into Drush that does something like 'interactive' but only does the thing that we want, which I think it returning all the output at the end rather than sending packets as the command runs?

Is there some benefit to getting all the output at the end? The AJAX stuff you did for Aegir 3.x, that allows our task log to be updated live is really cool. Wouldn't switching away from packets break this?

Status:Postponed» Needs review

#2025527: > `tty` is evil and killed my cat has been committed to Drush 8.x-6.x, and I've requested it be backported to Drush 5. Perhaps we should revisit #2002804: [meta] depend on Drush 8.x-6.x?

Status:Needs review» Postponed

I would prefer lobby for inclusion in drush 5.10.

Status:Postponed» Active

so, what's the status here? we're waiting for drush 5.10?

I believe so. Moshe want a 5.10 too, so it'll happen sooner or later. The issue to track is https://github.com/drush-ops/drush/issues/71

Status:Active» Fixed

The 5.10 drush release is there... so whatever drush major version we ship with this issue should be fixed now right?

let's assume so, we'll see soon enough...

Status:Fixed» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Issue summary:View changes

cleanup