Download & Extend

Clients are stalling - confirm that auto-reset/self heal code works

Project:Project Issue File Review
Version:6.x-2.x-dev
Component:Miscellaneous
Category:bug report
Priority:critical
Assigned:boombatower
Status:closed (fixed)

Issue Summary

Twice today I had to suspend testing temporarily because of weirdness in HEAD (see #337820: Rename menu path 'logout' to 'user/logout' for consistency). Both times I noticed this behaviour:

Sad Panda.

0 patches being tested, yet two slaves showing "busy." Chad confirmed that his slave (#7) is stalled on test # 7311.

Once things get rolling again, it only says "2" patches are being tested, even though all 4 slaves show as busy. The queue is also going down at a glacial pace, which implies that slaves really are working half-time.

So:
a) We need to un-freeze the frozen slaves
b) We should probably put in a ping-back or something that will cancel the test run if it doesn't respond back in X seconds to help prevent this in the future.

Comments

#1

Incidentally, now the number of testing patches is at 4 again. Not sure what, if anything, changed to cause this.

#2

Yea, I noticed this during my leave as I check t.d.o at various times...actually reset the slaves once which fixed the issue.

I'll look into this and the other issues as soon as possible. It has been a bit interestingly lately to get time to work.

#3

Status:active» postponed (maintainer needs more info)

FWIW, I haven't seen this now the last 2-3 times that I've re-enabled testing. So this might now be fixed, especially if you did something on your end the last time it happened.

Marking active (needs more info) for now; but if it doesn't recur in a couple weeks I recommend setting to fixed/closed.

#4

Title:Slaves are stalling» Slaves are stalling - confirm that auto-reset/self heal code works
Assigned to:Anonymous» boombatower

I'll see if I can get some testing done to ensure this code works as expected.

#5

Title:Slaves are stalling - confirm that auto-reset/self heal code works» Clients are stalling - confirm that auto-reset/self heal code works
Version:6.x-1.x-dev» 6.x-2.x-dev

#6

Status:postponed (maintainer needs more info)» fixed

Let reopen this if it still occurs, or file new issue.

#7

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.