I find just from the command line that stagingvm is behaving abysmally. You can just be typing a command and it will halt.
A quick check on top shows a typical VM story: High wait time, but the machine is actually doing nothing. This is typically when the physical node has too much disk access to do.
I think we probably need to find a new home for stagingvm.
Attached is a top snapshot showing the situation.

| Comment | File | Size | Author |
|---|---|---|---|
| Screenshot-rfay@stagingvm:-usr-local-sbin.png | 65.39 KB | rfay |
Comments
Comment #1
rfayCalling this fixed for now after the disk back-end change yesterday. Will reopen if it looks problematic.
Comment #3
rfayAll of a sudden we have really poor performance again. Just executing a vi session takes seconds.
I've disabled svn updates and such because they were never completing and were stomping on top of each other.
Comment #4
rfayThe every-5-minute cron job that checks out and processes themes and such, which normally takes less than a minute to run, is often taking more than 5 minutes. I've changed it to run every 10 minutes instead of every 5, but it looks like something has gone haywire. And you can feel the slowness even on the command line.
Comment #5
calebgilbert commentedIt sounds like the symptoms rfay is describing are things from stagingvm. When I checked staging vm the load was a bit high for a moment, rysnc was 18% cpu and then php spiked to 100% for a second (I'm not sure what these are from - anyone?), but then it went back down.
The worst thing I saw was that load times a staging site is taking forever stagingdb which looks like it is tapped out on memory and swapping quite a bit: http://screencast.com/t/MDlhYTNiYWY
So definitely a problem on stagingdb and a possibly a problem I wasn't able to replicate on stagingvm...
Comment #6
rfayIt did settle down. The rsync is from the every 5 minute update (which is just rsyncing locally). That's the amazing thing - that a local filesystem rsync could get so slow.
Once I spaced out the cron jobs and they didn't then overlap each other, we at least didn't get hundreds of rsync jobs.
I'd say it is a disk i/o issue.
Comment #7
obrienmd commentedI assume D.O has enough infrastructure for stuff like this, but my workplace might be able to donate a non-oversubscribed VM on one of our vmserver (kvm) clusters for this
Comment #8
gerhard killesreiter commentedThe OSL has had difficulties with their diskspace on the VMs' machines for a while. They are working on resolving this.
Comment #9
rfayGood enough.