I wanted to bring people up-to-date on events that happened last night regarding drupal.org. I am paraphrasing most of this from an email our lead systems engineer here at the OSL sent me (he and his folks are working on).

First thing is first, there was no loss of data during this outage. The Drupal infrastructure is backed up on a nightly basis.

At 3am PST we experienced an outage on drupal2.osuosl.org. The machine stopped responding to pings and pages the monitoring system was notified. For some reason, the pages did not make it out. We'll be investigating this further after patching up drupal2. We finally got to the machine at 8am PST to find that it was up and operational. Upon reboot, the machine appeared to have problems with udev (it didn't see all of its partitions). Digging further, we found the machine had some problems with a recent upgrade that was applied to all of the drupal infrastructure. More on that in a moment.

Corey cut over DNS for drupal.org to drupal1.osuosl.org. Unfortunately, the time-to-live on the DNS entry "drupal.org" was set at 86400 (one day). This has been lowered so we can recover from these types of problems more quickly in the future. Some people will continue to experience problems connecting to drupal.org until 11am PST 1/21/2006 due to DNS caching by upstream providers. If you can clear your DNS cache, this will resolve the problem immediately.

Moving forward, Corey and his team are going to be rebuilding drupal2.osuosl.org. Upon completion, they will rebuild drupal1.osuosl.org (after changing DNS, etc) to make sure that the recent upgrades that affected drupal2 don't impact drupal1. This is a "just in case" situation.

We're working diligently to get the problem resolved. As of this writing, drupal2 is almost completely rebuilt and should be on-line soon.

Comments

dmuth’s picture

While I had stale DNS data, I tried connecting to the IP of drupal1 (http://140.211.166.61/), as well as tunnelling a connection through a coloed box of mine, and appeared to hit a bad default settings.php, as I got this message:

Unable to connect to database server

This either means that the username and password information in your settings.php file is incorrect or we can't contact the MySQL database server. This could mean your hosting provider's database server is down.

The MySQL error was: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2).

Currently, the username is username and the database server is localhost.

Is there any chance that the default settings.php file could be updated so that we can connect to the server by IP if need be in the future?

Thanks,

-- Doug

--
Douglas Muth, Philadelphia, PA
http://www.claws-and-paws.com/

BryanSD’s picture

You never know how much you're going to miss something until it's not there. I'm glad to see drupal.org back up and running! At least the absense gave me some time to work on my own projects/sites, meet up with a friend I haven't seen in months for a beer, say hello to the wife and kid. Ok...so maybe I have a drupal addiction that needs to be taken care of. At least I can get my fix once again!

-Bryan

New Drupal Site Coming Soon:
CMSReport

New SMF Site:
WebCMS Forum

freyquency’s picture

Yeah, my finger is sore from clicking 'refresh'...

binduwavell’s picture

You might try the reload every extension for firefox:

http://reloadevery.mozdev.org/

-- Bindu Wavell
VP. Engineering
Zia Consulting

freyquency’s picture

:D Thanks for the recommendation!

BryanSD’s picture

I wonder how many Website owners watching bandwidth/performance are going to love that tool? Now it would be cool if the Tool also gave an audio alarm if there was a change to the contents of the page.

Dries’s picture

Thanks Scott, Corey, Matt and the rest of the OSL team. Your support and experience has been invaluable! You guys are FOSS heroes.

bradtem’s picture

In my logs I am seeing a most unusual DDOS. It's not referral spam (of which I also get lots) but I am seeing growing numbers of web hits for random nodes, and the referrals are other drupal site comment pages, none of which have any link to my site. Here's what's in the logs (referer and target on http://ideas.4brad.com)

http://www.computerworld.com/blogs/comment/reply/1405   /node/218
http://www.computerworld.com/blogs/comment/reply/1477   /node/228
http://brentrasmussen.com/log/comment/reply/407 /node/273
http://www.teradome.com/comment/reply/518       /node/234
http://walkah.net/comment/reply/67      /node/266
http://www.subterrain.net/drupal/comment/reply/87       /node/325
http://nashvillesnews.net/comment/reply/37579   /node/300
http://www.computerworld.com/blogs/comment/reply/1595   /node/294
http://www.bluenc.com/comment/reply/286 /node/254
http://turkwatch.com/comment/reply/9    /node/283
http://www.shalomctr.org/comment/reply/896      /node/224
http://lifeoflevi.com/drupal/comment/reply/356  /node/227
http://brentrasmussen.com/log/comment/reply/442 /node/317
http://texo.ca/comment/reply/101        /node/262
http://www.computerworld.com/blogs/comment/reply/1609   /node/284
http://www.reallivepreacher.com/comment/reply/583       /node/257
http://www.codepoets.co.uk/comment/reply/17     /node/281
http://dig.csail.mit.edu/breadcrumbs/comment/reply/6    /node/323
http://www.computerworld.com/blogs/comment/reply/1436   /node/253

Very odd. Coming from distributed addresses. Not enough to slow me down but is it presaging something?