View previous topic :: View next topic |
Author |
Message |
Michael Chu
Joined: 10 May 2005 Posts: 1654 Location: Austin, TX (USA)
|
Posted: Sat Mar 01, 2008 3:31 pm Post subject: The great server crash of 2008 |
|
|
For those of you wondering what happened to Cooking For Engineers the last week - well, we experienced a hard drive failure on our server. There was a bit of a communication problem between my and my hosting company which delayed the delivery of the downed hard disk to a drive recovery center, so I didn't get the data until this morning. The problems were exacerbated by the fact that I was traveling with Tina in New York City and did not have access to my offline backups, server deployment notes, or anything else that could have helped. Anyway, the site's back up now and (I think) fully operational.
For those interested, here's the timeline as best as I can figure it out.
Saturday, Feb. 23 - Sometime close to midnight the primary hard drive crashed causing physical damage to parts of the disk. I was online working on Fanpop and unaware of this at the time.
Sunday, Feb. 24 - After five hours of sleep, I went to check on the updates I made to Fanpop and discovered that Fanpop could not verify the existence of Cooking For Engineers (it's on my profile). Logging into the CFE server, I discovered I could not do anything due to a series of I/O errors. A sinking feeling in my stomach told me it was a drive crash; at the same time, the alarm clock was telling me I was going to be late for my tour of the NBC Studios at 30 Rockefeller Plaza. I sent a quick e-mail to Surpass Hosting to ask them to find out what the issue was. Around noon Eastern time, I found myself on the phone about 800 ft above the ground overlooking Manhattan with Surpass on the line explaining that my drive had failed and they were trying to see if they could recover my data. My secondary drive system was fully intact, but since I never finished setting up the automated backup scripts, they were empty. (Every week, I would tell myself that I'd get to it this weekend, but it never happened.) The rest of the communication with Surpass occurred via e-mail on my telephone. I was informed that data recovery was unsuccessful and they were building a fresh drive for my system. I asked if they could ship the drive so it could go to a data recovery expert. (I didn't want to lose a couple months worth of comments - and, in addition, I didn't have any offline backups of the content on Orthogonal Thought.) They confirmed they could do that and I provided my address before going to bed.
Monday, Feb. 25 - I get up and check my e-mail to see if they shipped the drive. They had not. I had an e-mail asking me for my address - which I already provided. I provided it again and contacted DriveSavers in Novato, CA to discuss my options. When I checked e-mail again after returning to the hotel room that evening (after attending a taping of Late Nite with Conan O'Brien), I found another e-mail that said they were unable to ship the drive because I'd have to BUY it first for $90. Yes, yes, of course I'll pay the ransom! Just ship it! I wrote back telling them to do whatever was needed, just get the drive out the door.
Tuesday, Feb. 26 - Didn't receive a response from Surpass in the morning, but DriveSavers did call because they were concerned about the delay. I sent a few more e-mails to Surpass to make sure they shipped directly to DriveSavers via overnight. I finally received a message back saying they were waiting for a manager to come and authorize the shipment without having paid for the drive. Finally, while dragging my bags to the subway in the rain on the way to the airport to return to California, I received a call from Surpass apologizing for the delays and the clearly faulty protocol (which they say they've fixed). The drive was shipped overnight to DriveSavers and the tracking number provided.
Wednesday, Feb. 27 - DriveSavers confirmed that they received the drive and started working on it.
Thursday, Feb. 28 - DriveSavers successfully salvaged the data after taking the drive apart in a clean room and reading the electrical response in the regions around the data corruption and recreating the correct data. Amazing. They shipped it overnight to me. I spent Thursday night rebuilding the mail system and DNS for the servers.
Friday, Feb. 29 - I backed up the data provided by DriveSavers and confirmed that it was completely operational. Then I started the slow process of uploading all the photographs to the server and finally built the application and revived the database around 1:00am on Saturday, Feb. 30.
It has not been a fun week. |
|
Back to top |
|
|
Dilbert
Joined: 19 Oct 2007 Posts: 1304 Location: central PA
|
Posted: Sat Mar 01, 2008 6:12 pm Post subject: |
|
|
so visiting NYC is not recommended, eh?
mean week - but a good job done well! |
|
Back to top |
|
|
Auspicious
Joined: 29 Dec 2005 Posts: 66 Location: on the boat, Annapolis, MD
|
Posted: Tue Mar 04, 2008 10:46 pm Post subject: |
|
|
... and the back-up scripts are now at the top of the task pile? <grin> I feel your pain. Thanks for all the work you do.
regards, dave |
|
Back to top |
|
|
Michael Chu
Joined: 10 May 2005 Posts: 1654 Location: Austin, TX (USA)
|
Posted: Wed Mar 05, 2008 12:44 am Post subject: |
|
|
Auspicious wrote: | ... and the back-up scripts are now at the top of the task pile? <grin> |
I did them on Sunday night. |
|
Back to top |
|
|
gillespie
Joined: 05 Apr 2017 Posts: 3
|
Posted: Tue Apr 18, 2017 8:52 am Post subject: |
|
|
ok thanks |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You can reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|