Blog entries   |  The month from hell

The month from hell

Just call this the week from hell. Actually, it was the month from hell. Server crashes plus configuration problems on a new server this week brought just about everything to a halt.

The trash bin in the back of our data center is littered with the remnants of servers that didn’t work as promised and hard drives corrupted by viruses and worms because a virus program also didn’t work as it should have.

Part of the problem started with a planned move from our data center in Loudoun County to a new one at Virginia Tech’s Corporate Research Park in Blacksburg. The idea was to have the server farm closer at hand. This meant buying new equipment and, in true Murphy’s law fashion, some of the equipment didn’t work like I expected new servers to perform.

Since the first of April, I’ve had three of Sun’s new Coolthreads Sun Fire servers bite the dust, two Dells running Linux suffer kernel failures and one Windows 2003 server reset itself and destroy everything from the last 10 days.

Capitol Hill Blue, which runs on multiple servers, went offline three times in the last three weeks, crippled by a bug in a new content management system and the same bug corrupted the backups without our knowledge.

We thought we were on the homestretch Friday moving the last of our servers, the one containing blogs for Fred First, Colleen Redman and others. But Fred’s blog crashed before the move and wouldn’t reboot on the new server. Finally traced the bug late Friday and finished up the move at 12:50 a.m. today.

So far (fingers crossed) everything is running fine. I’ve suffered more hardware and software problems in the last three weeks than in the last 11-and-a-half years of running and hosting web sites.

Lessons learned:

  • Sun Servers ain’t what they used to be. I guess I shouldn’t be surprised. David St. Lawrence, an escapee from the corporate drudge of Sun, said it ain’t the company it used to be either.
  • Abacus, a server co-location company located in San Diego and Germany, is a ripoff. I had hoped to locate a mirror site there but their tech people failed to respond in a timely manner when we needed assistance and it took them three days to fix a minor problem. That’s a shame. Abacus used to be a good company. Now they are just sham artists in it for the quick buck.
  • Backups don’t work when the file corruption that brings down a server is also on the backup files.
  • Linux is good for running Tivo and small web sites but it doesn’t have the power for large, full-scale operations that demand multi-threading, heavy processing needs and high traffic.
  • I need some sleep.

5 Responses to The month from hell

  1. Dusty Reply

    April 30, 2006 at 12:42 am

    ps..hope it’s all in the past and all that could go wrong,has.. :)

  2. fred1st Reply

    April 30, 2006 at 2:23 pm

    I agree with Dusty. Surely, you’ve paid your dues for this decade! It sure is nice when things are going smoothly, especially after rough seas. I don’t realize how much I miss the sounding board of the blog until it’s not there for me. Thanks for your blood, sweat and tears toward the good of your server residents.

  3. Doug Thompson Reply

    May 1, 2006 at 8:53 pm

    Sean:

    Don’t know how you measure volume but 400,000 page requests a day ain’t that much traffic. Capitol Hill Blue gets more requests than that an hour and averages 12.5 million page requests a day. Those are page requests, not hits. I know how to read a weblog and so do my advertisers. Event that is peanuts when compared with really high volume news sites like Washingtonpost.com or MSNBC, which get 12 million pages requests an hour.

    In my opinion, real volume on the Internet is measured in millions, not thousands or even hundreds of thousands. I’ve read the reports that Google runs on a large-scale cluster of Linux servers but we’re walking about a specialized application with many, many servers. That’s hardly an out-of-the-box use of RedHat.

    Doug

  4. Dusty Reply

    April 30, 2006 at 12:41 am

    Does this mean your posts on the Internet that got lost are forever lost? I had linked to them on my blog and really enjoyed them, doing my own thinking on the subject.

  5. Sean Pecor Reply

    May 1, 2006 at 12:40 pm

    I’ll have to take issue with your claim that Linux isn’t capable of running high volume web sites. With over 400,000 page requests daily to my sites, I suspect the sum of traffic on my web sites is at times much, much higher than yours. Yet all of my sites, and all of my databases, and all of my supporting services (ftp, pop, smtp) are all running on a single Linux server. And my Linux server is never fully utilized, serving up page requests almost instantly despite every page containing an average of twenty database queries. And my hardware isn’t terribly impressive. It’s a dual Xeon 2.4ghz processor 2U server, with two 18gb SCSI drives and 2gb of RAM.

    I suspect that your experience may be tainted by poorly written content management software, or poorly written Perl scripts that are resource hogs. Even a $100,000 server will be brought to its knees if the software is clunky :)

    Outside of the realm of my own anecdotal experience, many of the most popular web sites are powered by Linux servers. Arguably the most popular web site in the world is Google, which runs massive clusters of Linux servers.

    Sean

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>