It wasn't me. You can't prove anything.


2010-07-21

Upgrade or not to upgrade

Last night, I stayed late to work in order to upgrade a server. We only went from Red Hat 5.2 to Red Hat 5.5. The switch was needed to fix a kernel issue with the file locking mechanism. Sounds easy right? Well it was.

The issue was hitting us randomly and required the system be rebooted in order to fix. This was not acceptable because it happened often enough to get on people's nerves. It kicked people off their Linux boxes. A high percentage of our development is on Linux. Downtime sucks.

We researched the issue and decided the best fix was to upgrade the operating system. I hate just updating the kernel if I can help it because that might throw the system and the kernel out of sync. The user space and kernel space my end up with different expectations of the other. This just sounds like a pain so I try to avoid it.

The week before the upgrade I ran through an upgrade of the same versions on a different box. I'm glad I did because the upgrade went well and told me what I should be looking for. When the time came to pull the trigger, I was able to say "We are here and the next step will be a, b, c." That instills confidence in the thing we are working on.

We had a quick meeting before the event to discuss wholesale disaster and recovery. We met in the server room after hours and kicked every one off the network. We ran through the upgrade and rebooted. Oh, and had some pizza. The system came up and was working well. Only time will tell if this upgrade fixed the original issue.

On the way out the door, I thought about how much effort we just put in to the upgrade. I have to say we made it look easy. So much could have gone wrong and cost us not only our time, but many other people's time as well. We were as prepared as we could be. Things went well this time.

No comments: