Dumb.
Over the last year we’ve significantly upgraded the power distribution within our mini-data center. However, as often is the case when things are upgraded, we hadn’t yet worked out all of the kinks.
Yesterday, I upgraded the Brownrice phone systems by bringing a new Asterisk server online. Apparently I plugged this machine into a circuit that was nearly overloaded. All of our machines perform routine maintenance between 4am and 5am each morning, when most people are sleeping. This causes the machine’s load and power consumption to increase.
This morning, because of the added Asterisk server load, one of the data center circuits blew. Our batteries kept things running for a bit but eventually a few servers went down and, most importantly, one of our primary switches lost power. (If the power was cut to the building our generators would have kicked in, but the power supply was not affected.)
As we are a small hosting provider we don’t have a technician in the building over night. (If we did this person would have nothing to do 99.99% of the time.) Dave came in, sorted it out, and stabilized us.
We’ll be spending the day analyzing our power distribution to prevent this from happen again (yes, its happened once before) and that if it does happen again we’ll make sure that the outage will be minimized (think: switches and routers on their own circuits.)
I’m very sorry for the outage and thank you for your business.
~ Oban