Why Amazon’s S3 outage affected us and what we did about it

Why Amazon’s S3 outage affected us and what we did about it

Yesterday (2/28/2017) was a busy day for us. From about 1pm Eastern time to 1:45pm, many of our customers had slow access to our software, and throughout the afternoon, a tiny handful of folks couldn’t access certain attached files. I’m really sorry if you were affected by this. We know that lots of folks rely on us to run their businesses, and we take that responsibility really seriously.

I thought you might be interested in what happened.

The reason is because there was a major outage in Amazon’s “Web Services“. If you’re surprised that an online store has anything to do with software for countertop shops… well, it turns out that Amazon is one of the premier providers of cloud-based storage and computing. Their outage yesterday took down major parts of the internet, and luckily the effect on us was pretty minimal.

The vast majority of Moraware JobTracker and CounterGo doesn’t rely on Amazon’s S3 service, which is the one that went down. But if you’ve had customer support calls with us in the last couple of years, you’ve seen how we can share your screen almost instantly. The tool we use to do that relies on S3, and we hadn’t considered that if it broke, it could have a negative effect on us.

So, a little after 1pm, we got a call or two saying “Things are kinda slow…” which is super unusual. We started looking, and our development team immediately figured out the problem. It took about another 20 minutes to package up a fix and deploy it to our customers.

During the time it took to push out the problem, we got a flood of calls from customers saying the same thing. For about 45 minutes our whole support team was busy answering the phone with the same message. Once the fix went out, things got quiet again.

Too quiet…

Although our software was working well again, it turns out that our primary email channel was not. And to make things worse, all of our voice-mails and messages from our receptionist are routed through that same tool. Amazon wasn’t giving an ETA for their own fix, and we decided to sit tight while that all got sorted out.

Their fix took longer than we expected, but shortly after 5pm Eastern time, we got a flurry of delayed emails and voicemails. If you called or emailed us during the afternoon, you should have gotten a response last night at the latest. I’m really sorry about the slow responses, but everything seems back to normal now.

Thanks again for your patience, and we really appreciate you using our software.

Leave a Reply

Your email address will not be published. Required fields are marked *