AWS outage provides vital lessons
The internet and the cloud are not immune to failure, as a number of high-profile failures continue to prove. One of the most recent at Amazon Web Services (AWS) involving its S3 storage platform was caused by that old favourite, human error.
The outage, lasting around three hours, affected some high-profile online services, including Netflix, Slack and Quora. Events like this inevitably make the news, but while 100 per cent uptime is impossible to achieve, there are some things businesses can do to reduce risk.
The big attraction of the cloud is that it offers savings in cost and easier scalability than keeping systems in-house. However, it also has the effect of concentrating risk, especially if all of your services are with one provider.
If you’re not careful, you end up with a single point of failure, and a severe outage can have a big impact not just in financial terms but in damage to the reputation of your business too.
Cloud providers need to address this in a number of ways. They need to have built-in redundancy so that a secondary system is able to kick in in the event of failure. They also need to be able to scale smoothly based on load so that performance doesn’t suffer. They should have effective routing techniques in order to distribute load effectively and have back-up and recovery measures in place so that the system can be rolled back in the event of something like a ransomware attack.
Of course, if your business relies on the cloud, it doesn’t mean you can wash your hands of all responsibility. Endpoint security management still needs to be controlled in-house, as often it’s these systems that are most vulnerable to attack. Companies like https://www.promisec.com/ can help to address these needs.
It’s also important to understand what you’re putting in the cloud and its importance. Which systems need high availability, for example? In hybrid environments you need to be sure that you’re not introducing external risks that could cause your system to fail because of problems elsewhere.
It can be difficult to plan for cloud failures, but if a part of your business is reliant on a cloud service, then it can have a severe impact on your operation, so you need a contingency plan.