By now you’ve heard about Amazon’s EC2 (Elastic Compute Cloud) cloud service failure, or perhaps felt it. If you use Foursquare or read Reddit, use or Quora (among other services or websites) you no doubt felt the impact.
On 4.21 at 1:48am PDT. Quora even had a fun ‘down’ message: “We’d point fingers, but we wouldn’t be where we are today without EC2.” And this YouTube video:
Lew Moorman, chief strategy officer of Rackspace, said it best “It was the computing equivalent of an airplane crash. It is a major episode with widespread damage”. But airline travel, he noted, “is still safer than traveling in a car” — analogous to cloud computing being safer than data centers run by individual companies.
The fact remains, the cloud model is rapidly gaining popularity as a way for companies to outsource computing chores to avoid the costs and headaches of running their own data centers — simply tap in, over the Web, to computer processing and storage without owning the machines or operating software.
Consumers don’t realize that there are a host of sites that base a majority of their ‘up-time’ on cloud services, including Hotmail and Netflix to name just a few. Netflix was not affected by the recent outage because Netflix has taken full advantage of Amazon Web Services’ redundant cloud architecture (which is NOT inexpensive).
Industry analysts said the troubles would prompt many companies to reconsider relying on remote computers beyond their control. And while discussions surrounding that might happen in the next several weeks, in the long-term cloud computing will continue and thrive and evolve into what most industry experts and others already know it to be – a necessary and valued component of doing any kind of business or having any sort of web presence on the Internet. The truth is, every day many more companies around the globe experience ‘outages’ that take their services and sometimes web site down for hours. Added all together, they add up for far more lost time, money and engineering resources that Amazon’s interruption last week.
This round, the companies that were hit hardest by the Amazon interruption were start-ups who are focused on moving fast in pursuit of growth, and who are less likely to pay for extensive backup and recovery services or secondary redundancy in another data center (or Amazon’s redundant cloud architecture).
One of the things that most people are not aware of is that Amazon has an SLA (service level agreement) which is one of the weakest cloud compute SLA of any competing public cloud compute services, even though its uptime is actually very good. Most providers offer 99.99% or better, with many offering 100%, evaluated monthly, with service credit capping at 100% of that monthly bill. Amazon offers 99.95%, evaluated yearly, capping at 10% of that bill, and requires that at least two availability zones within a region be unavailable. Therefore, companies MUST take this into consideration when choosing a vendor as how it relates to what they do on the internet. Taking a secondary, back-up approach can close some of those holes, but it can get mighty expensive. Amazon’s EC2 pricing overall reflects this type of SLA and the ‘human’ support is not included — because of this aspect it can give a 10% to 20% uplift to the price, and it is geared primarily toward the very technically knowledgeable. Amazon is a cloud IaaS-focused (infrastructure-as-a-service) vendor with a very pure vision of highly automated, inexpensive, commodity infrastructure, bought without any commitment to a contract. Amazon is a thought leader; it is extraordinarily innovative, exceptionally agile and very responsive to the market.
That being said, the recent Verizon acquisition of Terremark should put most Tier 1 vendors on their toes including Amazon. Terremark offers colocation, managed hosting (including utility hosting on its Infinistructure platform), developer-centric public cloud IaaS (vCloud Express) and enterprise-class cloud IaaS (Enterprise Cloud). It is a close VMware partner (VMware is one of its investors), and is generally first to market with VMware-based solutions. It is a certified vCloud Datacenter provider. Some of Terremark’s perceived weak spots can and should now be addressed by the merger between the 2 service offerings, in particular the added personnel to better deliver on customer service and satisfaction (stretched thin’ has been the compliant). Now that it has a substantially bigger war chest from its parent Verizon and Verizon’s exceptional network worldwide (remember Uunet), it can take on and adapt more bleeding edge technologies, which it has done in the past, but has not been able to do so most recently.
Combinations like this will likely increase in this space over time as other vendors realize that 2 can be better than one. The devil is always in the details and the trick here is for company cultures to be merged efficiently with a clear and concise plan laid out for both sets of employees. The last thing you need are internal employees to wonder who is going to be replying to the same RFP (request for proposal) to any particular vendor moving forward. Strong, well thought out details by upper management should avoid these pitfalls for the most part, however, it can be pretty tricky to implement.
Long story short – I’d still bet heavily on the long-term success of this business. It’s a smart, cost efficient and labor efficient business model needed for most start-ups, mid-size and Enterprise clients. The days of sending your IT guys into a cage to update the companies software with numerous discs and software patches hoping that it doesn’t disrupt the companies servers should be long gone.