IT shops expected their cloud usage to flag due to recent chip bugs, but most environments survived the patches unscathed.
In the aftermath of one of the largest compute vulnerability disclosures in years, it turns out cloud computing usage won't suffer greatly after all.
Public clouds were potentially among the most imperiled architectures from the Spectre and Meltdown chip vulnerabilities. But at least from the initial patches, the impact to these platforms' security and performance appears to be less dire than predicted.
Many industry observers expressed concern that these chip-level vulnerabilities would make the multi-tenant cloud model a conspicuous target for hackers to gain access to data in other users' accounts on the same shared host. But major cloud vendors' quick responses -- in some cases, months ago -- have largely addressed those issues.
Customers must still update systems that live on top of the cloud. But with the underlying patches, cloud environments are well-positioned to address the initial concerns about data theft. And cloud customers have far less to do than a company that owns its own data center and needs to update its hardware, microcode, hypervisor and perhaps management instances.
"The sky is not falling; just relax," said Chris Gardner, an analyst with Forrester Research. "They're probably the most critical CPU bugs we've seen in quite some time, but the mitigations help, and the chip manufacturers are already working on long-term solutions."
In some ways, vendors' rapid response to install fixes to the Meltdown and Spectre vulnerabilities also illustrates their centralization and automation chops.
"We couldn't have worked with hardware vendors and open source projects like Linux at the pace they were able to do to patch project," said Joe Kinsella, CTO and founder of CloudHealth, a cloud managed service provider in Boston. "The end result is a testament to the centralization of ability to actually go and respond."
Security experts said there are no known exploits in the wild for the Meltdown and the two-pronged Spectre vulnerabilities. The execution of a hack through these vulnerabilities, especially Spectre, is beyond the scope of the average hacker, who is far more likely to find a path of less resistance, according to experts.
In fact, the real impact from the Meltdown and Spectre vulnerabilities, so far, has been the patching process itself. Microsoft, in particular, riled some of its Azure customers with forced, unscheduled reboots after reports about Meltdown and Spectre surfaced before the embargo on the disclosure was to be lifted. Google, for its part, said it avoided reboots by live-migrating all its customers.
And while Amazon Web Services (AWS), Microsoft, Google and others could quietly get ahead of the problem to varying degrees, smaller cloud companies were often left scrambling.
AMD and Intel have worked on firmware updates to further mitigate the problem, but early versions of these have caused issues of their own. Updated patches are supposedly imminent, but it's unclear if they will require another round of cloud provider reboots.
The initial patches to Meltdown and Spectre are stopgap measures -- it may take years to redesign chips in a way that doesn't rely on speculative execution, an optimization technique at the root of these vulnerabilities. It's also possible that any fundamental redesign of these chips could ultimately benefit cloud vendors, which swap out hardware more frequently than traditional enterprises and, thus, could jump on the new processors faster.
These flaws could cause potential customers to rein in their cloud computing usage or do additional due diligence before they transition out of their own data centers. This is particularly true in the financial sector and other heavily regulated industries that have just begun to warm to the public cloud.
"If you [are] starting a new project, there's this question mark that wasn't there before," said Marty Puranik, CEO of Atlantic.Net, a cloud hosting provider in Orlando, Fla. "I can't imagine a chief risk officer or chief security officer saying this is inconsequential to what we're going to do in the future."
Performance hits not as bad as first predicted
The other potential fallout from Spectre and Meltdown is how the patches will affect performance. Initial predictions were up to a 30% slowdown, and frustrated customers took to the internet to highlight major performance hits. Cloud vendors have pushed back on those estimates, however, and multiple managed service providers that oversee thousands of servers on behalf of their clients said the vast majority of workloads were unaffected.
While it remains to be seen if performance issues will start to emerge over time, IT pros seem to corroborate the providers' claims. More than a dozen sources -- many of whom requested anonymity because of the situation's sensitive and fluid nature -- told SearchCloudComputing they saw almost no impact from the patches.
The reality is the number of affected systems is fairly small, and the performance impact is highly variable, Kinsella said. "If it was really 30%, I think we'd be having a different conversation, because that's like rolling back a couple years of Moore's Law," he said.
Zendesk, based in San Francisco, suspected something was up with its cloud environment following an uptick in reboot notices from AWS toward the end of 2017, said Steve Loyd, vice president of technology operations at Zendesk. Those reboots weren't exactly welcome, but were better than the alternative, and the company hasn't seen a big impact from testing patches so far, he said.
Google said it has seen no reports of notable impacts for its cloud customers, while Microsoft and AWS initially said they expected a minority of customers to see performance degradation. It's unclear how Microsoft has mitigated these issues for those customers, though it has recommended customers switch to a faster networking service that just became generally available. AWS said in a statement that, since installing its patches, it has worked with affected customers to optimize workloads and, "in almost every case, prevent significant changes to their cost."
The biggest potential exception to these negligible impacts on cloud computing usage would be anything that uses the OS kernel extensively, such as distributed databases or caching systems. Of course, the same type of workload on premises would presumably face the same problem, but even a small impact adds up at scale.
"If any single system doesn't appear to have more than 1% impact, it's almost immeasurable," said Eric Wright, chief evangelist at Turbonomic, a Boston-based hybrid cloud management provider. "But if you have that across 100 systems, you have to add one new virtual system to your load. So, no matter how you slice it, there's some kind of impact."
Cloud providers also could take more of a hit with customers simply because of their pricing schemes. A company that owns its own data center could just throw some underused servers at the problem. But cloud vendors charge based on CPU cycles, and slower workloads there could have a more pronounced impact, said Pete Lindstrom, an analyst at IDC.
"It's impressionistic stuff, but that's how security works," he said. "Really, the question will be what does the monthly bill look like, and is the impact actually there?"
The biggest beneficiary from performance effects could be abstracted services, such as serverless or platform-as-a-service products. In those scenarios, all the patches are the responsibility of the provider, and analysts believe that, to the customer, these services will appear unaltered.
ACI Information Group, a news and social media aggregator, patched its AWS Elastic Compute Cloud instances, base Amazon Machine Images and Docker images. So far, the company hasn't noticed any huge issues, but employees did take note that its serverless workloads required no work on their part to address the problem and the performance was unaffected, said Chris Moyer, vice president of technology at ACI and a TechTarget contributor.
"We have about 40% of our workload on serverless now, so that's a big win for us, too -- and another reason to complete our migration entirely," he said.