Part 5: How to Eliminate Deadwood in AWS, Azure, and Google Cloud Platform Without Risking Disaster

by | Sep 14, 2017 | Cloud Optimization | 0 comments

This post is the last in a five-part series on how to reduce costs through choosing the right cloud instances from public cloud vendors such as AWS, Microsoft Azure, IBM SoftLayer, and Google Cloud Platform.

In this final segment of our ongoing series about right-sizing public cloud instances for optimal performance and cost efficiency, we are focusing on a common source of cloud overspending: deadwood.

Idle or “zombie” instances plague many public cloud environments, insidiously wasting your opex budget dollars. This deadwood can occur without you even realizing it, and is often the result of hasty deployments or a lack of accountability in the cloud world. Someone in the organization lights up an instance for a short-term use and then forgets to shut it down. Workloads change over time and no one goes back to eliminate the now idle instances. The fact is, most organizations don’t have an effective process for managing cloud instances and identifying idle instances. Given the complexity of invoicing from these providers and the lack of visibility into workload patterns, it is often hard to really know what is truly idle. Over time, the deadwood piles up.

Eliminating idle instances may seem like a no-brainer, but there is a potential risk. What if the instance is idle 90% of the time but then lights up for a short amount of time to handle a weekly, monthly, or quarterly workload, such as batch processing? Eliminating that instance could spell disaster — especially if it’s part of a mission-critical application.

To avoid that risk, you need to look at the workload pattern across a full business cycle, using sufficient history to erase any doubt. Consider the example below. Looking at a 24-hour period, you could conclude that you’ve identified a deadwood instance. Pull the plug, right?

Not so fast. Analyzing several weeks of workload activity paints a very different picture (see figure below). This workload is idle most of the time, but its activity peaks once a week, and this could be an important business process. So you may want to do some right-sizing, or consider scheduling the instance on and off, but it is clearly not deadwood.

Identifying the true deadwood requires sophisticated analytics that examine workload patterns across a full business cycle—and look at all utilization factors, including CPU, I/O and memory. With this insight, you can make confident decisions about what instances can be terminated safely to save money, and which ones are better candidates for downsizing and/or modernizing for optimal efficiency. In the example below, we’ve identified five instances of deadwood that can be terminated, saving more than $850 a month, or more than $10,000 a year.

With the right analytics and true visibility of workload patterns, you can finally make confident decisions about what workloads should stay and which should go. This insight helps you establish a process for regularly reviewing your cloud instances to ensure you are making the best use of every opex dollar.