Kubernetes provides teams with flexibility and power, but without guardrails and visibility into costs, which can result in inefficiencies and rising costs.
Whether you're running a single workload or managing a whole platform on Linode Kubernetes Engine (LKE), this blog post will show you how to optimize your costs at every layer of your Kubernetes infrastructure. We'll look at a combination of LKE best practices, open source tooling, and GitOps automation.
Understanding LKE costs
LKE has a straightforward cost structure that helps developers manage Kubernetes clusters without unexpected pricing surprises. The LKE control plane is fully managed by Akamai and is available at no additional cost for the standard version. The high-availability version currently costs US$60 per cluster per month. You only pay for the compute instances (LKE nodes), load balancers (NodeBalancers), storage volumes, and egress traffic.
Understanding the price of these components allows you to estimate cluster expenses more accurately and make strategic choices. For example, workloads that rely heavily on persistent volumes or require multiple NodeBalancers for redundant ingress traffic may experience higher storage and networking costs than lightweight apps.
Akamai Cloud’s LKE pricing page and cloud pricing guide provide detailed breakdowns that can help you calculate your cost based on resource allocation. By linking resource use to cost, you can fine-tune your Kubernetes environment to balance performance, availability, and budget considerations.
We have also made it easier for you to decide the most cost-effective cloud instances for your Kubernetes cluster by introducing the Kubernetes instance calculator that lets you compare different compute instances on LKE with one another and even against other cloud providers.
Cost-saving strategies using open source tools
Now, let’s talk about some cost-saving strategies using open source tools when you are running your Kubernetes cluster on LKE.
You can only optimize what you track and measure. Kubernetes doesn't come with built-in cost management tools. Fortunately, the open source ecosystem has filled this gap with multiple options, including:
Cost-tracking tools
Resource right-sizing tools
Environmental and infrastructure tools
Monitoring and observability tools
Cost-tracking tools
Kubecost is a comprehensive cost management platform for Kubernetes. It delivers continuous insight into both resource use and cloud expenditure by mapping spend to Kubernetes constructs like namespaces, deployments, and teams. The platform streamlines financial oversight by generating use breakdowns that highlight where resources are going, along with providing actionable advice to help reduce unnecessary outlays or wasted infrastructure.
Users can establish cost-based notifications for early warnings about overspending, and Kubecost's governance tools make it easier to spot emerging reliability or performance issues before they escalate. Through robust monitoring, budgeting controls, and transparent spending analytics, Kubecost helps engineering and operations teams achieve efficient, predictable cloud costs while supporting strategic decision-making for scaling and optimization.
For more information, see Controlling LKE Costs Using Kubecost.
Kubecost is not free, but it is based on the open source project OpenCost, which provides similar core functionality.
Resource right-sizing tools
Goldilocks
Goldilocks helps you get your resource requests just right by running the Kubernetes VerticalPodAutoscaler and its recommendation engine.
For more information, see Goldilocks: An Open Source Tool for Recommending Resource Requests.
Robusta Kubernetes Resource Recommender
The Robusta Kubernetes Resource Recommender (KRR) provides resource recommendations based on historical use metrics. It can be integrated into CI/CD pipelines or automated for real-time adjustment of resource requirements.
Environmental and infrastructure tools
kube-green
kube-green shuts down unnecessary resources, saving money while helping the environment.
Check out the kube-green CO2 calculator.
Infracost
If you use Terraform to provision your infrastructure on Akamai Cloud and LKE, then you can use Infracost to generate Terraform cost estimates before deploying your infrastructure. Infracost can also validate compliance with FinOps best practices and cloud vendor well-architected frameworks.
Additionally, it provides enforcement of your organization's required tag keys and values. This approach not only reduces unnecessary spending but also integrates cost considerations directly into the engineering workflow, which avoids them being treated solely as postdeployment issues.
Monitoring and observability tools
TOBS
To work alongside LKE, you can install a robust open source observability solution like The Observability Stack (TOBS) that is much cheaper than the managed solutions by other cloud providers.
For more information, see Deploying TOBS (The Observability Stack) on LKE.
Core Kubernetes cost optimization strategies
Now that we have our toolkit ready, let's explore the fundamental strategies for optimizing LKE costs.
Use the LKE control plane efficiently
Optimize node pools
Enable the cluster autoscaler
Right-size and manage resources
Use the LKE control plane efficiently
LKE offers a managed cluster control plane on which the Kubernetes API server and its supported components are managed by LKE. The redundant, high-availability version currently costs US$60 per month.
From a cost perspective, you should avoid creating numerous small clusters that are infrequently used. However, if you want to create a lot of clusters that are not mission critical for developers or for testing purposes, then you can use the free, standard control plane. Then, you will pay only for worker nodes, load balancers, and storage; the cluster itself is free.
Optimize node pools
Compute instances are by far the most expensive aspect of most large Kubernetes-based systems. The price differences between different instance types are significant. Using multiple node pools with varying instance types allows you to choose the most cost-effective instance types for each workload.
For more information, see Managing nodes and node pools.
Enable the cluster autoscaler
The cluster autoscaler scales your worker nodes up and down based on actual use. This is critical to ensure that the resources you need to handle requests are always scaled up while unused resources are automatically scaled down. This way, you only pay for what you use.
Right-size and manage resources
Accurately estimate your workload
When you deploy a workload on Kubernetes, you need to declare how much CPU, memory, and (optionally) ephemeral storage it needs. If you underestimate, your workload will run out of resources and either crash (memory) or slow down (CPU). If you overestimate, you pay for resources you don't use.
The Kubernetes HorizontalPodAutoscaler automatically scales your workloads up and down based on CPU and memory metrics. The cluster autoscaler scales the number of nodes to ensure all pods can be scheduled. However, you still need to ensure that your pods use their resources without unnecessary waste.
Perfect utilization is impossible since workloads don't use constant amounts of CPU and memory over time. However, by using tools like Goldilocks and KRR, you can determine the optimal range of resource use and adjust your pods' resource requests accordingly. You can use the Kubernetes VerticalPodAutoscaler or build your own solution to dynamically adjust resource requests.
Use limits that are greater than resource requests
Another technique is to use limits that are greater than resource requests. This allows you to pack more pods onto each node, reducing your total node count and costs.
Here's how this works: The requested resources of a pod are allocated and are always available to that pod and that pod only. However, the pods scheduled to a node may use more resources than they requested (up to their limits). This means that if there is extra capacity on a node, then its pods may take advantage of it.
By setting conservative requests but higher limits, you can schedule more pods per node (based on requests) while still allowing them to burst when needed. The total limits of all the pods on a node may exceed the node's resources, and some of the pods might, at any given time, use more resources than they requested.
Advanced cost management
With the core strategies in place, you can tackle more specialized areas that can drive significant costs in larger deployments.
Optimize bin packing
Manage egress costs
Optimize storage costs
Use block storage smartly
Clean up unused volumes
Reconsider current storage lifecycle policies
Optimize bin packing
Bin packing is the practice of making sure your nodes are not sitting around mostly empty. It goes hand-in-hand with node pool management as different workloads may have different performance, reliability, and availability requirements.
For more information, see The Art and Science of Kubernetes Bin Packing.
Manage egress costs
Egress means data leaving your LKE cluster and moving toward the internet. It can quickly become a significant cost driver if not proactively managed.
Akamai Cloud provides a generous amount of outbound (egress) bandwidth, and bundles it with compute instances pricing and significantly reduced egress charges after that. But you should still closely monitor large data exports, backups, or applications that deliver significant traffic to external users.
You can reduce egress costs by keeping interservice and storage traffic within the same region or by using Akamai Object Storage, which is often much cheaper than cross-cloud or public internet transfers. Consider doing more work inside your cluster and providing concise reports and results as opposed to exporting mountains of raw data.
Optimize storage cost
Different storage solutions have drastically different performance and cost profiles. Modern architectures leverage this hierarchy strategically:
Hot data in fast, expensive tiers
Warm data in balanced solutions
Cold data in high-capacity, cost-optimized storage
The art of system design lies in understanding these trade-offs and mapping data access patterns to the appropriate storage tier.
Use block storage smartly
Block storage volumes are persistent volumes that can be attached to your node. There are several best practices you can employ:
Choose WaitForFirstConsumer volume binding mode (which delays volume creation until a pod is scheduled). This ensures the creation of volumes in the same availability zone as the pods that will use them, reducing cross-zone data transfer costs and improving performance.
For production workloads, set the reclaim policy to "Retain," which preserves volumes when claims are deleted. This prevents accidental data loss when a PersistentVolumeClaim is deleted, giving you explicit control over when expensive storage resources are actually released.
Consider using volume snapshots for backup strategies rather than maintaining multiple live replicas, as snapshots are typically more cost effective for disaster recovery scenarios.
For more information, see Akamai's guide to Akamai Cloud Block Storage.
Clean up unused volumes
Create a periodic job to detect and remove unbound persistent volumes that are no longer attached to any pods or claims, as these unused volumes will continue to incur storage costs indefinitely.
Implement monitoring and alerting for volumes that have been unused for extended periods, allowing you to identify candidates for deletion or archival.
Pay special attention to development and testing environments that tend to accumulate forgotten storage resources, incurring a lot of unnecessary cost over time.
Reconsider current storage lifecycle policies
Consider compressing and archiving important historical data (such as logs) in object storage. For data that might not be relevant after a while, you can use lifecycle policies to ensure automatic deletion. That may be important for compliance and regulatory purposes, too.
Establish clear data retention policies that automatically transition data through storage tiers based on age and access patterns, moving from high-performance storage like Redis or managed databases to cheaper block storage.
Monitoring and alerting
Using the TOBS monitoring tools, teams can visualize underused nodes and quickly identify resource bottlenecks. They'll gain clear visibility into how their applications are performing.
Supplementing the dashboards with cost-tracking solutions, such as OpenCost or custom Prometheus rules, helps correlate operational telemetry with direct financial impact. This integration allows estimated costs to be mapped to specific clusters, namespaces, or workloads, and offers targeted insights for cost optimization.
Strategic alerting policies ensure that critical issues never go unnoticed. For example, teams might configure alerts for monthly node or storage costs that exceed budget thresholds, or trigger notifications when nodes remain underused for prolonged periods. This combination of real-time metric collection and cost-aware alerting empowers teams to make timely adjustments to rein in unchecked operational spending.
Automation and scale
Cost optimization becomes more and more important at scale. When you manage tens or hundreds of clusters with hundreds of nodes in each one you must automate every aspect of operations. That includes cost management, too.
Modern AI capabilities can now provide significant assistance in managing these complex environments.
GitOps integration
Modern organizations use GitOps to manage both applications and infrastructure. All changes to infrastructure or applications go through source control, complete with a review process and mandatory checks. Incorporate many of the strategies we’ve discussed in your CI/CD pipeline, as these will help you detect and prevent cost-related issues before deployment.
Argo CD is a great way to ensure your workloads are deployed using carefully curated templates, which can include cost attribution and resource quotas.
Policy engines
Policy engines like Kyverno work as an admission controller whenever resources are applied to your cluster. It can reject or modify noncompliant resources, such as deployments without proper resource requests and limits. This is very useful because it will catch rogue resources that sidestep CI/CD. It will also check all the policies periodically, providing you with early detection of problems that would have surfaced later.
Cost optimization is an ongoing practice
LKE cost optimization requires understanding its transparent pricing structure while recognizing that compute, storage, and egress are the primary cost drivers. Start by implementing essential tools for visibility and right-sizing, then apply core optimization strategies like cluster autoscaling and efficient node pool management. Scale these practices with GitOps automation and policy engines for enterprise environments.
The key is to treat cost optimization as an ongoing practice, not a one-time effort. With the right tools, strategies, and automation in place, you can benefit from 10% to 20% overall cloud savings on LKE.
Tags