Scale Transcoding and AI Workloads with GPU Kubernetes Clusters

Hanna Jeddy

Mar 07, 2025

Hana Jeddy

Hanna Jeddy

Written by

Hana Jeddy

Hana Jeddy is a Senior Product Marketing Manager at Akamai.

Share

The intersection of container orchestration and GPU computing represents a powerful frontier for organizations seeking to optimize their performance. Running managed Kubernetes clusters on GPUs isn't just a technical choice—it's a strategic decision that can transform how enterprises handle their most demanding workloads.

The demand for GPU-accelerated workloads is driven by the explosion in AI and ML initiatives, increased demand for real-time data processing and the rising need for high-performance media processing and streaming.  

Media and streaming applications are constantly adapting to fulfill demand. Sometimes a surge in traffic or demand is predictable, like livestreaming for a major sporting event, but not always. Edge-native applications leverage Kubernetes to ensure that an application’s underlying infrastructure can scale to meet peak demand while maintaining expected performance, and without paying for infrastructure resources that would otherwise go unused.

Performant transcoding is an essential component of a scalable media application, especially for live streaming. Now, we’re making that easier than ever for our customers with GPU node pools in managed Kubernetes clusters.

Announcing GPU Support for Linode Kubernetes Engine: Adding NVIDIA RTX 4000 Ada Generation GPUs to K8s Clusters

We’re excited to announce that Linode Kubernetes Engine now supports NVIDIA RTX 4000 Ada Generation GPUs. Our RTX 4000 Ada Generation GPU plans are optimized for media use cases with each card containing dedicated 2x encode, 2x decode, and 1x AV1 encode engines, but are right-sized for a range of workloads and applications. The RTX 4000 Ada Generation plans start at $0.52 per hour for 1 GPU, 4 CPUs, and 16GB of RAM.

Getting started is simple: while setting up your Kubernetes cluster, select your preferred GPU plan and the quantity of the node pool to add to your cluster.

Note: This requires selecting a region where GPUs are offered. RTX 4000 Ada Generation GPUs are available in the following regions:

  • Chicago, USA (us-ord)
  • Seattle, USA (us-sea)
  • Frankfurt Expansion (de-fra-2)
  • Paris, FR (fr-par)
  • Osaka, JP (jp-osa)
  • Singapore Expansion (sg-sin-2)

Fastest Path to Kubernetes Value

For developers who want to reduce the complexity of building and managing workloads on Kubernetes, our recently launched Akamai App Platform can also run on GPUs. Pairing the accelerated deployment of K8s that App Platform delivers with the powerful compute of GPUs, enables the perfect storm for high performance applications like media and AI at better cost, performance and scale. 

To try it yourself, create an account and browse our Kubernetes documentation to get started, or reach out to our cloud computing consultants for assistance. 

Note: App Platform is currently only available in Beta so will need to be activated through our Beta program page before it will be visible for deployment in your Kubernetes cluster.

Hanna Jeddy

Mar 07, 2025

Hana Jeddy

Hanna Jeddy

Written by

Hana Jeddy

Hana Jeddy is a Senior Product Marketing Manager at Akamai.

Tags

Share

Related Blog Posts

Blogs
Capture, Replicate, Deploy: Image Service Upgrades Now Available
April 24, 2025
Blogs
Cloud for the Streaming Era: Introducing Accelerated Compute
April 03, 2025
Media streaming is here to stay. The $100 billion segment of the media industry has rapidly flipped its stance in the media landscape.
Cloud
AI Inference on Akamai Cloud: Enabling Developers to Accelerate Edge Native Applications
March 27, 2025
The buzz around AI continues to grow, and with it, the crucial need to move beyond just training powerful models to effectively deploying them for real-world applications. This is where AI inference comes into play.