Akamai Cloud GPUs Powered by NVIDIA
Distribute AI inference and GPU-accelerated workloads on demand with NVIDIA-powered instances running on Akamai’s globally distributed cloud.
Create an account to deploy now, or join the waitlist for NVIDIA RTX PRO 6000 Blackwell Server Edition.
Why choose Akamai GPUs
- Enterprise-grade acceleration: RTX PRO VMs optimized for AI inference and graphics
- Low-latency delivery: run closer to users and data for faster responses
- Predictable costs: on-demand pricing with low egress and no long-term contracts
- Developer-friendly: deploy via UI, API, CLI, or Terraform; 24/7/365 support
NVIDIA GPU options on Akamai
- NVIDIA RTX PRO 6000 Blackwell Server Edition (96 GB GDDR7)
- Designed for advanced AI inference, agentic and physical AI, 3D, rendering, and scientific compute
- Availability via waitlist. Join the waitlist and review the data sheet
- NVIDIA RTX 4000 Ada Generation
- Balanced price/performance for ML inference, analytics, and media workloads using CUDA, Tensor, and RT cores
- View the data sheet
- NVIDIA Quadro RTX 6000
- A media workhorse with dual encode/decode engines and AV1 support; great for transcoding and visualization
- View the data sheet
Not sure which to pick? Talk with our team and match your workload to the right GPU.
Pricing and egress
- On-demand GPU pricing with no contracts; RTX 4000 Ada plans start at $0.52/hour
- Low egress: save up to 90% with pricing as low as $0.005 per GB in most regions
- See all current plans and rates on the GPU pricing page
Performance and efficiency highlights
- Up to 86% lower inference cost demonstrated with Stable Diffusion on Akamai Cloud. Download the white paper
- Benchmarks show NVIDIA RTX PRO 6000 Blackwell on Akamai Cloud delivers up to 1.63x higher inference throughput than H100. Read the benchmarking analysis
- Media pipelines benefit from dedicated encode/decode engines (including AV1) on supported GPUs
Built for your stack
- Kubernetes-ready: add GPU nodes to Akamai Kubernetes (LKE)
- Easy migration: resize Shared or Dedicated CPU instances into GPU instances
- CI/CD friendly: APIs, CLI, Terraform provider, custom images
- Data protection and durability: Backups with automated snapshots
- Manage via UI, API, or CLI with full documentation and 24/7/365 support
What you can run
- Real-time AI inference and agentic workloads
- RAG pipelines and personalization services at the edge
- Media transcoding, rendering, and interactive streaming
- Scientific computing, analytics, and visualization
Deploy in minutes
- Create your account and choose a GPU plan
- Provision in your preferred region and attach storage/networking
- Install NVIDIA drivers and CUDA toolkit using the GPU setup guide
- Launch your model or container (vLLM, TensorRT, PyTorch, or your stack) and scale on demand
- Optional: add GPU nodes to LKE and deploy with KServe/Kubeflow
Go further with Akamai Inference Cloud
If you’re building edge-native AI or agentic applications, pair GPUs with Akamai Inference Cloud for global traffic management, AI-aware security, vector databases, and low-latency inference at the edge.
Resources
Ready to start? Create an account. Prefer guidance? Book an AI consultation with our team.