AI Inference Hardware Decisions: When to Choose CPUs vs. GPUs

Joseph Glover

Oct 07, 2025

Joseph Glover

Joseph Glover

Written by

Joseph Glover

Joseph Glover is the Akamai Director, Office of the Field CTO — Cloud.

Share

AI infrastructure is not about chasing the latest hardware trend — it’s about choosing the right tool for the job.

Akamai’s globally distributed edge network uniquely enables scalable, cost-efficient AI inference for real-time applications. By strategically using CPUs, Akamai further reduces costs and energy consumption — without compromising performance for many inference workloads.

Where to draw the line: CPU vs. GPU

Table 1 will help you make the right infrastructure choice based on your model architecture, latency requirements, and deployment environment.

Use central processing units (CPUs) when

Use graphics processing units (GPUs) when

Running real-time inference for lightweight or sparsified models

Training deep learning models or processing high-res media

Handling control logic or general purpose tasks

Needing massive parallelism (e.g., matrix operations)

Prioritizing cost-effective, widely available edge compute

Optimizing for raw performance in centralized data centers

Requiring portability across environments 

Running large-scale model training

Table 1: When to use CPUs and when to use GPUs

Meet Moviemind: A lightweight AI demo for CPU inference

Moviemind, a simple recommendation engine built on a pretrained language model, runs entirely on CPU-based infrastructure and deploys easily using Terraform on Akamai Cloud Compute. 

9 steps to deploying AI inference on Akamai Cloud

The following section will walk you through how to quickly deploy AI applications on Akamai Cloud using infrastructure as code (IaC). By leveraging Terraform, you can spin up scalable, portable environments at the edge with minimal manual effort.

Before you begin, take a moment to carefully review each section to familiarize yourself with the steps so you can complete your setup effectively.

  1. Prepare the environment 

  2. Clone or fork the project repository

  3. Secure your secrets

  4. Configure for your needs (optional) 

  5. Initialize and apply configuration 

  6. Set up a custom domain (optional) 

  7. Access the application

  8. Estimate costs

  9. Clean up

Prepare the environment

If you've completed any of the steps below, feel free to skip ahead. Just make sure all prerequisites are in place before provisioning infrastructure.

Clone or fork the project repository

If you just want to deploy the project as is, you can clone it directly:

1. Navigate to the folder where you want to clone it e.g cd ~/Projects

2. git clone https://github.com/jgdynamite10/moviemind-public.git

3. cd moviemind-public

Note: If you plan to make changes, you should fork the repository first

  1. Go to the GitHub repository you are working from.

  2. Click Fork in the upper right corner between watch and star tabs. 

  3. Then fork your own copy of jgdynamite10/moviemind-public.git.

Secure your secrets

Protect sensitive data by following development security best practices.

Note: Never store password, keys, and tokens in GitHub. Add .env, secrets.tfvars files to your .gitignore.

Configure for your needs (optional)

Edit or define your customizable variables.tf so your infrastructure aligns with your application requirements and environment.

  • Label: Name your instance for easier tracking

  • Region: Choose a location close to users or data

  • Instance_type: Match compute to workload (Table 2)

 

Instance type

When to use

g6-standard-4

Small models, low traffic

g6-standard-8

Medium models or moderate traffic

g6-dedicated-8

Larger models, high concurrency, or when you need consistent performance

Table 2: Instance types and when to use them

Note: Wait until after infrastructure is provisioned to set the domain variable so that the needed information is available.

Initialize and apply configuration

To see what Terraform intends to create, modify, or destroy — without applying the configuration, run terraform plan.

It’s a good way to verify that your variables and configurations are correct.

Once your variables are set, initialize your Terraform workspace and apply the configuration to provision your infrastructure:

  1. terraform init
  2. terraform apply -var-file="secrets.tfvars"

Terraform prompts you to confirm before creating resources. This process takes approximately 5 to 10 minutes. After completion, it will output your instance's public IP address and other useful information.

Set up a custom domain (optional)

To use a branded domain for your service, follow Akamai's guide to configuring a custom domain and secure it with HTTPS.

Note: If you're deploying to a compute instance, create an A record pointing to your instance’s public IP. For faster DNS propagation, consider lowering the TTL to 300 seconds (5 minutes).

Access the application

After deployment, Terraform outputs your instance's public IP.

  1. Wait approximately 1 minute for services to fully initialize

  2. Open a browser and set the URL to:

https:// 203.0.113.42:8080

Replace 203.0.113.42 with your actual instance IP and 8080 with the port your application is using.

If you run into issues when trying to access the deployed application, refer to the Troubleshooting tips in the next section.

Estimate costs

Use the Akamai Cloud Computing Calculator to configure and price your infrastructure, from single instances to complex deployments, and to compare Akamai’s pricing with AWS, GCP, and Azure to see potential savings.

Clean up

Once your infrastructure is no longer needed, run terraform destroy.

Also remove:

  • DNS records (if using a custom domain)

  • Local secrets or temporary files

Troubleshooting tips

Provisioning issues

  • Run terraform validate to check for syntax errors or missing variables.

  • Ensure that your API token is valid and that your account has sufficient quota.

Server creation is stuck or server is created but offline

Sometimes the process gets stuck for more than 3 minutes with no obvious progress or the server seems to have been created but remains offline. In this case, the best option is to delete this server and run terraform apply -var-file="secrets.tfvars" again.

Terraform can't establish SSH connection with a server

Ensure you have an SSH agent running and your SSH key is added. (Learn more about SSH agents.)

The process is stuck on any stage

If the deployment process gets stuck for more than 3 minutes with no obvious progress, press Ctrl+C to break, and run terraform apply -var-file="secrets.tfvars" to restart the process. In most cases, this should help.

Application not loading

  • Confirm that the correct IP address and port are being used.

  • Use dig or nslookup to verify that your domain resolves correctly.

  • If your SSL certificate failed to provision (a common cause of loading issues), re-run the terraform apply command.

  • Check your firewall rules and open ports.

  • Confirm the SSH key and instance status in the Akamai Linode dashboard.

  • Test the API endpoint with curl or Postman.

If these steps don't resolve the issue, check your Akamai service logs or contact Akamai support for further assistance.

Congratulations

You’ve now deployed an AI inference service using CPUs on Akamai’s edge platform. This setup supports a range of real-time applications and can be extended with custom domains, HTTPS, and scalable infrastructure.

Match hardware to use case to avoid wasting time and money

When evaluating AI inference hardware, it’s important to think beyond raw computational power and consider how CPUs and GPUs align with your machine learning tasks and datasets. A CPU with a higher number of cores can handle sequential processing, control functions, and data processing efficiently, while GPUs deliver parallel processing capabilities for deep neural networks, large language models, and other high-performance computing workloads.

Frameworks like CUDA or Tensor libraries take advantage of GPU accelerators to speed up training models and reduce bottlenecks, especially for algorithms that rely heavily on matrix multiplication and high-speed throughput. At the same time, CPUs remain a cost-efficient option for many inference tasks, offering energy efficiency and portability across computing systems.

Whether your AI projects involve chatbots, generative AI, or large datasets for data science, understanding the key differences between CPUs and GPUs, as well as options from Intel, AMD, and NVIDIA, will help you match hardware to use case and avoid wasted training time or infrastructure costs.

Useful links for more information

With more than 4,400 Akamai edge locations, you’re well-positioned to deliver high performance, robust security, and global scalability — especially for edge computing and AI inference — regardless of where your users are.

Joseph Glover

Oct 07, 2025

Joseph Glover

Joseph Glover

Written by

Joseph Glover

Joseph Glover is the Akamai Director, Office of the Field CTO — Cloud.

Tags

Share

Related Blog Posts

Cloud
Powering and Protecting Online Privacy: iCloud Private Relay and Information for Akamai Customers
March 02, 2022
See how Apple worked with Akamai to launch iCloud Private Relay. Learn about the service and how it can be best leveraged for Akamai customers.
Cloud
Isolate Your Database: VPC for Managed Databases Is Available Now
September 29, 2025
Learn how you can enhance security, performance, and cost efficiency with Akamai’s new VPC for Managed Databases.
Cloud
The State of Enterprise AI: Why Edge Native Is the Fastest Path to ROI
September 30, 2025
Enterprises are scaling AI adoption with edge native infrastructure. Learn why real-time, low-latency performance at the edge is the fastest path.