When should I use horizontal scaling vs. vertical scaling?

It’s important to understand the difference between horizontal and vertical scaling to determine when to use each one. Use vertical scaling when: Your web application is in its early stages or has a light load, which a single powerful server can effectively handle. This is especially true when the demand for API calls is low. The application or API doesn’t support distributed computing well. In cases where the architecture or design doesn’t align with horizontal scaling principles, vertical scaling can be a suitable alternative. Use horizontal scaling when: Your application is experiencing growth, and demand is likely to spike. Horizontal scaling allows the infrastructure to adapt to changing loads by adding more servers as needed. You need to ensure high availability and redundancy. Horizontal scaling provides redundancy, making the system more resilient to failures and ensuring continuous operation even if one or more servers go down.

+1-8774252624

Try Akamai

Under Attack?

+1-8774252624

Products

Security
Cloud Computing
Content Delivery
All Products and Trials
Global Services

+1-8774252624

Security

App and API Security

API Security

Discover and monitor API behavior to respond to threats and abuse

App & API Protector

Protect web apps and APIs from DDoS, bots, and OWASP Top 10 exploits

Firewall for AI

AI security for LLMs and modern apps in hybrid environments

Client-Side Protection & Compliance

Assist with PCI compliance and protect against client-side attacks

Zero Trust Security

Akamai Guardicore Platform

One Zero Trust platform for coverage, visibility, and granular control.

Akamai Guardicore Segmentation

Mitigate risk in your network with granular, flexible segmentation

Secure Internet Access

Proactively protect against zero-day malware and phishing

Akamai Hunt

Stop the most evasive threats with proactive threat hunting

Enterprise Application Access

Granular application access based on identity and context

Akamai MFA

Harden against account takeovers and data breaches with phish-proof MFA

Bot & Abuse Protection

Account Protector

Mitigate account abuse and grow your digital business

Content Protector

Stop scrapers, protect intellectual property, and increase conversion

Brand Protector

Detect and mitigate fraudulent representations of your brand

Bot Manager

Welcome the bots you want and mitigate those you don’t

INFRASTRUCTURE SECURITY

Edge DNS

External authoritative solution for your DNS infrastructure

Prolexic

Protect your infrastructure from distributed denial-of-service attacks

IP Accelerator

Boost network performance and security for IP-based applications

DNS Posture Management

Strengthen DNS security with visibility, insights, and remediation.

Cloud Computing

Compute

Build, release, and scale faster with VMs for every workload

See all

Networking

Secure your network, balance traffic, control your infrastructure

See all

Containers

Efficiently orchestrate containerized applications

See all

Storage

Deploy dependable, easily accessible storage and management

See all

Databases

Scale easily with simple and reliable managed databases

See all

Create a Cloud Account

Content Delivery

APPLICATION PERFORMANCE

Ion

Improve the performance and reliability of your website at scale

API Acceleration

Improve the performance and reliability of your APIs at scale

IP Accelerator

Boost network performance and security for IP-based applications

MEDIA DELIVERY

Adaptive Media Delivery

High-quality video delivery for any screen to global audiences

Download Delivery

Deliver large file downloads flawlessly, every time, at global scale

EDGE APPLICATIONS

EdgeWorkers

Execute custom JavaScript at the edge, near users, to optimize UX

EdgeKV

Distributed key-value store database at the edge

Image & Video Manager

Automatically optimize images and video for every user, on any device

Cloudlets

Predefined apps that run at the edge for specific business needs

Cloud Wrapper

Use an efficient caching layer to improve origin offload

Global Traffic Management

Optimize performance with intelligent load balancing

MONITORING, REPORTING, AND TESTING

DataStream

Low-latency data feed for visibility and ingest into third-party tools

mPulse

Measure the business impact of real user experiences in real time

CloudTest

Site and application load testing at global scale

Solutions

Use Cases
Industry Solutions

+1-8774252624

Use Cases

CLOUD COMPUTING

AI Inferencing

Power rapid and responsive AI

Media

Deliver an engaging, interactive video experience

SaaS

Build with portability, performance, and efficiency from cloud to client

Gaming

Improve the gamer experience with low latency and high availability

SECURITY

Cybersecurity Compliance

Secure your business and reduce compliance complexity

Ransomware Protection

Mitigate attacks by limiting malware ingress and stopping lateral movement

Secure Apps and APIs

Build trust and drive growth with end-to-end protection

DNS Delivery and Security

Ensure responsive, resilient, and secure services and applications

Zero Trust

Solutions for comprehensive coverage, visibility and control

DDoS Protection

Protect your infrastructure from DDoS and DNS attacks

Bot & Abuse Protection

Stop account abuse, sophisticated bot attacks, and brand impersonation

CONTENT DELIVERY

App and API Performance

Improve user engagement through app & API optimization

Media Delivery

Deliver seamless streaming and download experiences to any device

Edge Compute

Build and deploy on the world’s most distributed edge platform

Industry Solutions

Why Akamai

Discover Akamai Connected Cloud

Learn more

Our Platform

Explore our global infrastructure

Learn more

Company

See how we power and protect life online

Learn more

Resources

+1-8774252624

Library

Learn

Learning Hub

Educational resources and training for Akamai products and services

Glossary

Key concepts in security, cloud computing, and content delivery

Security Research

Akamai Security Research

Insights and intelligence from the Akamai Security Intelligence Group

State of the Internet (SOTI) Reports

In-depth analysis of the latest cybersecurity research and trends

Partners

Find a Partner
Become a Partner
Cloud Computing Marketplace

+1-8774252624

Find a Partner

Why Choose an Akamai Partner

Learn about our industry-leading ecosystem of partners

Partner Directory

Find a channel or technology partner

Become a Partner

Contact Sales

Have questions? We can help.

Customer Support

Need technical support? We are here 24/7.

Get support

Horizontal vs. Vertical Scaling: Which Is Best for APIs?

Horizontal scaling adds more machines to a system, distributing the load. Vertical scaling upgrades a single machine’s CPU, RAM, and storage. The choice depends on system architecture and application needs.

Demand for an application programming interface (API) is usually not static. It will go up and down over time. In certain cases, such as holiday season spikes in traffic, an API might need to handle a significantly higher volume of requests than it does at other times. To meet such an increase in traffic, it is necessary to scale up the usage of APIs. System admins have two choices when it comes to scaling an API. They can go with horizontal scaling, which refers to adding more API instances to a cluster, or vertical scaling, which means adding to the computing capacity of the machine that supports the API. This article explores the differences between the two and which one is best for API scalability.

Horizontal scaling

Horizontal scaling, also known as “scaling out,” is the process of deploying additional virtual machines (VMs) so there will be more API capacity to handle an increased load. (Shrinking capacity is known as “scaling in.”) As more capacity is needed, system admins can add more VMs to the cluster. Specialized resource management software is required, however, to manage the load of API calls and route them to the right VM instance in the cluster and maintain balance.

Vertical scaling

Vertical scaling is the process of adding resources to a single node. It stands in contrast to horizontal scaling, which adds nodes. Vertical scaling, also called scaling up or scaling down, means adding resources such as central processing unit (CPU) capacity, memory, or storage to a server. In the case of APIs, vertical scaling usually involves adding computing capacity to the VM that hosts the API.

For example, if an API is hosted on a VM that’s been allocated one CPU core and 512 megabytes of random access memory (RAM), then scaling up that API could mean doubling the core count and RAM. The API would then have two dedicated CPU processor cores and 1,024 megabytes of RAM. With this new configuration, the API should be able to handle roughly double the load — though constraints on network bandwidth, storage speed, and other factors may reduce the impact of vertical scaling. There is also a resource management challenge with vertical scaling, but specialized software can typically handle this issue.

Why APIs statelessness matters for scaling

APIs are typically designed as stateless, meaning they do not store request data or retain information between sessions. This stateless nature is crucial when considering scaling methods. Stateless APIs do not require data replication across instances, making horizontal scaling more efficient and easier to implement. System admins can add or remove VMs as needed without affecting API operations, as the API client does not rely on specific server instances to function.

There is nothing wrong with being stateful. Indeed, it may be essential to the desired functioning of the app. However, it is significantly more complicated to execute horizontal scaling for a stateful app. Doing so would require copying stored data from the original version of the app to new instances.

A stateless app or API, in contrast, is one that does not store request data. It does not hold onto session data in memory. Each time a session starts, it’s as if the app is meeting the client for the first time. After the session is over, it’s “goodbye,” with no memory of the session.

Horizontal scaling is possible for a stateless app because it doesn’t matter which VM is responding to API calls. The API client can call on an infinite number of VMs hosting the API, and it will never matter. System admins can add or remove as many VMs as they want without affecting the operation of the API.

Horizontal scaling, the right choice for APIs

Given that APIs are stateless, horizontal scaling emerges as the right way to scale them. Adding more VMs to increase capacity works well with stateless APIs. Admins can create VM clusters that scale out as API demand grows.

Furthermore, while it is possible to scale APIs vertically, horizontal scaling is preferable because the resource allocation issues in vertical scaling make it comparatively harder to do. In contrast, horizontal autoscaling works easily with APIs. As systems management tools detect a spike in API traffic, they can automatically add VMs to host more instances of the API in a cluster. This is quite a bit more difficult in a vertical scaling situation, as autoscaling does not work well in vertical scaling.

Demand for APIs will inevitably change over time. It will be necessary to scale API capacity up or down. Horizontal and vertical scaling are both available options for APIs. However, the stateless nature of APIs, coupled with the relative ease of horizontal autoscaling, favors horizontal scaling as the right approach to scaling up APIs.

Distributed systems and microservices architecture

In a distributed system, where APIs are part of a microservices architecture, horizontal scaling is particularly advantageous. Each microservice can be scaled independently based on its specific load and requirements. This modular approach allows for more efficient use of resources and better fault tolerance, as failures in one microservice do not impact others.

Moreover, in a microservices architecture, horizontal scaling aligns with the principles of distributed systems, where different services are deployed across multiple nodes. This ensures that the system can handle varying loads and maintain optimal performance even during peak times.

Minimizing downtime with scaling strategies

Downtime is a critical concern when scaling APIs, as any interruption in service can lead to lost revenue and customer dissatisfaction. Horizontal scaling minimizes downtime by allowing system admins to add or remove servers without impacting the API’s availability. In contrast, vertical scaling may require rebooting the server or taking it offline temporarily, which can disrupt services.

API providers can leverage horizontal scaling to ensure continuous operation, particularly during planned maintenance or unexpected traffic spikes. By distributing the load across multiple servers, horizontal scaling ensures that the API remains responsive and available, even if individual servers need to be taken offline for upgrades or repairs.

Frequently Asked Questions

Horizontal scaling has a positive impact on an API’s performance. This approach involves adding more servers to your infrastructure and distributing the load among them. As a result, the system can handle more concurrent requests, leading to improved response times and enhanced overall performance.

In addition to performance benefits, horizontal scaling contributes to the redundancy and reliability of the system. If one server fails, the load balancer diverts traffic to remaining servers that can seamlessly take over, ensuring continuous operation. This aligns with API security best practices and enhances the system’s resilience against potential failures.

Vertical scaling involves upgrading the resources of a single server to enhance its performance. This can improve the processing speed for API requests, as a more powerful server can handle a higher load. However, it’s important to note that vertical scaling may have diminishing returns. There are limits to how much you can upgrade a single machine. Beyond a certain point, the performance gains may not be proportional to the investment.

When considering horizontal vs. vertical scaling, weighing the potential benefits against the risks is crucial. Keep in mind things like REST API security implications and ensure that the system remains resilient to failures.

It’s important to understand the difference between horizontal and vertical scaling to determine when to use each one.

Use vertical scaling when:

Your web application is in its early stages or has a light load, which a single powerful server can effectively handle. This is especially true when the demand for API calls is low.
The application or API doesn’t support distributed computing well. In cases where the architecture or design doesn’t align with horizontal scaling principles, vertical scaling can be a suitable alternative.

Use horizontal scaling when:

Your application is experiencing growth, and demand is likely to spike. Horizontal scaling allows the infrastructure to adapt to changing loads by adding more servers as needed.
You need to ensure high availability and redundancy. Horizontal scaling provides redundancy, making the system more resilient to failures and ensuring continuous operation even if one or more servers go down.

The cost-effectiveness of horizontal scaling vs. vertical scaling depends on factors like the application’s specific requirements, deployment scale, and overall demand. Vertical scaling is typically more cost-effective because you only buy one or two new components when you scale. However, it becomes less cost-effective as you approach the limits of the server’s capacity.

Why customers choose Akamai

Akamai is the cybersecurity and cloud computing company that powers and protects business online. Our market-leading security solutions, superior threat intelligence, and global operations team provide defense in depth to safeguard enterprise data and applications everywhere. Akamai’s full-stack cloud computing solutions deliver performance and affordability on the world’s most distributed platform. Global enterprises trust Akamai to provide the industry-leading reliability, scale, and expertise they need to grow their business with confidence.

Object Storage

Store your data for high availability, durability, or distribution.

Learn more

Additional Resources

AI Inference Efficiency: Spend Less and Do More

As demonstrated using the Stable Diffusion model on Akamai Cloud, AI inference costs can be reduced by up to 86%. Discover strategies to optimize performance and lower expenses.

Download whitepaper

Distributed Cloud: Taking AI to the Edge

A 2024 Forrester Consulting study, commissioned by Akamai, explores distributed cloud interest among cloud strategy decision-makers.

Read study

Security

App and API Security

Zero Trust Security

Bot & Abuse Protection

INFRASTRUCTURE SECURITY

Cloud Computing

Content Delivery

APPLICATION PERFORMANCE

MEDIA DELIVERY

EDGE APPLICATIONS

MONITORING, REPORTING, AND TESTING

CLOUD COMPUTING

SECURITY

CONTENT DELIVERY

Library

Horizontal vs. Vertical Scaling: Which Is Best for APIs?

Horizontal scaling

Vertical scaling

Why APIs statelessness matters for scaling

Horizontal scaling, the right choice for APIs

Distributed systems and microservices architecture

Minimizing downtime with scaling strategies

Frequently Asked Questions

How does horizontal scaling affect an API’s performance?

How does vertical scaling affect an API’s performance?

When should I use horizontal scaling vs. vertical scaling?

Is horizontal scaling or vertical scaling more cost-effective?

Why customers choose Akamai

Related Products

App Platform

Linode Kubernetes Engine

Managed Databases

Object Storage

Additional Resources

AI Inference Efficiency: Spend Less and Do More

Distributed Cloud: Taking AI to the Edge

Related Pages

Related Blog Posts

Ready to get started or have questions?

PRODUCTS

COMPANY

CAREERS

NEWSROOM

LEGAL & COMPLIANCE

GLOSSARY