Horizontal vs. Vertical Scaling: Which Is Best for APIs?

Horizontal scaling adds more machines to a system, distributing the load. Vertical scaling upgrades a single machine’s CPU, RAM, and storage. The choice depends on system architecture and application needs.

Demand for an application programming interface (API) is usually not static. It will go up and down over time. In certain cases, such as holiday season spikes in traffic, an API might need to handle a significantly higher volume of requests than it does at other times. To meet such an increase in traffic, it is necessary to scale up the usage of APIs. System admins have two choices when it comes to scaling an API. They can go with horizontal scaling, which refers to adding more API instances to a cluster, or vertical scaling, which means adding to the computing capacity of the machine that supports the API. This article explores the differences between the two and which one is best for API scalability.

Horizontal scaling

Horizontal scaling, also known as “scaling out,” is the process of deploying additional virtual machines (VMs) so there will be more API capacity to handle an increased load. (Shrinking capacity is known as “scaling in.”) As more capacity is needed, system admins can add more VMs to the cluster. Specialized resource management software is required, however, to manage the load of API calls and route them to the right VM instance in the cluster and maintain balance.

Vertical scaling

Vertical scaling is the process of adding resources to a single node. It stands in contrast to horizontal scaling, which adds nodes. Vertical scaling, also called scaling up or scaling down, means adding resources such as central processing unit (CPU) capacity, memory, or storage to a server. In the case of APIs, vertical scaling usually involves adding computing capacity to the VM that hosts the API.

For example, if an API is hosted on a VM that’s been allocated one CPU core and 512 megabytes of random access memory (RAM), then scaling up that API could mean doubling the core count and RAM. The API would then have two dedicated CPU processor cores and 1,024 megabytes of RAM. With this new configuration, the API should be able to handle roughly double the load — though constraints on network bandwidth, storage speed, and other factors may reduce the impact of vertical scaling. There is also a resource management challenge with vertical scaling, but specialized software can typically handle this issue.

Why APIs statelessness matters for scaling

APIs are typically designed as stateless, meaning they do not store request data or retain information between sessions. This stateless nature is crucial when considering scaling methods. Stateless APIs do not require data replication across instances, making horizontal scaling more efficient and easier to implement. System admins can add or remove VMs as needed without affecting API operations, as the API client does not rely on specific server instances to function.

There is nothing wrong with being stateful. Indeed, it may be essential to the desired functioning of the app. However, it is significantly more complicated to execute horizontal scaling for a stateful app. Doing so would require copying stored data from the original version of the app to new instances.

A stateless app or API, in contrast, is one that does not store request data. It does not hold onto session data in memory. Each time a session starts, it’s as if the app is meeting the client for the first time. After the session is over, it’s “goodbye,” with no memory of the session.

Horizontal scaling is possible for a stateless app because it doesn’t matter which VM is responding to API calls. The API client can call on an infinite number of VMs hosting the API, and it will never matter. System admins can add or remove as many VMs as they want without affecting the operation of the API.

Horizontal scaling, the right choice for APIs

Given that APIs are stateless, horizontal scaling emerges as the right way to scale them. Adding more VMs to increase capacity works well with stateless APIs. Admins can create VM clusters that scale out as API demand grows.

Furthermore, while it is possible to scale APIs vertically, horizontal scaling is preferable because the resource allocation issues in vertical scaling make it comparatively harder to do. In contrast, horizontal autoscaling works easily with APIs. As systems management tools detect a spike in API traffic, they can automatically add VMs to host more instances of the API in a cluster. This is quite a bit more difficult in a vertical scaling situation, as autoscaling does not work well in vertical scaling.

Demand for APIs will inevitably change over time. It will be necessary to scale API capacity up or down. Horizontal and vertical scaling are both available options for APIs. However, the stateless nature of APIs, coupled with the relative ease of horizontal autoscaling, favors horizontal scaling as the right approach to scaling up APIs.

Distributed systems and microservices architecture

In a distributed system, where APIs are part of a microservices architecture, horizontal scaling is particularly advantageous. Each microservice can be scaled independently based on its specific load and requirements. This modular approach allows for more efficient use of resources and better fault tolerance, as failures in one microservice do not impact others.

Moreover, in a microservices architecture, horizontal scaling aligns with the principles of distributed systems, where different services are deployed across multiple nodes. This ensures that the system can handle varying loads and maintain optimal performance even during peak times.

Minimizing downtime with scaling strategies

Downtime is a critical concern when scaling APIs, as any interruption in service can lead to lost revenue and customer dissatisfaction. Horizontal scaling minimizes downtime by allowing system admins to add or remove servers without impacting the API’s availability. In contrast, vertical scaling may require rebooting the server or taking it offline temporarily, which can disrupt services.

API providers can leverage horizontal scaling to ensure continuous operation, particularly during planned maintenance or unexpected traffic spikes. By distributing the load across multiple servers, horizontal scaling ensures that the API remains responsive and available, even if individual servers need to be taken offline for upgrades or repairs.

Frequently Asked Questions

Horizontal scaling has a positive impact on an API’s performance. This approach involves adding more servers to your infrastructure and distributing the load among them. As a result, the system can handle more concurrent requests, leading to improved response times and enhanced overall performance. 

In addition to performance benefits, horizontal scaling contributes to the redundancy and reliability of the system. If one server fails, the load balancer diverts traffic to remaining servers that can seamlessly take over, ensuring continuous operation. This aligns with API security best practices and enhances the system’s resilience against potential failures.

Vertical scaling involves upgrading the resources of a single server to enhance its performance. This can improve the processing speed for API requests, as a more powerful server can handle a higher load. However, it’s important to note that vertical scaling may have diminishing returns. There are limits to how much you can upgrade a single machine. Beyond a certain point, the performance gains may not be proportional to the investment.

When considering horizontal vs. vertical scaling, weighing the potential benefits against the risks is crucial. Keep in mind things like REST API security implications and ensure that the system remains resilient to failures.

It’s important to understand the difference between horizontal and vertical scaling to determine when to use each one.

Use vertical scaling when: 

  • Your web application is in its early stages or has a light load, which a single powerful server can effectively handle. This is especially true when the demand for API calls is low.
  • The application or API doesn’t support distributed computing well. In cases where the architecture or design doesn’t align with horizontal scaling principles, vertical scaling can be a suitable alternative. 

Use horizontal scaling when: 

  • Your application is experiencing growth, and demand is likely to spike. Horizontal scaling allows the infrastructure to adapt to changing loads by adding more servers as needed. 
  • You need to ensure high availability and redundancy. Horizontal scaling provides redundancy, making the system more resilient to failures and ensuring continuous operation even if one or more servers go down.

The cost-effectiveness of horizontal scaling vs. vertical scaling depends on factors like the application’s specific requirements, deployment scale, and overall demand. Vertical scaling is typically more cost-effective because you only buy one or two new components when you scale. However, it becomes less cost-effective as you approach the limits of the server’s capacity.

Why customers choose Akamai

Akamai is the cybersecurity and cloud computing company that powers and protects business online. Our market-leading security solutions, superior threat intelligence, and global operations team provide defense in depth to safeguard enterprise data and applications everywhere. Akamai’s full-stack cloud computing solutions deliver performance and affordability on the world’s most distributed platform. Global enterprises trust Akamai to provide the industry-leading reliability, scale, and expertise they need to grow their business with confidence.

Related Blog Posts

Anatomy of a SYN-ACK Attack
Learn how the TCP SYN-ACK attack vector reflection works, why it’s uncommon, and concerns it raises for security.
How to Defend Against Relentless DNS Attacks
Enterprise organizations, their employees, and their customers are better protected from cyberattacks when their DNS is properly secured.
Akamai Prolexic Now Offers Cloud, On-Prem, and Hybrid DDoS Protection
Akamai Prolexic introduces two new options, Prolexic On-Prem (powered by Corero) and Prolexic Hybrid, which extend Akamai’s cloud-based DDoS defense solution.