From limitations to scalability: How we transformed a high-traffic VOD platform

Blog image

AI summary

The article discusses the challenges faced by a platform experiencing performance bottlenecks due to increased traffic, which overwhelmed its single-server setup. Key issues included limited scalability, as the platform relied on vertical scaling, and an unoptimized architecture that hindered support for a growing user base. These challenges resulted in higher operational costs, frequent downtimes, and inefficient resource utilization.

The proposed solution involved a multi-phase optimization approach that allowed for seamless deployment without downtime. This strategy successfully increased the platform’s capacity from 270Mb/s with 100% CPU usage to 1.2Gb/s with only 40% CPU usage, significantly improving performance and resilience against potential outages.

The conclusion emphasizes the importance of proper infrastructure design and scaling strategies to avoid limitations that can impede business growth.

Client Overview

Client

A growing Video on Demand (VOD) platform using Kernel Video Sharing (KVS).

Industry

Digital Entertainment, Funtech

The client operates a VOD platform based on KVS (our partners). Initially, the project was deployed on two servers with minimal architecture distribution. One server for the KVS script application, one server for the database.

Despite powerful mid-range servers, the unoptimized project model quickly exhausted given the hardware resources. CPU utilization was disproportionate to the generated traffic. It became clear that the most powerful server would not be able to handle the project’s requirements. The client was not prepared to face such a problem.

Key business operations

  • On-demand video hosting with a growing global user base.
  • Heavy demand for scalable computing resources.

A business that ignores infrastructure early will face scaling issues later on.

Roman Kovalchuk
Marketing and Business Development Director
Contact an expert

Challenges

Performance Bottlenecks
As traffic on the platform increased to 270Mb/s, the existing infrastructure struggled to keep up, leading to serious performance bottlenecks. The single-server setup was overwhelmed, with CPU limitations causing noticeable slowdowns and significant degradation of the web resource during prime time.

Illustration below: CDN traffic lost half of the value when Web resource visits chart  dropped close to zero. This shows that even a server working at half capacity may bring zero profits.

Caption goes here

Limited Scalability
Scalability was another major issue. The platform relied on vertical scaling, meaning that the only way to handle more traffic was to upgrade the existing server hardware. However, this approach had both cost and performance limits. The infrastructure was not designed for horizontal scaling, which would allow traffic to be distributed efficiently across multiple servers.

Unoptimized Architecture
Finally, the platform’s architecture was unoptimized. These inefficiencies compounded over time, making it increasingly difficult for the system to support the growing user base.

Client’s infrastructure architecture before (vertical)

Business Impact

  • Increasing operational costs due to the need for high-performance servers сombined with low efficiency.
  • Frequent downtimes, leading to user dissatisfaction and potential revenue loss.
  • Inefficient resource utilization, requiring an overhaul of the infrastructure.

Solution

First of all, we noticed the problem and proactively came to the client with a designed solution. Our proposition was to implement a multi-phase optimization approach instead of simply upgrading hardware.

Phase 1

Performance Analysis & Caching Implementation

  1. Conducted a full system audit, identifying database bottlenecks.
  2. Introduced multi-level caching to reduce database load.
  3. Refactored the project’s structure, separating heavy requests to dedicated handlers.
  4. Optimized file storage for improved efficiency. These steps led to an increase in outgoing traffic to 800Mb/s  from the same stale flat architecture.
  5. Performed OS fine-tuning suitable to the current load of the server.

As a result, CPU usage was reduced by 40% with the same traffic, improving the usability of the web service for the end user.

Bandwidth and CPU usage graphics
Phase 2

Horizontal Scaling Implementation

  1. Shifted from vertical scaling (adding more powerful server) towards horizontal scaling (adding more servers).
  2. Implemented load balancing strategies for efficient traffic distribution.
  3. Configured a distributed storage system to synchronize data across multiple servers.

Instead of one overloaded server struggling with all tasks, multiple servers now share the load.

These actions gave us clear +400Mb/s traffic for each added node to project with soft CPU usage.

Phase 3

Proactive Monitoring & Support

  • Established real-time system monitoring to track CPU, memory, disk, and network usage.
  • Implemented predictive alerts, allowing preemptive issue resolution before failures occurred.
  • Maintained 24/7 proactive client support, offering continuous optimization suggestions.
  • In case of an incident, the new architecture allows us to quickly identify the problem area and act rapidly.
Client’s infrastructure architecture after (horizontal)

Seamless Deployment Without Downtime

One of the biggest challenges was deploying these changes without disrupting the platform. We executed the migration in stages, gradually shifting traffic to the new infrastructure while keeping the old system running. This ensured that users experienced zero downtime during the transition.

Results & Benefits

grid icon

Server efficiency increased, handling more traffic with the same hardware.
grid icon

Scalability improved, allowing the business to grow without infrastructure bottlenecks.
grid icon

Proactive monitoring prevented downtime, 
ensuring a seamless user experience.

Operational costs decreased as optimized resource usage reduced the need for constant hardware upgrades.

The project grew from the critical 270Mb/s with 100% CPU usage to a regular 1,2Gb/s with comfortable 40% CPU usage, which is an increase of more than 4 times.



*Furthermore, in the case of the last DDoS attack on the project, we confirmed that the service outage occurred due to network capacity overflow, not servers compute ability.

Bandwidth and CPU usage graphics

Conclusion

This case demonstrates how businesses can hit limitations due to improper architecture and how the right approach to scaling can prevent major problems. Our experience helps companies avoid these mistakes and build a resilient infrastructure.

Related articles

134% More Organic Traffic After Tailored CDN Configuration for Bot Whitelisting

34% More Organic Traffic After Tailored CDN Configuration for Bot Whitelisting

Challenge During Google’s algorithm updates, Googlebot often ramps up crawl activity to re-index affected sites. Our client’s anti-hotlink system was tuned to block requests exceeding a fixed rate per IP, a rule that typically stops botnet traffic and abuse. However, this system did not distinguish between Googlebot and harmful bots. When Googlebot’s frequency exceeded the […]
1Migration to Private Cloud Results in 40% Cost Reduction

Migration to Private Cloud Results in 40% Cost Reduction

Challenge As the client’s catalogue and customer base expanded, so did their infrastructure needs. The public cloud setup started falling short, leading to three major challenges: Is your current infrastructure holding you back? Discover how a Private Cloud can help you scale efficiently Discover how our Private Cloud helps you optimize performance and costs Solution […]
111% Retention Rate Increase After Implementing Custom Multi-Bitrate Feature

11% Retention Rate Increase After Implementing Custom Multi-Bitrate Feature

Challenge When internet speeds fluctuate, video buffers or crashes, causing many viewers to abandon their sessions before the video finishes. This results in a poor experience, a higher bounce rate, and a significant drop in engagement – directly affecting ad revenue. The client, who’d been using our CDN for optimized delivery, needed a way to […]
1Over 50% Monthly Cost Reduction After Migrating to Private Cloud

Over 50% Monthly Cost Reduction After Migrating to Private Cloud

Project Goals The Client defined the following primary goals for the migration: The project was not solely cost-driven. The CTO had used OpenStack in a previous role, but the Client’s current team lacked the operational capacity and expertise to deploy and manage a production-grade OpenStack environment. They needed a partner who could take full ownership […]
1What is a CDN? How Content Delivery Networks Work (2025 Edition)

What is a CDN? How Content Delivery Networks Work (2025 Edition)

The modern user expects instant access. Demands it. If a website takes over three seconds to load, 40% of people will leave it. For mobile apps, that figure climbs to 53%. High-definition streaming, e-commerce, online gaming, and real-time chats have forever lifted the standard for digital services. Meeting today’s expectations calls for high-performance, sturdy systems […]
Show more