How One Retailer Served 2 Million Product Pages During a Major Outage

When a top U.S. retailer’s origin system went down, 2 million shoppers never noticed.

Their product pages kept loading. Revenue kept flowing. And the engineering team could breathe a little easier.

Why? Harper was quietly running in the background, acting as a resilience layer built to handle exactly this kind of chaos.

In the video below, I sit down with Daniel Abbott, Technical Account Manager at Harper, to break down exactly how the system responded in real-time, how it was architected to handle failure gracefully, and what lessons your team can take away to prepare for the next outage you can’t afford to have.
‍

‍

What Happened: The Quick Version

A major U.S. retailer experienced a critical outage at their origin layer.
Harper instantly took over product page delivery with 40 million pre-rendered pages in cache.
Over the span of one hour, Harper served 2 million requests at a P95 latency of approximately 2 milliseconds.
The result: 80% of traffic was preserved during the downtime, and most customers never knew there was an issue.

Why Harper Was Built for This

Full-Page HTML, Pre-Warmed and Ready: The retailer preloads critical product pages into Harper via periodic cache warm-ups. Each page is stored as fully rendered HTML, so when the origin is unavailable, Harper serves the exact same experience—no degraded templates, no missing content.

Low Latency by Design: Harper’s fused stack approach means there are no extra network hops between compute, data, and cache. The result is consistently fast page delivery, even under heavy load. During the outage, Harper’s processing latency remained at or below 2 milliseconds for 95% of site visitors.

Distributed, Redundant, and Built for Failover: For this deployment, Harper was running across six geographically distributed locations (2 nodes at each location). Each node holds its own copy of the dataset, enabling local reads and eliminating the need to route across regions. Smart load balancing ensures only the healthiest, fastest nodes handle traffic.

Real Business Impact: This wasn’t just an engineering win. By keeping product pages live, the retailer preserved revenue, avoided customer frustration, and gained time to resolve the underlying issue. That’s the true value of resilience at the infrastructure layer.

‍

Lessons for Engineering Teams

This incident is a clear reminder that the cloud doesn't guarantee resilience. Without proper architecture, origin failures still result in downtime, lost sales, and reputational damage.

Harper can mitigate this risk by eliminating the origin dependency at the moment it matters most. Its fully distributed design, combined with tight data coupling and smart page caching, allows teams to deliver consistent user experiences even when their backend systems are under stress.

‍

Looking to Build Something Similar?

If your team supports a high-traffic retail website—or any application where uptime equals revenue—it's worth asking: how would your system handle a full origin failure?

If the answer isn’t crystal clear, Harper can help.

Book a demo—we’ll walk you through exactly how to implement this level of protection in your environment.

Our Story

Podcast

Blog

How One Retailer Served 2 Million Product Pages During a Major Outage

What Happened: The Quick Version

Why Harper Was Built for This

Lessons for Engineering Teams

Looking to Build Something Similar?

Skip the Boilerplate: How a Schema Can Power Your Entire Stack

Early Hints and Browser Support: How to Speed Up Sites Even When Safari is Not Onboard

Harper fuses database, cache, messaging, and application functions into a single process, delivering web performance, simplicity, and resilience unmatched by multi-technology stacks.