When a top U.S. retailer’s origin system went down, 2 million shoppers never noticed.
Their product pages kept loading. Revenue kept flowing. And the engineering team could breathe a little easier.
Why? Harper was quietly running in the background, acting as a resilience layer built to handle exactly this kind of chaos.
In the video below, I sit down with Daniel Abbott, Technical Account Manager at Harper, to break down exactly how the system responded in real-time, how it was architected to handle failure gracefully, and what lessons your team can take away to prepare for the next outage you can’t afford to have.
What Happened: The Quick Version
- A major U.S. retailer experienced a critical outage at their origin layer.
- Harper instantly took over product page delivery with 40 million pre-rendered pages in cache.
- Over the span of one hour, Harper served 2 million requests at a P95 latency of approximately 2 milliseconds.
- The result: 80% of traffic was preserved during the downtime, and most customers never knew there was an issue.
Why Harper Was Built for This
Full-Page HTML, Pre-Warmed and Ready: The retailer preloads critical product pages into Harper via periodic cache warm-ups. Each page is stored as fully rendered HTML, so when the origin is unavailable, Harper serves the exact same experience—no degraded templates, no missing content.
Low Latency by Design: Harper’s fused stack approach means there are no extra network hops between compute, data, and cache. The result is consistently fast page delivery, even under heavy load. During the outage, Harper’s processing latency remained at or below 2 milliseconds for 95% of site visitors.
Distributed, Redundant, and Built for Failover: For this deployment, Harper was running across six geographically distributed locations (2 nodes at each location). Each node holds its own copy of the dataset, enabling local reads and eliminating the need to route across regions. Smart load balancing ensures only the healthiest, fastest nodes handle traffic.
Real Business Impact: This wasn’t just an engineering win. By keeping product pages live, the retailer preserved revenue, avoided customer frustration, and gained time to resolve the underlying issue. That’s the true value of resilience at the infrastructure layer.
Lessons for Engineering Teams
This incident is a clear reminder that the cloud doesn't guarantee resilience. Without proper architecture, origin failures still result in downtime, lost sales, and reputational damage.
Harper can mitigate this risk by eliminating the origin dependency at the moment it matters most. Its fully distributed design, combined with tight data coupling and smart page caching, allows teams to deliver consistent user experiences even when their backend systems are under stress.
Looking to Build Something Similar?
If your team supports a high-traffic retail website—or any application where uptime equals revenue—it's worth asking: how would your system handle a full origin failure?
If the answer isn’t crystal clear, Harper can help.
Book a demo—we’ll walk you through exactly how to implement this level of protection in your environment.