Home About Projects Blog Subscribe Login

What DDoS Attacks Taught Me About Resilience Design

Surviving a 1Tbps attack isn't about having bigger pipes. It's about graceful degradation, traffic shaping, and knowing what to sacrifice. Lessons from the trenches.

In October 2016, the Dyn DNS attack took down Twitter, Netflix, Reddit, and half the internet. 1.2 Tbps of traffic. Massive botnet. Global outage.

The narrative was simple: attackers got bigger guns. The response was predictable: buy bigger pipes.

Both are wrong.

I've spent 20 years defending against DDoS attacks at Link11. We've seen terabit-scale assaults, state-sponsored campaigns, and everything in between. And here's what most people miss:

Resilience isn't about capacity. It's about design.

The Capacity Trap

The default playbook is seductive: more bandwidth, more scrubbing capacity, bigger CDN. Throw resources at the problem until it drowns.

This works—until it doesn't.

Because attackers don't play fair. They don't incrementally increase load. They spike, they randomize, they probe for weak points. And the economics are brutal:

You can't win an arms race when the opponent has better unit economics.

The Real Defense: Graceful Degradation

Here's the mental model shift that changed everything for us:

Don't design to absorb every attack. Design to survive it.

Survival doesn't mean zero impact. It means:

This requires intentional sacrifice.

What to Sacrifice (In Order)

1. Non-essential endpoints. Your /about page? Your blog? Let them go offline. Attackers love wasting your resources on low-value targets.

2. Anonymous traffic. Rate-limit aggressively for unauthenticated users. Authenticated customers get priority. This filters 80% of bot traffic instantly.

3. Resource-heavy features. That real-time dashboard with live WebSocket updates? Downgrade to polling. Complex search? Show cached results. High-res images? Serve thumbnails.

4. Geographic regions under attack. If 90% of malicious traffic comes from three ASNs in Eastern Europe—and you don't serve customers there—block them. Temporarily. Surgically. This is not geofencing for fun; it's triage.

Every one of these decisions buys you time, reduces load, and preserves capacity for what matters.

Traffic Shaping: The Underrated Weapon

Most defenses treat traffic like a binary: allow or block.

Traffic shaping treats it like a control surface.

Instead of:

You ask:

Then you apply priority queues:

This isn't DDoS mitigation. It's load-aware service degradation. And it works even when you can't distinguish attack traffic from legitimate spikes (Black Friday, viral post, product launch).

The Blast Radius Principle

Here's a failure mode I see constantly:

A single overwhelmed microservice (say, user authentication) takes down the entire platform. The DDoS didn't target your database—but your database died anyway because every service tried to reconnect simultaneously.

Blast radius is about containment:

When one component fails, the system doesn't collapse—it limps. Limping is underrated. Limping means you're still moving.

The Indicator Problem

Most teams don't realize they're under attack until it's too late.

Why? Because they monitor the wrong things:

Better indicators:

You can't defend against what you can't see. And you can't see what you don't measure.

The Human Element

Here's the uncomfortable truth:

The best defenses I've seen weren't purely technical. They were organizational.

Because during a 1Tbps attack at 3am, you need:

Resilience is a team sport. If your architecture is bulletproof but your incident response is chaos, you'll still go down.

What This Means for You

You probably aren't defending against terabit DDoS attacks. But the principles apply universally:

For SaaS founders: Build tiered service levels into your architecture from day one. Know what you'd sacrifice under load.

For infrastructure engineers: Stop optimizing for the happy path. Design for the worst day. What breaks first? What's your recovery plan?

For security teams: DDoS isn't just a network problem. It's an availability problem, a cost problem, and a business continuity problem. Own the whole stack.

The Bottom Line

Surviving a massive attack isn't about having the biggest pipes or the fanciest ML-powered mitigation.

It's about:

Resilience is a design choice, not a budget line item.

And in a world where attacks are getting cheaper and easier to launch, that choice matters more than ever.


Follow the journey

Subscribe to Lynk for daily insights on AI strategy, cybersecurity, and building in the age of AI.

Subscribe →