Ep #49: The "Invisible" Architecture Patterns That Power Big Tech
Unpacking Strangler Fig, Bulkhead, and Sidecar — the underrated patterns behind Netflix, Uber, and Shopify’s resilience.
Breaking the complex System Design Components
By Amit Raghuvanshi | The Architect’s Notebook
🗓️ Oct 14, 2025 · Free Post ·
The most valuable architecture choices are the ones users never notice. When Netflix streams 4K video without buffering, when Uber matches you with a driver in seconds, or when Shopify processes millions of transactions during Black Friday, these seamless experiences are powered by architectural patterns that remain invisible to end users but are absolutely critical to the platforms we rely on daily.
Everyone talks about microservices, but the real competitive advantages lie in the lesser-discussed patterns that solve specific, complex problems. Today, we'll unpack three powerful architectural patterns through the lens of how industry giants actually implement them in production.
Here’s a quick TL;DR video summary of this article — perfect if you prefer watching instead of reading
Why These Patterns Matter
Big Tech doesn’t just build systems; they build systems that scale to millions of users, handle unpredictable traffic, and evolve without downtime. The patterns discussed here are “invisible” because they focus on resilience, scalability, and maintainability - qualities users feel but never see. They’re the backbone of systems that “just work.” Let’s dive into each pattern, explore real-world applications, and see why they’re critical for premium platforms.
Pattern 1: The Strangler Fig— Netflix's Great Migration
What It Is
The Strangler Fig pattern, inspired by the plant that gradually envelops a tree, is about incrementally replacing legacy systems with modern ones. Instead of a risky “big bang” rewrite, new functionality is built alongside the old system, gradually taking over until the legacy system is retired.
The Challenge
How do you replace a monolithic system that serves 200+ million users without downtime?
Netflix faced this exact problem when transitioning from their monolithic DVD-by-mail system to their cloud-native streaming platform. Netflix created new microservices for specific functionalities (e.g., user authentication or recommendation engines). These services ran in parallel with the old system, with a proxy layer directing traffic to either the old or new components. Over time, the legacy system was “strangled” as more functionality shifted to the cloud.
Why It’s Invisible
Users never noticed the transition. Netflix’s streaming service remained uninterrupted, delivering shows while engineers rewrote critical systems.
How It Works
Flow Explanation:
User sends a request (auth, recommendation, or billing).
Proxy/Router intercepts and routes it accordingly:
To New Service A for authentication.
To Legacy System for recommendations.
To New Service B for billing.
Proxy/Router aggregates and returns the proper response to the user.
Netflix's Implementation:
Phase 1: Route new user registrations to microservices while existing users remained on the monolith
Phase 2: Gradually migrate user segments (geographic regions, subscription tiers)
Phase 3: Shadow mode - run both systems in parallel, comparing outputs
Phase 4: Full cutover once confidence reached 99.9%
The genius lies in the gradual transition. Users never experienced service disruption, while Netflix systematically replaced their legacy system piece by piece. Today, Netflix operates over 700 microservices, all born from this methodical strangler fig approach.
Key Insight: The pattern isn't just about technical migration - it's about risk management. Netflix could validate each piece independently and roll back instantly if issues arose.
To read more insights like this and to support the newsletter, consider upgrading your membership
Pattern 2: The Bulkhead — Uber's Fault Isolation Strategy
What It Is
Inspired by ship compartments that prevent sinking if one section floods, the Bulkhead pattern isolates components of a system to limit the impact of failures. Each component (or service) operates independently, so a failure in one doesn’t cascade to others.
The Problem
In a system handling millions of ride requests, how do you prevent one failing component from bringing down the entire platform?
Uber uses the Bulkhead pattern to isolate critical services like payment processing and ride matching. If the payment service experiences an outage, the ride-matching service remains unaffected, ensuring users can still book rides. Uber allocates separate resource pools (e.g., servers or database connections) for each service to avoid contention.
Why It’s Invisible
Users don’t notice when one part of Uber’s system fails because the app keeps functioning for unaffected features.
Uber's Service Isolation
Each service has isolated resources, preventing cascading failures.
Real-World Impact
During a 2019 incident, Uber's payment processing experienced high latency due to a database issue. Because of bulkhead isolation:
Payment failures affected only the payment service
Users could still request rides and see driver locations
The matching algorithm continued working normally
Only checkout was temporarily degraded
Implementation Details
Resource Isolation: Separate CPU, memory, and database connection pools
Thread Pool Separation: Different thread pools for different request types
Circuit Breakers: Automatic failover when error rates exceed thresholds
Rate Limiting: Per-service limits to prevent resource exhaustion
This pattern is why Uber maintains 99.99% uptime for core ride-matching functionality even when peripheral services experience issues.
Architecture is less about the answers you give and more about the questions you ask.
This is the fundamental shift most developers miss.
Junior thinking: “What technology should I use?”
Architectural thinking: “What problem am I actually solving, and what are the constraints?”
Junior thinking: “How do I make this scalable?”
Architectural thinking: “What does ‘scalable’ mean in this context, and what am I willing to trade?”
Junior thinking: “What’s the best practice?”
Architectural thinking: “Best for whom, under what conditions, and at what cost?”
The questions architects ask:
What happens when this fails? How will this system evolve over the next 2-5 years? What assumptions am I making that might be wrong? Who is impacted by this decision, and how? What are we optimizing for—speed, cost, flexibility, simplicity?
These aren’t questions you ask once during design. They’re continuous inquiry as systems evolve, requirements change, and new information emerges.
The architect’s value isn’t knowing everything. It’s recognizing what matters most, what can go wrong, and how to build systems that adapt.
Questions reveal gaps in understanding. Answers often hide them.
The best architects remain students, asking “why?” every single day—even about decisions they made themselves.
What question are you not asking about your current architecture?
I have recently published a book that dives into real scenarios, mental models, and reflective exercises that help you cultivate the Architect’s Mindset. It teaches how to manage complexity not just technically, but mentally by mastering focus, empathy, and clarity under pressure.
Ultimately, The Architect’s Mindset isn’t just a technical guide; it’s a guide to thinking like an architect balancing logic with intuition, structure with creativity, and technology with human understanding.
📘 Download Options:
🔹 Get a Free Sample
🔹 Buy the Full Book
Pattern 3: The Sidecar — Shopify's Observability Revolution
What It Is
The Sidecar pattern involves deploying a helper process alongside a main application to handle cross-cutting concerns like logging, monitoring, or security. The sidecar runs in the same context (e.g., same container) as the main app but is loosely coupled.
The Challenge
How do you add cross-cutting concerns (logging, monitoring, security) to hundreds of services without modifying each service's code?
Shopify uses the Sidecar pattern in its microservices architecture to manage logging and metrics. For example, a Sidecar container might handle API request logging for a payment service, sending data to a centralized monitoring system without burdening the main application.
Why It’s Invisible
Users experience consistent performance because the Sidecar offloads tasks like logging, leaving the main service free to focus on core functionality.
Shopify's Sidecar Architecture:
Shopify's Use Cases:
Observability Sidecar: Collects metrics, logs, and traces from all services
Security Sidecar: Handles TLS termination, certificate rotation, and authentication
Cache Sidecar: Provides Redis/Memcached interface without coupling to main application
Black Friday Success Story:
During 2023's Black Friday peak (handling 3.5M requests per minute), Shopify's sidecar pattern enabled:
Real-time performance monitoring across 1000+ services
Automatic traffic shaping when services approached limits
Zero-downtime certificate rotations
Consistent security policies without developer intervention
The pattern's power lies in separation of concerns - developers focus purely on business logic while operational concerns are handled transparently.
Why Trade-Offs Matter in System Design
Designing any modern system means making tough choices. You can’t have it all—performance, scalability, reliability, and cost are often at odds with each other. Skilled architects know it’s not about chasing perfection, but about understanding which trade-offs make sense for your specific context and goals.
Trade-offs force you to get clear about your priorities. They make you ask what’s truly important for your project. Is consistency more valuable than availability? Does keeping things simple outweigh the need for flexibility? Anytime you boost one quality, chances are you’ll have to give up some ground elsewhere - grasping that reality is key to building systems that work well for your needs.
In practice, successful systems are rarely built by picking the single “best” technology or pattern. Instead, they succeed through thoughtful compromises that match your business objectives, stage of growth, and all the practical limits you’re facing.
If you want to explore these ideas in more depth, my books on system design break down the core trade-offs—from real-world examples to straightforward frameworks that can guide your architectural choices.
👉 Get the books here:
System Design Trade-Offs: Volume 1 (Free)
System Design Trade-Offs Volume 2 (Free Sample)
Why These Patterns Are “Premium”
These patterns aren’t just technical choices; they’re strategic investments. They enable Big Tech to:
Scale effortlessly: Handle millions of users without crashing.
Evolve safely: Modernize systems without disrupting service.
Stay resilient: Isolate failures to keep apps running.
Optimize performance: Deliver fast, reliable experiences.
For premium platforms, these patterns are non-negotiable. They’re the difference between a system that buckles under pressure and one that feels like magic to users.
Closing Thoughts
The beauty of these patterns is that they’re invisible to the end user - nobody notices when Netflix upgrades its billing system without disruption, when Uber shields its core services from cascading failures, or when Shopify offloads networking logic into sidecars. Things just work.
In this first part, we unpacked three underrated patterns - Strangler Fig, Bulkhead, and Sidecar that quietly power some of the world’s most resilient systems.
But these are just the beginning.
👉 In the next parts, we’ll explore more powerful patterns like CQRS, Event Sourcing, and beyond - the hidden strategies that help Big Tech scale while keeping experiences seamless.
Stay tuned for the next post in The Architect’s Notebook.
– Amit Raghuvanshi
Author, The Architect’s Notebook







