Byte-Sized Design
Subscribe
Sign in
Home
Notes
Sponsorships
Paid Resources
Archive
Leaderboard
About
Latest
Top
Discussions
Slack Rebuilt Notifications for Millions of Users
Slack rebuilt its notification system from scratch, here's the architecture decision that made it possible without breaking millions of users.
Mar 30
•
Byte-Sized Design
11
2
How Uber Killed Hours-Old Data (And Why Your Batch Jobs Are a Liability)
What they found when they finally did the math on stale data.
Mar 24
•
Byte-Sized Design
20
3
GitHub’s Elasticsearch Problem Was Seven Years in the Making. Here’s How They Finally Fixed It
Why the right fix wasn't available until now, and what they did in the meantime.
Mar 16
•
Byte-Sized Design
11
How a 12-Word Issue Title Owned 4,000 Developer Machines
TLDR One GitHub issue title.
Mar 7
•
Byte-Sized Design
13
2
February 2026
Meta Used LLMs to Build Tests That Are Supposed to Fail
The tests that were built to fail
Feb 24
•
Byte-Sized Design
7
The Architect Is Not Being Replaced. The Architect Is Being Redefined.
And if you don't notice the difference, you'll end up on the wrong side of it.
Feb 17
•
Byte-Sized Design
22
4
Inside Uber’s 350PB Data Lake: The Distcp Rewrite That 5x’d Performance
How Uber scaled data replication from 250TB to 1PB per day by optimizing Apache Distcp, cutting latency 90%, 5x’ing capacity, and migrating 306PB to…
Feb 11
•
Byte-Sized Design
9
3
Knowing When to Stop Engineering: Airbnb’s Hardest Lesson
Tens of millions of lines of code. 700 services. 450 data pipelines. 4.5 years of migration. And the thing that could have cut the timeline in half was…
Feb 1
•
Byte-Sized Design
7
January 2026
What OpenAI Understood About Postgres That Most Teams Ignore
How One Postgres Instance Powers 800 Million ChatGPT Users
Jan 24
•
Byte-Sized Design
17
1
2
How Datadog taught an AI to investigate high-severity incidents
How Datadog built an AI SRE agent that investigates high-severity production incidents by forming hypotheses, following causal signals, and reasoning…
Jan 20
•
Byte-Sized Design
6
1
Processing Trillions: How Lyft's Feature Store Grew by 12%, 33% Faster, With Zero Custom DSLs
Lyft's Feature Store handles 1T+ operations daily, cut P95 latency 33%, and grew callers 25% YoY. How they built ML infrastructure engineers actually…
Jan 12
•
Byte-Sized Design
12
1
December 2025
The Trillion-Event Platform: How Spotify Built a Data System That Doesn't Break
TL;DR Spotify processes 1.4 trillion data points daily.
Dec 27, 2025
•
Byte-Sized Design
22
3
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts