Skip to main content
Posts

Posts

2021

Gateway Avalanche Crisis: How Synchronous Redis Calls Nearly Brought Down Our System

Gateway Avalanche Crisis: How Synchronous Redis Calls Nearly Brought Down Our System

·1662 words·8 mins
A deep dive into a production incident where our Spring Cloud Gateway experienced cascading failures due to blocking Redis operations. Learn how synchronous API calls in reactive environments can cause thread starvation, leading to health check failures and system-wide avalanches, plus the complete solution using async patterns.
A Peculiar Bug Hunt: When Exceptions Lose Their Voice

A Peculiar Bug Hunt: When Exceptions Lose Their Voice

·1195 words·6 mins
A deep dive into a mysterious production issue where exception logs mysteriously disappeared, leading us through Arthas debugging, Log4j2 internals, and the discovery that an exception’s getMessage() method was itself throwing exceptions due to Guava-Guice version incompatibility.
A Hidden Production Issue Discovered Through SQL Optimization

A Hidden Production Issue Discovered Through SQL Optimization

·1101 words·6 mins
When our operations team brought us a complex SQL query that was taking forever to execute, we thought it was just a performance issue. Little did we know, this investigation would uncover a deeply hidden character encoding mismatch that had been silently causing full table scans in our production database.
Troubleshooting a SSL Performance Bottleneck Using JFR

Troubleshooting a SSL Performance Bottleneck Using JFR

·395 words·2 mins
In-depth analysis of a microservice performance issue with CPU spikes and database connection anomalies. Through JFR profiling, we discovered the root cause was Java SecureRandom blocking on /dev/random and provide solutions using /dev/urandom.
JDK Tough Way - 1. A Comprehensive Guide to Thread Local Allocation Buffers

JDK Tough Way - 1. A Comprehensive Guide to Thread Local Allocation Buffers

·9380 words·45 mins
A deep dive into JVM’s Thread Local Allocation Buffer (TLAB) mechanism, covering design principles, implementation details, performance optimization, and source code analysis. Learn how TLAB improves memory allocation efficiency in multi-threaded environments and master TLAB tuning techniques.

2020