the system grows for a while (increasing λ), or we reduce the number of servers (decreasing μ) to realize our efficiency gains. That causes ⍴ to pop back up, and latency to return to where it was. This often leads people to be disappointed about the long-term effects of efficiency work, and sometimes under-invest in it.
The system we consider above is a gross simplification, both in complexity, and in kinds of systems. Streaming systems will behave differently. Systems with backpressure will behave differently. Systems whose clients busy loop will behave differently. These kinds of dynamics are common, though, and worth looking out for.
The bottom line is that high-percentile latency is a bad way to measure efficiency, but a good (leading) indicator of pending overload. If you must use latency to measure efficiency, use mean (avg) latency. Yes, average latency4.
Latency Sneaks Up On You - Marc's Blog
- By replacing integration tests with unit tests, we're losing al...from Computer Things
- The only good advice I have here is to re-evaluate your metrics oft...from ferd.ca
- Potential SLIs for different types of components - Request-driven ...from Steven Thurgood and David Ferguson
- as devices get extremely fast, interrupt-driven work is no longer a...from Glauber Costa
- As incidents continue to occur, teams generally respond by starting...from GitHub
- In general, the chief compliance officer at any company has a dial ...from Matt Levine
- The discovery of scaling laws has typically preceded a boomtime for...from Josh Beckman
- Use a small model to generate a 'draft' output, then use a ...from Josh Beckman