these are the three main lists of golden signals today: From the Google SRE book: Latency, Traffic, Errors, Saturation USE Method (from Brendan Gregg): Utilization, Saturation, Errors RED Method (from Tom Wilkie): Rate, Errors, and Duration
You can see the overlap. USE is about resources with an internal view, while RED is about requests and real work, with an external view.
focus on signals: Request Rate — request rate, in requests/sec. Error Rate — error rate, in errors/sec. Latency — response time, including queue/wait time, in milliseconds. Saturation — how overloaded something is, directly measured by things like queue depth (or sometimes concurrency). Becomes non-zero when the system gets saturated. Utilization — how busy the resource or system is. Usually expressed 0–100% and most useful for predictions (saturation is usually more useful for alerts).
Monitoring SRE's Golden Signals
from www.infoq.com
Filed under:
Related Notes
- Now the real advantage of tracing to me is it comes out of the box ...from Mathew Duggan
- My experience is companies do not anticipate that the cost of monit...from Mathew Duggan
- The **OODA loop** is the cycle *observe–orient–decide–act*, develop...from en.wikipedia.org
- Dealing with Error: • Put the knowledge required to operate the te...from Don Norman
- insights from the seven stages of action lead us to seven fundament...from Don Norman
- These are qualities of a good system that we should strive to fulfi...from Josh Beckman
- As far as I can tell, cognitive bandwidth and network bandwidth bot...from Ferd.ca
- If our notion of “what is supposed to happen” is getting weaker, th...from future.a16z.com