50: Service Level Agreements/Indicators/Objectives - Percentiles

In software reliability and availability, there are usually vague descriptions. When we say we offer good performance, what does it mean? This is why we use percentiles. Let’s think about response times, often used in SLOs, along with percentiles. In SLAs, we define X ms response time in the 95th percentile. To calculate this, put all response times and sort them from fastest to slowest. Pick the 95th one if you have 100 response times. (950th in 1000, and so on) The response time at that exact point defines your performance in the 95th percentile. For example, you can guarantee to serve in 200ms in the 95th percentile. The same applies to 99th and 99.9th percentiles. Pick accordingly from the sorted list. We also talk about the mean in response times. The mean response time is actually the 50th percentile. To find it, pick the middle one in the sorted list, and it shows the 50th percentile. For example, 200ms in the 95th percentile translates into 95% of all requests served faster than 200 milliseconds.


This note is mentioned in:

39. 56d4.

If you're unfamiliar with Zettelkasten: These notes are atomic. The aim is to have one idea in a note. The connections between notes are as important as the notes themselves.

Reply via email

or comment below.