SLIs, SLOs, SLAs, DevOps and SRE

DevOps

Earlier Dev and Operation teams used to work as silos. There used to be barriers between the Developers and Operations team. Developers used to focus on coding/testing the features where as operations focus on maintenance and stability. DevOps is a set up practices and culture to break this barrier. 

SRE

Site Reliability Engineer. If DevOps is concept, SRE is the prescription to accomplish DevOps concepts with practicality. SREs define the availability, level of availability and Plan in case of failure

DevOps can be break down to 5 key areas.

  1. Reduce Org Silos, increase collaboration and throughput.
    SREs now use the same tools as Dev team and collaborates while triaging.
  2. Accept failures are normal, as humans do mistakes.
    SREs set the SLAs and enforce
  3. Implement gradual changes, not at once. It helps things easy to roll back.
    SREs implement Canary Groups, internal employees(Beta) and blue-green deployments.
  4. Leverage tooling and automation.
    SREs Eliminate the manual work/toil(time off in lieu)
  5. Measure everything.
    SREs Measures the toiling and reliability of systems

SLI, SLO and SLA

SLIs drive SLOs which form SLAs

Service Level Indicators (SLIs)

These are metrics over time. Request latency, Batch throughput and failures per request.
E.g. 1) 3 9’s (99Xth percentile) latency of homepage requests over past 5 minutes < 400 ms
        2) 2 failures per day when latency is less than 2 seconds.

Service Level Objectives (SLOs)

Binding target for a collection of SLIs. SLOs are both upper and lower bounds.
100% availability is not a real target as it demands more resources and budget. So SLO defines how much % of availability is required. 
e.g. 92nd percentile landing page SLI will succeed 99.XX % over trailing year.

Service Level Agreements (SLAs)

Business agreement between a customer and service provider typically based on SLOs. Usually communicated to the customers as a promise. Failing to do so can also be communicated to the customers as ramifications or incentives, like free service.
Layman E.g. Pizza gets delivered in 30 mins. If not, pizza is free 🙂

Junaid Ahmed
Junaid Ahmed

Junaid Ahmed is an enthusiastic Cybersecurity Manager and Azure Architect with a strong focus on cloud security, identity management, and passwordless adoption. He is passionate about helping organizations simplify their security approach, strengthen trust in the cloud, and embrace innovative technologies that drive both resilience and growth.

Articles: 35

Leave a Reply

Your email address will not be published. Required fields are marked *