Site Reliability Engineering – Crash Course

  • <span class="slider-title-topics" style="color:#5f0404">Bring Reliability to your Site .</span>
  • <span class="slider-title-topics" style="color:#bbbfa7">Practical Hands on Workshops .</span>
  • <span class="slider-title-topics" style="color:#e0dedc">Toil is the enemy .</span>
  • <span class="slider-title-topics" style="color:#b1fbff">Live Demos! .</span>
  • <span class="slider-title-topics" style="color:#4c4e4d" >Chaos Engineering .</span>

Site Reliability Engineering, condensed course to the busy engineers to catch on basic foundational elements of Site Reliability Engineering in real world.

With practical real world scenarios, organizations can quickly adapt and scale as the business demands.

Overivew

Organizations always facing challenges of scale, while we accept failures as norm, still the challenge to scale and ensure reliability with agility.

This one day course, offers a condensed version of Site Reliability Engineering concepts and terminologies that can help understand and communicate the basic fundamentals of Site Reliability Engineering.

This course is designed to fit within the busy schedule of the on duty engineer in busy growing environment.

This course, covers the concepts and the fundamentals of Site Reliability Engineering Foundation in real world challenges.

Prerequisites

Knowledge

Students to this class are expected to have:

  • Basic knowledge of sites operations
  • Basic understanding of computer operations skills :such as managing files

Technology

Depending on the delivery method of this course, the students should have :

  • A Workstation with Internet browser capability such as (Chrome, Edge, or Safari)
  • Good persistent internet connection without blocking firewalls(ideally non corporate firewall protected workstations)

Objectives

Students who completed this course, should build the skills and knowledge that allows them to

  • Describe the principles and practices of SRE
  • Have good understanding of Service Level Objectives and Error Budgets
  • Understanding TOIL and How to reduce it
  • Describe and research automation tools and techniques
  • Apply automation across the CI/CD pipeline
  • Acquire the necessary skills to understand and measure SRE practices and meeting the Service Level Objectives
  • Adapt or lead the adaptation of SRE within the organization
  • Understand the relationship between SRE and other frameworks

Audience

This course is designed to assist and equip the students with the skills and knowledge that allows them to perfect their daily tasks with respect to operationalize the CI/CD pipelines with confidence and capitalize the organization investment to operate reliable business.

  • Scrum Masters: Understand and publish meaningful observability metrics to team’s performance dashboards
  • Product Owners: understand the business value of time to respond to incidents and to ensure business continuity and reliability metrics at bar.
  • Application Architects: Understand the impact and importance of SLOs and SLIs and how to architect observable applications
  • Software Developers: Understand the significance and relevancy of metrics and SLIs
  • Security Architects: Ensure the SRE practices are secured and within the security guidelines
  • SRE Site Reliability Engineers: Learn skills and knowledge to develop realistic SLOs, and SLIs, Discover and optimize TOIL within the CI/CD pipeline
  • Systems Architects: Build infrastructure environments within SRE and Security practices
  • Help Desk staff: Understand the SLO and ticket response mechanism and optimize toil 

Timeline

The Site Reliability Engineering-SRE Foundation certification Course is a one day course, includes comprehensive coverage to Site Reliability Engineering practices.

The following is guidelines for the instructor to organize the time pace with the students, subject to change based on students preference.

Breaks during the day follows the 106 rule, every 45-60m 

*the 106 rule, indicates the human memory capacity to learn the new factual elements which is 106 facts before the memory could be reused.

Course Curriculum

Overview of Site Reliability Engineering (SRE)

  • Origins of SRE
  • SRE Principles
  • SRE Practices

Measuring SRE practices

  • SLO – Service Level Objectives
  • EB – Error Budgets
  • EBP – Error Budget Policies
  • SLI – Service Level Indicators

SRE Challenges

  • Organization Growth
  • TOIL
  • Monitoring and Observability

SRE Automation

  • Automation Focus areas
  • Automation Tools
  • Automation as a solution to TOIL

Continous Learning - Reliability

  • Reliability Measures
  • anti-fragility and Chaos Engineering
  • Disaster Recovery and Business Continuity

Organizational Impact of SRE

  • Pattern of Adapting SRE
  • Blameless post mortems
  • Developing SLO, SLI, and Observability

SRE Calendar

Scroll through the months, and chose the right schedule for you, send us a standard request form register

SRE Crash course Registration

Submit this form to request registration or inquire about a course, we offer professional advise to upskill IT professionals in the area of Architecture and Software Development.
Please enable JavaScript in your browser to complete this form.
Your name as you like us to call you
The name or the web site of your company
Your phone number so we can contact you.
your best email, please make sure it doesn't filter us out,
Course categories
Choose the category most fit to your requirements
# of Students: 1
Approximate number of students. Slide the number of students wishing to participate in the program.
Optional Target date of the class, date format DD/MM/YYYY
Additional comments about your inquiry , including additional target dates of the class if any

If you like what you see, please share it.

About the author

Leave A Reply

For the love of learning, We welcome inquiries and design courses for you!

Courses run on demand, custom designed, Please send us a note and one of our team members will reach out to you.