Overivew
Organizations always facing challenges of scale, while we accept failures as norm, still the challenge to scale and ensure reliability with agility.
This one day course, offers a condensed version of Site Reliability Engineering concepts and terminologies that can help understand and communicate the basic fundamentals of Site Reliability Engineering.
This course is designed to fit within the busy schedule of the on duty engineer in busy growing environment.
This course, covers the concepts and the fundamentals of Site Reliability Engineering Foundation in real world challenges.
Prerequisites
Knowledge
Students to this class are expected to have:
- Basic knowledge of sites operations
- Basic understanding of computer operations skills :such as managing files
Technology
Depending on the delivery method of this course, the students should have :
- A Workstation with Internet browser capability such as (Chrome, Edge, or Safari)
- Good persistent internet connection without blocking firewalls(ideally non corporate firewall protected workstations)
Objectives
Students who completed this course, should build the skills and knowledge that allows them to
Audience
This course is designed to assist and equip the students with the skills and knowledge that allows them to perfect their daily tasks with respect to operationalize the CI/CD pipelines with confidence and capitalize the organization investment to operate reliable business.
- Scrum Masters: Understand and publish meaningful observability metrics to team’s performance dashboards
- Product Owners: understand the business value of time to respond to incidents and to ensure business continuity and reliability metrics at bar.
- Application Architects: Understand the impact and importance of SLOs and SLIs and how to architect observable applications
- Software Developers: Understand the significance and relevancy of metrics and SLIs
- Security Architects: Ensure the SRE practices are secured and within the security guidelines
- SRE Site Reliability Engineers: Learn skills and knowledge to develop realistic SLOs, and SLIs, Discover and optimize TOIL within the CI/CD pipeline
- Systems Architects: Build infrastructure environments within SRE and Security practices
- Help Desk staff: Understand the SLO and ticket response mechanism and optimize toil
Timeline
The Site Reliability Engineering-SRE Foundation certification Course is a one day course, includes comprehensive coverage to Site Reliability Engineering practices.
The following is guidelines for the instructor to organize the time pace with the students, subject to change based on students preference.
Breaks during the day follows the 106 rule, every 45-60m
*the 106 rule, indicates the human memory capacity to learn the new factual elements which is 106 facts before the memory could be reused.







Course Curriculum
Overview of Site Reliability Engineering (SRE)
- Origins of SRE
- SRE Principles
- SRE Practices
Measuring SRE practices
- SLO – Service Level Objectives
- EB – Error Budgets
- EBP – Error Budget Policies
- SLI – Service Level Indicators

SRE Challenges
- Organization Growth
- TOIL
- Monitoring and Observability
SRE Automation
- Automation Focus areas
- Automation Tools
- Automation as a solution to TOIL
Continous Learning - Reliability

- Reliability Measures
- anti-fragility and Chaos Engineering
- Disaster Recovery and Business Continuity
Organizational Impact of SRE
- Pattern of Adapting SRE
- Blameless post mortems
- Developing SLO, SLI, and Observability
SRE Calendar
Scroll through the months, and chose the right schedule for you, send us a standard request form register