Site Reliability Engineering

”Site reliability engineering”, or SRE for short, originates from Google around 2003. According to SRE inventor Ben Treynor, SRE is ”what happens when a software engineer is tasked with what used to be called operations.”

An SRE engineer is expected to do up to 50% ”ops”-related work (work on issues, be on call…) while the other 50% is for developing new features, automating things etc, so you need to be good at both system administration, coding and automation.

DevOps appeared around 2008 (five years later). With the popularity of DevOps, SRE can now be seen as ”a specific implementation of DevOps with some idiosyncratic extensions”.

Read more: Site Reliability Engineering on Wikipedia

Read a book or two about SRE for free:

On the other hand, not everyone agrees that SRE is the way forward: Read about why error budgets might not always be a good idea when building resilient systems and teams:

This entry was posted in . Bookmark the permalink.

Lämna ett svar

E-postadressen publiceras inte. Obligatoriska fält är märkta *