• Latest
Starting an SRE Team? Stay Away From Uptime

Starting an SRE Team? Stay Away From Uptime

December 9, 2021
Pac-Man’s Family All Have VERY Different Names In Pac-Man Museum+

Pac-Man’s Family All Have VERY Different Names In Pac-Man Museum+

May 28, 2022
The Quarry’s Online Multiplayer Mode Has Been Delayed

The Quarry’s Online Multiplayer Mode Has Been Delayed

May 28, 2022
Evil Dead: The Game Review – Not Very Groovy

Evil Dead: The Game Review – Not Very Groovy

May 27, 2022
Love the trinity lenses? Try these great alternatives instead!

Love the trinity lenses? Try these great alternatives instead!

May 27, 2022
Apple supplier lockdown: Quanta workers rebel

Apple supplier lockdown: Quanta workers rebel

May 27, 2022
Pokkén Tournament Won’t Be Supported Competitively After 2022 Championships

Pokkén Tournament Won’t Be Supported Competitively After 2022 Championships

May 27, 2022
Xmake and C/C++ Package Management

Xmake and C/C++ Package Management

May 27, 2022
More of PlayStation’s Biggest Games Are Becoming Movies, TV Shows – Beyond 751

More of PlayStation’s Biggest Games Are Becoming Movies, TV Shows – Beyond 751

May 27, 2022
Analogue Pocket’s Next Big Beta Update Will Roll Out In July

Analogue Pocket’s Next Big Beta Update Will Roll Out In July

May 27, 2022
Verizon downplays database hacked and held for ransom, security risk could remain

Verizon downplays database hacked and held for ransom, security risk could remain

May 27, 2022
Cover Reveal – Evil Dead: The Game

Enter For Your Chance To Win Game Informer Gold – Evil Dead: The Game Issue

May 27, 2022
Mario Strikers: Battle League Gets a Free Demo

Mario Strikers: Battle League Gets a Free Demo

May 27, 2022
Advertise with us
Saturday, May 28, 2022
Bookmarks
  • Login
  • Register
GetUpdated
  • Home
  • Game Updates
    • Mobile Gaming
    • Playstation News
    • Xbox News
    • Switch News
    • MMORPG
    • Game News
    • IGN
    • Retro Gaming
  • Tech News
    • Apple Updates
    • Jailbreak News
    • Mobile News
  • Software Development
  • Photography
  • Contact
    • Advertise With Us
    • About
No Result
View All Result
GetUpdated
No Result
View All Result
GetUpdated
No Result
View All Result
ADVERTISEMENT

Starting an SRE Team? Stay Away From Uptime

December 9, 2021
in Software Development
Reading Time:5 mins read
0 0
0
Share on FacebookShare on WhatsAppShare on Twitter


A good SRE engineer will tell you your service is never down. A great SRE engineer will tell you that’s not what you should be measuring. In fact, they’ll tell you their job is customer service. 

Site Reliability Engineering (SRE) has grown immensely popular with many of the world’s largest tech companies, like Netflix, LinkedIn and Airbnb employing SRE teams to keep their systems reliable and scalable.

Along the way, SRE engineers have become one of the most sought after engineering roles in tech. 

The role is traditionally understood as ensuring that services are reliable and unbroken, but reliability and uptime aren’t perfect metrics. Perhaps what organizations should be asking themselves is what their customers think of their service. 

Wandering down to your engineering department and asking your SRE team about customer satisfaction is a good place to start. 

Their answer just might surprise you. 

History of SRE

In practice, Site Reliability Engineering has been around for a while. In the past its functions were covered by roles that had names like production ops, disaster recovery, testing or monitoring. The rise of cloud computing facilitated a need for more engineers in production. The complexity only grew as more organizations transitioned from monolithic infrastructures to distributed microservices. 

Modern Site Reliability Engineering originated at Google in 2003 with the work of Benjamin Treynor, who is seen as the “father” of what we now simply call SRE. Treynor, who coined the term, was a software engineer placed in charge of running a production team. With the goal of making Google’s website as reliable and serviceable as possible, he asked that his team spend half their time on operations tasks so they could better understand software in production. This team would become the first-ever SRE team.

“Ben Treynor said, I’m paraphrasing, ‘[SRE] is essentially like throwing a software engineer at an operations problem’, right? Because you come from that developer mindset, that design and, you know, you think about all of these things. So think about it as a developer but apply it to an operational type of problem.” – Brian Murphy on the Dev Interrupted podcast at 4:26

Why not uptime?

So why shouldn’t you be too concerned about your uptime metrics? In reality SRE can mean different things to different teams but at its core, it’s about making sure your service is reliable. After all, it’s right there in the name. 

Because of this many people assume that uptime is the most valuable metric for SRE teams. That is flawed logic. 

For instance, an app can be “up” but if it’s incredibly slow or its users don’t find it to be practically useful, then the app might as well be down. Simply keeping the lights on isn’t good enough and uptime alone doesn’t take into account things like degradation or if your site’s pages aren’t loading. 

It may sound counterintuitive, but SRE teams are in the customer service business. Customer happiness is the most important metric to pay attention to. If your service is running well and your customers are happy, then your SRE team is doing a good job. If your service is up and your customers aren’t happy, then your SRE team needs to reevaluate.

A more holistic approach is to view your service in terms of health. 

The Four Golden Signals

As defined by Google, these are the four golden signals of SRE. If these can be managed effectively, then you probably have a healthy system. 

  • Latency: Involves response time and the time it takes to service a request. 
  • Traffic: Is a measure of the demand that is being placed on your system. E.g. how many messages are you getting; can you handle them?
  • Errors: The rate of requests that fail. E.g. running an HTTP server that is returning a lot 500s is bad. 
  • Saturation: Is a way of thinking about the capacity of your system. E.g. is your service being overwhelmed? 

Establishing system health

“The best way to get started is just measuring stuff, you know, just getting the baseline of what’s healthy, what’s not healthy, what looks like health, and then you can start working from there.” – Brian Murphy on the Dev Interrupted podcast at 10:49

It can be difficult to know whether or not your organization should consider forming an SRE team, or what your next steps are if you’ve already made the decision. 

Again, think of your decision in terms of a holistic approach, not just your uptime. If you have high uptime, that’s fantastic, but what you should be establishing is a benchmark. 

Using the four golden signals to guide you, establish what you think a healthy system should look like and set your benchmark. Keep measuring over time and you will begin to see the areas that are good or require more work. 

These measures will help inform all of your future decisions. Perhaps your organization is ready to roll out new features or make choices around expanding your service. 

Critically, the health you establish provides insights into customer happiness. If things look good you probably have happy customers. 

Internal customers

When done right SREs aren’t just making customers happy, they’re making the lives of developers easier too. Nothing is worse than having to stop because there’s a problem in production. Good SRE teams can shield dev teams by focusing on major hotspots.

If the fires are being managed before they are out of control, it allows developers to keep pushing out features. It even gives them the freedom to keep breaking things, if necessary!

When things do break, or require a slowdown, a dialogue can occur. A good SRE understands that the developer who wrote a piece of code understands it better than anyone. The model for good internal customer service is an SRE who brings in a developer, gives them ownership of the code they created, and offers to help them fix it.

Happy customers are the best customers

Whether you already have an SRE team or are thinking about forming one, remember to think beyond the engineering – think about the customer. 

Ask yourself if your customers are happy and if you would describe your service as healthy. Remember to think about your own teams as well, your developers will thank you for it. 

_____________________

This article is based on an episode of Dev Interrupted. The only podcast made exclusively for dev team leaders, it features expert guests from around the world to explore strategy and day-to-day topics ranging from dev team metrics to accelerating delivery. 



Source link

ShareSendTweet
Previous Post

comedy scene | funny pranks | reaction videos | memes | tiktok | moments | clips | shorts | part 168

Next Post

Amazon Delays New World Server Merges After Service Outage

Related Posts

Xmake and C/C++ Package Management

May 27, 2022
0
0
Xmake and C/C++ Package Management
Software Development

Xmake is a lightweight cross-platform build tool based on Lua. A previous article provides a detailed introduction to Xmake and...

Read more

What Is Smoke Testing? – A Brief Guide

May 27, 2022
0
0
What Is Smoke Testing? – A Brief Guide
Software Development

Smoke testing is a method of determining whether a newly released build is stable. It assists a QA or testing...

Read more
Next Post
Amazon Delays New World Server Merges After Service Outage

Amazon Delays New World Server Merges After Service Outage

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

© 2021 GetUpdated – MW.

  • About
  • Advertise
  • Privacy & Policy
  • Terms & Conditions
  • Contact

No Result
View All Result
  • Home
  • Game Updates
    • Mobile Gaming
    • Playstation News
    • Xbox News
    • Switch News
    • MMORPG
    • Game News
    • IGN
    • Retro Gaming
  • Tech News
    • Apple Updates
    • Jailbreak News
    • Mobile News
  • Software Development
  • Photography
  • Contact
    • Advertise With Us
    • About

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?