This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
- # ask-the-speaker-track-1 (249)
- # ask-the-speaker-track-2 (114)
- # ask-the-speaker-track-3 (244)
- # ask-the-speaker-track-4 (175)
- # bof-leadership-culture-learning (6)
- # bof-project-to-product (10)
- # bof-sec-audit-compliance-grc (2)
- # demos (9)
- # faq (6)
- # games (20)
- # games-self-tracker (1)
- # gather (22)
- # happy-hour (52)
- # hiring (19)
- # lean-coffee (12)
- # networking (3)
- # plenary-discussion (1290)
- # summit-help (44)
- # summit-info (122)
- # xpo-adaptavist (7)
- # xpo-anchore-devsecops (9)
- # xpo-aqua-security-k8s (2)
- # xpo-basis-technologies (2)
- # xpo-blameless (2)
- # xpo-bmc-ami-devops (2)
- # xpo-cloudbees (4)
- # xpo-codelogic-code-mapping (2)
- # xpo-dynatrace (1)
- # xpo-gitlab-the-one-devops-platform (1)
- # xpo-granulate-continuous-optimization (2)
- # xpo-infosys-enterprise-agile-devops (2)
- # xpo-instana (4)
- # xpo-itrevolution (3)
- # xpo-launchdarkly (11)
- # xpo-logdna (1)
- # xpo-pagerduty (8)
- # xpo-rollbar (2)
- # xpo-servicenow (1)
- # xpo-shoreline (3)
- # xpo-snyk (4)
- # xpo-sonatype (5)
- # xpo-split (10)
- # xpo-splunk_observability (5)
- # xpo-stackhawk (4)
- # xpo-synopsys-software-integrity (2)
- # xpo-tasktop (4)
- # xpo-weaveworks-the-gitops-pioneers (5)
“For years, our retrospectives didn’t feel like they were working; we worked to make it more blameless. No more people standing up in front of room, explaining how things didn’t go right; We’ve taken shame and humiliation out of the process, so they can tell stories and make sure it doesn’t happen again.” — @kimberly_h_johnson How have you managed the balance between improving psychological safety whilst also holding people responsible for their work? It sometimes feels like role accountabilities and the desire to make sure we don't fail our members gives people a justification to continue with their non-psychologically safe behaviours.
“For years, our retrospectives didn’t feel like they were working; we worked to make it more blameless. No more people standing up in front of room, explaining how things didn’t go right; We’ve taken shame and humiliation out of the process, so they can tell stories and make sure it doesn’t happen again.” — @kimberly_h_johnson How have you managed the balance between improving psychological safety whilst also holding people responsible for their work? It sometimes feels like role accountabilities and the desire to make sure we don't fail our members gives people a justification to continue with their non-psychologically safe behaviours.
For me, I think we have to care and we have to demonstrate that too. We can't make others care, hence we can't make them be responsible. they have to do that for themselves. Everything we do should be a pull system anyway. Just my 2cents.
Leading by example - I share my failures and learning too in as public of a way as I can. I also think words matter. I say "learn fast" vs. "fail fast" and have shifted the organization away from using post mortem to using learning reviews. It takes time to build that trust. I also like to talk about "honoring reality" and try to encourage senior leaders to honor reality vs. judging it.
By flipping the meaning of accountability to mean “able to give a detailed account of their perspective in/around an event, without fear of reprisal” rather than “falling on a sword”, it places the focus on how rich and detailed the account can be. Ironically, this can fulfill the desire you mention, which is the commitment to stakeholders. Public “lashings” don’t demonstrate commitment to genuinely learn from an event, it only implies that the org/leadership believe that all mistakes come from individual motivation, and that shame is an effective deterrent to making future mistakes.
@ckissler "have shifted the organization away from using post mortem to using learning reviews". I like that. Learning Reviews. I will be stealing that! 🙂
@kimberley.wilson2 We have a fantastic talk from the Suncor team later this morning, who will talk about their safety culture, important to them because of the hazardous nature of so much of their work — from @jroa @lideluca and John Hill. (I forgot to ask them the obvious question about whether that emphasis on physical safety encourages psychological safety — it seems implicit, but let’s find out for sure from them!)
☕ Good morning and ready for day 2️⃣ !!!
Good morning, everyone! I’m excited about the talks today this morning!!
Love the spotlight on goals and community engagement throughout the year 😊
Read this last night and thought of this community @genek : If in your working hours you make the work your end, you will presently find yourself all unawares inside the only circle in your profession that really matters. You will be one of the sound craftsmen, and other sound craftsmen will know it. This group of craftsmen will by no means coincide with the Inner Ring or the Important People or the People in the Know. It will not shape that professional policy or work up that professional influence which fights for the profession as a whole against the public: nor will it lead to those periodic scandals and crises which the Inner Ring produces. But it will do those things which that profession exists to do and will in the long run be responsible for all the respect which that profession in fact enjoys and which the speeches and advertisements cannot maintain. https://www.lewissociety.org/innerring/
One of the questions I always like, is do you understand what the next level of leadership is working on, and what they see? Sometimes things that seem odd at lower level make sense from that next perspective. Emphasizes the need for full team in retros
So delighted you’re here, Joey! Looking forward to your talk later shortly!
“wrong side of the rope” fits perfectly with that Inner Ring essay ^ 🙂
can't wait for Dr Westrum later. Has anyone else listened to Gene's podcast with him at least 4 times :rolling_on_the_floor_laughing:
I was thinking this morning that I need to reply to the tweet asking about favorite Idealcast episode with Westrum as my choice.
Yes i listened to the Idealcasts @ronwestrum did with @genek many times, pausing frequently to try and think over what I’d heard
And what a huge difference to put effort into removing that rope, instead of trying to keep a barrier of exclusivity!
(I listened to the 2 hours episode with @jtf for the 4th time two days ago. 🙂
I forgot to thank @sophie.weston129 on Twitter, which is what prompted it.
I too find myself getting new insight from talks on 3rd or 4th hearing 👍
How did you connect with people (like other speakers and leadership team) at the other conferences @genek ? Superrrr curious
Worst case, if we don’t get to talk this week, email me and let’s talk! (Maybe during one of the networking sessions?)
Yes please! I’m sure others would benefit as well! I’ll follow up with you
@nickeggleston - I’ve also found that, people tend to find Gene…be it a a city park, grocery store, remote beach, crowded bar. 😆
So a quick trip to the store can end up taking 5 hours…
Wanna meetup in Gather during networking session this morning? I can share some of the stories, but MVK is right — it’s about 50/50 me finding them, and them finding me. After Phoenix Project came out, it became much easier, if you can imagine. Hope you’re having a great week here!
I think that I will be using the phrase "Wrong side of the rope" from now on.
Go here to see: https://doesvirtual.itrevolution.com/video-library-upgrade
Excited to hear about future live events.. while I am grateful for the virtual option, I am very much looking forward to getting back together in person hopefully next year!
So many of the new amazing great case studies came from this community!
excited about the new version release, to bad we couldn't get a different color cover
☀️ Starting us off this morning is the team from Suncor – @jroa @lideluca and @johnhill presenting, Continuous Delivery at Suncor's Digital Bay; Mining Maintenance Meets DevOps ☀️
I am happy that I managed to adjust my day rhythm enough to make yesterday possible. Tomorrow is going to be hard because of a dinner with the team.
Thank you for presenting, @jroa @lideluca and @johnhill, to tell us about the amazing Suncor culture and story!
Suncor Journey to Zero: https://www.suncor.com/en-CA/sustainability/health-and-safety/journey-to-zero
This really speaks to me. Customers I serve at my org work in Nuclear Power Plants!
I forgot to ask you a question, @jroa @lideluca @johnhill Does the cultural emphasis on physical safety make it easier to achieve psychological safety? And if so, how? THANK YOU!
How and how often do you (suncor) measure psychological safety?
Great question, @genek I would suggest yes but it took time... Yes, in that the teams embrace the safety aspects as a common and important goal.
It took time b/c as we went through our safety journey, there was pressure within layers of management that were still performanced on throughput. That created a tension that made it difficult at times
Huge logistical problems not visible until people stop and and questions
I agree. Over time - yes. At first it can make people feel uncomfortable as it has been a shift in culture. But overtime. The principles of looking out for one another create phycological safety.
Many have observed that the famous Toyota Andon cord requires a tremendous amount of psychological safety — when installed at GM plants, people were famously yelled at and punished b/c it jeopardized production targets.
This really speaks to me. The people I serve at my org work in Nuclear Power Plants
@jroa, I think that your response about moving safety to a priority position is true for any culture change. In my experience moving Agile principals to a priority position meets with the exact same resistances. Thank you
While we don’t measure it per say… I think metrics like “near miss” reporting are a sign of phycological safety. People feel comfortable to step forward and say. “Something almost happened and I want to learn from it”
Having spent most of my career in ecommerce, where people’s lives aren’t typically on the line, this is a great story to share. It’s reflective of other types of culture transformation we are all apart of.
(For people interested in this topic, you may like the wonderful section in Dr. Amy Edmondson’s book The Fearless Organization of a South African mining company, and a similar journey to zero on the job accidents.)
Awesome about safety culture! How do you maintain safety while moving forward purposefully without 'analysis paralysis'?
@nickeggleston Nice question! We have a few different measures... organizationally, we look to measures such as "Great Places to Work" surveys and similar trust indices. At the Team level, our Agile COE will use a Team Pulse check (similar to spotify's) that measures psych safety, team direction/vision, and how the team inter-operates. We look to the scrum master/team lead and agile coach to work through the check (anonymous) and come up with actions as a result
(It occurs to me that much of Dr. Sidney Dekker’s work also involves dangerous mining and resource extraction companies — showing how dangerous work in those industries are.)
I think one challenge is to make sure that safety doesn't impact development for non-safety issues. Like the DoD manadating operational testing, after users involved in design and fielding, Need to make sure safety acceptance is in team, and not a seperate silo
I wish we could do a touch-a-truck event with a mining truck.
@lwdavis There is always a risk of becoming to embroiled in all the "what ifs". However, generally speaking, the mix within a team will create a balance of safety and urgency to keep moving forward. We also continue to have progress/performance reports that can be used to highlight areas that might need a nudge
"increased wrench time" and "uptime of trucks" - a very different POV
…when critical mission goals can’t be done with existing COTS software…
@tiny.mpetersii Good call out. Another lens is that when you apply safety across the board, it creates a muscle memory that becomes a reflex (you just do it). Then, the teams help temper the "right" amount of safety given the context of the situation. As always (so important), context matters!
I’m hearing digital andon cord and standardized work. I was expecting Alcoa and I’m hearing Toyota. 🙂
@kboth_does Checklists are great and serve a very useful purpose. When they are blindly followed, people miss the intent behind the checklist. We've suffered from that checklist mentality and it is something to watch for. I've personally seen IT teams look to get the checkmark instead of understanding and accomplishing what the checkmark actually represents. 😞
Good point - I've heard this expressed as the difference between a
check off list (going through the motions) vs. a
list of things to check (applying thinking and care)
How do you address the training with such disparate view points of technology?
"How do we implement a build into a culture that is predominantly buy?" Going through this right now
We recently went through this as well. Be careful of the pendulum swinging too far the other way. Now we're having conversations about, "Why are we building this now commodity software instead of buy it and invest our people in more differentiating software?"
Completely agree with you there! I have worked in places that were too far down the buy route. We constantly weigh things as we discuss how to bush more build mindset to make sure they're things that differentiate us!
and of course, if we build, we'd ideally like to open source and give back to the community
Now we're trying to talk more about where our systems fall in Geoffrey's Moore's Core/Context mode: http://strategictoolkits.com/strategic-concepts/core-and-context/
The nature of the current way of working reminds me of how air tanker refueling scheduling was done in USAF, before Kessel Run Project Jigsaw. Lots of whiteboards for scheduling and tracking. Such an exciting project, @lideluca!
@leahb @alex @mvk842 - I notice that this talk is not in the Video Library (yet). Can we expect this to appear later today? Are you aware of any talks that will not be part of the Video Library (due to NDA agreements)? Thinking of @jason.cox's awesome Disney talks from a few years back.
Loved the @jason.cox talks that could not be published! So happy I was there to hear about the "Grumpy" server and how "Darth Vader" is a Pathological Leader!!! lol!
@mring we’ll be publishing this talk after the AM plenary block is over. I have my finger ready on the publish button.
And yes, to my knowledge, all talks will be published.
@mring Plenary talks are published at the end of the morning/afternoon. 🙂
@jroa you mentioned affecting culture change at Suncor, do you have tips on how to effect culture change?
@jen Training is tricky. It costs $ and takes capacity out of the system but it also drives productivity, engagement, etc. What we're trying (jury is out) is having digital fluency/literacy target courses at the org level (e.g. all of IT and functions). This includes areas such as Cloud, DevOps, agile, etc. Specific needs are meant to be identified at the team / project level. Then, a decision about how to best close the gap is made. It coudl be mentoring, buying (course or talent) or alternate.
Wonder if a talk about small teams vs large teams would be good to have in future DOES - like how to empower the small team, also there is a need for large team, so how to make the large more effective, so really a compare and contrast and then followed by when to try each...
Is it really a small vs large or a single team vs a collection of teams. Seems like if the team is too large, you start losing efficiency
I think it speaks to where there is spare capacity.. Larger teams seem to have more gray area and you can start with adding smaller items easier, but small teams you may have just look at starting a new team that can overlay on an existing..
@johnhill Ha! Flatland! https://www.amazon.com/Flatland-Romance-Dimensions-Edwin-Abbott-ebook/dp/B00KVTS1T2
And now, we welcome @jpetoff and @cleng from Google, here to present, How Google SRE and Developers Work Together
Please welcome two phenomenal experts in SRE in theory and practice, Dr. @jpetoff and Dr. @cleng!
@jroa how did your teams get introduced to DevOps and obtain buy in from the top?
@gitty.rosenfeld It's cliche but culture change is tough. I think you need to have support from leadership (we do, a la John Hill and others) and a motivated front line (leaders need to convey the why). From there, i sincerely believe it occurs at the team level. Coaches, mentors and leaders need to reinforce the culture we are moving to and sharing stories of successes. When old culture behaviors come forward, it's best to safely (no pun intended) provide that feedback in a non-threatening fashion. It's important for leaders to realize the change takes time. An org may have spent decades entrenching current behaviors. You can't flip that over night. It takes persistence and reinforcement.
(Dr. @cleng Ph.D. thesis was on distributed systems. 🙂 How fitting that he’s now doing SRE — like how some of the best firefighters can think like arsons. 😆
“Google: at 2B+ LoC: might be the most complex integrated systems humanity ever created”
Not super relevant to this talk, but you can read my PhD thesis at http://tuprints.ulb.tu-darmstadt.de/3078/
@jroa it appears Suncor's DevOps journey started with one business segment versus the whole technology org transforming simultaneously - is this accurate or was it big bang?
Love the quote - very empowering for the software engineers
“BubbleStorm” — this sounds like one of those wonderful projects that torture people trying to preserve CAP theorem objectives, @cleng!!
“We now have 3000 SREs” — all reporting to VP 24/7 Engineering, Ben Treynor-Sloss.
@nickeggleston We're still early in our maturity for DevOps. Our DevOps COE (small team initially), defined the capabilities that make up DevOps at Suncor and then prioritized which ones we felt we should emphasize. From there, we began the change and comms work to educate teams and leaders about the why . For the leaders, the messaging was a reinforcement of quality (repeatability via automation), resilience (being able to restore or deploy in a rapid fashion) and speed (doing 1000's of tests in minutes via tech). This would allow for shorter feedback loops with the customers to help the time to value equation. HTH. If not, hit me up and we can talk more.
the chemistry background for @jpetoff makes sense to me. a discipline that created the “things I won’t work with” series seems good background for SRE
We will be publishing a podcast with @jpetoff later on this month in case you're interested
BTW @genek did the talk start early? I dialed in at X;12 and we were already a few slides in...
I’m so sorry! Not sure — @annp would know. But all good, we’re all so happy that you’re here! (@annp, can you send the link to the complete video? 🙏 )
The functional nature of SRE at Google is so interesting to me — and I’m so excited that @jpetoff and @cleng are sharing some of the “engagement models” between the product and SRE orgs, including the economics.
@jtf STEM subjects like chemistry are a great foundation for SRE: it's like applying the scientific method in a pressure cooker 🙂
I completely agree! physics-chemistry double major myself. went from an interest in computational physics/chemistry into software.
Very nice! I did synthetic chemistry myself so software is a bit of a departure, but I love it!
@kristin.valters TBH, it's neither. We have targeted work going on across all the business units/areas. The messaging and vision has been communicated to everyone (org wide) with an emphasis on digital (some groups don't do anything tech so the message gets lost with them). Then, based on pull (or leader push, on occasion), we focus on a # of small teams in that area. Allows for tailoring of technologies used and business practices employed. Early in the agile transformation work I was leading, I quickly discovered you get the most "bang for the dollar" by going to the teams and areas that want you vs. selling to the groups that don't. The laggards will come along eventually. HTH
“Both sides must agree to start the relationship; either side can end it.” (I love the way @jpetoff describes this.)
“SRE is a scarce resource by design” - is that to ensure that Devs retain some level of ownership?
Yes. and also to ensure that we aren't taken for granted and only working on the highest value added things.
Indeed. For SRE supported services, Devs may also share a portion of the oncall load and many teams that don't have SRE support handle their own oncall.
Is the team that owns the service the first one that gets alerts for it, or are first level alerts sent to a different team?
Yes, the team that owns the service typically gets paged. There is no general triage queue.
How does the decision to hand a project back to a product team go? Do the SREs collectively decide that or the product team or someone in management?
“Work must be challenging to SRE teams. It must improve the reliability of systems thru engineering.” — @jpetoff
Excellent description of how SRE works with ops!
What is the decision process around funding SREs for those who hold that budget line? @jpetoff
I posted my response in the main chat earlier, but reposting here to make it easier to find: The budget for SRE headcount comes from the Dev org. They "pay" SRE via HC, but once transferred, SRE is in complete control of the HC (until the engagement is ended by either side).
If I’m not accountable for my quality decisions, quality is a fantasy.
@genek told me I was going to love this one! This is so amazing @jpetoff nd @cleng!a
Is there anything that prevent the product teams from thinking, "I'll just leave that for SRE to fix, as I know that they are there to catch things for me, so I don't need to think about it as much as I would have done if SRE didn't exist"?
That sounds like a 'throw it over the fence' mindset to me which I'd consider an antipattern. @cleng anything to add?
Yes, definitely an antipattern. What behaviour do you see at Google? Is there anything intentional which prevents that antipattern?
I think there is an educational and communication component to it. SRE leadership and engineers on the ground communicating about the key SRE principles and best practices and how teams should work together for maximum impact. This is something that @cleng is actively working on. My team is also working to share reliability best practices more broadly. Reliability needs to be everyone's concern, not just SRE.
There are a number of remedies to this challenge: • Write down who's responsible for what and who has authority over what. If SRE can block bad launches, Dev will try to work something out (However, the job of SRE is NOT to block, it's to advise!) • Work together. Have Devs work a little bit on infrastructure and ops and have SREs work a little bit on product - not too much, because it waters down the role specialization, but enough to maintain a mutual understanding. • Establish ultimate accountability for the product's reliability with the product team. SRE's job is to help them achieve that, but at the end of the day, they remain accountable. • If the situation spirals out of control, declare a "production freeze" (only stability fixes get deployed) and/or "code yellow" (reliability work trumps all other project work until the exit criteria are met). You need support from senior Dev leadership for either. If you can't get support from anyone in the reporting chain on the Dev side, you should find a better Dev org to work with.
Thanks @cleng. "Establish ultimate accountability for the product's reliability with the product team. SRE's job is to help them achieve that, but at the end of the day, they remain accountable.", this resonates for me as to how to avoid the human tendency to think that it's someone else's problem.
The last 3 companies where I helped bring DevOps to life, I had to spear head getting Ops involved in the DevOps.. This can be challenging as most of these teams have been covered up in firefighting and are staffed to min staffing levels...
What’s so remarkable of this talk is that everything is grounded in economics — deliberate surfacing that funding SREs is at expense of product devs; that SREs can’t be “bought” to do non-novel work, etc.
This sounds very much like when you bring a manufacturing engineering into a physical production process in order to address quality or throughput issues. My Dad did this at Pratt and Whitney to go in and help other teams or contractors figure out how to build a part correctly when they were having trouble. And they were an expensive resource to bring in.
or for that matter - not Sys Admins renamed as "SRE" 🙂
Are you seeing the need to address the measurement or incentives for individuals in org? If so, how are you messaging that?
@nickeggleston The budget for SRE headcount comes from the Dev org. They "pay" SRE via HC, but once transferred, SRE is in complete control of the HC (until the engagement is ended by either side).
I see SRE as the breakout group that can be the champion in both Dev and Ops... they bridge the gap in many ways...
@jpetoff - question, I guess the SRE comes into picture only for web scale systems at Google, correct? I see some refer about SRE for every IT Service/Systems. Just wanted to stand corrected.
SRE support is typically limited to the most mission critical services. It's part of the cost benefit equation.
Love that "Mission Critical Services" - most IT systems owners think that all their services are "Critical" 🙂 in a typical Enterprise (the talk about availability and reliability) but not about the cost to achieve that
@jpetoff what are the main responsabilities of the SRE Education Director, why this role is neccesary in your Organization?
Thanks for asking! I lead the learning and development function for SRE. My team is responsible for onboarding, getting folks ready to go oncall and ongoing education opportunities. We also bring reliability-focused education to all of engineering. Reliability needs to be a priority for everyone, not just SRE. This may sound fluffy, but we also foster a strong oral tradition and passdown of the SRE organizational culture through storytelling in our classes.
“highly customized infrastructure make it difficult for SREs”, especially in situations when SREs handle multiple services” ==> drives/encourages standardization.
“You can’t build a wall and then complain about a ‘throw it over the wall’ mentality”
standardization again. relevant in truck maintenance and software environments. under appreciated I think.
For the knowable, or meta level patterns in context, I would suggest 🙂 A headwind for the unknowable (treating the unknowable as if it's one size fits all)
What are some of the ways that are used at google to “teach how to fish”, especially when something is on fire?
• Sharing ops work to some extent • Escalating during incidents, debugging togehter
• reviewing postmortems together • co-design sessions • ops/architecture training sessions for Dev
We have the SRE team lead the BPM (Blameless Post Mortems) they are in a great place to look at issues deeper and can then help those involved get to the root cause easier and see the action going as far as possible with in Development or in better Operations.. I know I am not at Google but thought I would chime in... hope that is okay.
Thanks for chiming in @mr.denver.martin always interested in how others do it.
@mr.denver.martin That's actually very similar to how many Google teams organize postmortem reviews.
I can also see SRE building the tools and process for Fishing, not just teaching other to Fish.. but they are not Fisher People... 🙂
@jpetoff I wonder how you maintain guard rails though and not create a chaos of tools and strategies if SRE is an optional engagement?
It's complicated. The preferred strategy is to make the tools and strategies you want the teams to use the most attractive ones (incentives). They can be most accessible, easy-to-use, best supported, most feature complete. Don't try to force a solution on your engineers that isn't working for them. They'll find ways around it. The second approach is on the relationship level. Just because a Dev team doesn't have (full) SRE support, doesn't mean that they're not exposed to SRE. There can be consulting, SRE Love, training programs, tech talks, etc. When SRE has a reputation to be helpful, the devs will ask you for advice and follow your best practices. SRE is production evangelism. The third approach is "the stick": Policies about what you can/cannot use, automated compliance measurement/enforcement, nice-or-naughty dashboards for senior leadership, etc. I would generally advise against these, but there are corner cases when they are necessary. Typically, when you can convince 80-90% of the org with the other strategies, it can be sufficient to show that number to non-compliant team (and/or their bosses). Then again, not everything always needs to be uniform. If you don't have to deal with it, let them do what they feel is right. If you always follow the one-size-fits-all approach, you stifle innovation. Listen to why they chose a different path. Maybe they have good reasons.
@jpetoff @cleng Amazing talk! Does Google have any sort of 24 x 7 monitoring team/Operations or is that all delegated to the dev teams? If it is responsibility for the development teams how do you handle legacy applications which may not be under active development?
What’s so bad about discussing SLOs after software is written? What could go wrong? 😆 😆 😆 (“Overengineering something at the expense of valuable features”)
for the scale, pace of questions, comments for this subject - we need to engage SRE - cc @jpetoff (How are you scrolling all these comments)
Ha! working to get through the Qs as quickly as possible. Also listening to the talk at the same time which is distracting me. I'm terrible at multi-tasking 🙂
“Important to not see SRE as a human abstraction layer over production. That’s an invitation for complexity to flourish” - @jpetoff
The explicit funding model of the engagement process described reminds me of the economic principal of optionality that Gene has talked about in the most recent few episodes of the Ideal Cast. You pay for an SRE to come over if you think that is going to increase your overall value more than spending that money on something else (like another Dev, designer, PM, whatever).
“For products in new business units, you could get away with lots of Baseline engagements”
hey @cleng, can you tell us more about how you measure team maturity?
Service maturity: • SLO quality (user-oriented) + compliance • Ops workload (tickets + incidents) • Data integrity processes • Capacity planning / efficiency • postmortem processes and hygiene • Release automation Team maturity: • OKR planning processes • Staffing/attrition • ops workload • SRE/Dev relationship
thanks @cleng! really insightful, love how staffing attrition is a factor in team maturity, often overlooked but it's a key leading indicator to predict the quality that a team might deliver
Unfortunately attrition can be a trailing indicator, because it takes time to build up enough frustration that the engineers actually leave. Also, be careful to measure this only for organizations big enough that you get a meaningful signal. When you look at a 5 people team and 2 of them leave for reasons completely unrelated to the team health, you get a huge spike. A noisy signal is not useful.
I love the benefit of being able to call experienced SREs when things go wildly wrong in a major incident!
“SRE Love” — when devs write proposals / requests for SRE help, to aid in knowledge transfer, mentoring, skill upleveling. “Helps build relationships between devs and SREs.”
The SRE Office hours and Continuous Learning in mentoring the devs is awesome!
Fascinating how this feels like a "service you can buy based on your goals and funding" all grounded in economics.
I think a big difference is the focus on automating away toil so that the team can scale sublinearly to the size of the service rather than throwing more bodies at the problem.
Think of it as SREs provide operational (production) expertise. They’ll help fix an operational issue, but the goal is to understand the issue such that they can engineer away the problem from happening again.
i.e. writing better fire fighting equipment, so that don't need to spend time fire fighting?
Hi @jchanto17 I see SRE as the bridge between Ops and Dev, these are people that are pulled out of or not part of Dev or Ops, but they have skills and knowledge to be able to get to root cause and then can look at 1. fix temporary with work arounds 2. figure out how to respond and restore faster 3. fix long term how to keep it from happening again.
Might an SRE also look for risks before they become issues? Is that inherent to eliminating toil?
Yes @nuwayser, SREs are empowered to make tomorrow better than today by focusing on the things that will have the most impact on improving reliability of the services they support.
SREs make more sense in places where dev teams actually own dev & ops - more common when your apps are cloud-native.
in traditional datacenter based apps, SREs may play the role of a bridge between dev & ops
i'm involved in some research projects on whether SREs fit into the hybrid world of cloud-native + legacy datacenter apps. It's been interesting and challenging
Ok thanks all for the comments, now it makes sense to me. And what I think is that I can take this type of firefighting work out of the Dev team, because in my case Dev team use to tackle this type of problems and slowdown the value delivery
> Yes @nuwayser, SREs are empowered to make tomorrow better than today by focusing on the things that will have the most impact on improving reliability of the services they support.
Thank you @jpetoff :)
@jchanto17 to your question - as someone who has worked in traditional ops before, an SRE is someone with my mindset +
developer skillset, and who is also empowered / expected / measured on their ability to
• participate in incident response to close issues
• proactively mitigate risks
• leverage error budget to find problems make the service more resilient
By developer skillset I mean more than python/bash/perl scripting: it's someone who has the skills and license to get into the app code and improve it proactively.
@jonathansmart1 to your earlier point re: measurement alignment between SREs and Product owners - curious to know if you think the two roles are responsible for the same outcomes or different ones.
Hi @nuwayser, not sure, keen to understand the case study point of view (SREs and Product Teams, not only POs). I guess that it will depend on org by org, as to how that is done. With more aligned incentives or less aligned incentives. Keen to understand and how risk is mitigated.
Are there any examples of SRE Love projects that we could use as a example?
Typical examples: • Helping a Dev team to set up their monitoring/alerting • Analyzing SLOs / refining them • Architecture review • Picking the right tools / infrastructure for a new "greenfield" service • A migration to new infrastructure (e.g. database) • Improving ops processes
How does incentivisation (performance appraisals, pay, promotion, rewards) work for SREs? How are SRE incentives and Product Team incentives kept in line, to avoid the antipatterns which would easy to occur (e.g. SREs incentivisation not inline with Product Team org incentives, e.g. lighting fires in order to put them out, as an extreme example)? Thanks!
Core aspects of SRE performance are impact (measurable improvements for users/Dev/SRE/budget) and simplicity (standardization, deprecations, cleaner architectures, less dependencies). Heroics can be rewarded temporarily but are a generally not going to get you a promo. Specifically, the focus for performance evaluation is on designs and landing engineering projects not "keeping the lights on".
Service Level Objectives are nominally the tool that allow SREs and Devs (and other business stakeholders) to speak the same language and align on incentives
@genek - we need a slack plugin to move all these "Gold nuggets" to a ever living doc (kind of shortform or notion) - Goodness me - there is so much to read thru
Are dev-teams on call overnight? Is there any pattern of follow-the-sun rotation handing off to another team for overnight?
"You build it, you run it." 😉
I'm pretty sure that's how it is, but I was hoping for confirmation. There's folks in our org who don't agree with that.
Yes, Dev teams are oncall, some of them 24x7 when they don't have multiple sites across the globe. That's not what we would do for business-critical services though. Being paged at 3am when the rest of your team is soundly sleeping is not a recipe for success. However, SRE escalation support via baseline can be helpful in these scenarios.
Thank you for the response! What would the size of those teams be? How many people around the globe? We have some teams of 5-7 that are responsible for critical services and currently hand off to operations overnight because they don't want to be woken at 3am, but the operations team is becoming overloaded.
For an SRE team, we recommend at least 6 engineers in each of the two sites. That's a significant investment, so each SRE team is typically working on a large set of services (or a few very big/critical ones). Dev being oncall during business hours and SRE being oncall during off-hours is also a model that is working well for Google SRE teams that are using it. It increases the coordination overhead a bit (more cross-team handoffs), but makes dealing with faulty releases easier (releases should be done when the devs are oncall - they're more familiar with the changes). It also exposes Dev to production, but with a safety net - there's always an SRE familiar with the service you can escalate to.
The clearly defined different engagement models is really interesting to me. Having that so well defined to give teams the ability to decide what they can afford and not just what they would ideally want. It seems really useful to keep the smaller and less important projects from swamping the SRE org.
Hmmm... seems like an SRE role aligns a lot with my interests. Is there any good resources to start learning and grow this muscle? I assume the SRE O'reilly book is a good place to start?
@nepobunceno There are lots of resources to check out at sre.google including our various books. There are some good large system desigjn exercises and shorter form articles too.
There are great books by Google (Free) and also there are number of courses in LinkedIn and Youtube available. The one thing I will recommend strongly the micro learning videos on this subject by Google Seth Vargo and our own @lizf
@jpetoff Where can I find the system design exercises? I'm scrolling through the site and recognize books and articles I've grown from and helped my org change processes to reflect.
I’m fascinated by the path one is required to go through to get to Full SRE Support — and how products going thru hypergrowth will likely need it!!'
I have to confess here (A general statement): All these good work by various organizations - SRE, DevOps, Product Management, Funding, Transformation, Team composition, Organizational design - all these get over complicated in their own way in many Enterprises and a common statement comes to answer all the time - WE ARE NOT GOOGLE - cc - @genek, @jpetoff . Reminds @jonathansmart1 book - antipatterns
Google is an enterprise with its own challenges. With 140k employees and >20 years of history there are many complications on the ground. Individual SRE PAs find their own solutions on how to adapt to their space. Please don't take Google SRE as a blueprint to be applied verbatim to your org. It's a case study. We keep evolving the way we do things. Because we have to adapt all the time.
^^^^ THIS - Love this @cleng (the same applies to all the models right - Spotify, Etsy, Google, Amazon)
I like that the SREs can vote themselves off the island
How are SREs on a project evaluated for performance/promotions? How frequently and by whom? @jpetoff @cleng
SRE's performance is evaluated by SREs (up to a certain level of seniority), but SRE managers and promo committees expect positive peer feedback from Dev peers as critical support for high ratings.
that autonomy is a really useful source of information
“When SREs have all left/abandoned a [problematic service], well, the developers are left wearing the pager anyway.” 🙂 I love these “tough love” statements from @cleng.
“you never fully understand a system until you see it burst into flames” — @cleng 😆
I’ve always admired the way that Google has talked about how they’ve defined SRE as a career path, as a set of skills, so much pioneered by @jpetoff.
Thank you! PSA that all the books are available online for free at sre.google/books
Google got their own TLD? I didn’t know that!
When can I get “Nick.”? :rolling_on_the_floor_laughing:
Thank you so much @jpetoff and @cleng for giving us a glimpse of how SRE works inside of Google!! This is something I’ve wanted to better understand for nearly a decade! 🙏
Thanks so much for inviting. us, @genek. Will continue to work our way through the backlog of Qs!
It's been an honor and a pleasure. Fantastic community at DOES! Great questions, many things to learn from others too!
Challenge to Enterprise Technology Leaders - "WE ARE NOT GOOGLE" - WE WILL NEVER BE; 🙂
This is amazing and very timely with some of the work we've begun doing! Thank you @jpetoff and @cleng! Reminder to self to read up on your SRE books!
Wow - so much to take in. I’ll be watching this one again
@eleonravinez @jeff.gallimore There is a lot of unpack there. I think the best way to sum it up is through this conference talk that I gave at FailoverConf last year: "Swim Don't Sink: Why Training Matters to an SRE Practice" https://www.youtube.com/watch?v=8iaNMMwozCc
When I get the "we're not Google lecture" I try to find ways to say "but we can aspire to be"
@vmshook Many of the foundational principles of SRE can be applied no matter what your size (e.g., SLOs and error budgets and a 'vanquish toil' mindset)
I love that the whole conference is focused on these topics at the same time. It’s creates a great shared community context.
thank you @jpetoff, @cleng - Brilliant session (I can manage without a coffee now for a while)
Thanks for sharing!! very usefull, I’ll be watching this one again
When working with an SRE, how do you decide whether to go with an incremental approach over redesigning a whole system to meet reliability goals?
When you have been working closely with the system and know all the hidden timebombs: An incremental approach is generally preferred, because a clean-slate rewrite is costly and high risk. However, sometimes you have explored all incremental options and they either don't get you to an acceptable level or are even costlier/riskier than a rewrite. Then it's time to sharpen your design pencils. When you're new to the system: A thorough architecture review and a regular ops review to assess the long- and short-term viability of the system should get you close to the above.
@jpetoff and @cleng - Thank you! Can you post a snapshot of the SRE in a nutshell slide here?
AS @genek says - DOES is like a recharging battery station. I managed to charge by (Inspiration) and Battery over last 6 Years - JUST by DOES
@cleng @jpetoff Were there ever a valid counter argument for putting SRE inside the product teams? (If this is covered in books/papers I haven't read, please share! this is the single hottest topic where I'm at right now.)
@christina.biangslev This is not something that has been considered at Google AFAIK. Keeping SRE and Dev reporting lines ensures that reliability is a first class feature. If everyone reported up to Dev, it's possible that the orgs leaders might be tempted to trade off reliability for feature velocity or to take on more tech debt than is wise. I know other companies have approached SRE differently though and have found success with a more embedded or consulting approach.
I agree. What I see often is compromising quality because of the pull from business to deliver new features. I think this would also be something that would make it a second class citizen.
See https://youtu.be/n4Wf14e2jxQ?t=497 for a good argument. Summary by Ben Sloss: "So for that reason SRE has to be its own team. It's my basic thesis. If you don't have a team who views their mission in life as making sure that the product works, you will ignore availability and reliability until you're in real trouble."
I completely acknowledge this risk, that Feature Fetisch might taking over the SRE agenda. I'm struggling to fit the ideal of the truly independent product team with the experience of our SRE pioneers. I'm sure there's a place for a variety of setups; I'm wondering what the deciding factors would be to consider the truly independent model.
There's a whole website for the many organizational options you have: https://web.devopstopologies.com/
Speaking of talks happening at the same time… if we go back later to listen to one we missed, which channel is best for the follow-up discussion? (since we don’t have one channel per talk or (set of) speaker(s))
Maybe do DM to the presenter and maybe they would still be watching the Slack Workspace... just a thought.. but then you miss on others that may have knowledge besides the speaker... hmm.
It’s tough to know. I love staying on one channel throughout the day for the lively engagement and community, but it would be super cool if the thread discussion for a given talk were moved to a dedicated channel for continuing discussion… @genek
👏:skin-tone-2: Let's get ready to welcome the team from Capital One – @girija.rao, @denee.ferguson @jennifer.miles, presenting Productizing the Network: Square Peg, Round Hole?:clap::skin-tone-2:
I’m so excited that @girija.rao @denee.ferguson and @jennifer.miles will be talking about they brought DevOps principles for core networking that enables (all?) major bank operations!
This reminds me of the talk BMW did about applying DevOps principles to their core IT support org
“My favorite: Wireless LAN”. from @denee.ferguson (I’m getting stressed out hearing about all these mission critical services. “It’s always the firewall or network.“)
lol... @genek I like to say "bring it on"... most of the time, it isn't!
so DevNetOps is a thing, right? cc - @denee.ferguson
It is!!!!! Incorporating the development/automation has been key to getting out of the firedrill mode, and bringing sanity to our pace of delivery
Ah.. @girija.rao is talking about creating Build/Run teams for Network!
“Previously, we had Engineering and Run reporting to two different executives” — @girija.rao cc @scott.prugh
I heard this org change referred to as "reverse conways law". If you don't like your architecture, change your organization to reflect what you want your architecture to be.
align it to business outcomes and the architecture gets fixed.
(More difficult to rearchitect your core switches, which are global in nature, than most software. 🙂
but sometimes it doesnt. sometimes figuring out the right archtiecture is also really hard
Network Engineering + Network Operations separate teams => Eng + Ops team but this is still IT focused, correct? These product teams don't include non-IT operations providing value to customer?
The org itself is IT focused... But the org also has non-IT staff members that are integral to our success.
@cncook001 Indeed!!! Over 160 people changed managers as part of this transformation....
"Shoulder Tapping" - these words are giving me a mild anxiety attack.
Yet another reason why open floor plans can kill productivity. At my previous studio job, somebody created paper indicators that you cut out to put on your monitor: • Green: I can be interrupted • Yellow: I can be interrupted but we can't talk about your cat • Red: DND
@cncook001 - This reminds me Martin Fowler quote "You change the organization or change your organization" 🙂
@char aka "drive bys"
The scale of this talk blows me away: 14,000 devices, 185k carrier assets (!!)
I have a manager who loves drive by's . I developed PTSD from every time I heard his office door open.
Most of the "Productive" Engineers prefer to work from home (as they do not need to see these managers and other distractions)
You would think remote working would provide some speed bumps for those drive bys but not always the case. Where there is a will there is a way!
mine is back in force w/no telework options and a move back to an open office. less time to think
some jobs have that option. my last one did. my current does not.
The biggest problem I am hearing in infra product team transitions (aka platform teams) is that 1) engineers don't easily skill into product manager roles and 2) have an antibody reaction if you parachute in someone with product skills who is not an engineer.
The biggest lesson is the product managers NEED to understand the technology...
This leads to me to question - The PdM's do they need to be technical or Business 🙂 or both?
At my previous job we had motion in both directions (engineering->product crafts and vice versa)
My attempts at product teams with Infra were underwhelming at first, but I found several months later they came back to me with an appreciation for what they learned. It just took longer to click
I think what we found is that engineering->product manager was better than the other way around
@char re the purple squirrel quest, one of my challenges is to "train" the infrastructure "POs" at the moment. Im just getting started.
common theme: I was a developer and kept building features that nobody used or wanted. I became a PM to prevent that from happening
In some org (Enterprises) - The Product Management is (abused). The PdM's act as Service Delivery Managers (approving people work)
@chris.gallivan421 Id love to hear more about your successes with infra POs. what did they come back to you for?
They said they liked mobbing, rather than working in individual silos
Many product teams are matrices with consensus leadership shared by two "pyramid" reps
they also said, they felt like a real team for the first and only time in the dojo
ah.. I have experience with those and have even talked about the quick learnings, but at this time, they are barely "walking," not to mention leadership buy in
@chris.gallivan421 those comments have been ones ive witnessed as well. it's a beatiful thing, isn't it?
engineers don’t easily skill into product manager roles < I’m wondering what specific problems you’ve seen with people who wanted to make the transition.
at Stellantis we favored the 2 in a box, but it was mostly because we needed a place to put the managers 🙂
@jtf These are field reports I am hearing. Just went through this in detail yesterday with a large Australian investment brokerage.
on app side this led to typing pools rather than product teams
so much so, we have renamed TPO from Technical Product owner to Typing Pool Owner
@char: I complete agree it happens. I think it can be a problem to move from engineer --> product manager for many different reasons. I’ve also seen problems for people moving from business analyst --> product manager or scrum manager --> product manager. I’m not clear there’s anything uniquely challenging about engineer --> product manager. When I’ve seen problems in transition the biggest issue is the lack of mentoring in the transition and/or a weak (or unhealthy) product management culture in the organization. Might that apply in these cases?
One of the issues I have seen with engineering moving to product is that they sometimes can have a hard time not solutioning.
A lot of the people I have seen in these roles are long term employees who used to code. Over the years they have built up a lot of domain knowledge, more so than on the business side
Is there any effect on how the org is structured? Ie. is someone moving from engineering to product because that’s where growth is, vs. because they specifically want to be in product? Curious if organizational dynamics are a major driver of seeing that transition succeed.
That failure mode for engineers makes sense to me @char. One thought that came to mind for me is that we try to have our engineers oriented to client needs in any case; the product owner acts as a customer proxy and tries to talk in client language to the teams (and sometimes the engineers join client meetings). I suspect that might make an easier transition compared to an environment where the product owner acts as a translation layer and client language doesn’t make it into the engineering team.
@scott.jaffa so far we have not had any engineers shift over to product management...
I plan to use customer empathy to begin to help these folks, since who we serve work daily in an environment that demands extreme safety. I just got here 3 weeks go, so thinking to start there.
@jtf Having the engineers start paying attention to the user experience that their products provided has been key... having the former engineering team members be on call also helped them get religion (quickly!) over the importance of details
@jtf These are all good hypotheses, reflective of growth mindset. I would also, as a counterbalance, inquire as to whether there is fixed mindset operating in a way that's not easily disrupted, even with the best training available.
When IBM Marketplace start 5+ years ago, we adopted "3 in a box" model. Development Manager, Product Owner and Technical lead. We called it "intentional tension in the system". Those 3 had to agree on what to work on next. It worked really well.
@char my group may benefit from the shift of fixed to growth, simply by nature of our business where we serve teams who's lives can be in jeopardy if we don't shift. thanks for the reminder of fixed v growth. that's a great way to bring in customer empathy with them
@jtf I do believe that one of the major constraints in the whole transition to product-centric operating model is workforce lack of product managers. I mean, there is not even a well accepted educational/training pipeline. (Not to dismiss the efforts and valuable offerings of the boutique sector. But we need more scale IMO.)
How can the engineers become product experts? I was in distribution centers as often as possible.
I’ve been sending all the engineers to the customer product demos, user trainings, etc. Too new to know if it is having success, but that team did just get an amazing kudos from one of the users in a training about the application quality…
@bryan.finster486 We haven't tried cracking that nut yet... the people leader for the engineering teams is the product owner... so we focus heavily on enhancing the user centricity of the product owner. The engineers get there kinda by osmosis.... We do have tight relationships with other internal teams that help us understand specific use cases among our customer base (e.g., call center agents, traders, executives, etc.)
@char: reminds me of the early days in agile, where lack of people with experience in agile was a bottleneck to widespread adoption. There was much more demand in 2005 than there were people to satisfy it. It seems something similar is happening now with product management. Everyone can see this is a better way to operate, but precisely because it is new there’s a dearth of experienced practitioners. (Mind the Product is great, but only gets you so far.)
@denee.ferguson Are engineers reporting to the product managers in terms of career pathing etc?
@char Engineers report to product owner; product owners report to me. Product managers report into our Agile org, led by @jennifer.miles While product management could be a potential path for the engineers, I'm not seeing much interest from the engineers in making that move. Most want to remain deeply technical, while a few have aspirations to become product owners/people leaders.
Interesting take. Question - then the PdM and PO spend more time on people management activities than the "Product" management? cc - @denee.ferguson
While they may not be plentiful, I have met some amazing developers who are passionate about customers, products and clean code
@lbmkrishna product managers are not necessarily people leaders. Product owners are. We have found that limiting number of direct reports a product owner has is key. For one of my teams, there are 8 engineers (soon to be more as we onboard new hires). One of the engineers that wanted to become a people leader will take over product owners responsibilities for one of the products this team owns, and will become the manager of 2-3 engineers on the team. This will reduce the direct report load on the current product owner for that team. Generally, 6-7 is the max number of direct reports a product owner should have in our org.
Thank you @denee.ferguson - Thank you for providing the context, much appreciated.
PS: something I learned earlier this year: load balancers and all those Nginx systems shield developers from having to implement all sorts of new endpoints and protocols, such as HTTP/2, QUIC, SSL, and all sorts of other things I’ve only heard of. I gained a whole new appreciation for all those “commodity” devices! (Without them, we’d have to change every app to do things like handle devices roaming between WiFi and cell connections!!)
Load balancers definitely make technical implementations simpler!!! Now if I could just get MACs to roam better!!!
Ha!! When I learned what load balancers did for devs, my jaw dropped in awe. I had no idea!!!
@cncook001 not possible in our world
It depends on the culture. Needs to be a constant reminder for transparency, vulnerability and blamelessness
I disagree, I think it is a good metric… but only if you understand that good metrics allow you to ask questions. They can’t give you answers.
Is there a metric that can't be used for bad reasons? good to be aware of the risks and build the right conversations around them.
A metric by itself can definitely be abused. That is why it is important to identify multiple metrics and track them in tandem to be used as guard rails.
Is there a metric that can’t be used for bad reasons? < I have a friend who says “if it can’t be used for evil it isn’t a superpower.”
What were the new Infra Product teams oriented to? Was it more outcome focused or technology focused?
@girija.rao - So many questions. Would that mean you had a branch focus team that would include WAN design as well? Would FW access requests be part of datacenter connectivity?
@brianspo for e.g. improving the wifi experience, understanding usage painpoints and wish lists through empath interviews and surveys, and determining technology enhancements to drive those outcomes
@nkampwerth we have a core security team and perimeter security team and each one focuses on the related use cases
did you see incidents are seasonal?
Can you say more about using velocity and story points - do you standardize how teams assign story points?
the real benefit we found is measuring at the team level but we are able to roll up to see trends
What is the purpose of your story pointing then? I've always stuck to the matra that story points mean absolutely nothing outside of a product team. Since the numbers are all made up and unique to the team, I use it to ensure consistency and predictability of the teams and use other, more real, metrics to see how teams are performing over time
@john.rowe It helps the team understand how much work they can take on each sprint... Each team knows how many story points they have been able to successfully deliver on average during a sprint.... We do include stories for things like training, vacation, PI Planning meetings etc so that our process is consistent from sprint to sprint. As I understand it (@jennifer.miles is the agile guru), story points isn't a metric that should be compared across teams.
Q4 tends to be a lot lower... change freeze after black friday
so your days without incident metric in Q4 is not used? or do you track trends year on year?
@marc.price the recording was made in mid-September, so we didn't have full Q3 stats at that point, much less Q4. We do track trends year on year
@denee.ferguson I completely empathize with this challenge of having infra / network teams adopt Scrum and apply it to building out hardware and network configurations. Big aha moment for me when working with teams in my org. It was contextual by team, but we eventually de-emphasized the importance of Scrum ceremonies and transitioned more teams toward Scrumban or Kanban. Provided the benefits of lean/agile but without as much "overhead" that Scrum introduced.
Yeah scrum was never intended to be used in this way. We see parts of Scrum as a “starting point” less a “end state”
Yeah, Kanban and flow at the speed of the work for infra / hardware. Physical constraints are better handled via kanban and blocker states and you move on when you can.
Regarding metrics for our teams, we paid less attention to velocity and more on lean metrics like Cumulative Flow Diagram (CFD), WIP and throughput. Flow Metrics (from @mik) would work here too. We also spent time in value stream mapping to identify quality issues (low % C/A) and delays (in scheduling or hand-offs). These were where we got the most "bang for buck" with our infra teams.
Scrum always raises a red flag for me with anything operational as it's not suited to interrupt-driven work.
@char That was part of the square peg, round hole challenge for us... honestly I have found that using scrum ended up improving delivery (quantity, quality... still working on predictability). But we had to teach the engineers how to chunk up their work into bite-size pieces.... It used to be that they might have a single story that would hang out on their kanban board for months (something like "deploy X")... and there was no visibility into how that effort was going, where the team was getting stuck... We have found 2 week sprints to be best fit for my team... The 2 week interval allows us to incorporate unplanned work without disrupting the current sprint, but keep whoever is escalating the need to do X happy.
@denee.ferguson It sounds like you have some flexibility then with the Scrum masters accepting emergent tasks that aren't specified at the start of the sprint - some purists (not me)might raise an eyebrow :face_with_rolling_eyes:
@char yes. As much as I would like to say we will not take on stories mid-sprint, that's not our business reality. My team in particular isn't fully staffed (we're hiring!); hoping that as reqs are filled this will be less of a problem because we will be able to accomplish more each sprint with more engineers.
I am wondering if making more frequent changes in smaller batches reduce the number of severe incidents?
I guess it can cause more incidents if you try to change too frequently. faster doesn't always mean better
Smaller batches that reduce the blast area are key. In our loadbalancer space we reduced failure group size first by adding more load balancers enabled via automation.
So then the failure of any change affected a much smaller area.
This also greatly helps scheduling dependencies since dependencies increase risk exponentially.
Depends on how you define an incident. This is where the A/B and canary deployments can help greatly. Have lately been converted to feature flagging as a way to rapidly fix deployed incidents, as long as you minimize embedding
@tiny.mpetersii Agreed. Feature flags are a great technique to 1) reduce blast area, 2) reduce dependencies and risk, 3) inject operational thinking upstream to dev(shift left on ops concerns)
@brian.m.smith There's a balance... The smaller the number of devices in a window, the more windows you need. Since our windows are normally at night, wear and tear on the team is a consideration... So... I personally strive to do enough changes (pilots) that we have the kinks worked out... then ramp up the devices per window. Some of our technologies (e.g., SDWAN, Wireless) it is generally a big bang deployment... There are some ways of limiting the blast radius, but you can only do that so much. In those situations thorough lab testing and bug scrubs are key... Unfortunately, vendors tend to have a lot of bugs.
@brian.m.smith frequency of execution is actually positive... we get to be a well-oiled machine.... but once we get to that point, we're finding that we need to reduce the total number of windows to say... upgrade all devices in the network... so we can get it done in a shorter time period.
(I am always in awe of how in this domain, even the smallest changes can have catastrophic global impact: I.e., global outage. firewalls, core switches, etc.)
It’s amazing how one “little” routing change or firewall statement can have unintended and hard-to-backtrace consequences.
yup....... not the first time I've seen BGP route change have significant impact. fortunately not caused by my teams 😉
Dynamic routing is a double edged sword. Once a dynamic routing change propagates, you can't just reverse it as quickly as a "no" statement.
@brian.m.smith The #devices in scope for a change isn't generally the driver of likelihood of incidents.. Attention to detail in testing, design, adherence to best practice.
Great presentation. QQ please. How did you measure the business outcomes and NPS?
A lively thread on the impact of projects like this on infrastructure teams…
@raghu.tumuluri614 we haven't measured NPS for our products.... we did try to go there for wireless, but honestly found too many cases were users complaints really involved things that were not wireless......
did you try to get the pulse surveys at the end of every PI? just to get the business and technology to provide feedback and maybe help to drive NPS in a desired way?
There is a thread growing (a breakout QA thread full of gold)
how did you role out your agile training? lead by internal champions?
@jennifer.miles What is your approach for avoiding "ivory towers" when teams want to opt out of product model with less than complete data points as to why it won't work?
@andrew.machen continued exposure to the benefits of the model, basically a wear them down approach. We have one team who is still resistant but have adopted some of the processes. It is an iterative process with them.
Our experience is similar. Certainly great metrics and more energy from engineers in making things better as the months and years roll by. Great talk from you and your team!
> I’m incredibly pleased at the transformation we achieved with our product-oriented agile-driven restructuring - it enabled us to establish a unified mission and sense of identity, full visibility and prioritization of work, improved execution and delivery, and clear accountability internally and with our stakeholders. This structure also allowed us to easily incorporate several new functions over the past two years. It’s an ongoing journey as we continue to iterate upon this foundation to best meet the evolving needs of our dynamic organization and the services we provide. > Girija Rao Vice President > > The unification of efforts and ownership across the architecture, engineering, and operational aspects of product teams, in concert with the ability to effectively manage priorities has enabled us to transform our technical capabilities while maintaining stable business operations in a more focused and optimized manner. > Vince Gutosky > Senior Director & Chief Network Architect
Thank you all for listening!!! We are hiring!!! look on the hiring tab for an opening on my team...
great session, thanks for sharing!
@char happy to answer questions via linkedin
🌟 Up next, we welcome @angeldiazrodriguez and @sheilalodhia from Discover, sharing their presentation, How Discover Financial Services Puts Engineering “Craftsmanship” at the Center of Our Digital Transformation. Joining us for questions will be @kevinjosephallen 🌟
I recall Capital One was or is a pioneer in SAFe implementation. Is CO still managing this framework @jennifer.miles
@humayoun.khan we still use SAFe although in the Network space we have adopted a hybrid approach at this point, using a product based teams in a SAFe like environment.
(Yes, me, too! I had shared my amazement of how Discover powers Apply Pay with several folks lately — several had said something like, “oh, yeah, I saw their name when I accepted the TOS when I set up my credit cards, and now I know why!“)
Had a chat with Discover the other day, their fraud prevention folks didn't know about some of the changes to improve the system, like apple pay. So the multiple small charges in a day were getting flagged as fraud, and turned down, when marketing was advocating make those small charges for extra points
@tiny.mpetersii Runway is a way for us to brand the transformation. There are different altitudes that we’re iterating through, but the core point is to get off the ground quickly and keep climbing consistently.
We worked with the authors of Creating Your Dojo: Upskill Your Organization for Digital Evolution - Joel Tosi and Dion Stewart who started Dojos at Target. They recommended 6 weeks,
Over time we saw that it took about 6 weeks for learning to stick
I’d like to learn more about the concept of a Dojo, is that book the place to start?
week 1 - A man, a plan, a canal; Panama During week 1 you make and begin executing a plan Week 2 - No plan survives contact with the enemy If you are tracking to a two week sprint, you realize you didn't plan well enough, and your plan isn't executable. Week 3 - A plan for a real plan Take your lumps from failing during the first two weeks, Week 4 - Get the work done This is the week where the real work happens, because the team finally knows and understands which direction its going Week 5 - If you find yourself going through hell, keep going Week 6 - Celebrate
Same approach at Principal Financial Group (prior company). Six week dojo challenges. Influenced by talking with folks at Target and the Dojo Consortium.
@chris.gallivan421 I've boiled it down to: week 1 - let's do this (excitement) week 2 - oh this is bad (overcommitment) week 3 - oh this is really bad (we have no idea what is going on week 4 - TADA! week 5 - I knew we were fine! week 6 - wrong rock
Also this: "It takes anywhere from 18 to 254 day... to form a new habit and an average of 66 days for the new behavior to become automatic." https://www.healthline.com/health/how-long-does-it-take-to-form-a-habit#takeaway
5 days a week x 6 weeks is only 30 work days of habit forming. So even six weeks has a risk of teams leaving a dojo and new habits not sticking.
Why 6 weeks? Why are 6 week boundaries attractive? Well because math. People are really bad at predicting the future, and therefore we have to develop means and mechanisms to understand how to break up time, which is a human construct as well. People have been dividing their time into smaller parts since the beginning, of well, time.
@mring - which is why quarters are useful - form - 6 weeks storm, 6 weeks storm/norm, 1 week retro, repeat
I agree with @mring - ideally it would be more than that amount of time - that was our minimum
we also saw cases of unlearning, where 20 hours spent in dojos, 20 hours spent at desk unlearning
I love this topic - I’ll be around anytime to discuss this more
Discover’s dojos are typically 4 to 6 weeks, with a hands-on and interactive experience that covers real-life scenarios which you wouldn't get from classroom-based training. The time line comes from the selection of learning modules that has been crafted to meet the product team. Dojos give product teams a thorough understanding of core concepts while leveraging their existing product backlogs for pairing and upskilling, creating an experiential learning environment that accelerates adoption in the team's backlog immediately. Its the combination of workshops in the morning, and pairing in the afternoon. Old habits take time to break as they practice new patterns to form new habits.
we tried 4 weeks, but we found teams felt more pressure to deliver something. 6 weeks gave them some more room to breathe and learn
software development is like repairing a tractor. sometimes you need to take a few bolts off to understand the problem
I particularly enjoyed @bryan.finster486’s talk on weaponizing DORA metrics yesterday — was absolutely fascinating!
Stories from three teams across some of the most important business units at Discover: • Priya Gupta, Sr Manager, Customer & Account Data • Joe Mathew, Sr Manager, Line Increase Request • Lakshmi Rupanagunta, Manager, Authorized User
@sheilalodhia we also put heavy focus on CI metrics at the WM Dojo. Code merge frequency and dev cycle time. Any thoughts?
Hi - we do have a bunch of team metrics we use in addition to the ones we highlighted today. Code merge, refactor, re-use etc. included
The metrics I presented were the minimums for all teams. The teams are free to measure with other metrics that help them solve their individual problems.
@marc.price Domenica Degrandis (Tasktop) did some internal training for us; we also had product owner training and to some extent an internal agile coach.... I personally think we should have done more, ensuring more role specific discussions about how this would work on day in day out basis. Having an agile coach working directly with each team would have cut the learning curve substantially.
we have a central team of enablement specialists that help with training and helping teams adopt new ways of working, sounds like we are heading in the right direction, thank you for sharing
We started using this to horizontally scale the knowledge of the goals. https://www.engineeringthedigitaltransformation.com/
Thank you for getting these testimonials from your teams, @angeldiaz — these were great stories from the actual teams solving their actual problems, as opposed to executives talking “Powerpoint to Powerpoint”. 🙂
I love the Discover team presentation with all the micro interviews embedded in their talk - Love it cc - @angeldiaz @sheilalodhia
I love the notion of “what objective evidence is there that a certain person is actually good at X, Y, or Z.” 🙂
If I may submit a patch, Gene:
objective evidence is there that a certain person is perceived by others at being good actually good at X, Y, or Z.”
Those are two different but highly related questions
😆 Nice. Just wanted to have a test case that detects when someone is not actually good at something. (Or anything. 🙂 😆
Exactly… dunning-Kruger effect is very real, and there are folks who are amazing at expressing confidence while actually creating problems due to lack of competence in ways much harder to detect quickly
Leaders and Product folks who get the chance to see and understand the value of dojos are the dream.
:thinking_face: I wonder if some here started to leverage Nicole Forsgren's (GitHub / Microsoft), Margaret-Anne Storey, Chandra Maddila, Thomas Zimmermann, Brian Houck, and Jenna Butler "SPACE" framework on measuring Developer Productivity. https://queue.acm.org/detail.cfm?id=3454124. Including during and after Dojos.
Converting innovation to invention - good thought - need to make sure not measured as innovation not resulting in invention is bad. 80% of new ideas tend to fail
Thank you so much, Dr. @angeldiaz and @sheilalodhia Lohdia!
Would love to hear more abut the academy!
You’ll see us here next year talking lots more about it. And we’ll be sure to engage with this community as we talk more about the Discover Tech Academy in the interim.
:unicorn_face: Coming up next, our very own @genek and @steve773 with The Four Characteristics of Structure Needed to Get Great Dynamics
YES!! Dr. @steve773 ! Get ready for a fun presentation, all!!
This presentation represents how @steve773 and I are trying to prove that four characteristics of structure predict high- vs. low performance.
I'm afraid that @steve773 is going to help us drink from a firehose again.
Ooh, sounds like this will also be a good talk
Second consecutive DOES without bow tie for Dr @steve773 The begining of a new era} 😂
@angeldiaz @sheilalodhia thank you for sharing the Discover story! I've observed that learning culture is equal parts enablement with curated resources like internal academies but equally encouraging new behaviors for leaders & employees. Do you agree? What kinds of experiences has Discover had with the latter?
@kxbres agreed. It's the means and the ways. We equally coach on the adoption of the curated resources, alongside our core Discover behaviors that enable community, curiosity and innovation (maybe there’s an upcoming DOES talk on those ;)). Neither one can stand alone.
Great question! When changing how we work we tend to focus on the people within the teams and the steps they need to take….and never get to the leaders. This time as we rolled out Runway- we started with leaders in making the changes to the product model to seed the ‘why’ and problem we are solving mindset. Additionally, DTA has created learning journeys for leaders to help them understand the concepts the teams are learning as they get up skilled on DEVOPS.
> What are the structures and dynamics necessary to unleash the distributed and collective human creativity and potential to compete and win, in an age being tumultuously disrupted by scientific and technological innovation, market transformations, and political and societal realignments. > > How and why over the last 150 years are some organization able to generate and deliver better ideas, quicker, faster, and more reliably. > > How do they create this magical dynamic that high-performing organizations use to unleash and empower everyone’s innate human creativity and intelligence to advance business and societal needs.
This is the classic Dev vs. Ops dynamic. But it’s also Merchant vs. Ops dynamic described yesterday by @lucas.rettig. And for that matter, Team of Teams.
These two model of communication in the organization make me think of an email from Elon Musk to tesla,..
configure vs run for the role of the leader. that’s a powerful lens.
The suggestion is that the leader can look at the structure of the system, simulate it in their head, and predict how it will behave. LIke how @jtf looked at MIT Beer Game, and immediately said, “That’ll never work! One way communications, slow feedback!”
@genek you reference the MIT beer game in your podcast and elsewhere frequently. Do you have a good source you like to reference for those who are new to the concept or want to see it in action?
email me at <mailto:email@example.comfirstname.lastname@example.org> — I’ll send you some show notes in the Idealcast (or just look there).
Who feels like that dude in the middle of the system? ideas, requests, inputs coming in from everywhere and outputs expected everywhere else. Holy 😱 Batman.
This is what was described in an Idealcast episode on modularity — the Eppinger Design Structure Matrix, where every node is connected to every other node. A completely full adjacency matrix.
@steve773 @genek simplification, standardization, stabilization, synchronization — is it more useful to start with one of these? or treat them as all inter-related and use them all at once?
The dude in the middle is not feeling Focus, Flow and Joy, I guess.
Thanks @jeff.gallimore In that order of simplify, standardize, stabilize, and synchronize.
I relate a lot to that middle guy and this is really helpful.
@jeff.gallimore by creating a simpler architecture, there are fewer standards to even create.
Cognitive load is quite a limiting factor for cognitive beings
Last year we were talking a lot about cape fatigue that superheroes get. I think the social structures and setting them into a more working condition is not simple enough. So, the people transforming burn out.
@jeff.gallimore without standards, it’s harder to recognize that you even have an aberration that demands stabilization.
i’m hearing echoes of shewhart and statistical control…
and normalization of deviance is much easier when there isn’t explicit standards
“how does the new person do their job, without any guidance on what is best known way to do that work?” (Reminds me of Kelsey Hightower story of being put on pager duty for an app that he had no idea what it was, how it ran, no access to repo, etc. So familiar to so many Ops people.)
We are having this conversation in my team now....how do provide guidance, encourage autonomy and enable creativity?
I prefer the term "Guidelines" vs. "Standard". It implies more flexibility to adapt to your team/situation vs. "One size fits all"
For me @steve773 talk is like - "You need to be this tall to ride" - He challenges the conventional thinking - Our Enterprise leaders to take a course with Steve 🙂
@jtf to argue against standards is to argue against choreography or musical scores. i.e., life will always be like going to a 3rd grade musical all the time every where.
Sometimes standardization tends to get frowned upon in the agile community. "We're agile, we don't do standards" 🙂
Counter-argument. "Here are all of the Agile Best Practices." #maximumPossibleCompliance
https://youtu.be/ZXXaCCbpNYw?t=1269 "Best practice"
I get the emotion piece, think the visual was making me think about org structure
Thinking about a distinction between a procedure and a standard in this context.
stabilization explains the electron andon cord from the mining truck repair center earlier.
So, the system needs to be simple, standardised, stable, synchronized... So creating that system sounds like not in the obvious domain.
without standardization people can still do Simple domain badly. 🙂
I think Snowden has changed the names - Simple is now Clear and Disorder is now confused
The approach to the work needs to match the domain of the work. Which itself is not static.
Reminds me of Jon Smart's BVSSH: "Software is an agile created box on a lean conveyor belt." P32 🙂
I think the claim would be the application of these 4 characteristics are would be a complex, adaptive system — a learning organization that can operate in a CAS.
When attuned to a dynamic environment rather than frozen in concrete
Want a complicated, adaptive system. A complex adaptive system means you don't always know what strings will be pulled when adaptation happens . Goal with observability is to move from complex to complicated so fixes can happen. I always like the ball of yarn analogy, complicated means you can pull a string and trace it but difficult, complex means the strings don't have a forecastable result.😀
Yes, I understand the desire, but VUCA environments tend not to cooperate with such desires. Ref the situation in Team of Teams for example.
Yep, for sure, but everyone argues for complex and adaptabilitiy and I'm not sure that is always achievable. Used to work with intelligence in the Air force so one of the keys is getting those VUCA areas into the funnel of understandable possibility.
@ferrix “simple” in terms of the degree of higher order interactions. Simple is 1st and 2nd, complex gets you into 3rd 4th 5th … nth order systems.
I think the model of operation needs to be as simple as it can be not to add to the complexity of the surrounding CAS. That takes cognitive energy to simplify the model.
In our previous talks, we divide work domains into: • Slow: cognitive, creative, planning, organizing, assessing • Fast: operations, muscle memory, flow, practiced routines In tech, teams love to work in Fast mode, in Flow — and with absence of right strutures, teams are forced to work in Slow mode (meetings, escalations, etc.)
Rewatching during the break and I'm detecting some Kahneman influence. 😄 More emphasis right now on Thinking Fast and Slow than the Noise research. Now, I'm trying to get my head around what it means to be locally Fast. I was in a meeting this morning where given the same data, I can use Fast thinking but it would be Slow thinking for someone else on the same team. I think this means I need to make that so clear within my "partition" that someone higher in the organization receiving the input can reduce their noise and confusion about decision making.
That's part of the aspect of expertise (in a Gary Klein sense) - moving explicit conscious thought patterns to a level of instinctual response.
Ditto to @genek point above to @ferrix picture. We’re building complex adaptive systems. We increase the adaptability by protecting the intellectual horsepower to do useful problem solving by the architectural simplification and the dynamic stabilization.
We assert: leaders should be primarily in Slow mode; teams should be in Fast mode.
That certainly fits with the paradigm of IC <--> manager being a state change, not a promotion per se
I’m thinking of the old Starfish and the Spider - what about design sprints at the team level? Why can’t teams be in slow mode to discover directly from their customers?
Does this go against any of the science of motivation by Dan Pink, that is autonomy, mastery, and purpose? It seems like fast mode could remove some sense of autonomy?
i believe deming would agree with you. from “out of the crisis”: > “The workers are handicapped by the system, and the system belongs to the management.”
there is working on the system (slow thinking) and working in the system (fast thinking).
Don’t we want product teams to engage their cognition, creativity, planning, organizing, and assessing their work?
I like the debate going on in this thread and the nuance about Improvements in the Daily Work (slow) vs. the Daily Work (fast). Also about the difference between Product Discovery and Product Delivery. This is where I'd love to see more discussion and debate between the DevOps and Product communities about some of these concepts. Product Discovery folks are trying to figure out how to do Discovery faster and validate good ideas from bad as early as possible. But discovery work itself feels like it might fall in the "Slow" domain.
Exactly. 100% I’d love to hear @genek or @steve773 join in on that, and perhaps it is my startup days and product bias here.
@mring What’d I’d offer on this is fast feedback to inform slow (deliberative) thinking. Worst is slow feedback that triggers rushed thinking (because of little lead time) that is fast (habits and bias because no time to actually be deliberative). @nicole.forsythe
@mring @nicole.forsythe If this is something you’d have time to pick up, would be delighted to schedule a time. This has become a BIG deal in our discussions with @genek
@steve773 would absolutely be willing to chat more on this topic. I'll DM you my contact info and we can go from there.
@mring @annp @nicole.forsythe @genek Maybe we can find a time to speak once Gene and Ann have had chance to enjoy their Miller Time. over the weekend. Steve
I'd love to be a fly on the wall in that discussion when that gets scheduled!
When thinking the meaning of words agile, lean and fast, it does need structure. Without a bone structure a cheetah will be a blob of dotted soft tissue... not very fast 😄
I think Dr. Cal Newport addresses this in his book “World Without Email” — he says leaders are not creating enough structure and process flows, forcing teams to live too much in Slack, email. (Haven’t finished it yet, but two people gushed endlessly about this book after I shared this concept with them.)
I coach my coaches to be (as Steve Martin said) “so good they can’t ignore you”
https://www.calnewport.com/books/a-world-without-email/ (He also wrote Deep Work.)
@genek That reminds me of the three-legged chair that a good leader stands on: one leg is personality that inspires, second leg is excel (or facts and numbers) and the third leg is structures that support reaching the goals with lower states of energy. Courtesy of a friend and former colleague.
@genek love the netflix chaos example… do we have any speakers from Netflix?
So sad, so true. @jwillis was just talking about this in another channel
Bookmark: several folks here are from FAANGs or FAANG-like. cc @annp
I specifically mentioned Netflix because “chaos monkey” and because Dave Hahn at least superficially saying they don’t do DevOps (https://youtu.be/UTKIT6STSVM)
Example 2: Bad Tightly Coupled Apps (LESS GOOD) — I so much appreciate in this community I can (mostly) get away with shorthands like Havens, Yakomin, things mentioned in Idealcast. 🙂
High cohesion, low coupling needed: interesting to see this example of tight coupling
@christina_yakomin — you’re now shorthand of a bunch of concepts. 🙂
There is no difference between a local problem and a system problem. The local will spread to become system and even if you catch the problem, you have to shut down the whole system anyway to fix it.
I've loved the stories by @steve773 but I think this is my personal favourite.
the technical system and the organizational system as an overlay…where I have latitude and where I have dependencies….
Stop the pipeline and swarm the problem = the famous Andon cord (I guess?)
@genek @steve773 Nice clean graphics useful for illustration. Typo for next time?
this is what I love about DOES - we thank people for catching or sharing mistakes
“tightly coupled, loosely controlled” < nice summary that can be used lots of places
I think that still beats tightly coupled and tightly controlled 😄
Computer Science 101: High cohesion and low coupling. Bounded contexts 🙂
@jonathansmart1 How about control in that? Maybe a lot? 😛
Low coupling requires federated, delegated controls. Control Objective is common and not implementation specific. How the Control is implemented can vary thousands of ways. And signal detection (which doesn't impede flow) to spot deviations from the Minimal Viable Compliance
in Phoenix Project: “we’re 1 hour into a 9 hour database conversion, that we thought would take 15m” 😆
the absence of agility even in the presence of functionality is a fatal flaw
^^ @steve773 When cost of change of change is unacceptably high
the absence of agility even in the presence of functionality.. @steve773 please repeat that?
I think the Andon is rare. I have not seen zero bug policy in a lot of places or even a threshold for stopping and fixing.
Yeah, pull the cord to invite top level micro management.
Not even pharma has a zero bug policy :thinking_face: (and those bugs can result in serious harm)
Like to focus on the human element here. The relationship between people.
@gerijotoole We can create the super duper greatest system that has fantastic functionality…today. Then, in the next moment, the environment changes. If the system is agile it can adapt and adjust and maintain its relevance. Without, it quickly loses its relevance because it’s less and less well tuned towards doing something useful.
I get the sense there are a bunch of anti- SAFE people here, so I might cause a firestorm, but my intent for asking this question is to gain understanding. Does the existance of Agile Train provide synchronization, by creating dependable sync points?
I think it would depend on how resilient, learning, and “agile” your train is
I would say that yes it does. Where that is needed. i.e. there is a high degree of coupling. Ideally, people break the dependencies, have low coupling and don't need to synchronise teams, products and releeases
e.g. it might be a good pattern for building a satellite. It is quite likely not a good pattern for a large organisation with thousands of unique contexts
#allframeworks not #noframeworks. If it works in context, use it. And treat it as a departure point, not a destination
Yeah, I think that the word “release” is the key word there…in my experience the process and structure of releasing IS the problem, and like Smart says, “Ideally, we break the dependencies,” if the train is the endpoint, it’s not enough.
i.e. use any framework you want to, if it helps and if it suits context. Not a case of "all frameworks are bad"
oh, okay. Where I really need to grow is to understand the conditions when SAFe is good for an organization and when it might not be good. As person who wants to help AF orgs grow in their lean-agile transformations, I want to be able to better advise orgs in their next step, not just sell them training.
I want to read more and more of the "Heart", "DNA", "Source" info, so that I can help people solve their transformation problems.
@nicole.forsythe, your book is on my Kindle and Audible. Thanks for the research based help. I am going to read through it a third time. @jonathansmart1, would your book help in this journey.
@brian.m.smith yes. Generally, it's a collection of patterns and antipatterns. In chapter 3, specifically, there is a comparison of frameworks
Brian I’m not Nicole Forsgren, tho I do admire her work. I’m unpublished, so take it all for what you paid for it! (Forsythe and Forsgren are VERY close.)
“centralized control system could not keep up with the pace of reality” — consequence is that entire system eventually shuts down.
Has anyone experimented with the Haier org model? https://corporate-rebels.com/rendanheyi-forum/
Centralized systems can not keep up with the pace of the real world
The good part of Soviet planned economy in 5-year terms would be that we could have been fine with 5 years of Twitter and then none.
"why nations fail" is a great book about those two slides, @genek https://www.amazon.com/Why-Nations-Fail-Origins-Prosperity/dp/0307719227
I love the idea of synchronizing on goals/outcomes rather than time (which is one of the first things I think about when I think of sync).
everyone was trapped in their functional silo, unable to work with their peers in other functional silos — structure wouldn’t enable it.
@vmshook in terms of a society not tuned to getting feedback there’s the example of the Soviet Navy’s experience with reactor propulsion vs that of the USN. Attaching chapter 5 of my book which has case about Naval Reactors program.
adding it to my reading list...it's always insane after DOES
Aah the Facebook structure! 🙂 Reminds me of what just happened with Facebook!
This is one of those presentations that I will need to listen to a few more times
You bet. Let’s talk 1x1 about what resonated and why. Would love to hear what you’ve got in mind.
Seems like this aligns with Dekker's thoughts on Safety Culture where you empower people to do the things you have hired them to do, without fear.
Have to say I’m loving the virtual format and live Q&A with the speakers.
Thanks so much - great interaction to the common goal of a great presentation
Hey @jeff.gallimore - Nobody cuts @steve773 off! (Or maybe that was just my feed). 😆
My feed too! A very densely edited piece. Hard to pack so much into such a short time.
imagine all the great stuff they left on the cutting room floor!
Long version with Gene break-ins like Idealcast for a #watch-party win!
I really love getting to see questions, answers and notes from others
Transitions between talks are so fast!! My poor cognitive gets overload :rolling_on_the_floor_laughing:
I could listen to @genek and @steve773 all day. So many aha moments
Thanks Glenn. Would be delighted to touch base you me Gene to see what ideas resonated, why, etc.
A new spin on all day devops. I’d love to join that!
for sure. I face so many challenges with teams that don’t understand DevOps (and DevSecOps for that matter). Being able to give them real examples within a model / framework that shows what’s good and what’s bad is key.
You can get the new paper @suzette.johnson5 is talking about in the Fall 2021 DevOps Enterprise Journal (sponsored by LaunchDarkly). It's free to read here: https://myresources.itrevolution.com/id006657131/The-DevOps-Enterprise-Journal-Fall-2021?_ga=2.157584826.1882764989.1633366383-1037911749.1592589043
@mr.denver.martin That’s right. In absence of safety, people can’t call out issues, so no stabilization, so problems persist and escape.
@genek - Do we extract these slack conversation to some kind of system to produce some reference materials for our delegates? This is becoming high priority feature request for our conference now.
@christian.lefevre You bet. Looking forward to learning what resonated and why and where. Let’s find time with Gene to talk with more bandwidth. Cheers!
I am looking at making Safety Culture one of the Keys of Dissertation on Implementation of DevOps in organizations, trying to narrow down my topic still. Open to input @steve773
Let’s find time to talk. Would be delighted to learn what you’ve got in mind. Steve
sure, I know you are very busy, I can work around your schedule... I am a First year of a 3 year program.. So I have idea but still have to do a lot of work.
I am attending Crummer Graduate School, Rollins College. EDBA program.. https://crummer.rollins.edu/
Can u meet in Gather so interested folks can listen in?
I am open to meeting in Gather - I think we just have to work out a time to meet...
@steve773 - your comment - its about connecting people not work centers REALLY resonated.
@steve773 have u studied the incentives that seem to work against safety where decisions are made to leave defects in place because the cost of settling the lawsuits is less than cost to fix with respect to profits over the considered time period?
That seems, in considering things from an SRE perspective, to be similar to the "appropriate level of reliability" - 100% is never the right level
@drew.boyer I do recommend a read of Taichi Ohno’s ’88 book on The Toyota Production System. One would think it’s about material through machines. It’s really a philosophical treatise on creating a positive environment for people.
Thought this note would be of interest…vis a vis designing human oriented systems… https://conta.cc/3w6hbnu
@ojacques2 are you collecting bookmarks and notes for this conference like the last one? (https://docs.google.com/document/d/17nHB3aDOsUI2f3W5EShyoFQiK7NMwIJ8QFkUCANVPbk/edit)
Reminder: The action has moved to the breakouts! Join the following channels to interact with speakers live while their talks air: #ask-the-speaker-track-1 #ask-the-speaker-track-2 #ask-the-speaker-track-3 #ask-the-speaker-track-4
@sanjeev.sharma It is refreshing to hear someone else saying these things. Silos work to protect themselves way too well.
@sanjeev.sharma and @john.comas I have worked with rolling out DevOps at both large and small companies, the hard part is finding the space to set things in motion... the way to find the space is different in both...
@john.comas I have told my peers to use what we have before including some shiny new external tool. I so much agree with your statement on the presentation.
Yes, we often see it as having an operation while still running a marathon.
I'm curious, what guardrails, if any, are there to prevent a team from driving up infrastructure costs? Are those costs reflected back somehow or is it kept in a centralized bucket of money?
Working on chargeback and show back models. Based on a person's role that makes the request
@sanjeev.sharma I completely understand Test Data Management for Integration tests, Performance Tests, Smoke Tests, etc. But, can you explain the need for Unit Test Data Management? In my understanding, Unit Tests should not be updating data in the first place. Input data for Unit Tests facilitate testing the path through the code and the interaction between components.
Your understanding of unit tests matches mine. The unit tests by nature should be self encapsulated to the atomic units of function we are testing
@john.rowe we have used tagging in AWS and a tool call Cloud Custodian https://cloudcustodian.io/ that will turn down systems based on tagging so they will not get left around...
Thank you @sanjeev.sharma @john.comas! Loved the presentation! btw about #3 - I found it a combination of mandates + "build it and they will come" .. very relatable!
Per your question on mandates, I've always been of the mind where you need to build something people want and find useful so you don't mandate it. Now, once you've proven out the value and shown success, you can talk about mandating. I never like to see mandates in absence of proven success though
Leading companies use an interesting blend of "choose to use" and peer pressure. "You're going to reinvent that? Really?"
Sanjeev, Have you invested in automating the Platform installation/setup using DevOps practices and have you introduced self service in this space yet.
Still early there @anil.atri. We are piloting a portal as we speak. And yes, we will deliver the platform as a private PaaS
hey @jeremy.castle.lvh0 and @ryan.chambers.rse9 did you use the shopping/cashier analogy internally when you were pitching this work?
We kind of stumbled upon that later. I secretly funded the work within my suite first. This is how we are selling it to the org now.
I love that "team 3" who were DevOps tool pro's but didn't bother to give the Ops team a heads' up on a major release.
@rajat.sud Can you give us a high-level idea of what a typical "low score" leadership investigation looks like, especially in cases where course correction flies in the face of priorities?
We struggle with this - but that 'buy in' is important. Also, as the CoE, one of our most important mandate is to educate our Business Community on the value-add (be it CI/CD, test automation, Securitization) that DevSecOps adds to their product
Asked another way - Is product ownership generally in favor of prioritizing these score corrections?
Hi, all. Looking forward to presenting about eBay's Velocity Initiative in a few moments ...
Reminder: The plenary sessions are starting again in 5 minutes. Start making your way back to your browser. https://devopsenterprise.slack.com/files/UATE4LJ94/F01D34MC2KS/image.png
⚡ Let's get ready to welcome @rshoup and @markwei from eBay, here to present Driving a Tech-led Reimagination of eBay Through DevOps ⚡
Please welcome the team from eBay! VP Engineering and Chief Architect @rshoup and VP of Core Product Engineering @markwei! They’re loading up the film into the projector — will be starting any second now! 🙂
(If you don’t see this talk talk playing, please refresh your browser! 🙂
I’ll kick this off with a very important question. What’s the best strategy for sniping an auction on eBay at the very last second? Technologically-speaking, of course.
Set what you are willing to pay up front, and the system will 'snipe' for you up to your limit
Everything we have done and learned, we learned from this community, and it's such an honor to present at my very favorite conference!
eBay: $10.3B, 159MM active buyers; 19MM sellers, 13K employees!
“together, we can help teams unblock each other, permit teams to take more risks than they otherwise would… and also set mandates, when that’s required.” 😆 — @rshoup
Value Stream Mapping was a tremendous tool in showing us the bottlenecks in the overall system. No substitute for seeing the "whole board"
How long did your VSM took? how big was it? Any particular challenges?
Quick and targeted with 3 individual teams as a cross-section. We already started hearing the same things.
I have seen many cases where crazy percentages are spent in wait >50%
So many issues across every phase of the product lifecycle. Can't do it all at once. Where to start?
I loved hearing hints about the stack at eBay — I meant to ask, what are the most used languages used these days at eBay these days? (And I loved hearing about the improvements in build and startup times!) @rshoup
The improvements in startup times must have been so appreciated and exhilerating!
The dev teams are particularly excited about the build time improvements, so we rapidly went from pilot to full rollout everywhre.
“dysfunctional experimentation” < it wasn’t for lack of tools. I’m guessing lack of mindset?
huge dependencies among teams. when teams r stuck in wait states, what do they do… find something else to work on. that creates more wip. more wip == bad.
“we are focusing the the delivery pipeline, because if we improve this, everything else becomes easier.” (Seriously, I love how @rshoup talks about thinking in systems!)
Theory of Constraints, baby! Any improvement not at the bottleneck is an illusion, as someone said (@genek)
“10% of eBay applications are in our pilot; focusing on short-term wins and long-term capabilities” @markwei
We kept asking ourselves how we could take a massive problem and make it a targeted, actionable problem
The 6 words above (TB, SS, LF) are words I use several times a week 🙂
@rshoup That was definitely from Dr. Eliyahu Goldratt, but that quote was surprisingly hard to find: it was in the Beyond The Goal audiobook.
I like hearing about the architecture community taking a lead role in driving Developer experience and velocity. Great broader conversation of the role or architecture as a catalyst for DevOps and developer best practices.
As Chief Architect, that's my bailiwick. And also, I had to recognize that unblocking software delivery was the only way we were going to get to architecture improvements. How can we change the architecture if we can't deploy code reliably?
I think that is great alignment....I need to take that angle at Nationwide with my architecture friends
Wow. @markwei are you saying that in payments, it’s actually even more important to deploy quickly? (e.g., respond to fraud, etc?)
I’ve never heard that before — that’s so cool. Usually I hear, “it’s PAYMENTS. We have to be super careful!”
ha ha… more frequent deployments with smaller batch sizes == “super careful”
Ha! That "Its payments!!" is a daily conversation I have :)
Similar to @bryan.finster486’s talk yesterday, asking "why can't we deploy today", we kept asking the same: if we were to ask you to deploy every day, why wouldn't that work. Now we have a list of impediments to remove ...
Used that on a team today. Found out they didn’t know about feature flags.
Exactly! Why can’t we ship today? https://podcasts.apple.com/ke/podcast/why-cant-we-ship-today/id1327456890?i=1000440026763
unblocking dependencies < I love to hear those words
Gamification really helps, if the culture is good.
My pal @justin124 joined you guys. Awesome guy. My last roll at Walmart was leading the DevOps Dojo. It sounds like ya’ll are trying something similar. I’ve a bunch of learnings from that if you’re interested.
There’s something I did at Walmart to help spread the knowledge of the goals in a scaleable way. Happy to talk about that strategy as well.
The impediments the product engineering teams identified gave us the roadmap for the platform and infra teams
“remove impediments to flow” : :thumbsup:
BTW, listening to NPR Morning Marketplace, I now always hear ads from eBay Technology, recruiting engineers. Wild.
i love the approach of getting explicit about what exactly is in the way.
Training and education are key to this. Most of the engineers have never experienced a CD situation.
This really got the attention of the execs -- we can double the productivity of the existing teams with no new resources
eBay is an interesting place. I've never been somewhere with people who've been at the company for 10-20 years. And there's a LOT of them.
Yeah, but there are still a lot more than I encountered at Walmart, but that might be west coast bias.
Yeah, I was a “kid” I was only there for 19 years.
“the same teams are now delivering TWO TIMES MORE features and defect fixes”. Huge. What an amazing finding and conclusion! 🙏 🎉 Thx @rshoup
Several are here at the conference. We assign them to a team, and they figure out with the team how to help.
I've embedded with our Payments/Identity group. I'm mostly operate in service to those teams to help them however they need. I also serve as a buzzing mosquito to remind them to push the envelope and "we'd like to get to a place where..." etc
Haha I only do diagrams - kidding. Definitely involved in coding. It very quickly helps me see where the issues are, and helps me establish a trust relationship with the team
In the past 4 months, I've worked on: • hey, here's a more-efficient way to write integration tests • Our config is confusing and leads to errors. Here's a framework to test it. • We seem to have a gap around feature flagging. Let me go investigate how to fix that. • Why are we running sprints like this? Let's talk about that! etc
But I have also been involved with management and QE suggesting new approaches and pushing the envelope.
This is really great, I'll go sign our architects up for a coding bootcamp 🙂
Here are some of the things I've helped with: • Teaching TDD and coding practices • How to improve/automate the checks we do as part of a deploy, such as localization and accessibility testing • Improving how we do code reviews • Introducing trunk-based development
There's someone who is a representative to the velocity program for Payment/Identity. He pointed me to a few teams who could use my help, so I'm there to help them.
I'm assigned to one team, but I also drop in to meetings with other teams when they have questions
That sounds like a role we're trying to create, we just call it a Principal Software Engineer
By "team" I actually mean I work with an overall group - I'm working with the View Item team
Yea, that's basically the same thing different name
All of us architects/principals also meet multiple times a week and share what we're doing and discuss approaches
Hearing the focus on PDCA it make me think of the whole effort in terms of Toyota Kata. Was that a point of reference?
@rshoup @markwei Have you ever run into “we can’t change that much because we don’t want to confuse the customer with all of these changes” type of thing? Is this a problem?
different problem from velocity but we do run into that some. but we have been working to decouple deployments from that. we can deploy quickly behind feature flags and rollout as desired.
I tend to think that the answer to this is “we want to be able to deploy fixes and minor changes quickly, but let’s not deploy a major refactoring of the UI every day”.
Curious how folks capture data to track change failure rate? Is it manual entry when there's an incident or bug? If automated, how?
“I was shocked that my build time was 1 hour. We worked at it, and now it’s 4 minutes.” 🎉 “1 hour 45m to validate PR; now down to 10 minutes.” So great, @markwei!!
We use open source tooling, and hope to open source some of our internal tools over time.
@rshoup - did some of the improvement in speed and efficiency include moving folks from older build and deploy processes and tools to the open source tools you were referencing above?
It wasn't about older tools vs newer tools, but adopting the tools we already had, and improving them. Hope that makes sense.
“aspiration: one-click deploys for all our applications” @markwei
I run my tests when my dependency changes. Contract testing is the solve ultimately.
the Marines and I talk about the meaning of 'done' all the time
AH! Yes. The definition of done could actually be a great talk for next year at DOES!
“weekly mobile releases have vastly improved our quality, because there’s not a rush to jam features in.” I love this phenomenon!
And this was much easier than anyone expected, including and especially my mobile release team 😉
smaller batch sizes with more frequent delivery… lean FTW!
“Excitement and Fun”. “People not in the pilot want in!” What a cool testament, @markwei!
Good thing there’s two of you! You have a lot of questions (which is great)!
Good thing there’s two of you! You have a lot of questions (which is great)!
this partnership has made all the difference in the world. teams see our alignment and follow.
i would love more about how others deal with Hot-Fix and impact on next sprint...
“CD” is always my answer. 😄 We design for emergencies and use it to deliver features.
yeah, we have issue where the team that should be doing grooming for the next sprint is the ones that get pulled into hotfix so the grooming does not happen as well as it should... So we end up with more issue at deploy then have more hotfix, so it is a never ending spiral.
Happy to talk a little about that at the bar later, if you’re free.
yes, should have time after the closing remarks, But need to drop at 6:45 pm EDT...
Culture is a second order effect < focus on being better at performance, on learning, and culture follows. That’s been my experience.
which is good! if you had to fix culture first that would be really really hard. 🙂
per Jez in Accelerate: change culture by changing behavior, not the other way around
Culture comes from the top, as I've experienced very directly in every place I've ever worked 😉
A colleague NPS score might be interesting? "I would recommend our ways of working to a colleague or friend". Interesting to see the trend over time. It's usually a Duning-Kruger curve (up, down, up).
you beat me to eNPS comment
Also, it's possible to improve on the DORA metrics by working people harder (unhappier, unsustainable, burnout)
Good idea. We have done a developer survey and plan to do more. NPS is one of the questions.
Idea: measure quality (better), value, sooner (flow), safer (compliance, GRC), happier 🙂 #BVSSH
No pressure. There are only two initiatives that are reviewed every month by the exec team -- our massive multiyear long move to mediated payments, and this work
“Our CEO Jamie Iannone is a huge supporter of our work, constantly mentions our work in the Town Hall… and tells us to go faster.” 😆
We have always encouraged Cross Training between QE and SE. Some people think they are too good to cross train
Google had a few programs like this where you did a 'tour of duty' in a different role. Was generally well received.
Interesting. Any blog I can reference? Is an uphill battle most of the times
focusing too much on the metrics, gaming the system, …
How to get them gaming it in a way that helps the goals? 😄
As @nasello.scott says, “hacking the biggest undocumented API”
Positive metrics showing that we aren’t taking chances.
speaking of goodhart's law https://twitter.com/scottnasello/status/1400918583023079424?s=20
Say more about what you mean? We are laser-focused on delivery at the moment, and next year we will introduce Flow Framework for a broader perspective.
“Hey, Randy, I know YOU think you’re doing the right thing, but…“. 😆
How did eBay get to 2020 without launching a program like this? Or did they try & abort previous attempts?
My theory - we have leadership in place now who "get it" - Mark, Randy and Jamie
It's not like we have a ton of folks working on this. There are maybe ~6 of us working solely on this? Echoing DVC, I think having platform + product engineering 'joined at the hip', is hugely helpful in aligning incentives.
It takes vision, commitment and courage to implement an effort like this
Smaller resource investment, but people who had done it. Work smarter, not harder 🙂
“What’s our North Star?” < echo of Toyota Kata again
Please help us figure out how to scale. Really interested in experiences around going big and wide.
On you slide two slides ago you said something like Improving the Ebay outcome. Could you restate that again here in the chat?
Not sure I know exactly what you are referring to, but you can download the slides at https://videos.itrevolution.com/watch/624410229/
Boom, CEO vid
My heartiest congratulations to@rshoup @markwei for pulling this off! 🎉
Very stimulating talk Randy and Mark, thank you. I’ve added it to my list for rewatching and sharing later. Fun Coincidence: The used Apple Watch I ordered via ebay just arrived
💡 Bringing us to a close today is none other than @ronwestrum, here to share more about Information Flow Cultures 💡
If you start features with a benefit hypothesis, "If we do this feature, we expect this outcome", then determine how you can measure that benefit in production. One example is A/B testing. There are experts here who could explain the practice better.
We have a pretty large experimentation effort here. Not evenly applied, but the capability is there.
For the velocity work, we do have a hypothesis about each of the platform improvements, and each of them have success metrics (e.g., build time, startup time, etc.). Justin is right about customer experimentation; it's uneven.
Hey @rshoup @markwei - you forgot to say if you are hiring
Thanks ebay team....took away some great action....already chatted our VP of Architecture
Watch out Randy! Steve used to manage 200+ dev teams at Nationwide, and then he was asked to take over all of I&O at Nationwide. 😆 He’s speaking tomorrow — amazing amazing story.
Up next: Please welcome Dr. @ronwestrum, whose work has influenced the work of so many of us!!
Idealcast Episodes in the unlikely event someone hasn’t heard them yet: • https://itrevolution.com/the-idealcast-episode-17/ • https://itrevolution.com/the-idealcast-episode-18/
Hey Ron! I'm halfway through Sidewinder. Awesome and Inspiring.
I think you generated the phonemes when naming the model @ckissler
Incidentally, during the @allspaw recording session, he and I were arguing about which @ronwestrum paper was the best. Requisite Imagination vs. Information Flows: A Personal Journey.
I reference Westrum every week in class. I am so happy to hear you present this directly.
I actually ask about flow of information when I interview at new companies... and then follow up with are they looking to change that flow....
so often communication is done from the perspective of what works best for the communicator.
We heard from Captain @emily356 in the amazing workshop this afternoon: leaders can reward these behaviors: “when I make mistakes, I tell everyone, because that’s who I am”, reinforcing these desired norms.
Not mentioned here, but so profound, when combined with Technical Maestro: Rabinow’s Law: “if the boss is a dope, you will have, or soon will have dops all the way down.”
@genek I'm still waiting for someone to define the Inverse Rabinow's Law.
@nuwayser "When we hired the new CxO"... 😉
but if the boss isn’t a dope but people think he is… what then?
Gene...I shared the Dope story you gave me with my team that day. Lots of fun out of that. But more important the traits of a leader who isn't a dope which they really loved.
The two in combination have tremendous explanatory power: can explain great and horrible orgs, and why
and you just put a pair of words to an attribute that I’ve been trying to describe. That pretty much covered the cost of the conference in one line. Thanks!
i can see how important being a “technical maestro” must be to the ability to have “requisite imagination”
But isn't that what leaders are supposed to be doing? Looking down the road, identifying impediments, and clearing as much as they can?
Westrum’s Law: “The higher the stakes, the rougher the play” ouch.
The mind expanding manifestation of requisite imagination for me: @rdaitzman yesterday walking business partners thru what could go wrong, and inviting discussion of response/etc, and elevating risks.
Lesson, don’t let your company acquire other companies 😂
Yes, team, not family
Sounds like when the families combined, it was more like the Gambino family
(thanks to @ronwestrum, I’ve watched 9 hours of Boeing documentaries. admittedly some parts at 1.8x speed)
So when a company goes through a M&A or major leadership change, there's a major cultural shift
The story of Boeing’s culture change after the merger is terrifying.
They are lucky that starliner didn’t make it on the same level at the max.
That seems to happen far too often unfortunately. 😞
"family" culture can often mean toxic -- unlike what I think the "good family" culture can be which is a performing/generative team
True. I wonder how many people have a negative context around the word family.
Curious about this too. In his book No Rules Rules, Netflix CEO Reed Hastings talks about creating a culture less like a family and more like a professional sports team. I'm very curious how folks from Netflix would respond to this, as well as what kind of culture Netflix would be classified as under the Westrum model.
I’ve definitely heard of some shady companies using the “family” angle to avoid fairly treating employees.
I think lots of different analogies can work, because the magic isn’t in the words, it is in the shared understanding, and does it generate the “will to believe”.
I think organizations should recognize their culture explicitly and then the relationship between employee and organization can start from a place of trust. If we see these topologies as analogies and not intentional cultivation then there isn't collective ownership over the culture
Completely agree @meghan.glass. It isn’t enough to put the words in a powerpoint, it is a shared mindset you need to build.
unfortunate loss.. when culture not explicitly stated
Yeah, agree that genuine connection is critical. I've seen it come at the expense of team performance/business needs, which is when it's no longer working/generative..
how can they get this generative culture back, when they're now renowned for their planes falling out the sky? 😞
Culture is something most don't quite understand since it's thought of hard to measure objectively
Curious about what kind of questions should be asked to find faults in a company culture.
i’ve used @nicolefv’s work in “accelerate” with much success. she is a surveying aficionado.
this paper has the survey instrument included: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2681909
I love that, collective efficacy
employers that actively avoid surveys or bury signals are likely pathological...
I love that the State of DevOps Research validates the Westrum Organizational Typology — @ronwestrum
Promises kept can be such a hard part of being a middle manager if the leaders above you keep shifting priorities
is it wise to expose the leaders' inability to stick to a plan to those lower down? 🙂
“all the smart people at Boeing were assigned to work on Supersonic Transport project.” 😆 (leaving few for 747 project.)
love that the leadership secret here is "make people feel better"
After watching 3 documentaries on the 747, here’s the one I’d recommend that was produced by National Geographic, coincident with recent launch of 747-8. https://www.youtube.com/watch?v=845w8O4T9v8
The original 747 project is nuts: 75K engineering drawing. Can’t even comprehend how one designs something like that.
I so distinctly remember seeing the “upstairs” in the 747 for the 1st time
this reminds me of Kent Beck 3X. After proving your idea/MVP, everything is a threat to your survival.
@ronwestrum - back to tech maestros ... Does the tech maestro create the culture? or Does the culture reveal the tech maestro?
Reviewing my notes: each 747 engine costs $20MM — 4 of them. $80MM. So they attach at very last minute, so payment made at last possible minute — minimize time before assembly and sale to airline!
That factory picture reminded me of this scene in Idiocracy... 😝
Reminder: Please submit your feedback for the talks you attended. It’s so valuable for us and the speakers. And after all, feedback is a gift and sharing is caring! Enter your feedback for those talks here: https://events.itrevolution.com/virtual-agenda/ https://devopsenterprise.slack.com/files/UATE4LJ94/F02GHSEB604/feedback-does21us.png
Sorry, @ronwestrum!!! 😆 Just wanted to prove to you that I did my homework! 😆
This talk is an emotional rollercoaster!!
“employees felt that management were preventing them from doing the right thing”
.. when you’ve just joined a new org and see so many similarities… :woman-facepalming::skin-tone-2: where do I start to help?
@allspaw is going to pop up and say adaptive capacity!
I love the "circle of faith". Takes Product Management (being customer-centric) to a whole other level
“saved their lives” sounds so much like the Toyota philosophy of not building cars, but enabling people to transport their families to soccer games etc.
This framing of a 'circle of trust' is really wonderful. I'm going to rewatch this many times, I predict.
Thank you so much, Dr. @ronwestrum — we so much appreciate your work!!!
how do you give a virtual standing ovation??
Thank you, @ronwestrum!
stand up clap clap clap clap
Did you hint last DOES that there might be a new book to be on the lookout for?
@ronwestrum will you be hanging out with us in Gather or slack for the during the rest of the conference?
@ronwestrum hopefully i'll get to join a Gathering with you sometime (when you're not teaching 😂)