ask-the-speaker-track-1 2020-06-24 | Devops Enterprise Summit Slack Archive

Jess Meyer - IT Revolution (she/her)10:06:45

After break, welcome speaker @richard431 to Q&A!

👋 1

Richard Vodden10:06:20

I think he’s the evil one

😄 4

😂 5

Richard Vodden10:06:20

I think he’s the evil one

😄 4

😂 5

Richard Vodden10:06:33

I certainly wish he’d stand still

Pete Nuwayser - IBM10:06:30

^^ was just trying to figure that out

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:11

this is going to be interesting, I just worked out an entire new setup for our terraform stuff

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:11

this is going to be interesting, I just worked out an entire new setup for our terraform stuff

Richard Vodden10:06:53

it may not go into enough detail for you - 30 minutes is a terrifyingly short amount of time to cover such a huge subject

Richard Vodden10:06:01

happy to fill in any gaps 🙂

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:42

Yeah it's a massive subject 🙂 Thanks for the offer! Will probably find questions 😛

Richard Vodden10:06:50

would be great to understand what you did too, we will definitely be able to learn from you too

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:25

Super happy to talk that through yeah, probably less concise than your version, preparing a talk tends to structure thoughts better 😄

Richard Vodden10:06:55

ha - thanks; it didn’t get nearly as much prep as it should have done :rolling_on_the_floor_laughing:

😂 1

Tom Ayerst10:06:08

“No such thing as simple” - 😱

Tom Ayerst10:06:08

“No such thing as simple” - 😱

Richard Vodden10:06:25

oh hey tom 🙂 Long time no see!

Tom Ayerst10:06:54

Hi! 🙂

Tom Ayerst10:06:21

I was just thinking I still use teh stuff we came up with at the Home Office 🙂

Richard Vodden10:06:46

me too - the touchpoints stuff we use verbative for DevX at Babylon

Tom Ayerst10:06:02

Cool!

Kenny Johnston - Gitlab10:06:18

@richard431 - What is the number of “product teams” developing the (I think you said 20) microservices?

Kenny Johnston - Gitlab10:06:18

@richard431 - What is the number of “product teams” developing the (I think you said 20) microservices?

Richard Vodden10:06:53

295 Microservice 😳

Kenny Johnston - Gitlab10:06:13

Oh - I was only an order of magnitude off :man-facepalming:

Richard Vodden10:06:21

there are about 40 product teams (“squads”) spread across 4 tribes

Pete Nuwayser - IBM10:06:54

40 teams, 295 m-s?

👍 2

Daniel Cahill - Engineer - Ontario Systems10:06:58

You might discuss this later, but how have you handled rollbacks using terraform? We have a lot of custom wrappers with mixed results.

Daniel Cahill - Engineer - Ontario Systems10:06:58

You might discuss this later, but how have you handled rollbacks using terraform? We have a lot of custom wrappers with mixed results.

Richard Vodden10:06:33

I don’t quite get to it. The most important thing is to test at the “service” layer (you’ll see that in a sec) that rollback is possible

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:09

Splitting states seems to be fairly important in our setup, then you can have the most volatile layer isolated in its change

Richard Vodden10:06:37

yeah - splitting state is absolutely vital. We have one statefile per “service” per “environment”

Richard Vodden10:06:03

I’ll introduce those terms in about 4 minutes

✔️ 1

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:13

and usually you can just revert the evil commit and run terraform apply over it again - tends to cover most cases

Richard Vodden10:06:36

yeah - however if you’ve tested as part of your IaC SDLC that that approach DOES work, then you can be doubly sure that you’re covered

Jimmy Simons10:06:52

@richard431 Did you ever consider using Feature Flags to manage rollouts and state splitting?

Richard Vodden11:06:49

we didn’t - can you expand a bit more? Sounds interesting

Jimmy Simons11:06:21

@sherron

Simon Herron11:06:05

Hi @richard431, awesome talk.

Simon Herron11:06:18

We use / and provide an integration into Terraform so that you can manage all of your feature flagging centrally. Thus controlling everything from Canaries, Betas, Rollbacks, anything wrapped in a flag from a single UI.

Simon Herron11:06:18

Terraform is also really great for Flag cleanup, as you can keep all the flag definitions as application code.

Luke10:06:13

@richard431 - Slightly off topic from Terraform as a whole, but do you guys live on EKS or self hosted Kubernetes?

Luke10:06:13

@richard431 - Slightly off topic from Terraform as a whole, but do you guys live on EKS or self hosted Kubernetes?

Richard Vodden10:06:48

EKS all the way

Luke10:06:54

Great, likewise!

Pete Nuwayser - IBM10:06:33

"I give you these 15... er... 10 commandments!" 😄

👍 3

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:32

terraform modules in separate git repositories is so vital for keeping sane

👋 1

💯 2

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:32

terraform modules in separate git repositories is so vital for keeping sane

👋 1

💯 2

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:03

AND HAVING SOMEONE THAT MAINTAINS THEM.... don't just create it and then ignore it forever 😂

💯 2

Johnny Longo10:06:29

and using git tags too for versioning!

💯 3

Richard Vodden10:06:27

in about 90 seconds I the evil version of me will forget to press next

Richard Vodden10:06:07

it means i’m quite surprised when the next slide arrives :rolling_on_the_floor_laughing:

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:48

Hah interesting we went away from that module component :thinking_face: It was a large overhead and the team I'm currently in is not that big and only has a few people writing infrastructure

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:48

Hah interesting we went away from that module component :thinking_face: It was a large overhead and the team I'm currently in is not that big and only has a few people writing infrastructure

Richard Vodden10:06:14

oh that is interesting

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:04

Usually with smaller teams I've found that starting from the component level is a bit easier to get up and running with

Richard Vodden10:06:21

yeah - I can see why you wouldn’t bother with the module layer

Johnny Longo10:06:02

Great talk @richard431! We're in the Azure world using Terraform and learning as we go so this is great for us

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:29

The components creating alerts for stuff are so smart - I'd so going to steal that 🤯

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:29

The components creating alerts for stuff are so smart - I'd so going to steal that 🤯

Richard Vodden10:06:45

tiz not theft - its exactly why I’m here 🙂

Richard Vodden10:06:56

well - why the evil version of me is there 😉

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:49

😄 real good stuff so far

Luke10:06:30

Be interested to know how you pull in the other modules into another terraform. We’re currently using separate modules like you are 1 to 1 with a AWS resource, but currently setting up say a S3 bucket, then using outputs/remote state to obtain the ARN for example

Luke10:06:30

Richard Vodden10:06:57

I didn’t have time to get to that properly

Richard Vodden10:06:06

we make quite heavy use of data only modules

Richard Vodden10:06:16

this means that we have a loose coupleing between components

Luke10:06:21

We’ve moved to Terragrunt to combat some of the environment challenges, which has helped a load for us.

Richard Vodden10:06:38

we’re using terragrunt at the environment level

👍 1

Richard Vodden10:06:43

didn’t have time to cover that either 😄

😂 1

Luke10:06:59

30 minutes is a scarily short time for something this huge

💯 1

Kenny Johnston - Gitlab10:06:19

@richard431 - Services are re-used across environments? You might have “smaller” versions of a service for staging versus production?

Kenny Johnston - Gitlab10:06:19

@richard431 - Services are re-used across environments? You might have “smaller” versions of a service for staging versus production?

Richard Vodden10:06:36

yeah - the size is passed in (by the environment) to the service

Kenny Johnston - Gitlab10:06:45

:thumbsup:

Karl Marfitt10:06:40

I'm also a keen user of some other Hashicorp products (including packer and vagrant). I love thinking about the 'chicken and egg' challenge with infrastructure-as-code for things like SCM/repository systems that contains the code and binary build artefacts to build the SCM/repository systems 😉 The answer: the egg came first (dinosaurs laid eggs well before chickens existed!) Also it probably doesn't matter anyway, if it can all be self-referencing, bootstrapping, however you want to explain it (I guess similar to compiling a compiler on a kernel compiled by the same compiler...) and idempotent

Pete Nuwayser - IBM10:06:57

and now I know what comes after bar

Pete Nuwayser - IBM10:06:57

and now I know what comes after bar

Richard Vodden10:06:30

I asked my 8 month old little boy what should come next and he did an amazing homer impression

Pete Nuwayser - IBM10:06:30

HAHAHAHA

Pete Nuwayser - IBM10:06:39

LMAO

Pete Nuwayser - IBM10:06:45

Although next time I might use foo, ga and zi 😉

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:51

npm update is probably the scariest command I usually issue though 😄 it just downloads the entire internet

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:51

npm update is probably the scariest command I usually issue though 😄 it just downloads the entire internet

Richard Vodden10:06:12

I think that is node’s fault rather than the concept of dependency management’s fault 😄

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:23

Probably yes 😄

Martin Sperrin10:06:29

Then you forget to add the cache to .gitignore...

😱 1

Richard Vodden10:06:46

we’ve all done ti

Richard Vodden10:06:20

its a bit bigger now

💯 3

Richard Vodden10:06:20

its a bit bigger now

💯 3

Kenny Johnston - Gitlab10:06:20

I was just searching for the repo!

Richard Vodden10:06:28

look down 🙂

👍 1

Kenny Johnston - Gitlab10:06:46

I responded to the wrong thread :)

Richard Vodden10:06:20

https://github.com/babylonhealth/beroku

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:39

Just seeing terragrunt, haven't looked into that in ages - what does it now bring to the table after we have state locking and workspace in place?

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:39

Just seeing terragrunt, haven't looked into that in ages - what does it now bring to the table after we have state locking and workspace in place?

Richard Vodden10:06:01

the main thing is that you don’t have to repeat your backend configuration in every “service” in the environment

Richard Vodden10:06:16

you just have a terragrunt.hcl at the top of the environment which declares that

Richard Vodden10:06:20

that’s pretty much all it does TBH

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:26

Ah :thinking_face:

Richard Vodden10:06:35

if terraform fix that we’ll most likely ditch terragrunt

Jevon White10:06:36

I do like it for that, though - and the ability to run Terraform across multiple folders that are dependent on each other.

Richard Vodden10:06:34

that’s a good shout @j.white.1 - we are running a plan across the entire environment every hour which uses exactly this feature

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:50

That might actually be a really good feature yeah - my folders usually are split by how often they are applied so this is usually not hitting me as hard I suppose

Karl Marfitt10:06:43

yay testing!

Tom Ayerst10:06:12

Testing & automation proceed together 🙂

Karl Marfitt10:06:56

Have not yet had the opportunity to play with it but definitely on my to-do list https://terratest.gruntwork.io/

Karl Marfitt10:06:56

Have not yet had the opportunity to play with it but definitely on my to-do list https://terratest.gruntwork.io/

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:12

terratest is dope!

Richard Vodden10:06:23

its a bit wierd, cause usually you’d write your tests in whatever language you’re writing in

Richard Vodden10:06:28

but in IaC that makes no sense

Richard Vodden10:06:48

terratest is go, and go can do pretty much anything

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:30

I've done a bit of that in very lightweight teams - where the "test" in the end is just a tf apply on a temporary environment and followed by a destroy Doesn't really test on a functional level but atleast shows you the code does not explode

Richard Vodden10:06:51

and if you do no other test, do that one

Martin Sperrin10:06:46

Yeah, don't think IaC is quite ready to join the 'test in production movement' just yet... 🙂

Jevon White10:06:47

Have you used other tools like AWSpec or ServerSpec and if so, how do you feel it compares to Terratest?

Richard Vodden10:06:12

I really like them both, and have even forked it with a view of porting it to go so that I can have it in the Go world. We have more go skills at Babylon, so terratest made more sense

Jevon White10:06:43

:thumbsup:

Pete Nuwayser - IBM10:06:04

Yes! Infra test!

👍 2

Jesse Cafarelli10:06:24

ahh yea, we just started using terratest just recently. It works surpsingly well with kubernetes

💯 1

Pete Nuwayser - IBM10:06:49

Wow, I wish I had this testing slide seven years ago

Luke10:06:43

Do you run anything like https://www.runatlantis.io/ for the terraform/terragrunt plan? It’s something i’m looking into myself currently

Karl Marfitt10:06:01

'extending the concept of a micro service at the infrastructure layer' fundamentally so powerful for pushing higher quality into non-functional (especially performance and operational acceptance) test scenarios

Johnny Longo10:06:15

@richard431 Good shout on running terraform plan regularly to see if anyone has manually changed the resource - we'll have that!

💯 4

Johnny Longo10:06:15

@richard431 Good shout on running terraform plan regularly to see if anyone has manually changed the resource - we'll have that!

💯 4

Kenny Johnston - Gitlab10:06:26

Poor man’s “GitOps”? 🙂

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:23

Mhm there it is again, the symlink of global variables :thinking_face: That always breaks for me because linux and windows had problems on these 😞 (might be all gone now I haven't looked at that for ages)

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:23

Richard Vodden10:06:40

I have found no other way of doing it

Richard Vodden10:06:53

we tried SO HARD to avoid it

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:09

yeah I have no chance I'm copying stuff 😞

Richard Vodden10:06:37

I rejected git submodules a while back because the dev experience was awful

Richard Vodden10:06:48

but there’s now a thing called subrepos which I’ve not played with yet

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:10

mhm :thinking_face:

Pete Nuwayser - IBM10:06:48

examples that actually work. yes.

Michael Winslow10:06:14

This is great @richard431! I see you have broken down the testing and you have Unit Testing -> Does the resource deploy? Are your team's unit tests following the rule of: These tests run even if the machine you are testing on is completely isolated/air gapped?

Michael Winslow10:06:14

Richard Vodden10:06:16

we have a pool of AWS accounts which are maintained empty

Richard Vodden10:06:37

but now I’ve typed that, we only do that at the component level

Martin Sperrin10:06:46

You can have terragrunt query the output of the plan

Richard Vodden10:06:56

unit tests are done on the plan - so yes; will work on an airgapped machine

Michael Winslow10:06:32

Thanks! That was great.

Kenny Johnston - Gitlab10:06:18

@richard431 How do you establish a connection between app code repos and the repos that contain the “Service” from an infra perspective?

👍 1

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:56

Real good talk!!

Pete Nuwayser - IBM10:06:00

@richard431 really good, practical talk. thank you so much 👏

Kenny Johnston - Gitlab10:06:15

👏 - Awesome talk @richard431 thank you!

👏 2

Jevon White10:06:15

Great talk @richard431

Michael Winslow10:06:18

Great Talk! Thank you!

Karl Marfitt10:06:20

I'd definitely like to try and learn more on this, especially how does this compare to 'managing state-as-code' with things like Terraform Enterprise/Cloud services watching/testing SCM changes for consistency/compliance Thanks so much @richard431 brilliant talk

Tom Ayerst10:06:20

👏:skin-tone-5:

christophe bolle10:06:31

Fantastic talk!

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:35

Good thing to know is that my approach is not that different so I feel validated now 😂

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:35

Good thing to know is that my approach is not that different so I feel validated now 😂

Richard Vodden10:06:04

and visa versa - mutually assertted success 🙂

💯 2

Philipp Böschen, TUI, DevOps Coach, (he/him)10:06:08

Nothing increases confidence more !

Martin Sperrin10:06:41

awesome talk @richard431, validates some of the stuff we have done and gives me a few new ideas as well :thumbsup:

TomLimoncelli (he/him) Speaker Op Best Practices for April Fools10:06:51

@richard431 Great talk! I can't wait to show it to my coworkers that are struggling with the same issues at Stack Overflow.

Chris Leeworthy (he/him)10:06:53

This is great for us, we are just at the start of our terraform journey and trying to figure out structures and working practices

👍 1

Daniel Cahill - Engineer - Ontario Systems10:06:57

Good talk! Really helpful since Terraform is what we use. Gave me some ways to imrpove and focus on better breakdown.

Jevon White10:06:04

I am very early on this journey - but know that it’s a journey that needs to happen.

👍 2

Alex Abrahams10:06:23

Outstanding. Thanks @richard431

Gareth Septhon10:06:24

@richard431 Great talk

Jesse Cafarelli10:06:32

awesome talk

Marcello Marrocos10:06:08

@richard431, nice presentation, thank you! Do you have any rules on how setting boundaries for the scope of your environment creation and testing? For instance, in a scenario where I have an app that consumes an API backend and has an API Management in the middle. In your experience, should we handle it individually each piece or in this case we could handle as a "package"?

Marcello Marrocos10:06:08

Richard Vodden10:06:15

I think you have to look at how they are released. Are changes made to just the application, or just the backend? If so they should probably have a separate release train and be managed separately. If, on the hand, you find that you are most often making changes simultaneously to both, then they really are just one thing and you should join them together

👍 1

Richard Vodden10:06:49

as a rule of thumb, in my experience, if you aren’t sure which to do, then go for more smaller release packages - so keep them separate.

Marcello Marrocos11:06:46

Sounds also reasonable for starting and iteratively improve it adding the components. Thanks for sharing!

Brian Martin10:06:24

Improv is a great skill.

Luke10:06:29

👏

Darryl Brown10:06:17

@richard431 I really enjoyed your presentation. I am proposing a terraform / chef implementation for the company I work for and I like your approach. What were your challenges with VNet separation between applications?

Darryl Brown10:06:17

Richard Vodden11:06:23

at the MS level we’re exclusively using K8s so we segregate using isthio rather than sperate vnets - would be interesting to talk through you problem though1

Darryl Brown11:06:52

Is this a good platform our do you have a breakout session later?

Richard Vodden11:06:12

I don’t have a breakout alas

Richard Vodden11:06:19

this is as good as anything 🙂

Darryl Brown11:06:56

Oh bummer, ok I can go high level. I inherited a structure with large vnets and groups of subnets with separation (loosely) by sservice, so all apps with infra are in one subnet and services of type A in another

Darryl Brown11:06:59

Obviously other control structures in place with these, but in listening to your approach it sounds like your focused on the container structure. I'm not there yet :)

Darryl Brown11:06:46

Actually writing it out like this i think I see how I can manage this in the app grouping similar to your approach

Richard Vodden11:06:55

ok we have a “special” service called bootstrap which creates all the common stuff in the environment - that would create and output the vnets

Richard Vodden11:06:12

now i like to use “data only modules” for services to look up what they depend on which already exists, but you could just as easily do a data lookup from the component itself

Richard Vodden11:06:43

then you’re saying in your sevice that you want the infrastructrure to be deployed to “subnet type A” and it will look up which subnet that is at deployment time

Darryl Brown11:06:29

Cool, is that within the Terraform ecosystem? Or is it a custom add on? I'm just getting my team started with Terraform, shifting out of direct ARM template management in our repo.

Darryl Brown11:06:28

Ok i follow, that sound way more manageable and we can still verify that no changes have been injected into the deployments

💯 2

Darryl Brown11:06:55

I'll dig into my test environment and see if I can mock this up, thank you

Richard Vodden11:06:55

great 🙂 - happy to talk through in more detail if you get stuck - ping me on linkedin or something 🙂

Darryl Brown11:06:58

Will do

Jess Meyer - IT Revolution (she/her)11:06:34

@ann.marie.99 @cncook001 Are you available for Q&A right now?

Areti Panou11:06:08

hey everyone! Just managed to tune in today. What presentation is now on in this track?

Areti Panou11:06:08

hey everyone! Just managed to tune in today. What presentation is now on in this track?

Richard Vodden11:06:31

DevOps and Internal Audit: A Great Partnership (Nationwide Insurance)

Clarissa Lucas, Author and IT Audit Leader11:06:01

There was a scheduling mishap. The audit presentation will be rescheduled.

Richard Vodden11:06:19

oh woops 🙂

Areti Panou11:06:16

Thanks! @lucasc5 when is your presentation taking place?

👍 1

Clarissa Lucas, Author and IT Audit Leader11:06:40

I'm not sure yet. As soon as I find out, I'll let you know 🙂

Areti Panou11:06:43

Thanks! Looking forward to it as you can see 😄

Clarissa Lucas, Author and IT Audit Leader11:06:54

@areti.panou Glad to hear! @lewir7 and I are looking forward to everyone seeing it 🙂

👍 2

Areti Panou11:06:23

I am a bit confused

Craig Cook - IBM11:06:15

sure, I can chat. Watching a talk at the moment.

👍 1

Jose Mingorance11:06:31

When we say deployment frequency, are they talking about Production?

Jose Mingorance11:06:31

When we say deployment frequency, are they talking about Production?

Craig Cook - IBM11:06:31

Yes.

Bryan Finster - Walmart (Speaker)11:06:32

@lucasc5 @lewir7 I'd love to connect and compare notes on dashboards.

Clarissa Lucas, Author and IT Audit Leader11:06:05

@bryan.finster I believe there was a scheduling mishap. Our presentation will be rescheduled for a later time. The presentation currently playing is from @ann.marie.99 and @cncook001. Thanks!

Craig Cook - IBM11:06:15

oh, whoops. look like the talk got swapped.

Bryan Finster - Walmart (Speaker)11:06:26

Oh gosh. 🙂

Bryan Finster - Walmart (Speaker)11:06:55

@ann.marie.99 and @cncook001, I'd love to connect and compare notes on dashboards. 😂

😆 3

Bryan Finster - Walmart (Speaker)11:06:55

@ann.marie.99 and @cncook001, I'd love to connect and compare notes on dashboards. 😂

😆 3

Jose Mingorance11:06:17

Please sign me up.

Jose Mingorance11:06:04

We are working on building low level engineering metrics to help teams with the right behaviors. The challenge is that getting these metrics can be very time consuming.

Bryan Finster - Walmart (Speaker)11:06:30

What platform tooling?

Craig Cook - IBM11:06:31

"Platform" in this context is something like "Commerce Platform Squads". These teams are focused on things like checkout flows.

Jose Mingorance11:06:05

Standard set. Jenkins, BitBucket, Sonar, Nexus, Fortify, etc.

Bryan Finster - Walmart (Speaker)11:06:29

Sorry, I was asking @jose_mingorance what tooling they were using for CD platform. @cncook001 I'd love to schedule a zoom meeting sometime, if you and @ann.marie.99 have spare minutes.

Craig Cook - IBM11:06:34

Oh, GitHub enterprise, Travis, Jenkins, Zenhub, etc

Jose Mingorance11:06:44

all on prem

Bryan Finster - Walmart (Speaker)11:06:07

@jose_mingorance we use Hygieia to aggregate data. It will integrate with those easily.

Jose Mingorance11:06:35

we are experimenting with Splunk to aggregate and trend. We did play around with Hygieia but did not go far.

Jose Mingorance11:06:41

read my mind

Bryan Finster - Walmart (Speaker)11:06:12

Setup can be a bit of a struggle sometimes, mostly a documentation issue.

Bryan Finster - Walmart (Speaker)11:06:23

However, it's much better than Splunk for this.

Jose Mingorance11:06:29

that was our experience. We did try the first and second version.

Jose Mingorance11:06:56

We saw what Roger had put together at Verizon

Bryan Finster - Walmart (Speaker)11:06:07

@cncook001 we are building dashboards using Hygieia data to gamily metrics and to use the metrics to direct teams to CD playbooks to help improve them.

❤️ 1

Bryan Finster - Walmart (Speaker)11:06:48

@jose_mingorance I'd ping the Hygieia core team. They are working to improve that.

Jose Mingorance11:06:15

Nice. We are trying to do the same. Metrics not to control but to guide and coach. Feedback loop into our Dojo coaching too.

👏 1

Bryan Finster - Walmart (Speaker)11:06:27

:this:

Craig Cook - IBM11:06:39

The open source tool I referred to earlier in my talk was Hygieia. No travis support and too hard to add it. We built our own.

Bryan Finster - Walmart (Speaker)11:06:16

We only use it for collection. Visualization we are working to make more actionable.

Bryan Finster - Walmart (Speaker)11:06:48

I like the PR duration scoring. It's on our roadmap for the coming quarter.

👍 1

Bryan Finster - Walmart (Speaker)11:06:18

We score on integration frequency, build duration, build stability, deploy frequency, and Sonar violations.

Bryan Finster - Walmart (Speaker)11:06:46

I also want to score on code inventory: code on any branch that is not in prod.

Ann Marie Fred - Red Hat12:06:33

I’m up for a Zoom. 🙂

Bryan Finster - Walmart (Speaker)12:06:00

Awesome. Free now? 😄

Bryan Finster - Walmart (Speaker)12:06:41

Seriously though, when is a good time?

Ann Marie Fred - Red Hat18:06:11

Gotcha on the DMs.

Areti Panou11:06:04

I was here for the other presentation but I am glad that I get the chance to watch this one as well. Really useful, thanks!

👍 2

Jess Meyer - IT Revolution (she/her)11:06:25

<!here> Sorry - technical mishap! @ann.marie.99 and @cncook001 will re-air at 1:55. Nationwide will air tomorrow - stay tuned! Sorry everyone!

👍 2

Jose Mingorance11:06:30

did we lose the session?

Areti Panou11:06:34

yup

Bryan Finster - Walmart (Speaker)11:06:51

@jessicam it crashed

Jose Mingorance11:06:05

Where are the SREs? 😉

Patrick Debois11:06:36

we decided to stop the session as it was not the right one

👆 2

🆗 2

Michael Gillett11:06:11

this is a very useful presentation (even if it wasn't scheduled for now/ended early)! Do you have a breakdown of the various metrics you used? I made a note of some but think I missed some of the points you were making @ann.marie.99 @cncook001

👍 1

Michael Gillett11:06:11

👍 1

Ann Marie Fred - Red Hat12:06:48

The slides should be posted… let me get you the link.

Ann Marie Fred - Red Hat12:06:21

https://github.com/devopsenterprise/2020-London-Virtual I believe “Day 2” will show up here at the end of today.

👍 1

Michael Gillett09:06:35

Thank you 🙂

Areti Panou11:06:51

It's a cliffhanger now. How does it end? 😃

😆 1

☝️ 1

Patrick Debois11:06:53

just fyi - the session that was airing will be played again at the correct time

Craig Cook - IBM11:06:08

Stay tuned!

💯 2

Bryan Finster - Walmart (Speaker)11:06:24

Thanks @patrick.debois256

Patrick Debois11:06:26

haha !

Areti Panou11:06:27

Decided to go back to waterfall, deploying every 3 months?

🌊 2

Matt Cobby (DevEx, InnerSource)11:06:35

Well what ever happened, @ann.marie.99 @cncook001 that was a brilliant set of insights. Love to talk more. I was riveted all the way.

💯 2

🙂 2

Craig Cook - IBM11:06:20

~~Looks like there will be a replay in 30mins.~~ off by an hour

Jess Meyer - IT Revolution (she/her)11:06:04

<!here> Ann Marie Fred and Craig Cook's talk will re-air 1:55 during the correct time!

Bryan Finster - Walmart (Speaker)11:06:47

During our talk

Bryan Finster - Walmart (Speaker)11:06:56

sad panda

🐼 1

Jose Mingorance11:06:39

BTW, is IBM using Jenkins?

Jose Mingorance11:06:39

BTW, is IBM using Jenkins?

Karl Marfitt11:06:08

TL;DR yes and no! First as Craig suggested too, my views are my own and not necessarily shared by those of my employer 😉 IBM is so big that if you think of any software/tools, somebody somewhere in IBM is very likely using them in anger but not necessarily globally/enterprise-wide. I'm an IBM employee and I've used Jenkins a lot, it's also kind of 'built in' as a standardised part of some other pipeline offerings on our cloud tooling but our current enterprise-wide solution is Travis CI (fully integrated with an enterprise instance of Github as the strategic SCM of choice at the moment). Personally, I prefer being completely tech/tools agnostic and looking at whatever seems best suited for the context/problem/product/system-under-test at the time...

Karl Marfitt12:06:11

Another point to add for anyone interested, there's also a recent (I think) offering named CIO Cirrus which is basically an OpenShift based cloud platform solution for hosting IBM internal tools. I expect that strategically many internal teams may want to move their pipelines to this for the benefits over whatever they were doing before instead. The enterprise platform standardisation, ongoing support and availability/reliability advantages may outweigh any in-house supported options.

Ann Marie Fred - Red Hat12:06:56

In our own area of IBM, I’d say we’re about 50% Jenkins and 50% Travis! People tend to use Travis CI if possible, because it’s easier to set up and maintain, but they’ll use Jenkins if they need its more sophisticated set of plug-ins, scheduling, or build chaining capabilities.

Ann Marie Fred - Red Hat12:06:41

We even have one group using UrbanCode Deploy, because they have multiple environments and a set of modules that have to be deployed in a synchronized way.

Ann Marie Fred - Red Hat12:06:37

The CI/CD tool is one where I frankly don’t care what another team is using; whatever works for them is fine with me. But there are, of course, skills and knowledge we can share by at least more or less limiting it to two tools.

👍 2

Rosalind12:06:06

There are other parts of IBM that are completely Jenkins because of it’s support for multiple hardware and operating systems. The only other one that is close for operating systems is Gitlab CI. We need a solution that works well for z/OS and for various languages.

Ann Marie Fred - Red Hat18:06:01

Hm I’ve never tried Gitlab CI. Good to know it has broad platform support.

Saloni12:06:07

@richard431 Hi - I want to be able to see this talk where you don't stand still. 🙂 I don't know if I'm being daft but I can't see it in the library amongst all the others. Has it been uploaded?

Saloni12:06:07

@richard431 Hi - I want to be able to see this talk where you don't stand still. 🙂 I don't know if I'm being daft but I can't see it in the library amongst all the others. Has it been uploaded?

Richard Vodden12:06:35

there were a few problems with my original upload so it only got properly submitted yesterday

Richard Vodden12:06:07

I’m sure @jessicam is on it, but she also has a billion other things to do for the speakers who were more organised than me :rolling_on_the_floor_laughing: I’m sure it will be up soon!

Jess Meyer - IT Revolution (she/her)12:06:44

@saloni.seth Yes! We will get it in the library soon. Thanks

👍 2

Saloni12:06:34

Thanks Jess! Im looking forward to it.

Richard Vodden16:06:21

@saloni.seth its been uploaded now 🙂

Jess Meyer - IT Revolution (she/her)16:06:14

Thank you!

Saloni07:06:35

Thank you @richard431 and @jessicam

🍾 1

Ann Marie Fred - Red Hat12:06:50

Apparently I chose a poor time for a breakfast break!

Marc Boudreau (Enterprise Architect)12:06:05

No video?

➕ 4

Jess Meyer - IT Revolution (she/her)12:06:14

Welcome speaker @aimee.bechtle055 for Q&A!

Aimee Bechtle12:06:26

Hello! I am happy to be here

Aimee Bechtle12:06:37

Good afternoon/morning

Aimee Bechtle12:06:34

Is the no video message directed towards my talk?

Jess Meyer - IT Revolution (she/her)12:06:41

@aimee.bechtle055 your video looks good to me! Anyone having issues?

Rosalind12:06:46

looks fine to me

Aimee Bechtle12:06:02

Good, thanks

Jeff Gallimore (CTIO - Excella)12:06:49

resilience is people!

Rosalind12:06:18

@aimee.bechtle055 what was the biggest help in your change to remote workforce with COVID 19?

Rosalind12:06:18

@aimee.bechtle055 what was the biggest help in your change to remote workforce with COVID 19?

Stephen Noad12:06:06

Hi @rradclif 👋 Apologies to dive in here, but we at Slack created these two blogs with our point of view on the remote workforce that may help; 1️⃣ https://slackhq.com/how-slack-shortens-distances 2️⃣ https://slack.com/intl/en-gb/resources/using-slack/slack-remote-work-tips

Rosalind13:06:11

np, we use slack extensively…. slack spread is our problem channels are great until you have 100s.

Aimee Bechtle12:06:41

Many of our employees were faced with childcare issues and getting the equipment needed and fearful they wouldn't be able to meet their commitments and target dates. We softened our dates and deadlines and were understanding and empathized with them

Duncan Lawie12:06:42

Love the "full stack team" over "full stack engineer"

👍 2

Jeff Gallimore (CTIO - Excella)12:06:48

https://itrevolution.com/book/full-stack-teams-not-engineers/

Aimee Bechtle12:06:01

@jeff.gallimore I quote that forum paper often

Aimee Bechtle12:06:27

My favorite part about that paper is where it mentions how it eliminates a lot of women from applying to full stack jobs

Jeff Gallimore (CTIO - Excella)12:06:38

yes…

Jeff Gallimore (CTIO - Excella)12:06:00

such a simple (not easy) and powerful shift in thinking

👍 1

Rosalind12:06:01

What are the ideas of how to do a dojo when everyone is remote?

Thomas Williams12:06:06

This is so applicable to my teams. Thank you, Aimee!

Aimee Bechtle12:06:49

Some ideas are reduced hours in the day for Dojo, using conferencing and collaboration tools like Miro and Zoom or Meet, allow breakouts or pairings of team members in the Dojo to do work on their own time or time zone. At S&P we have people in Asia-Pacific, UK, and multiple USA time zones. We choose a time frame to collaboration and be a team in the AM, from 8 - Noon, so we can all collaborate during our time zone work hours. Then assignments are completed outside of those hours and with a pairing of teammembers.

Rosalind12:06:09

I really like how you put the C’s together, this will be a great help in helping explain this culture change.

Aimee Bechtle12:06:25

I'm a big fan of alliteration. It helps to remember

Yasmin Sajid12:06:40

YBYO

Aimee Bechtle12:06:12

If you noticed, so many things begin with a "C". We've all been working for "A"s our whole life, let's celebrate the "C"!

😀 4

Jeff Gallimore (CTIO - Excella)12:06:53

Duncan Lawie12:06:29

:kermit_flail:

Ciaran Byrne12:06:58

Hi @aimee.bechtle055, are you doing this within an existing funding envelope, or is 'extra' development funding available for the transformation?

Bryan Finster - Walmart (Speaker)12:06:59

Love your success criteria @aimee.bechtle055

Bryan Finster - Walmart (Speaker)12:06:34

Especially 3

Aimee Bechtle12:06:35

@ciaran.byrne asking the hard question. Existing funding.

Aimee Bechtle12:06:35

@ciaran.byrne asking the hard question. Existing funding.

Ciaran Byrne12:06:05

Thats a common theme over the past couple of days. Unsurprising, I suppose - doing more with the same (at best) and less (probably table stakes now)

Aimee Bechtle12:06:13

I hear "We're DevOps" over and over again, this success criteria helps me to help them understand if they are just practicing CICD, or are they really DevOps?

Aimee Bechtle12:06:26

I don't like when DevOps gets watered down

💯 2

Aimee Bechtle12:06:53

My latest debate is on "Continuous Testing". Is that really a thing?

Aimee Bechtle12:06:53

My latest debate is on "Continuous Testing". Is that really a thing?

Bryan Finster - Walmart (Speaker)12:06:19

I prefer "testing is the air we breath, not a step"

Jeff Gallimore (CTIO - Excella)12:06:18

@aimee.bechtle055 i like the idea of “change profiles”

👍 1

Jeff Gallimore (CTIO - Excella)12:06:59

at a minimum, it will help explain what you’re dealing with and manage expectations

Aimee Bechtle12:06:25

@jeff.gallimore I've been thinking of how to put together a matrix of the factors that influence and effect the pace of change and if we could look at companies and be able to relatively estimate how long it would take. I'm sure there's too much unpredictability but how can it be a data supported estimation when companies need expectations set?

Aimee Bechtle12:06:04

At Company #2 they set out to "transform" in 3 years, when I left in 2019 they had a ways to go

Aimee Bechtle12:06:44

From 3 years to 6 years. The CEO updated the same slide every year and changed the date.

Kurt A, Clari12:06:15

@aimee.bechtle055 - in light of the corp legalese on your slides - can we use your C-Suite concept (with appropriate attribution of course)?

Jeff Gallimore (CTIO - Excella)12:06:21

@aimee.bechtle055 that would be a powerful tool. imagine using that tool as a self-diagnostic with leaders… where do YOU think you are? and how fast do YOU think you’ll move? …aaaaaand here’s what the data show…

Rosalind12:06:38

We have teams 15 years into the transformation, they would be called DevOps or as close to it for a team building product code, but they understand it’s continuous improvement, so I consider them one of the best teams.

👍 2

Aimee Bechtle12:06:54

@kboth_does of course!

🍾 1

Aimee Bechtle12:06:54

@kboth_does of course!

🍾 1

Kurt A, Clari12:06:11

thanks! - and did you change your twitter handle? It is coming up as doesn't exist

Kurt A, Clari12:06:13

I found you - looks like you were able to drop the 71 that your slides cited 🙂

Bryan Finster - Walmart (Speaker)12:06:25

I told @genek101 once when I was frustrated once that the trick is finding a way to get change done before leadership changed.

Ciaran Byrne12:06:38

Thank you @aimee.bechtle055 👏👏

👏 3

Jeff Gallimore (CTIO - Excella)12:06:57

I think Gartner made a run at something similar with their “enterprise technology adoption” profiles: https://www.gartner.com/en/documents/3890775/understanding-gartner-s-enterprise-technology-adoption-p

Bryan Finster - Walmart (Speaker)12:06:06

Yes, I'm going to see how many people can watch your preso @aimee.bechtle055. Very nice.

❤️ 1

Matt Wheeler12:06:06

@aimee.bechtle055 exactly the talk I needed to hear.

Aimee Bechtle12:06:08

@bryan.finster that's very hard to do

Aimee Bechtle12:06:08

@bryan.finster that's very hard to do

Bryan Finster - Walmart (Speaker)12:06:18

I agree.

Thomas Williams12:06:13

Thank you, Aimee! 👏

Aimee Bechtle12:06:18

@mwheeler Thanks!

Bryan Finster - Walmart (Speaker)12:06:44

I'm up next on Track 2. Fingers crossed.

👍 1

Aimee Bechtle12:06:59

@jeff.gallimore This is going to be shared today. Thanks for this link from Gartner

Jeff Gallimore (CTIO - Excella)12:06:02

Rosalind12:06:09

Thank you, very well done. I plan to share widely.

Aimee Bechtle12:06:17

Thanks Rosalind!

Christian Rudolph (TUIGroup - Head Of DevOps Transformation)12:06:24

Thank you @aimee.bechtle055 great talk and good explanations of your thinking

👍 2

Ann Marie Fred - Red Hat12:06:27

The DOJO Consortium - A Living Scenius Project (US Bank, Verizon, Walmart) is on the track 2 channel

Craig Cook - IBM12:06:48

Welcome to the encore presentation! The story I heard was this presentation was so great it needed to be repeated. I may be biased though.

😅 2

Ann Marie Fred - Red Hat12:06:50

If it’s OK with y’all I think Craig and I will answer Q&A here.

Ann Marie Fred - Red Hat13:06:07

We had a gut feeling that certain things were slowing down many of our squads. What about you, do any of you have an intuition that something is a problem, but no way to visualize or prove it?

Pete Nuwayser - IBM13:06:01

"redeliveries from IV&V" - everyone knows there are code quality problems; impacts in terms of apparent/anecdotally low %C/A (but nobody had bothered measuring); and it was a distraction from improving quality

Pete Nuwayser - IBM13:06:01

Ann Marie Fred - Red Hat13:06:37

What is %C/A?

Pete Nuwayser - IBM13:06:32

Percent complete and accurate - see the DevOps Handbook

Ann Marie Fred - Red Hat13:06:41

Ah ok. So would that also map to a high change fail percentage?

Ann Marie Fred - Red Hat13:06:06

Otherwise known as “wasting a lot of time”?

Pete Nuwayser - IBM13:06:13

it's meant to show how often work flows "left to right" without being sent back left

Ann Marie Fred - Red Hat18:06:39

Hmmm ok.

Ann Marie Fred - Red Hat13:06:39

Have you ever seen an Agile or DevOps metrics dashboard used in evil? Have well-meaning but ill-informed people swooped in from outside of the team asking why certain scores were out of range and how soon could they get back to green?

Ann Marie Fred - Red Hat13:06:39

Ryan Marfurt13:06:34

Story points!

Ann Marie Fred - Red Hat13:06:52

“Greenwashing”

Erik Sackman13:06:34

if managers start asking for features like productivity per person be aware

👍 2

😱 1

Erik Sackman13:06:34

if managers start asking for features like productivity per person be aware

👍 2

😱 1

Ann Marie Fred - Red Hat13:06:13

Yes, this is the path to the dark side.

Erik Sackman13:06:36

start with the end in mind, what do you want to know/reach and which metrics can provide you with relevant data on this

👍 4

Craig Cook - IBM13:06:12

How would you visualize lead time with your teams?

Craig Cook - IBM13:06:12

How would you visualize lead time with your teams?

John Boyes13:06:14

I really like what the Accelerate book has to say about this, namely that lead time beginning from feature inception is very variable, but change lead time (from code commit to being used in production) is less variable and a great thing to measure.

Ann Marie Fred - Red Hat13:06:34

That’s true.

Ann Marie Fred - Red Hat13:06:32

There’s value in both. If your product management process takes 3 months but your development/deployment/delivery cycle takes a couple of weeks, a) speeding up product management will have much more of an impact than you might think, b) the lead times should be measured separately.

Ann Marie Fred - Red Hat13:06:43

When we looked at the lead time of features that were actually delivered (from the time the story was open to the time it was in production), it was almost always less than 1 month. And stories that were more than 1 month old rarely got delivered at all.

Ann Marie Fred - Red Hat13:06:35

I think measuring “the time from when a story is opened, to when it gets into the sprint backlog” AND “the time from when a story is added to the sprint backlog, to when it is delivered and in production” would both be useful metrics for our own teams.

👍 1

John Boyes13:06:34

Agree 🙂

Ann Marie Fred - Red Hat13:06:49

Defects per developer reminds me of that Dilbert comic: I’m going to write me a minivan!

Ann Marie Fred - Red Hat13:06:49

Defects per developer reminds me of that Dilbert comic: I’m going to write me a minivan!

Hannah Beech13:06:11

In my first job out of uni - at a video games company - us testers were ranked on defects raised. So not quite the same as showing defects per developer, but it made for a really, really toxic environment.

Craig Cook - IBM13:06:09

Wow. Sounds like a good learning experience.

Ann Marie Fred - Red Hat13:06:31

Ouch. But I’m not surprised.

Hannah Beech13:06:34

It didn't feel like a good learning experience at the time! But with being far removed from that experience, I do now see it as a fantastic learning experience.

Ann Marie Fred - Red Hat13:06:13

#hugops

😀 2

Ann Marie Fred - Red Hat13:06:31

Found it: https://dilbert.com/strip/1995-11-13

Erik Sackman13:06:58

you can also count # of stories instead of # of story points

Erik Sackman13:06:58

you can also count # of stories instead of # of story points

Ann Marie Fred - Red Hat13:06:11

Yep, that’s what we did!

Cindy Vineberg13:06:08

although not all stories are equal either

Ann Marie Fred - Red Hat13:06:37

True. We’re measuring work item throughput, not velocity in Scrum terms.

Craig Cook - IBM13:06:22

Squads seem to have fairly consistent story point sizes, but the meaning is different. That's one reason we made it difficult to compare squads.

John Boyes13:06:39

One of the great things about counting stories is that (unlike story points) it naturally provides an incentive to make all stories small.

Ann Marie Fred - Red Hat13:06:40

Hm, I hadn’t thought of it that way, but you’re right! We also learned over time that stories larger than 5 points would take FOREVER to deliver. As in several weeks to months, with long running feature branches and hideously painful merges. So we stopped allowing stories to be over 5 points.

👍 2

Ann Marie Fred - Red Hat13:06:07

Do any of you have problems with pull requests getting “lost”, where nobody notices them, or nobody reviews them? Anyone else have a good solution for that problem?

Ann Marie Fred - Red Hat13:06:07

Do any of you have problems with pull requests getting “lost”, where nobody notices them, or nobody reviews them? Anyone else have a good solution for that problem?

John Boyes13:06:40

One technique I use is to coach teams to really care about finishing things (where finishing means used by customers in production). Once a team cares about that, they are more motivated to not just review pull requests but also care about all the other important things that happen after code commit before production.

Ann Marie Fred - Red Hat13:06:16

Do they have a way to quickly see all of the open PRs? Before we built our dashboard, a few of the older squads (the ones with more repos to maintain) built their own tools.

Ann Marie Fred - Red Hat13:06:02

Github has a quick way to see a list of all of the repos for a team, but not an easy way to see how many PRs are open against each of them. You have to click through.

Bryan Finster - Walmart (Speaker)14:06:53

I'm adding that to our team dashboard because even my team has that problem. "Aged PRs" My wife's team has "Code Monkey" a Slack bot that nags them about PRs

👍 1

John Boyes14:06:24

Yes, what I’ve seen most often is for teams to build tools for this, to fill in any gaps that e.g. GitHub doesn’t provide out of the box. I think what’s even better is when an organisation has created bandwidth explicitly for tools like that to be built, to enable all the teams . .

👍 1

Bryan Finster - Walmart (Speaker)14:06:11

I agree. We just did that.

👍 1

Bryan Finster - Walmart (Speaker)14:06:16

Developer Productivity includes Developer Enablement (CD Platform), Developer Experience (IDE's Frameworks), and Developer Advocacy (Dojo, Training, helpful tools)

Bryan Finster - Walmart (Speaker)14:06:55

We actually did all of those things before in the CD platform area. Now it's less ad hoc. 🙂

🚀 1

👏 1

René Lippert13:06:36

@ann.marie.99 how many environments you have where the deployment is counted. So if you have software version 1.0.1 and two customer environments would you count deployment of version 1.0.1 two times if it is deployed on both environments?

René Lippert13:06:36

Craig Cook - IBM13:06:20

That is counted as "1". Some squads use travis to deploy to 3+ regions. That is counted once.

Ann Marie Fred - Red Hat13:06:37

Our group is running chunks of http://IBM.com, so we don’t have customer deployments.

Bryan Finster - Walmart (Speaker)14:06:35

I agree We have 12K stores, each are HA datacenters. We count unique artifacts only. 🙂

👍 1

Torben Froelund13:06:42

How do you define availability? Should the whole site be down to be in red or only part of it, what is some corner case of feature is not working?

Torben Froelund13:06:42

How do you define availability? Should the whole site be down to be in red or only part of it, what is some corner case of feature is not working?

Ann Marie Fred - Red Hat13:06:02

Ideally and in most cases, we use synthetic monitors to ensure that customers can actually interact with our services. For example, we’ll check web pages to make sure a certain word renders on the screen, or we’ll make sure customers can log into My IBM and see their account info.

Ann Marie Fred - Red Hat13:06:17

Our APM monitors are almost never down, but our synthetic monitors are down more often.

Craig Cook - IBM13:06:24

That is defined by each squad for each of their services.

Craig Cook - IBM13:06:01

Since squads own their own services they know what "customer impact" means.

👍 2

Torben Froelund13:06:06

ok, I'm considering using synthetics also

Torben Froelund13:06:21

great presentation, we are right in the middle of defining our metrics, so very valuable input 🙂

👏 1

❤️ 1

Ann Marie Fred - Red Hat13:06:32

We monitor about 300 services on our production availability dashboard, and 60+ in pre-prod/validation.

Torben Froelund13:06:50

can I ask what APM tool you use?

Craig Cook - IBM13:06:12

New Relic.

Craig Cook - IBM13:06:45

I gave a talk 2 years ago how we use it at their conference.

Torben Froelund13:06:08

is it on youtube?

Torben Froelund13:06:13

we use AppDynamics

Craig Cook - IBM13:06:54

https://www.youtube.com/watch?v=M_qYdvkdqSo

Torben Froelund13:06:06

super thanks!

Ann Marie Fred - Red Hat13:06:59

I recommend a monitoring tool like New Relic where it’s easy for developers to configure their own monitors. Some (like CheckMK) are just not user-friendly enough.

Torben Froelund13:06:44

agree, we use AppDynamics, but I honestly haven't gotten the teams to buy into it, so it's a centralized effort to visualize key metrics now

Craig Cook - IBM13:06:09

The first thing I did joining was getting the Availability dashboard created. Make your uptime visible. Review each week with execs. Drives interest in monitoring in general.

Torben Froelund13:06:30

yep, just need to get the availability defined for each area, will use the input I got today for sure

Erik Sackman13:06:59

@ann.marie.99 @cncook001 did you look at the 4 key metrics from the state of devops report?

Erik Sackman13:06:59

@ann.marie.99 @cncook001 did you look at the 4 key metrics from the state of devops report?

Ann Marie Fred - Red Hat13:06:22

Absolutely! Stay tuned for how they play in here.

Ryan Marfurt13:06:58

Do you consider WIP at all in your DevOps score? For example, rewarding teams that focus on finishing a smaller number of work items as opposed to starting to work on a large number of work items?

Ryan Marfurt13:06:58

Do you consider WIP at all in your DevOps score? For example, rewarding teams that focus on finishing a smaller number of work items as opposed to starting to work on a large number of work items?

Craig Cook - IBM13:06:30

No, we display WIP but it is not part of the overall grade.

Ryan Marfurt13:06:26

Do you find teams use that display to change behavior? I ask because teams in our organization recognize the need to limit WIP, but commonly overlook the implications and end up in heavy WIP scenarios.

Ann Marie Fred - Red Hat13:06:00

If teams review WIP in their daily stand-up or weekly retro, yes they will change their behavior based on it.

Ryan Marfurt13:06:16

Thank you!

René Lippert13:06:33

Does the bucket size metrics encourage the developers to write as few lines as possible eg. define all variables in one line instead of mulltiple lines.

René Lippert13:06:33

Does the bucket size metrics encourage the developers to write as few lines as possible eg. define all variables in one line instead of mulltiple lines.

Ann Marie Fred - Red Hat13:06:20

Theoretically it could, but they’re not motivated to game that.

Ann Marie Fred - Red Hat13:06:41

We do have plenty of 5-line pull requests, though, and that’s FINE. Not a problem.

Pete Nuwayser - IBM13:06:14

Flow Efficiency = Process Cycle Efficiency from the DevOps Handbook

👍 1

Ann Marie Fred - Red Hat13:06:06

Squad comments is our answer to squads who don’t intend to change their practices to improve a score for one reason or another. Often they would ask us to change the scoring system so they would get a Green/Good score. Have any of you run into a similar situation?

Andre Lee-Moye, DZone13:06:24

This may be a bit out of the purview of this presentation, but it's from a conversation I had at Lean coffee yesterday. Were any of these metrics used to try to reflect the DevOps culture internally? @ann.marie.99

Andre Lee-Moye, DZone13:06:24

Ann Marie Fred - Red Hat13:06:06

Yes… we had good discussions about each metric, why we should or should not use it, and what we would expect to see if teams are doing what we want them to do.

Ann Marie Fred - Red Hat13:06:52

I know the slides in the middle went fast, but we tried to outline what best practices we were hoping to drive with each metric. You can get the slides from https://github.com/devopsenterprise/2020-London-Virtual tonight after they’re posted if you want more details.

Ann Marie Fred - Red Hat13:06:08

It’s difficult to read the second link on one screen. It’s a link to our open-source `npm audit fixer` tool. The link is here: https://github.com/IBM/npm_audit_fixer It works for us, but if you have trouble using it, feel free to send me a message on Twitter, or if you’re truly lovely, contribute a fix!

Mars Penton13:06:17

This talk made me excited to bring this back to my org. We need to do this!

❤️ 1

👍 1

Pete Nuwayser - IBM13:06:31

Thank you @ann.marie.99 and @cncook001 👏 very insightful!

🙂 1

Andrew Risse13:06:32

With daily automated checks for versioning, how are your squads dealing with breaking changes?

Andrew Risse13:06:32

With daily automated checks for versioning, how are your squads dealing with breaking changes?

Craig Cook - IBM13:06:35

It caused a lot of pain with our own squad when we turned that on. The script Ann Marie posted has flags to adjust how aggressive it gets. We tuned it to focus on high critical defects. Those are not too bad to resolve when then happen.

Andrew Risse13:06:23

Thank you!

Craig Cook - IBM13:06:11

We also expect 100% unit test coverage. That helps with confidence when we upgrade packages frequently.

Andrew Risse13:06:37

100% is impressive

Ann Marie Fred - Red Hat13:06:28

100% is only slightly painful when you start at 100% with greenfield code and put a check in to fail the build if it drops below that. It’s definitely harder to get there with legacy code, but some teams have done it. The rest will get to, say, 80% coverage and put a check in the build that won’t let their coverage drop below 80%.

Ann Marie Fred - Red Hat13:06:43

To expand on the automated patching a bit more: we have two ‘npm audit’ scripts - one version will block the build pipeline for higher-severity vulnerabilities, to get developers to fix those right away. It goes in the “test” stage of the build. The one we posted here that runs on a schedule will try to patch any and all vulnerabilities it finds.

Ann Marie Fred - Red Hat13:06:01

Github itself has a feature that will automatically notify you of open source vulnerabilities, but it’s not available on our Github Enterprise Server yet. I don’t think it actually fixes them for you, either. There’s a bot called Renovate that will fix things, but it doesn’t catch nearly as many things as npm audit does for NodeJS apps.

Andrew Risse20:06:08

cool, thank you for the info and the talk!

👍 1

Andrew Risse20:06:32

Hannah Beech13:06:58

Thank you @ann.marie.99 & @cncook001. Super interesting, and very helpful. I've been toying with getting various metrics visible to my team and you've definitely given me some pointers for this 👏 👏

👍 2

Ann Marie Fred - Red Hat13:06:21

The slides will be posted to https://github.com/devopsenterprise/2020-London-Virtual tonight if you want more details.

❤️ 4

Ann Marie Fred - Red Hat13:06:15

Once you get the initial connection set up with your code repo, your monitoring system, and a few other things, adding new metrics isn’t bad. Credit also goes to to Tony Huo, our developer who implemented most of the calculations and the front-end.

Ann Marie Fred - Red Hat13:06:13

We did have some issues with Zenhub rate limiting… but anything can be worked around. Tony ended up pulling the data in Jenkins batch jobs and storing it in a cloud DB.

Ann Marie Fred - Red Hat13:06:32

Oh that’s a point too - pulling all of the data live was too slow! You don’t want your users to have to wait 20 seconds for a screen to load, so pre-caching the data in a database is important.

Mikko Niemi14:06:33

@aimee.bechtle055 I just watched your session and very much appreciate what I saw. Do you have more long form material regarding your C-suite and DevOps Success Criteria that you would like to share?

Brian Martin16:06:15

@david627 do you remember the Flintstone's episode where Fred becomes an executive at the quarry? He was told all he ever had to do was say 3 things. Who's baby is that? What's my line? I'll buy that.

Bob Clasen16:06:49

@aimee.bechtle055 Hi, Aimee. Great presentation! Love the Nats pennant 🙂 .

Bob Clasen16:06:49

@aimee.bechtle055 Hi, Aimee. Great presentation! Love the Nats pennant 🙂 .

Aimee Bechtle17:06:26

Hi Bob! Thank you. Good to "see" you.

Tom Sheeran19:06:55

@richard431 I regret I missed your talk. I don't see it in the Library. Are you able to share? Thanks

Tom Sheeran19:06:55

@richard431 I regret I missed your talk. I don't see it in the Library. Are you able to share? Thanks

Richard Vodden19:06:31

There were some problems with the original submission. It will be up soon - I'll ping you as soon as it's up 😃

Richard Vodden16:06:12

Hey @tom.sheeran - its been uploaded now 🙂

2020-06-24

Channels