Recap of DevOps Days Boston 2017 with David Fredricks

This week, I had the opportunity to continue a conversation started weeks ago with David Fredricks, organizer of DevOps Days Boston.

You can watch it on YouTube or listen via Soundcloud. Transcript below.

The Incredible Impact of Open Spaces

Paul: All right so welcome everyone my name is Paul Bruce and once again I’m back here with a member of the DevOps community, Dave Fredericks. Now Dave you organize the DevOps Boston event is that correct?

Dave: Yeah that’s correct. I’ve participated as a volunteer organizer for the last three years, involved for the last four.

Paul: Excellent…and I got a chance to meet you beforehand, I think at one of the DevOps Days Boston meetups, but then also we got to chat at the event and it was a really good event. I think a number of different things were really just cohesed really well, particularly from my point of view, the collaborative open spaces. Can you tell us a little bit about what that’s like how that got into the conference schedule?

Dave: Yes, certainly. So open spaces is a really interesting kind of platform that’s unique to DevOps Days. Basically how it works is everybody comes up with topics during the event after listening to some of the keynotes. Some discussions that are interesting to individuals, a lot of times you want to add on your personal perspective into, not only offering new ideas or maybe even some suggestions, but asking specific questions.

A way of being able to do that is by getting everyone together at the event over common topics. You basically vote on different topics that are of interest to you and they can actually go anywhere from cultural to personal to technology as a whole.

The idea is, there’s a few rules, it’s basically:

  1. what’s being said is what needs to be said
  2. who’s there is the people that need to be there
  3. when it starts and when it ends is the time it starts and ends

Those are the only kind of guidelines that we go by and the idea is to get people who usually wouldn’t be open to public speaking to be able to have a chance and an opportunity to either share some ideas, ask questions specifically and directly to different individuals and to have an open forum.

The real values that come out of it are real specific dialogue, the biggest thing is new introductions and relationships that are created.

The hope is that throughout the year after the event, a DevOps stage event is for you to be able to get contact information of individuals who are in the same space, at the same stage as yourself to have an outside outlet to be able to bounce ideas off of through the year as you start to face some of the challenges as you as an engineer try to solve problems.

Paul: Yeah that was one of those things that really clicked for me, being part of a number of those open spaces, I saw exactly what you said which was people were far more likely to comment and to share and ask questions. And in a larger audience and I think the other element of that is the fact that not only do they get the share but they get instant feedback.

And this is one of those core tenants I think of DevOps, in my mind, is this concept of continuous learning. But you don’t learn unless you know what’s going on and you don’t know what’s going on unless you [as an organization] radiate information which is typically facilitated by feedback loops. So whether we’re talking about technology feedback loops or real people feedback loops, I think that’s really helpful.

So can I back up for a second and ask you a slightly broader question about DevOps: in your mind how would you define DevOps?

What Is DevOps, Really?

Dave: Great question. You know these this is one that in our community we talk a lot about, especially for folks who are outside of quote-unquote “DevOps thought process”, knowing that it’s something that’s taking off as a force in the software world.

One of the things we do is to talk about how do we define DevOps. The biggest thing for me is DevOps means different things to different people and it’s all about context and perspective, where you come from and where you’ve been and what challenges you’re trying to solve. So when I meet somebody new who’s in this space and they’re starting to kind of either chant or evangelize to me without first getting a baseline perspective as to where I’m at and what I was doing and what I’m trying to solve, immediately has me question, “okay, are you trying to push your ideals down on me?”,

This is what DevOps means to me: getting folks to work together in an efficient collaborative manner to solve a common goal, period.

It has nothing to do with tools. It has nothing to do with process. It has nothing to do with frameworks. It’s all about getting people together, teaching context, having empathy, understanding what somebody’s doing, why they need to do it, and what what they’ve been doing in the past. You share your ways of doing it and then together when you have a sense of “okay, I know why this person has to do things, I know the reason why they’re thinking this way”, you can efficiently solve problems and for me that’s that’s what DevOps is to the core, right there.

Paul: So one one thing I heard from that is it starts with people, right? It doesn’t start with tools, it doesn’t start with how you’ve been doing it; it starts with people and really understanding the context and the perspective that they bring to the table. Is that right?

Dave: Yeah, Paul, you you nailed it right there. It starts, it continues, and it ends with people. Ultimately I take the concepts and the core principles of DevOps, and you can apply that to any industry, any product, any delivery, any manufacturing, and it really is bringing people together to work more efficiently to solve a common problem.

What Is DevOps Not?

Paul: And so actually people are doing that, you’ll hear the prevalence of these amalgam terms like DevSecOps, DevTestQAOps. And I kind of take issue with that in the sense that I understand how important terminology and clear labels for things. As a practitioner and engineer, as soon as somebody starts to blow out a term to mean “all the things”, my red flags get raised up instantly.

That doesn’t mean that [DevOps] doesn’t include other people, but can you tell us a little bit about how important the scope is of DevOps to you? And just kind of following that up with some context, I was able to speak to Ken Mugrage from the DevOps Days Seattle, and he was very clear about how if we blow it out into all the things, “DevOps” loses its value.

And so I put this to you: why is a pantheistic term, if DevOps grows to that, why is that a problem?

Dave: No, that’s a great thought. I want to take this back a little bit to identify why are all these actions added on, how and why this is how [DevOps] is being branded in this way. This was a discussion that I have, especially with growing teams.

One of the biggest things I talked about with organizations is, first and foremost, technically there is no DevOps engineers. So why label it that way?

There’s No Such Thing as a “DevOps Engineer”

When I started working with a lot more enterprises, I helped organizations transform their development to be much more modern so that they can have quicker release cycles and feedback. It’s one of the things that used to frustrate me, was like “hey, we need five DevOps engineers!”. That doesn’t mean anything to me, you got to explain on a day to day basis, what is this person doing, and ultimately, why are you labeling these folks as DevOps engineers?

And I I had some interesting feedback which came from the product marketing side. They were like, “Dave, we’re in the enterprise. We’re used to big long deploys of software in order to get it to our customers, and a lot of times we don’t know if our customers are even getting any value out of what we’re producing. When we’re releasing every year and waiting for six months to get the actual feedback from our customers, it doesn’t make any sense.”

So you see this large swath of folks trying to get into this space to build software quicker to have faster feedback to be able to add more value to end users.

These individuals don’t really understand this whole open source community, they don’t understand how the strength of the community is really the value.

“So we don’t know how to really market. We don’t know how to communicate to the group in a way for us to be able to blanket it all together. So we just scoped it into this thing and we call it #DevOps and everything gets that kind of label to it.”

From my experience what I’m starting to see is a lot more of these organizations who are specific to security, to testing, in a way of being able to catch and grasp that member of the audience, it’s “let’s throw it in, Dev and Sec Ops, Dev Quality Ops. What starts to happen in my mind and what I’m what I’m worried about is that people start to lose the real purpose.

Paul: So basically the exact same thing that happened to Agile. Everybody forgot to have agility as one of the core tenants that people check in on, on a regular basis such that they internalize that, and that is where their activities and their tools flow from, right?

Dave: Yes exactly. If you start to get too focused on the terminologies and the labeling of things and forget the context as to why you’re practicing it, ultimately the further down stream you get and the more generations that start to get folded into the process, they’ll start to lose the actual scope, “hey we’re trying to get people to to work together in a more collaborative manner to be efficient and to be able to deliver quickly.”

How to Be a Good DevOps (Citizen, Vendor, Employer)

Paul: Yeah, one thing that I did recently was put out an article (and thank you you, you had shared it to a number of people and I think that’s half the reason why I got some attention). It was essentially how to be a good DevOps vendor. It took the approach of looking at it from the customers perspective. The implementation of that was over a simplified customer journey and then chronologically through that journey, I went through and basically made statements from an outsider’s perspective onto different groups whether it be product, marketing, sales.

Back to your perspective, I get that it has to fundamentally start with people because people are what build teams and teams are what build software and software is what affects us. But the team affects us and individuals affect us, and so it does make sense to keep that as a core of value, to consider personal responsibility and also the responsibility of the team to have these cultural aspects present.

But unfortunately I think what happens is that we do need tools and you know, conferences are notorious for needing some kind of funding and becoming self-funding is really hard, and so out comes sponsor packages and I mean it’s an ecosystem. All software is eventually, in most people’s minds, going to make money and so this is where I was coming from, understanding that there is no such thing as a DevOps vendor or a DevOps tool or a DevOps job/position. Yet the fact is that when you’re closely aligned with the thinking of another person and “DevOps” is the term they’re using, it’s easy for these vendors to kind of bring that in and pull that into their messaging.

So I guess my my point of view on that is that we are gonna have to deal with that but it’s kind of a constant battle against the pantheism of trying to “all the things” a term [DevOps] but in the meantime we also do have to represent those tenants to more than just the developers and operations. If you really want to sell to developers and operations or teams that are looking, or they have internalized DevOps, they’re going to be looking at the world from this interesting perspective. And they’ll be looking across the tool chain to figure out who sounds like they’re blowing smoke up [you know where].

If a tool vendor or a service provider does not understand the core of DevOps, then their messaging, their selling process, their product ideation…it’s all not going to jive with the real market.

After after a recent Boston DevOps meetup we dove into this for what like an hour and a half, and just really talked about how do we actually do this. My concern is that when we start to move this into the enterprise (and by the way, the good principles of DevOps should be moveable to the enterprise, right? If they work, they work, and it’s a matter of fitting to context) that I think, while the core of it is culture, we can’t just live in this sort of kumbaya world.

We really have to figure out how to scale DevOps principals up and out into the enterprise setting so that, by the way, these good principles have a positive impact on things like automated insurance, things like machine learning in terms of healthcare, defense and government settings.

So I’m working on that on the side but in the meantime, what do you think about scaling to the enterprise? What does that even mean for DevOps?

How DevOps Is Re-writing Management Decisions

Dave: Yeah, that’s a great point. It’s an interesting challenge. There’s a lot of organizations who are facing it. Right now, I’m dealing with situations where we’re starting to see a lot of enterprise buy instead of trying to build it themselves. One thing they have is capital and resources. So the idea is, “if we don’t know or we can’t make it, it’s the bye versus build, like why go out and try to do what people already are being really successful in doing in something that we don’t understand too well? Let’s just go ahead and absorb some of these startups…”

Paul: Do you mean actually purchasing startups in order to just fill that technical gap in an organization? So I don’t want to name names, but I’m thinking of a very large enterprise that just recently bought up one of the most well-known API monitoring services out there, and people are freaking out like “oh gosh, what’s going to happen, are they going to de-culture this awesome group of guys and gals?”

Dave: I’m dealing with the same thing within an organization, a large security company buying a smaller more nimble security product with a lot of open source options. They’re putting out there trying to create groundswell to get this tool for free into the hands of engineers, let them play with it so they can understand how it works and create some kind of a swell within the engineering teams and then we’ll come up to the top start talking to the executives about, “Hey, what challenges are you facing in this broad space?”, where you’re trying to protect not only year your customers information but also information about your company.

As they start to have that type of dialogue, all of a sudden the executives within the organization starts to look down, talking to their engineering group and saying “hey, what do you know, what have you played with, what do you think is interesting, how do you think we should be solving this problem?” You’ve already created that initial lift of inertia in engineering, then they say hey let’s go with this product…we already know how it works, we’ve been you tooling around with it. Win-win, right?

So this is a completely different way of thinking of how enterprises used to be selling products into their customers. It was always a top-down approach…let’s talk to the executives who have the purchase power, float it down and then they’ll disseminate that information in the way that we roll it out into the engineering team. That’s how you could do it in the old-school way. Now in today’s new world, a lot of tools are available for you to play with for free and when enterprise organizations start to try to come into this space, they’re really kind of blindsided by this whole new content creation process.

Selling Into DevOps Takes Understanding DevOps

What I’m starting to see is they’re at least now recognizing we do not know how to sell to this to this community of this group. We know we really want to get into the space, we want to do it the right way, what do we do right and you know to your point with your article, I’ve shared your article with all of the enterprises that I’ve have been talking to me about this problem because I can’t teach them about the thought process of open source.

I mean, we can look back in the 60s, the MIT days, where the two groups kind of split off. A lot of us in the DevOps space already have the mentality of like “hey, you know we want to be able to share a lot of this stuff but we do want value for hard work we do”. But for the most part there’s different ways of doing it versus everything is being paid for with the enterprise mentality.

What I’m starting to experience is there’s a lot of organizations out there that are realizing it’s exponential value once they start to get into this community and…

the brand loyalty within the DevOps community is tremendous

…but the challenge that is in front of us right now is really the learnings piece and I’m thinking it’s a leadership issue (this is my own personal view). It’s enterprise leadership that needs to get out of the way and allow for new blood to come in to be able to understand the kind of movement. I’ve been doing a little, as much as I can to try to influence old leadership. It’s a challenge and a lot of it has to do with success syndrome. You’ve been doing it in certain way for decades. It’s a great case study that we’re gonna be able to kind of sit back and watch in the next five years

Calling All Researchers: Inclusion Means You Too!

Paul: Yeah and you know, there’s so much going on, no one person can do it alone. So without plugging any commercial products of any kind (that’s not my motion) I have started something called the iterativeresearch.org, which is essentially a bunch of contributors to research. As they go along, it could be lightweight contributions, simply just pocketing articles and getting into a feed of people who pay attention, it’s writers too, but the point is it’s not on a brand that’s connected to a pay-for services. And you know I would love, for this conversation to really start flowing in that direction because I think it takes many perspectives, right?

The core of this is it’s an inclusive conversation, not an exclusive one.

So understanding that you are a busy man and we’re at the top of our time, are there one or two things that you want to give a shout-out to or any particular resources that people can go to, events, communities, open-source forums, anything like that?

Get Out to a DevOps Tribe Near You!

Dave: Yeah, you know, thank you so much for the opportunity first and foremost, we’re gonna have to do it again! One of the things I really would highly recommend to folks who are interested in getting more involved, start to look at some local meetups that you have going on. There are some great folks within every community in whatever city, whatever small town, who are interested in sharing ideas and in thoughts in challenges. All you have to do is get out there and look. Go find your tribe! The biggest thing is don’t sit back and wait and sit on your hands and expect for interest to come to you.

The whole constant learner, the Kaizen mentality, be better tomorrow than you are today, be better today than you are yesterday. It lives and dies in DevOps and the way to do it is start to talk to folks who you’re not used to talking to.

Don’t be afraid get out there introduce yourself and have a good time. Life is learning.

Paul: Cool. So that’s David Frederick’s everyone and thank you David for spending the time with me. Do you prefer going by David, Dave?

Dave: Dave, David, either way.

Paul: Dave/David, I’ve really enjoyed it was great being able to spend some time. We’ll circle back. Thank you so much! Cheers!

 

More from DevOps Days Boston:

 

 

Recap of DevOps Days Boston 2017 with Corey Quinn

This weekend, I had the chance to have a ‘distributed beer’ with Corey Quinn of Last Week in AWS to chat about the DevOps Days Boston 2017 conference last week. We provide a few takeaways each in about 5 minutes.

You can watch it on Youtube and listen on Soundcloud.

My Recaps of Day 1:

.

Links from this chat:

 

Stop Using the ‘Staging’ Server – DevOps Days Boston

Chloe Condon presented on how containers and IaC (infrastructure as code) can help us skip over the ‘staging server’ part of traditional deployment strategies. This article is a loose transcript of taking points from her talk at DevOps Days Boston 2017.

What’s Wrong a Staging Environment?

Feedback from a traditional staging environment is too slow. The only thing the reviewer knows is if unit tests passed, the rest of the tests are run after that. “Staging” is usually reserved for integration, functional, UI, and performance testing (i.e. complete feedback). Too little, too late.

We’re all too familiar with the question “who broke staging?”. The fragility and centrality of this staging model creates bottlenecks. Also, the very first time something is brought into pipeline usually happens in staging and that’s when ‘broken’ occurs.

There’s lots of “friction” between environments. Dev/test/staging are often not equivalent and are configured differently, causing deployment between environments to be a hastle. Flows across these environments are time-consuming (environment variables and files missing).

Code changes are being tested more extensively in staging, which means there’s little room for timely feedback.

Ephemeral Environments

The great thing is now, we have containers. We can run every build, package it in a container, then run tests on it in the same pipeline. Microservices are well-suited for this type of model, but also distributed stacks (like a web app, database, and supporting APIs) benefit from this model too.

Additionally, most stages of testing can be containerized. Leaving performance and scalability off for a moment, that enables us to run integration, functional, and security testing as part of a complete containerized package.

The problem still remains: we have the rule that staging has to be as close to prod as possible. This might serve some of those tests (like performance and security), but is largely dis-optimal for unit, integration, and functional tests. Performance tests could also be run earlier to provide us a better heads-up about degradations that creep in over time. In practice, late-stage environments don’t match reality and this causes friction..

So let’s reconsider the premise that all of our non-unit testing has to be run in a shared environment that bottlenecks us. This helps us shift feedback to the left. (Chloe says to insert Beyonce clip here.)

Containers = Consistency & Composition & Completeness

So now the container we’re handing off is much more complete: it includes a more complete set of self-testing capabilities that we can ask our pipeline to run for us.

You can hand off containers to your customers (usually internal but maybe even external) and with composition, you have confidence that the bits they’re running are the same as what you tested and what you want them to have.

Infrastructure as Code

Team should define what code is part of the process. When people are able to spin things up automatically on their own, this streamlines an important part of their process. Visualizations help a lot, which is why CodeFresh and other platforms have visual controls over the package and deploy process.

Infrastructure-as-Code (IaC) includes Dockerfiles, but also deployment scripts. If it’s code, treat it like it’s important because otherwise it’s outside the flow of delivery.

Paul’s take: IaC also includes a whole bunch of other stuff too. For example:

  • Composition scripts (like Docker compose, Kubernetes scripts)
  • Secrets management configuration
  • Network configuration
  • Database configuration (might include data)
  • Tests and test data
  • Feature flag configuration
  • Monitoring configuration & scripts

Implementing IaC requires a few things:

  1. Your team agrees and has an in-depth knowledge of how to push healthy code artifacts into the pipeline. No one is an island, others’ contributions need to be readable and easily debuggable.
  2. A resilient process (i.e. pipeline) including dynamic build/package/test semantics enables contributors to focus on the ‘push’ and feedback rather than the semantics.
  3. Information radiators along the process must cater feedback as granularly as possible: individual contributor first, then channel, then team. ChatOps bots give you immediate feedback about breakage as soon as it occurs.

A complete IaC artifact list will require collaboration between multiple contributors, which facilitates communication. Just make sure that empathy and positive reinforcement is part of your management strategy.

Questions from the Audience:

Q: “How do you describe the state of the code in PRs?”

Chloe: “Badges in the repo, some conventions, success flags on Codefresh.”

Q: “How often do people actually use this for pre-stage vs. just going to prod?”

Chloe: “For lots of people, they maintain separate branches for multiple environments. Then you can introduce new versions dynamically.”

Q: “In more complex systems, is there a composition management layer?”

Chloe: “This is the beauty of the compose files. When you treat them like code, this makes management a lot easier.”

More reading:

Iterative Security – DevOps Days Boston 2017

Tom McLaughlin presented on iterative security,  incorporating security into DevOps cycles through early detection and prevention of vulnerabilities. His slide are here. This is a loose transcript of taking points from his talk at DevOps Days Boston 2017.

Breaches in Practice vs. Theory

Tom made the point that breaches often occur in areas that aren’t covered by development or security teams because vulnerabilities escape due to a lack of objective and continuous risk assessment.

Code still has passwords and tokens in it. Lots of assumed knowledge going from dev to prod. Account access and password policies, patching, is usually handled by someone else. Leads to “good luck, it’s up to you” syndrome.

There’s also security paralysis. When we don’t think we know how to do something, we just won’t. And we’re rewarded for accomplishing things. So long as disaster doesn’t strike, we get by.

Why Do We Suffer from Security Breaches?

Mostly, we get distracted. 0-day exploits, crypto weaknesses, hash collisions. We get distracted by logos and discussion threads, but not patching the system. We get caught up in all of this stuff instead of actually doing what improves security.

Think about all the publicly exposed mongodb and Elasticsearch instances you’ve seen…being proactive isn’t always hard, but is rarely incentivized well.

We don’t do a good job explaining how to get from where you are to where you should be. We also don’t always practice critical thinking. What is you goal? What is your posture about security? Proactive, reactive?

We also don’t always have a wealth of layered instructional content. There’s a lot of information at the extremes (101 and advanced tutorials), but most of us are in the middle.

Solve the Problem Like You’re At Work

So then let’s develop a threat model together, as an example. Let’s start by being realistic. What kind of org and product matters? Align with your company on risk management policies and processes.

Prioritize. Use DREAD (or STRIDE) for rating threats and modeling risk.

Also take care of the easy stuff: USB sticks over man in the ceiling.

Do you still use a service after it’s been breached? I leave that up to you.”

Decompose the system. Map out your architecture and understand the systems. Look at the perimeters, how are credentials proliferated? Understand your data pipeline, where is your really valuable data stored?

Take time to consider things like exposed net ports, unpached containers, weak secrets…there are tools for this. These tools can be found in later slide here.

Putting a Response to Security Threats into Action

Two words: impose constraints. To find which constrains work for you and start with a simple discovery process that includes:

  • Time, how long to solve? Timebox solutions, defensible use of existing time.
  • Complexity, how hard is it? Ask deep questions, iterate over which help.
  • Risk, how risky is the problem and solution?

Secrets management is a first start. Tom pretty much pwns this space and I encourage you to seriously check out his extensive work on the topic here.

In terms of tactical actions you can take today, Tom mentioned these few, but of course there are more:

  • At the code, start with at least something like git-crypt.
    Ask yourself, what should be thrown out before it goes anywhere else?
  • In configuration management scripts:
    Developing a master re-key strategy is a great exercise to flesh this out.
  • Storage…a tool like sneaker for S3
    Really makes you ask questions…who/how are buckets managed.

Summary

We need to be better at security, continuous or otherwise. We need to act. There are simple things you can do, but they need to be aligned to your team/organization risk strategy. And make it easy for others to do the right thing, so that it’s far more likely to happen without imposing huge effort cost.

Tom’s a great speaker, engaging and fun to listen to. He is also a huge community contributor and even runs a distributed DevRel (developer relations) slack group. Tom is currently working on the CloudZero team.

More reading:

Enterprise Wild West – DevOps Days Boston 2017

Rob Cummings‘ keynote at DevOps Days Boston 2017 explored how Simon Wardley’s Pioneers, Settlers, and Town Planners model applies in enterprise engineering and large organizations. The general idea is:

  • Pioneers: explorers of new ideas, create prototypes, prove the need
  • Settlers: stealers of new ideas, move prototypes to MVP, prove feasibility
  • Town Planters: manufacturers, MVPs to industries, prove scalability
Bits or pieces?: On Pioneers, Settlers, Town Planters and Theft

The Problem: Overly Simplistic Approaches

Bi-modal IT splits the org into Mode 1 (systems of record) and Mode 2 (systems of innovation). Mode 1 has less line of sight to customers and is governed by enterprise architecture and governance. Mode 2 often runs into Mode 1 when …. The problem is that often, there’s no flow between Mode 1 and Mode 2. Bi-modal is overly simplistic.

The book “Thinking in Systems” is a great place to start your journey beyond these modes. Transition states and feedback loops exist already in your org, but realizing where they are and how they could be improved takes practice and group engagement.

Paul’s advice: System’s thinking is a much broader topic that, if you haven’t actually studied, it would serve you well to listen to The Fifth Discipline by Peter Senge. As context for my presentation in April on IoT testing, this made me realize that systems thinking was a necessary mental tool moving forward.

Everyone Innovates…Sometimes.

Pioneers live outside standards, fail often, and don’t necessarily make decisions based on metrics. Find the new horizon. That’s how they innovate, they bring ideas from outside in.

Settlers make prototypes real, building trust in the org, kick off ecosystems around the adoption of ideas, but sometimes suffer from adoption problems. They bring ideas further in to the org.

Town Planners focus on ops efficiency, build services and platforms that Pioneers rely on for future innovations. They’re metrics heavy and bring reality to the operation of ideas.

Fostering Friendly Theft

The Wild West is a “theft-based pull model”. There are no mandates. Theft occurs from right to left (pioneers on the left). re-use from left to right.. This is a good thing. Everyone is excellent and everyone both should participate in empathy. Foster feedback loops and maintain pull culture.

The Wild model exists within a team, not as separate departments. Again, for DevOps we’re not talking about traditional cost centers and departments; we’ve got mixed teams that are aligned on a shared goal with their own perspectives on how to do things best, together.

Paul’s Take: DevOps Requires Buy-in from Everyone

For DevOps to work, a team needs to understand and adapt to their organizational ecosystem. So while the micro-mechanics of the Wild West help us pull new ideas in on a continual basis, there has to be an understanding that extends across the whole org.

Many conversations at DevOps Days Boston 2017 on day 1 expressed the need for “buy-in from the top”, but effective DevOps also requires buy in from everyone. Teams need to align the virtues of DevOps to how they can positively impact the organization. It does no good for an SMB VP of Engineering to apply DevOps if the purpose of doing so hasn’t been clearly articulated in terms that other dependencies (like the developers, operations, sales, marketing, finance, and support) understand. But when you do so, it’s much easier to carry people with you in planning and execution.

DevOps is Organizational, Operational, and Orthogonal. Applying it in isolation only decreases the value it brings to us.

Scaling to the Enterprise

Rob shared an anonymized anecdote from a large company where the Wild West model was adopted:

A small group of pioneers realized “we need to fix this, can’t meet customer needs”. They knew how to do it and got CIO sponsorship. The team got to MVP status with code. Unfortunately, the Wild West model was not immediately adopted beyond that initial release.

“We were trying to push the model onto the team.” Even though everything done up to that point focused on ease-for-enterprise (weekly demos, code was open sourced, process transparency), adoption took time.

Eventually, another team took the ideas and model, shipped their thing to production, then other teams followed. “Now we have a ‘proliferation problem’…people started customizing tools and artifacts.” Teams often stuck with some favorite tools, and in DevOps culture, tailoring is huge.

But not everyone wants to build their own house. For example, code pipelines…yuk. So Planners came in and built a commodity pipeline platform. This requires talent, people who have skill and can scale, understand operational efficiency.

Summary

Here are a few anti-patterns that will reduce friction and increase your flow.

  1. Using enterprise architecture to prevent waste and force adoption.
    Don’t use it as a gate to get to production!
    .
  2. Relying on innovation labs or CoE for pioneers.
    Teams outside your org toss things in that often don’t work inside the org. Be super-public so settlers are likely to steal. Change CoE to “Center of Practice”, inclusive, then everyone can be excellent.
    .
  3. Don’t forget that your org requires a systems thinking approach.
    Create flows not barriers. Each role is filled with excellent people.

 

More reading:

No Root Cause in Emergent Behavior – DevOps Days Boston 2017

At DevOpsDays Boston 2017, Matthew Boeckman presented on how emergent behavior in complex systems requires us to re-think our root cause analysis paradigms. His slides are here. I also had a great time talking meta with Matthew afterhours, but that’s for a later post.

Traditional RCA in Complex Systems

Unfortunately, traditional RCA focuses on what and who. Despite its roots stemming from NASA, in the software world, RCA is misaligned to find only one channel of causality. A fishbone diagram shows this:

This might be okay for simple systems (i.e. 3-tier web/app/data servers). There’s much more to this: networking, hosting, and operating environments. Beyond that, users access in both benign and benevolent ways.

Waterfall encouraged us to minimize complexity by locking down state (i.e. promote a “don’t change” mentality). Waterfall (think 12mo cycles) encourages us to think that change is the developer’s fault. And there were a lot of constrains in the 80s and 90s, most of them are no longer true.

Root cause is fine for static models, but there are bad when it comes to “lots of boxes”, cloud-based dynamic and distributed systems. It’s very hard to trace the source of problems in this new world. Change vectors (a/b testing, reconfigurations, migrations, feature flags) abound, in fact they’re encouraged.

Our systems are far more complex than they were 20 years ago. They involve the whole stack, the whole team, and the whole organization.

Paul’s Take: Occam’s Anti-Razor

A heuristic idea we often employ is Occam’s razor, in general that, the simplest answer is often the right one. Coupled with a confirmation bias, we (humans) often look for a single causal root to the problems we see. Then we build processes that inherit our bias. But what if operational failures occur because of multiple causes, chain reactions that exceed the typical ‘5 whys’ RCA model?

As quickly as the concept of the razor was introduced, Chatton, a contemporary, countered the idea with: “If three things are not enough to verify an affirmative proposition about things, a fourth must be added, and so on.” Similarly, many ascribe a balance of simplicity and complexity in solving problems to the quote “Make things as simple as possible, but no simpler.” by Einstein.

The idea is right fit…right fit of simplicity/complexity to the problem at hand. With complex systems, we can’t always assume that the simple answer is the most useful one in future scenarios.

Our Systems Aren’t Trees, They’re Forests

Emergence is about collective behaviors, systems we connect and integrate over time, and not simply the aggregate of behaviors emitted by individual subcomponents and nodes.

We need to develop, test, deploy, monitor and issue resolve them like the complex semi-organic systems they are, part of an ecosystem of services and fallible subsystems that they are. We can no longer afford to ignore better paradigms for dealing with them.

Enter Systems Thinking. Understanding why things emerge takes more than an ops dashboard and intuition. Sometimes analysis on complex problems requires a multi-variate perspective.

Paul’s advice: System’s thinking is a much broader topic that, if you haven’t actually studied, it would serve you well to listen to The Fifth Discipline by Peter Senge. As context for my presentation in April on IoT testing, this made me realize that systems thinking was a necessary mental tool moving forward.

Systems thinking helps us to identify activities, interactions, and ultimately change vectors contributing to emergent behaviors. Understanding which dials and levers are involved in the problem enables later actions to resolve the issue. This feeling of being at home in the problem space is also similar to “cynefin”, a welch/gaelic term that in Scottish (my heritage!) means:

“a place to live and belong. where the nature of what’s around you feels right and welcoming”

Not at all coincidentally, the Cynefin framework as applied to emergent behavior helps us make quick decisions during and about incident management situations.

Staying Ahead of Emergent Behavior

The fact is that most workforces, small or large, are a revolving door. So is your current system state after multiple releases and infrastructure migrations. There be the monsters. Software is dynamic, and so should be your product discovery process, your learning loops, your incident management model, and so on.

The Cynefin framework gives us this quadrant visual to show that various issues need to be addressed differently:

The fact is, each of these quadrants assume two things:

  1. The issue occurred already, so you need to fix it and learn from it
  2. Information needs to be radiated (sensed) to make “sense” of it

In my after-hours chat with Matthew, we dived into the issue of metrics. Measuring issue tracking goes beyond mean-time-to-resolution (MTTR). Issues that are flagged with *how* they were resolved using Cynefin categories now have an opportunity for improvement.

Paul’s Take: could this be a JIRA custom field? Just thinking out loud.

Tracking the delta on a specific issue (what approach someone thought should be used at first vs. what would have been better after the fact) is a way to measure successfulness and improvement on a spot basis.

Then over time, aggregates can be used to show team and organizational reflexiveness to dynamic, emergent behavior. Though neither of us have customer anecdotes or proof-of-concept clients, I challenge you who are reading this to try it out for a few sprints or whatever intervals you use.

Summary

We need to embrace emergent behavior and learn how to approach incidents better using systems thinking and frameworks like Cynefin. Unlike traditional RCA, we’ll need to step out of our comfort zones, see what works, and learn from our mistakes.

Matthew is a Denver, Colorado native, and has spoken at other conferences like Gluecon (wicked!). If you have questions, ping him (and me) up on Twitter and let’s get a dialog going.

DevOps is Organizational, Operational, and Orthogonal

Some people seem to think that DevOps is a buzzword. It is not. At all.

As part of my research for integrating concepts of risk, quality assurance, and continuous testing into the IEEE 2675 working group (DevOps standard), I am realizing that there is no one single articulation of DevOps that seems to fit all contexts. However, in the spirit of DevOps, I’ll continue to iterate past this issue to explore aspects of the paradigm to provide value to people I meet and conversations we have. Here are three I’ve been pondering:

DevOps Is an Organizational Paradigm

DevOps is about breaking apart established paradigms, structures that worked for the prior generation of problems, management methodologies that are no longer the optimal solution for tomorrows problems today. DevOps is challenging institutional values that don’t actually lead to value. DevOps is helping us to take ownership over our success.
We are learning for the first time, every time, and deliberately discovering what we should know as we build the future.
Collaboration, contribution, sharing, learning, improvement, alignment, focus, value. These are words that describe our homegrown methodology, one whose aim is to meet the pace of innovation better than agility alone. Self management, self organization, self improvement. Shared understanding, shared goals, shared vision. Many experiments, many failures, and many wins.
It is fine if a single team wants to “try out DevOps”, but unless the organization is prepared to support and change to value the positive outcomes of that team, initiatives won’t go very far. In this way, it is a relational paradigm that applies between individuals, between teams, and between organizations too.

If you want to go fast, go alone. If you want to go far, go together.

There’s juice to that statement. Fast and far are relative to what needs to get accomplished (see this link for quote etymology).

DevOps Is an Operational Paradigm

Software tools are a huge part of DevOps conversations now. Why? Because automation and efficiency, sure, but also because its easier to feel confident and efficient in our own ignorance than to face the fact that most of software is about finding the right people to build the right software for other people.
Tools are only a part of our conversation. And more often, it’s tools (I mean outspoken assholes here) that dictate how [little] we understand about DevOps. Just because he buys all the ski equipment and reads a lot about which slopes are best doesn’t make him an expert. Practice matters, and practice means knowing the software landscape.
But tools and automation are only an enabler to the work, an outcome of good decisions; it is the team together which holds the capability to make better decisions tomorrow. And every day has new challenges which yesterday’s solutions won’t overcome, not to mention known challenges that demand experience and perseverance.
If you are automating the shit out of your pipeline, good for you. Is this truly helping you learn how to provide people value, or do you more often find employees arguing about which tool and approach is better? This is an example of how hyper-focusing on tools is counter to the goal of DevOps, to iteratively improve our ability to provide value to (and with) people.

DevOps Is an Orthogonal Paradigm

In this way, DevOps is a mindset that also encompasses those who may not necessarily think it applies to them. It must include everyone, each with our own skills and perspectives. It is not simply about developers and operations. It is about connecting contributors to consumption just as much as it connects consumers to contributions. It is about the whole supply chain, the whole delivery pipeline, and the whole collective of people impacting each other.
For DevOps to be really successful, its execution must be inclusive across boundaries. That includes more than just engineering teams, it involves recruitment, marketing, sales, customer support, HR, PR, and finance. In the IEEE 2675 working group, we are finding that these other groups are a necessary part of the supply chain that DevOps teams depend on. A few examples of the need for an orthogonal approach to DevOps are:
  • How can you go “faster” if your acquisition process takes many months?
  • How can you go “faster” if a supplier doesn’t provide a way to validate that you integrated their product or service correctly?
  • How can testing be continuous if it isn’t automated and therein scalable?
  • How can you expect marketing to crush numbers if you don’t integrate them into your sprints (their work often needs weeks/months of lead time)?
  • If your onboarding process doesn’t train new engineering recruits (dev/test/ops/PM/IT) on your lifecycle, how can you expect them to “go fast”?
  • If it takes days/weeks for customer feedback to reach development cycles, how can you expect to be building the “right thing” tomorrow?
Every one of these questions takes some kind of answer that includes collaboration, which you can’t expect unless you foster positive work culture and encourage people to improve professionally and personally.

DevOps Is Just a Word

DevOps is a word we have now for the next set of ideas for how to sustainable move fast in the right direction. Unlike a manifesto, its goal isn’t to constrain, but to evolve. Hopefully we can come up with a better name in the future, which is highly likely because we iteratively learn. But DevOps is what we have now and so far its doing us a lot of good.