Thoughts on DevOps vs. Enterprise Culture Clash

Probably not unlike you, every day I work with folks caught in a clash between organizational processes and technology imperatives. “We have to get this new software up and running, but the #DevOps group won’t give me the time of day.”

Large organizations don’t have the luxury of ‘move fast, break stuff’; if they did, their infrastructure, security, financial, and software release processes would be a chaotic mess…far more than usual. But how does one ‘move fast’ without breaking enterprise processes, particularly ones that they don’t understand?

Enterprise, Know Thyself

The answer is simple: encourage engineers to always be curious to know more about their environment, constraints, and organizational culture. The more you know, the more nimble you’ll be when planning and responding to unanticipated situations.

Today I had a call with a health care company, working to get docker installed on a RHEL server provisioned by an infra team. What was missing was that the operator didn’t know that the security team using Centrify to manage permissions on that box required tickets to be created to grant ‘dzdo su’ access for a very narrow window of time. Additionally, the usual ‘person to connect with’ was off on holiday break, so we were at the mercy of a semi-automated process for handling these tickets, and because they had already put in a similar request in the past 7 days, all new tickets would have to go through a manual verification process. This frustrated our friend.

The frustration manifested in the form of the following statement:

Why can’t they just let me have admin access to this non-production machine for more like 72 hours? Why only 2 meaasly hours at a time?

– Engineer at an F100 health care organization

My empathy and encouragement to them was to “expect delays at first, don’t expect everyone to know exactly how processes work until they’ve gone through them a few times, but don’t accept things like this as discouragements to your primary objective.”

If everything were easy and no problems existed, kind words might be useless. When things are not working that way, knowing how to fix or overcome them goes a long way, just like a kind word at the right time. We crafted an email to the security team together explaining exactly what was needed AND WHY, as well as an indication of the authority and best/worst case timelines that we were operating under, and a sincere thank you.

Enterprise “DevOps” Patterns that Feel Like Anti-Patterns

In my current work, I experience a lot of different enterprise dynamics at many organizations around the world. The same themes, of course, come up often. A few dynamics I’ve seen in play when enterprises try to put new technology work in a pretty box (i.e. consolidate “DevOps engineers” into a centralized team) are:

  1. Enterprise DevOps/CloudOps/infra teams adopt the pattern of “planned work”, just like development teams, using sprints and work tracking to provide manageable throughput and consistency of support to other organizational ‘consumers’. This inherits other patterns like prioritization of work items, delivery dates, estimable progress, etc.
  2. Low/no context requests into these teams get rejected because it’s slow/impossible to prioritize and plan based on ambiguous work requirements
  3. The amount of control and responsibility these teams have over security and infrastructure systems the organization is often considered “high risk”, so they’re subject to additional scrutiny come audit time

That last point about auditing, particularly the psychological impacts on ‘move fast’ engineers, cannot be understated. When someone asks you to break protocol ‘just this one time’, it’s you that’s on the hook for explaining why you took action to do so, rarely the product owner or director who pressured the engineer to do it.

Technical auditors that are worth anything more than spit will focus on processes instead of narrow activities because to comb through individual log entries is not scalable…but verifying that critical risk mitigative processes are in place and checking for examples of when the process is AND isn’t being followed…that’s far more doable in the few precious weeks that auditing firms are contracted to complete their work.

The More You Know, The Faster You Can Go (Safely)

An example of how understanding your enterprise organization’s culture improves the speed of your work comes from an email today between two colleagues at F100+:

Can you confirm tentative dates when you are planning to conduct this test? Also will it take time to open firewall, post freeze incident tickets can be fast tracked?

– Performance Engineering at Major Retailer

This is a simple example of proper planning. Notice that the first as is for concrete dates, an inference that others also need to have their shit together (in this particular case because they’re conducting a 100k synthetic user test against some system, not a trivial thing in the slightest). The knowledge that firewall rules have to be requested ahead of time, and to notify incident response that potential issues reported may be due to the simulation, not real production traffic, comes from having experienced these things before. Understanding takes time.

Another software engineer friend of mine in the open-source space and I were discussing the Centrify thing today, and he asked: “why can’t they just set up and configure this server with temporary admin rights off to the side, then route appropriate ports and stuff to it once it’s working?” Many practitioners in the bowels of enterprises will recognize a few wild assumptions there, and in no way is this a slight of my friend, but rather an example of how different thinking is from two very different engineering cultures. More specifically, those who are used to being constrained as opposed to those who aren’t often have a harder time collaborating with each other because they’re reasoning is predicated on very different past experiences. I see this one a lot.

DevOps Is an Approach to Engineering Culture, not a Team

This is my perspective after only 5yrs of working out what “DevOps” means. I encourage everyone to find their own by having their own journey of curiosity, keyboard work, and many conversations.

There is and never should be a DevOps ‘manifesto’. As Andrew Clay Shafer (@littleidea) once said, DevOps is about ‘optimizing for people’, not process or policy or one type of team only. Instead of manifesto bullet points, there are some clear and common principles that have stayed the test of time since 2008:

  • A flow of work, as one way as possible
  • Observability and Transparency
  • Effective communication and collaboration
  • A high degree of automation
  • Feedback and experimentation for learning and mastery

Some of the principles above come from early work like The Phoenix Project, The Goal, and Continuous Delivery; others come from more formalized research such as ISO and IEEE working groups on DevOps that I’ve been a part of over the past 3 years.

I don’t tend to bring the “DevOps is not a team” bit up when talking with F100s primarily because:

  • it’s not terribly relevant to our immediate work and deliverables
  • enterprises that think in terms of cost centers always make up departments, because “we have to know who’s budget to pay them from and who manages them”
  • Now that DevOps is in vogue with various IT leaders and just like the manifestation of Agile everywhere now, DevOps is perceived as ‘yet another demand to do things differently from management’, so after being restructured, engineers often have enough open wounds that I don’t need to throw salt on
  • if this is how people grok DevOps in their organization, there’s little I as an ‘outside’ actor can do to change it…except maybe a little side-conversation over beers here and there, which I try to do as much as appropriately possible with receptive folks

However, as an approach to engineering culture, DevOps expects people to work together, to “row in the same direction”, and to learn at every opportunity. As I stated at the beginning of this post, learning more about the people and processes around you, the constraints and interactions behind the behaviors we see, being curious, and having empathy…these things all still work in an enterprise context.

As the Buddha taught, the Middle Path gives vision, gives knowledge, and leads to calm, to insight, to enlightenment. There is always a ‘middle way’, and IMO is often the easiest path between extremes to get to the place where you want to be.

Put That in Your Pipeline and Smoke Test It!

I rarely bother to open my mouth as a speaker and step into a spotlight anymore. I’ve been mostly focused on observing, listening, and organizing tech communities in my local Boston area for the past few years. I just find that others’

A friend of mine asked if I would present at the local Ministry of Testing meetup, and since she did me a huge last-minute favor last month, I was more than happy to oblige.

“Testing Is Always Interesting Enough to Blog About”

Permissioned quote from the Boston DevOps community, Dec 12th 2019. James Goin, DevOps Engineer

The state and craft of quality (not to mention performance) engineering has changed dramatically in the past 5 years since I purposely committed to it. After wasting most of my early tech career as a developer not writing testable software, the latter part of my career as of late has been what some might consider penance to that effect.

I now work in the reliability engineering space. More specifically, I’m a Director of Customer Engineering at a company focusing on the F500. As a performance nerd, everything inherits a statistical perspective, not excluding how I view people, process, and technology. In this demographic, “maturity” models are a complex curve across dozens of teams and a history of IT decisions, not something you can pull out of an Agilista’s sardine can or teach like the CMMI once thought it could.

A Presentation as Aperitif to Hive Minding

This presentation is a distillation of those experiences to date as research and mostly inspired to learn what other practitioners like me think when faced with challenges in translating the importance of holistic thinking around software quality to business leaders.

Slides: bit.ly/put-that-in-your-pipeline-2019

Like I say at the beginning of this presentation, the goal is to incite collaboration about concepts, sharing the puzzle pieces I am actively working to clarify so that the whole group can get involved with each other in a constructive manner.

Hive Minding on What Can/Must/Shouldn’t Be Tested

The phrase ‘Hive Minding‘ is (to my knowledge and Google results) a turn-of-phrase invention of my own. It’s one incremental iteration past my work and research in open spaces, emphasizing the notions of:

  • Collective, aggregated collaboration
  • Striking a balance between personal and real-time thinking
  • Mindful, structured interactions to optimize outcomes

At this meetup, I beta launched the 1-2-4-All method from Liberating Structures that seemed to work so well when I was in France at a product strategy session last month. It so well balanced the opposite divergent and convergent modes of thinking, as discussed in The Creative Thinker’s Toolkit, that I was compelled again to continue my active research into improving group facilitation.

Even after a few people had to leave the meetup early, there were still six groups of four. In France there were eight contributors, so I felt that this time I had a manageable but still scaled (4x) experiment of how hive minding works with larger groups.

My personal key learnings

Before I share some of the community feedbacks (below), I should mention what I as the organizer saw during and as outcomes after the meetup:

  • I need to use a bell or chime sound on my phone rather than having to interrupt people once the timers elapse for each of the 1-2-4 sessions; I hate stopping good conversation just because there’s a pre-agreed-to meeting structure.
  • We were able to expose non-quality-engineer people (like SysOps and managers) to concepts new to them, such as negative testing and service virtualization; hopefully next time they’re hiring a QA manager, they’ll have new things to chat about
  • Many people confirmed some of the hypotheses in the presentation with real-world examples; you can’t test all the things, sometimes you can’t even test the thing because of non-technical limitations such as unavailability of systems, budget, or failure of management to understand the impact on organizational risk
  • I was able to give shout-outs to great work I’ve run across in my journeys, such as Resilient Coders of Boston and technical projects like Mockiato and OpenTelemetry
  • Quite a few people hung out afterward to express appreciation and interest in the sushi menu of ideas in the presentation. They are why I work so hard on my research areas.
  • I have to stop saying “you guys”. It slipped out twice and I was internally embarrassed that this is still a latent habit. At least one-third of the attendees were women in technology and as important as being an accomplice to improving underrepresented communities (including non-binary individuals), my words need work.

A Few Community Feedbacks, Anonymized

Consolidated outcomes of “Hive Minding” on the topics “What must be tested?” and “What can’t we test?”
  • What must we test?
    • Regressions, integrations, negative testing
    • Deliver what you promised
    • Requirements & customer use cases
    • Underlying dependency changes
    • Access to our systems
    • Monitoring mechanisms
    • Pipelines
    • Things that lots of devs use (security libraries)
    • Things with lots of dependencies
  • What can’t we test?
    • Processes that never finish (non-deterministic, infinite streams)
    • Brute-force enterprise cracking
    • Production systems
    • Production data (privacy concerns)
    • “All” versions of something, some equipment, types of data
    • Exhaustive testing
    • Randomness
    • High-fidelity combinations where dimensions exponentially multiply cases
    • Full system tests (takes too long for CI/CD)

A few thoughts from folks in Slack (scrubbed for privacy)…

Anonymized community member:

Writing up my personal answers to @paulsbruce’s hivemind questions yesterday evening: What can/should you test?

  • well specified properties of your system, of the form if A then B. Only test those when your gut tells you they are complex enough to warrant a test, or as a preliminary step to fixing a bug, and making sure it won’t get hit again (see my answer to the next question).
  • your monitoring and alerting pipeline. You can never test up front for everything, things will break. The least you can do is test for end failure, and polish your observability to make debugging/fixing easier.

What can’t/shouldn’t you test?

  • my answer here is a bit controversial, and a bit tongue in cheek (I’m the person writing more than 80% of the tests at my current job). You should test the least amount possible. In software, writing tests is very expensive. Tests add code, sometimes very complex code that is hard to read and hard to test in itself. This means it will quickly rot, or worse, it will prevent/keep people from modifying the software architecture or make bold moves because tests will break/become obsolete. For example, assume you tested every single detail of your current DB schema and DB behaviour. If changing the DB schema or moving to a new storage backend is “the right move” from a product standpoint, all your tests become obsolete.
  • tests will often add a lot of complexity to your codebase, only for the purpose of testing. You will have to add mocking at every level. You will have to set up CICD jobs. The cost of this depends on what kind of software you write, the problem is well solved for webby/microservicy/cloudy things, much less so for custom software / desktop software / web frontends / software with complex concurrency. For example, in my current job (highly concurrent embedded firmware, everything is mocked: every state machine, every hardware component, every ocmmunication bus is mocked so that individual state machines can be tested against. This means that if you add a new hardware sensor, you end up writing over 200 lines of boilerplate just to satisfy the mocking requirements. THis can be alleviated with scaffolding tools, some clever programming language features, but there is no denying the added complexity)

To add to this, I think this is especially a problem for junior developers / developers who don’t have enough experience with large scale codebases. They are either starry-eyed about TDD and “best practices” and “functional programming will save the world”, and so don’t exercise the right judgment on where to test and where not to test. So you end up with huge test suites that basically test that calling database.get_customer('john smith') == customer('john smith') which is pretty useless. much more useful would be logging that result.name != requested_name in the function get_customer

the first is going to be run in a mocked environment either on the dev machine, on the builder, or in a staging environment, and might not catch a race condition between writers and readers that happens under load every blue moon. the logging will, and you can alert on it. furthermore, if the bug is caught as a user bug “i tried to update the customer’s name, but i got the wrong result”, a developer can get the trace, and immediately figure out which function failed

Then someone else chimed in:

It sounds like you’re pitting your anecdotal experience against the entire history of the industry and all the data showing that bugs are cheaper and faster to fix when found “to the left” i.e. before production. The idea that a developer can get a trace and immediately figure out which function failed is a starry-eyed fantasy when it comes to most software and systems in production in the world today.

The original contributor then continues with:

yeah, this is personal experience, and we don’t just yeet stuff into production. as far data-driven software engineering, I find mostly scientific studies to be of dubious value, meaning we’re all back to personal experience. as for trace driven debugging, it’s working quite well at my workplace, I can go much more into details about how these things work (I had a webinar with qt up online but I think they took it down)

as said, it’s a bit tongue in cheek, but if there’s no strong incentive to test something, I would say, don’t. the one thing i do is keep tabs on which bugs we did fix later on, which parts of the sourcecode were affected, who fixed them, and draw conclusions from that

Sailboat Retrospective

Using the concept of a sailboat retrospective, a few things that I’d like to improve are below, namely:

Things that propel us:

  • Many people said they really liked the collaborative nature of hive minding and would love to do this again because it got people to share learnings and ideas
  • Reading the crowd in real-time, I could see that people were connecting with the ideas and message; there were no “bad actors” or trolls in the crowd
  • Space, food, invites and social media logistics were handled well (not on me)

Things that slowed us:

  • My presentation was 50+ mins, way too long for a meetup IMO.

    To improve this, I need to:
    • Break my content and narratives up to smaller chunks, ones that I can actually stick to a 20min timeframe on. If people want to hear more, I can chain on topics.
    • Recruit a timekeeper from the audience, someone who provides accountability
    • Don’t get into minutia and examples that bulk out my message, unless asked
  • Audio/video recording and last-minute mic difficulties kind of throws speakers off

    To fix this? Maybe bring my own recording and A/V gear next time.
  • Having to verbally interrupt people at the agree upon time-breaks in 1-2-4-All seems counter to collaborative spirit.

    To improve this, possibly use a Pavlovian sound on my phone (ding, chime, etc.)

Things to watch out for:

  • I used the all-to-common gender-binary phrase “you guys” twice. Imagine rooms where it would somehow be fine to say that, but saying “hey ladies” to a mixed crowd would be considered pejorative to many cisgender men. Everything can be improved and this is certainly one thing I plan to be very conscious of.
  • Though it’s important to have people write things down themselves, not everyone’s handwriting can be read back by others after, and certainly not without high-fidelity photos of the post-its afterward.

    To improve this, maybe stand with the final group representatives and if needed re-write the key concepts they verbalize to the “all” group on the whiteboard next to their post-it.

More reading:

Afterthoughts on Hive Minding

It’s a powerful thing to understand how your brain works, what motivates you, and what you don’t care about. There are so many things that can distract, but at the end of the day, there are very few things measurable immediately worth having done. Shipping myself to Europe until next week, for example, has already had measurable personal and professional impact.

One thing I experienced this week after injecting a little disruption to conformity yesterday was what I now call “hive minding”, or otherwise assisting independent contributors in rowing in the same direction. The classical stereotype of “herding cats” infers that actors only care about themselves, but unlike cats, a bee colony shares an intuitive, survival imperative to build and improve the structure that ensures their survival. Each bee might not consciously think about “lasting value”, but it’s built into their nature.

Be Kind, Rewind

I’m always restless, every success followed by a new challenge, and I wouldn’t have it any other way, but it does lead to a growing consideration about plateauing. Plateauing is a million times worse than burning out. There are plenty of people and companies that have burned out already but are still doing something “functional” in a dysfunctional industry, and if the decision is to flip that investment, it’s an easy one to make. Fire them, trade or cut funding; but what do you do with a resource when they plateau?

I think you’ll know you’ve plateaued when you find yourself without restlessness. If necessity is the mother of invention, restlessness is the chambermaid of clean mind. Al least for me, like a hungry tiger in a cave, I must feed my restlessness with purposeful and aligned professional work. The only problematic moment with me…I like to get ahead of the problem of someone telling me what to do by figuring out what we (everyone, me and them) should be doing before someone dictates it with less context.

The sweet spot of this motion is to do this together, not in isolation and not dictatorially, but coalescing the importance of arriving at the “right” goals and in alignment at the same time. The only surprises when you’re riding the wave together is what comes next, and when you engineer this into the process, surprises are mostly good.

It took a while to arrive at this position. I had to roll up sleeves, work with many different teams in multiple organizations, listen to those whose shoes I don’t have the time or aptitude to fill, figure out how to synthesize their inputs into cogent and agreeable outcomes, and do so with a level of continuity that distinguishes this approach from traditional forms of management and group facilitation.

Don’t Try This On Your Own

The cost of adaptability is very high. If I didn’t have an equally dedicated partner to run the homefront, none of this would work. She’s sought out the same kind of commitment and focus on raising the kids as I do with what goes into pays the bills. There are very few character traits and creature comforts we share, but in our obsession over the things that make the absolute best come out of what we have, she more than completes the situation.

In this lifestyle, I have to determine day by day and week by week what net-new motions/motivations I need to pick up and which I need to put down, either temporarily or permanently. This can feel like thrash to some, but for me, every day is a chance to re-assess based on all the days before now; I can either take that opportunity or not, but it is there despite whether I do or not take it. If my decisions are only made in big batches, similar to code/product releases, I inherit the complexities and inefficiencies of “big measurement”…namely, granularity in iterative improvement.

Feedback Loops, Everywhere

As I explore the dynamics of continuous feedback loops beyond software and into human systems, a model of frequency in feedback and software delivery not as separate mechanisms, but as symbiotic, emerges. The more frequently you release, the more chances there are for feedback. The more feedback you can synthesize into value, the more frequently you want to release. One does not ‘predict’ the other; their rate bounds each other, like a non-binary statistical model.

What I mean is that a slow-release cycle predicts slow feedback and slow feedback predicts low value from releasing frequently; a fast feedback mechanism addicts people to faster release cycles. They share the relationship and depending on how extreme the dynamics feeding into one side of the relationship, the other one suffers. Maybe at some point, it’s a lost cause.

An example from the performance and reliability wheelhouse is low/slow performance observability. When you can’t see what’s causing a severe production incident, the live investigation and post-mortem activity is slow and takes time away from engineering a more reliable solution. Firefighting takes dev, SRE, ops, and product management time…it’s just a fact. Teams that understand the underlying relationship and synthesize that back into their work tend to use SEV1 incidents as teachable moments to improve visibility on underlying systems AND behavioral predictors (critical system queue lengths, what levels of capacity use constitute “before critical”, architectural bottlenecks that inform priorities on reducing “tech debt”, etc.).

The point is that feedback loops take time and iterative learning to properly inject in a way that has a positive, measurable impact on product delivery and team dynamics.

Going from Feedback Loops to Iterations…Together

All effect feedback loops have one thing in common: they measure achievement levels framed by a shared goal. So you really have to work to uncovered shared goals in a team. If they suit you and/or if you can accept the awesome responsibility to challenge and change them over time, it’s a wild ride of learning and transforming. If not, find another team, company, or tribe. Everyone needs a mountain they can traverse and shouldn’t put themselves up to a trail that will destroy them. This is why occasionally stepping back, collaborating, and reporting out what works and what doesn’t is so important. Re-enter the concept of “team meetings”.

Increasingly, most engineers I talk to abhor the notion of more meetings, usually because they’ve experienced their fair share of meetings that don’t respect their time or where their inputs have not been respectfully synthesized in a way they can see. So what, meetings are a bad thing?

Well, no, not if your meetings are very well run. This is not one person’s job, though scrumbags and mid-level management with confirmation bias abound, and especially so because they don’t have built-in NPS (net promoter score). A solution I’ve seen to the anti-pattern of ineffective meetings is to establish common knowledge of what/how/why an “effective” meeting looks like and expect these behaviors from everyone in on the team and in the org.

How to Encourage Effective Collaboration in Meetings

Learn to listen, synthesize, and articulate back in real-time. Too much time goes by, delay and context evaporate like winter breath. Capture as much of this context as you can while respecting the flow of the conversation. This will help you and others with remembering and respecting the “why”, and will allow people to see what was missing (perspectives, thinking, constructs), afterward. Examples of capture include meeting minutes, pictures of post-its, non-private notes from everyone, and even recordings.

But in just about every team and organization there’s a rampant misconception that ALL meetings must produce outcomes that look like decisions or action items. These are very beneficial, but I’ve seen people become anti-productive when treating themselves and others as slaves to these outcomes. Taking decisions too early drives convergent attitudes that are often uninformed, under-aligned, and often destructive.

Some of the most effective meetings I’ve had share the following patterns:

  • know why you’re meeting, provide context before, and set realistic expectations
  • have the “right” people in the room
    • who benefit from the anticipated outcomes and therefore are invested in them
    • who bring absolutely critical perspective, where otherwise invalidates outcomes or cause significant toil to refactor back in afterward; not to few
    • who contribute to functional outcome (as opposed to those who are known to bring dysfunction, don’t respect the time of others, argue over align); too many
  • agree on what positive and negative outcomes look like before starting in
  • use communication constructs to keep people on track with producing outcomes
  • have someone to ensure (not necessarily do all the) capture; note and picture taker
  • outcomes are categorized as:
    • clear, aligned decisions (what will happen, what worked, what didn’t, what next)
    • concrete concerns and missing inputs that represent blockers to the above
    • themes and sense of directional changes (i.e. we think we need to change X)
    • all info captured and provided as additional context for others

Trust AND Verify

One thing I keep finding useful is to challenge the “but” in “trust, but verify”. In English, the word “but” carries a negating connotation. It invalidates all that was said before it. “Your input was super important, BUT it’s hard to understand how it’s useful”…basically means “Your input was not important because it was not usable.”

My alternative is to “trust and verify”, but with a twist. If you’re doing it right, trust is easy if you preemptively provided an easy means to verify it. If you provide evidence along with your opinion, reasonable people are likely to trust your judgment. For me, rolling up the sleeves is a very important tool in my toolbelt to produce evidence for or against a particular position. I know there are other methods, both legitimate and nefarious, but I find that practical experience is far more defensible than constructing decisions based on shaky foundations.

All this said, even if you’re delivering self-evident verification with your work, people relationships take time and certainly take more than one or two demonstrative examples of trustability to attain a momentum of their own. Trust takes time, is all.

Takeaways and Action Items from This Week

Democratic decision processes are “thrashy”. Laws and sausages: no one wants to know how they’re made. In small teams going fast, we don’t have the luxury of being ignorant of outcomes and the context behind them. For some people, “democracy” feels better than dictatorial decisions being handed down without context; but for those who still find a way to complain about the outcomes, they need to ask themselves, “did I really care enough to engage in a functional and useful way, and did I even bother to educate myself on the context behind the decision I don’t like?”

Just like missing a critical perspective in a software team, in a global organization, when one region or office dominates an area of business (U.S. on sales, EU on security, for instance), this will inevitably bias outcomes and decisions affecting everyone. As the individual that I report to puts it, “scalability matters to every idea, not just when we’re ready to deploy that idea”. Make sure you have the right “everyone” in the room, depending on the context of your work and organizational culture.

Someone I once met and deeply respect once told me “it’s not enough to be an ally, you need to be an accomplice“. In context, she was referring to improving the epic dysfunction of modern technology culture by purposefully including underrepresented persons. Even if we make a 10% improvement to women’s salaries, hire more African-American engineers, create a safer place for LGBTQ, I still agree with the premise that doing these things isn’t good enough. Put it another way, receiving critical medical treatment for a gushing head wound isn’t an “over-compensation”, it’s a measured response to the situation. The technology gushing head wound, in this case, is an almost complete denial from WGLM (white guys like me) that there is a problem, that doing nothing continuously enables the causes of the problem, that leadership on this doesn’t necessarily look or think like us, and that this isn’t necessarily needed now.

Bringing it back to the wheelhouse of this article, true improvement culture doesn’t just take saying “sure, let me wave at you as an ally while you go improve the team”. It takes being an accomplice (think a getaway driver), we should ALL be complicit in decisions and improvement. Put some skin in the game, figure out how something truly worth improvement and your effort maps to your current WiP (work in progress) limits, and you may find that you need to put something less worth your time down before you can effectively contribute to improvement work. Surrounding yourself with folks who get this too will also increase the chances that you’ll all succeed. This is not an over-compensation, it is what everyone needs to do now to thrive, not just survive.

Crossing Cross-functional Chasms

Initialization Phase

My first evening in La Ciotat: I picked up a rental car in town due to the good graces of Giulia, the front desk assistant who was coming off of her shift. She called, then insisted she drive me to the pick-up office where the lady attending was on her way out to deliver a car. Thirty not so awkward minutes later after discussing dog grooming and training techniques in great depth, the attendant came back and shortly I had a car. It was just starting to rain and it had been years since my last stick shift. Crossed fingers and no stalls to get back to La Rose, but by that time, waters were pouring out in sheets. The sprint from car to lobby was in place of the shower I had hoped to take earlier.

The hotel restaurant was leaking everywhere and occasionally losing power. The great thing about room charges is that a full bar downstairs doesn’t require the internet to hand over all sorts of drinks. People outside the lower forty-eight seem to intuit what to do when the credit card terminal is out of service. The lightning was faster and closer than I’ve ever seen from my fishing town, so it was a good time to revert to battery power and write this.

My recent recipe is bourbon (or tequila) and a splash of each lemon juice, creme de cacao, absinthe, shaken and filtered into a highball with a thick peel of an orange. A few months ago it was half high-quality sake and half Prosecco with a flake of rosemary. In a pinch, anything works, and when your bartender has got flooding issues to deal with, you can ponder life under a canopy and try to stay dry. The following are my thoughts from underneath all of this.

Planning Phase

The thing about my work, it isn’t scalable because it serves different goals than other kinds of work. Like Kent Beck describes in his “3-x” model, there are modes of work that optimize for different localized outcomes but all serve the same high-order goal. What is that goal? As Eliyahu Goldratt states, “the goal of an organization is to make money”. Certainly commercial ones, but even non-profits need to do this in order to exist. I exercise aspects of each of Kent’s 3 modes: explore, expand, and extract. I dig holes to find gold and when I hit it I dig hard, and then try to scale that out to optimize efforts to extract that gold.

In his epic distillations, the Innovators Dilemma and Solution, Clayton Christensen puts a fine point on how if a company is not thinking of its next horizon at all points in the current extract motion, it has no lasting future. Despite the dilemma of where to divest funds and how to prioritize “next” work, I am looking to do that for whomever I work with. I want to help optimize what’s currently being extracted, translate learnings into gaps and undiscovered opportunities, and continuously listen and learn “what’s next” (ref: Tim O’Reilly in “WFT?: What’s the Future and Why It’s Up to Us”. If we’re not doing that, either homogeneously as all actors in an organization or as a unique sub-function, then we’re dooming our employees and product to obsolescence.

Implementation Phase

I do…a lot…of things in my current organization. Pre-sales guidance, analyst relations, strategic product planning, blog writing, speaking, webinars, on-site customer planning sessions, technical prototyping, automated pipeline and testing examples, collaboration with support on key customers, building examples, positioning and messaging assistance, customer engineering, amongst others. “Cross-functional” is an easy way of putting it. When friends ask about what I do, I just say “I help technology companies make better decisions.”

But when your cross-functional, you get to see how diverse people groups are and how differently they structure their goals. For some, it’s money, but for others it’s lead acquisition, and for yet others, it’s sprint deliverables and low-defect release cycles. For leadership it’s all of these things plus happy employees, happy customers, and happy selves. I want all of these things and more…happy me and happy mine (family) which requires balance. Balancing multiple objectives takes a lot of practice, similar to my experiences with Uechi Karate-do. Balance isn’t a static state, it’s an ability to re-balance and prioritize based on shifting needs and constraints.

In planning one of four strategy sessions with one of the founders, I found myself thinking “our goals are not the same, he wants to prioritize an existing backlog around reporting, but I want to define the new state of the art for our industry”. After realizing that he had a different goal, we played better together; but I am not distracted. Maintaining the status quo has never been my strong suit and I’m more useful when focused on what’s next, not just what already was.

This is my current approach to balance: understand what drives people (myself included), listen to everyone, provide value that’s aligned to these motives, and circulate what makes sense to the organization. Catching the moment when a founder’s goals and my own differ happens in real-time, but only if I’m exercising balance along these guidelines.

Deployment Phase

This week, the plan is to listen, a lot. Especially because of the language gap, but also thanks to my eclectic manners of verbal communication, as evidenced last time, less seems to be more here. I am working to lock down the details of a new position, focused on bringing the customer perspective to every area of business and a translation of my own invention. The activities performed today have a tendency to be…predictable, easily replicable, and therefore boring to someone like me.

Though billed as “strategy sessions”, my feeling is that current leadership understands the need for all elements of the business to be engaged deeply…”lean forward” as I often call it. The real strategy happens next week, in decision meetings amongst founders and key business owners separate from the rest of the employees. This is an interesting model, right-fit I think for humans who need time to digest and consider various perspectives and potential directions.

Though many ideas and directions will be discussed over the next 7 days, we’ll need to prioritize and I don’t own the company. All I can do is help those who do have a clear understanding of internal and external dynamics, provide requisite evidence for my positions, and improve relationships with my counterparts here in France.

Monitoring Phase

Feedback loops are important whether you automate them or not (but automating them is the smart way to do it). How do you automate human interactions though? The closest I’ve gotten is to “pull forward”…in my upcoming role, building in the demand and supply of effective internal and external collaboration. The partner channel is a significant dynamic in my current organization, much of it is channel but there is also a contingent technical element, as all good partnerships between tech companies should have. A colleague of ming is fantastic at tactical and technical delivery in this scope, but to scale these efforts out to the whole organization takes project/program management that he’s not particularly keen to deliver himself.

A key element to “monitoring” my effect is to A) have traceable inclusion in conversations (via Salesforce currently) and B) through volunteered backchannel context measure how many times my involvement improves what we’re doing, in the partner, sales, marketing, and development work. This week would be an example of execution, then after next week, I would explicitly ask my leadership what value they heard as a result of my presence. Pull forward isn’t a zero-effort enterprise, but is absolutely necessary if you take your individual impact lifecycle seriously.

Reintegration Phase

The bartender tonight quickly called for backup and started taking care of the flooding issues. I suspect this was because he knew that if he had just sat there and let the hotel restaurant get flooded, someone would ream the shit out of him the next day. He got ahead of the problem and solved it. This is what DevOps and SRE is all about, seeing that no one else is solving for the lasting value and putting patterns in place to help others to do exactly that.

In our current state, this organization takes its time to synthesize and integrate learnings. Faster than most other teams I see but not as fast as some, as with everything pace can be improved. More importantly, alignment and transparency must always be improved and that is not zero-effort in the slightest either. In a prior position, an SVP once stated that “alignment is about 50% of my time spend”. With marginal variance, I wish that applied across all roles and responsibilities in every team and every organization I work with. Imagine the impact you’d have if 100% of your 50% heads-down work was wildly effective. That is what rowing in the same direction looks like.

Reprise

For this post, there is no “call to action” other than if you want to leave a comment or engage on social channels. I can always find something worthwhile for you and I to do together. Drop a line and let’s chat about that.

This is Why #DevOps

Since well before 2008, DevOps as a keyword has been growing steadily in mindshare. This post is my version of a landing page for conversations I have with folks who either know little about it, know a lot about it, or simply want to learn why DevOps is not a hashtag buzzword.

TL;DR: if you’re not interested in reading a few paragraphs of text, then nevermind. If you really must jump to my own articulation of DevOps, click here.

DevOps Is a Question, Not an Answer

The word “DevOps” is a starting point in my conversations with folks along the journey. Whether it’s over obligatory beers with local community members after a monthly meetup, in a high-pressure F100 conference room, in my weekly meetings with the IEEE, or internally in organizations I work with and accept benefits from, “DevOps” is often the start of a really important dialog. It’s a question more than a statement, at least the way I use it day to day, spiked with a small challenge.

It’s a question of “what are you doing now, and where do you want to be”. It’s a question of “how are the things you’re doing now incrementally improving, maybe compounding, your efforts”. It’s a question about how you remove both your backlog of toil and resolve the inbound toil that daily changes to software systems represents. It’s a question of “how do you chose to think about your work as you’re doing it?” DevOps is an improvement mindset, not just about code and configuration, but about organizational cohesion, communication, collaboration, and engineering culture.

DevOps Is More than “Developers and Operations”

Unlike those whose staunch opinions limit the scope of DevOps to refer to activities of developers and operations engineers, though managing scope creep is important, the agency and effect of DevOps are clearly intertwined with an organizations capability to also foster a set of core values and principles. For over a decade, we’ve seen some teams going so much faster than others that it doesn’t matter who’s slower, the result out to production is buggy, insecure, and unscaleable systems. This very common and dysfunctional clock speed mismatch also extends to testing, risk and compliance, budgeting (i.e. part of planning), architecture, and proper monitoring/telemetry/measurement practices.

Similarly, an ‘agile’ team find it very hard to be ‘agile’ without the buy-in of leadership, support from finance and legal, alignment with marketing and sales, and deep connection to its customers/stakeholders as well as a whole slew of other examples that underline how agile can’t be just a dev-centric worldview.

I hesitate to invent more terms such as ‘DevBizOps’, ‘DevTestOps’, and ‘DevSecOps’ simply to segment off these augmented versions of DevOps when these areas of concern should be integrated into systems delivery lifecycles, to begin with. I don’t want to conflate concepts, yet so many elements overlap, it’s hard to know how to have one conversation without having three others too. At the end of the day though, it’s as arbitrary to me that people invent new terminology as it is that some others chose to dig their fundamentalist heels in about exclusively keeping the scope of DevOps to two traditionally distinct groups, particularly when we see modern interpretations of DevOps blends skills and responsibilities to the point that we highly value (pay) SREs and “10x developers” because they defy narrowly scoped job opportunities in favor of cross-functional roles.

What are *My* Core Values and Principles of DevOps?

Again, this is a question you should seek to answer in your own way, but after only five years of immersion and purposefully seeking out other perspectives daily, a few key elements seem to hold:

  • Transparency
    • Observability
    • Traceability
    • Open Systems
    • Great Documentation
  • Inclusivity
    • Collaboration
    • Swarming / Blameless Fault Handling
    • Customer/User-focus
    • Prioritization to Value
  • Measurement
    • Building Quality In
    • High-value Feedback Loops
    • Useful Monitoring and Telemetry
    • Clear SLA/SLO/SLIs
  • Improvement
    • Automate AMAP and by default
    • Continuous Learning
    • Coaching and Mentoring
  • Alignment
    • Small Batches, Short-lived Change Cycles
    • Clear Processes (Onboarding, Approvals, Operationalizing, Patching, Disposal)
    • Work Contextualized to Biz Value/Risk
    • All Stakeholders Represented in Decisions

DevOps Is Inclusive by Nature

When dialed in, the values and principles of DevOps foster an environment of high-trust and safety, synthesize perspectives without losing focus, and balance personal and team capacity to compound individual contributions rather than burn out talented professionals. DevOps is not, as Audre Lord (via Kelsey Merkley) puts it, our “master’s tool”, so long as we don’t make it such; rather, if we decide that DevOps must not inherit the meritocracy and exclusivity of corporate management and Agilista methodology, it must be a truly better alternative than that which came before.

This is how DevOps can change things, this is why it must. New thinking and new voices provide a way out of tech monoculture. In order to improve a system, you need more than the one option you’ve already found faulty. This is why I personally support local non-profits like Resilient Coders of Boston. DevOps values and principles offer a far better chance for new voices to be at the table, at least the version I’m advocating.

DevOps Improves Organizations In-Place

Imagine if your teams, your colleagues, your leadership really internalized and implemented the core values and principles defined above. Doing so in finance, human resources, marketing, sales, and even [dare I say] board of trustees groups would cause a natural affinity for supporting systems engineering teams. Conversely too, developers about to change a line of code would be more likely to ask “how does this impact the customer and my organization’s goals?”, “what documentation do customers and sales engineers need to effectively amplify the new feature I’m developing?”, and “how could this process be improved for everyone?”.

There is an old stereotype of programmers being bad at “soft skills”, resulting in miscommunication, disconnect of work from business value, “not my problem” mentality, and throw-it-over-the-fence mindset. My perspective on DevOps is that none of these things would have the room to materialize in organizations that put processes in place to ensure the above values and principals are the norm.

Everyone can do these things, there’s nothing stopping us, unicorn to F100. Importantly, transitioning from what is currently in place to these values takes time. Since time is money, it takes money too. DevOps advocacy isn’t good enough to make the case for these changes and the spend that comes attached. No one developer or even team can change an organization, it takes Gestalt thinking and demonstrated value to get people ‘bought in’ to change.

Facilitating DevOps Buy-In

The first thing to get straight about DevOps is that it takes conversation and active engagement. This is not and never should be something you go to school or get a certificate for; it is a journey more akin to learning a martial art like Uechi-ryu than a course you pay thousands of dollars for at some tech industry Carnivale.

Collect those in your organization is interested in having a conversation about cultural implications of DevOps values and principles, using something lightweight like a lunchtime guild, a book club, or even a Slack channel. Listen to people, what they think and already know, and don’t assume your perspective is more accurate than theirs. Be consistent about when/what you do together to further the dialog, hold a local open spaces meetup (or find someone who does this, like me), and invite people from outside the engineering team scope, such as a VP or member of product management at your organization, and ASK THEM afterwards what they thought.

Once you have people from different levels engaging on similar wavelengths about DevOps, ask them to help each other understand what are some first tangible and tactical steps to improve the current situation based on some of the core values and principles either defined above or further crafted to fit your circumstances. Get a headline name in one or more regional DevOpsDays organizing committees to come visit as an outside perspective to what you’ve got going. And importantly, make time for this improvement. SEV1 incidents aside, there’s always some weekly space on everyone’s calendar that’s an optimal time to get together.

Or you can just ping me [at] paulsbruce [dot] io or on Twitter and we can figure out a good way for you to get started on your own DevOps journey. I’m always happy to help.

On Lack of Transparency in SaaS Providers

As many organizations transition their technical systems to SaaS offerings they don’t own or operate, I find it surprising that when a company acquires a 3rd-party offering deployed on said offerings, they are often told to “just trust us” about security, performance, and scalability. I’m a performance nerd, that and DevOps mindset are my most active areas of work and research, so this perspective is scoped to that topic.

In my experience amongst large organizations and DevOps teams, the “hope is not a strategy” principle seems to be missing in the transition from internal team speak and external service agreement. Inside a 3rd-party vendor, say Salesforce Commerce Cloud, I’m sure they very skilled at what they do (I’m not guessing here, I know folks who work in technical teams in Burlington MA). But even espousing a trust-but-verify culture internally, when your statement to customers who are concerned about performance at scale of your offering is “just trust us”, seems maligned.

TL;DR: SaaS Providers, Improve Your Transparency

If you provide a shared tenancy service that’s based on cloud and I can’t acquire service-level performance, security audits, and error logs that are isolated to my account, it’s a transparent view into how little your internal processes (if they even exist around these concerns) actually improve service for me, your customer.

If you do provide these metrics to internal [product] teams, ask “why do we do that in the first place?” Consider that the same answers you come up with almost always equally apply to those external consumers who pay for your services that are also technologists, have revenue on the line, and care about delivering value successfully with minimal issues across a continuous delivery model.

If you don’t do a good job internally of continuously measuring and synthesizing the importance of performance, security, and error/issue data, please for the love of whatever get on that right now. It helps you, the teams you serve, and ultimately customers to have products and services that are accurate, verifiable, and reliable.

How Do You Move from “Trust Us” to Tangible Outcomes?

Like any good engineer, when a problem is big or ambiguous, start breaking that monolith up. If someone says “trust us”, be specific about what you’re looking to achieve and what you need to do that, which puts the ownness on them to map what they have to your terms. Sometimes this is easy, other times it’s not. Both are useful areas of useful information, what you do know and what you don’t. Then you can double-click into how to unpack unknowns (unknowables) in the new landscape.

For SaaS performance, at a high level we look for:

  • Uptime and availability reports (general) and the frequency of publication
  • Data on latency, the more granular to service or resource the better
  • Throughput (typically in Mbps etc.) for the domains hosted or serviced
  • Error # and/or rate, and if error detail is also provided in the form of logs
  • Queueing or otherwise service ingress congestion
  • Some gauge or measure of usage vs. [account] limits and capacity
  • Failover and balancing events (such as circuit breaks or load balancing changes)

You may be hard-pressed to expect some of these pieces of telemetry provided in real-time from your SaaS provider, but they serve as concrete talking points of what typical performance engineering practices need to verify about systems under load.

Real-world Example: Coaching a National Retailer

A message I sent today to a customer, names omitted:

[Dir of Performance Operations],

As I’m on a call with the IEEE on supplier/acquirer semantics in the context of DevOps, it occurs to me that the key element missing in [Retailer’s] transition from legacy web solution last year to that which is now deployed via Commerce Cloud, the lack of transparency (or simply not asking on our part) over service underpinnings is a significant risk, both in terms of system readiness and unanticipated costs. My work with the standard brought around two ideas in terms of what [Retailer] should expect from Salesforce:

A) what their process is for verifying the readiness of the services and service-level rendered to [Retailer], and

B) demonstrated evidence of what occurs (service levels and failover mechanisms) under significant pressure to their services

In the past, [Retailer’s] performance engineering practice had the agency to both put pressure on your site/services AND importantly how to measure the impact on your infrastructure. The latter is missing in their service offering, which means that if you run tests and the system results don’t meet your satisfaction, the dialog to resolve them with Salesforce lacks minimum-viable technical discussion points on what is specifically going wrong and how to fix it. This will mean sluggish MTTR and potentially synthesizing the expectation of longer feedback cycles into project/test planning.

Because of shared tenancy, you can’t expect them to hand over server logs, service-level measurements, or real-time entry points to their own internal monitoring solutions. Similarly, no engineering-competent service provider can reasonably expect for consumers to “just trust” that an aggregate product-plus-configuration-plus-customizations solution will perform at large scale, particularly when mission-critical verification was in place before fork-lifting your digital front door to Salesforce. We [vendor] see this need for independent verification of COTS all the time across many industries, despite a lack of proof of failure in the past.

My recommendation is that, as a goal of what you started by creating a ticket with them on this topic, we should progressively seek to receive thorough information on points A and B above from a product-level authority (i.e. product team). If that’s via a support or account rep, that’s fine, but it should be adequate for you to be able to ask more informed questions about architectural service limits, balancing, and failover.

//Paul

What Do You Think?

I’m always seeking other perspectives that my own. If you have a story to tell, question, or otherwise augmentation to this post, please do leave a comment. You can also reach out to me on Twitter, LinkedIn, or email [“me” -at– “paulsbruce” –dot- “io”]. My typical SLA for latency is less than 48hrs unless requests are malformed or malicious.

Performance Engineer vs. Tester

A performance engineer’s job is to get things to work really, really well.

Some might say that the difference between being a performance tester and a performance engineer boils down to scope. The scope of a tester is testing, to construct, execute and verify test results. An engineer seeks to understand, validate, and improve the operational context of a system.

Sure, let’s go with that for now, but really the difference is an appetite for curiosity. Some people treat monoliths as something to fear or control. Others explore them, learn how to move beyond them, and how to bring others along in the journey.

Testing Is Just a Necessary Tactic of an Engineer

Imagine being an advisor to a professional musician, their performance engineer. What would that involve? You wouldn’t just administer tests, you would carefully coach, craft instruction, listen and observe, seek counsel from other musicians and advisors, ultimately to provide the best possible path forward to your client. You would need to know their domain, their processes, their talents and weaknesses, their struggle.

With software teams and complex distributed systems, a lot can go wrong very quickly. Everyone tends to assume their best intentions manifest into their code, that what they build is today’s best. Then time goes by and everything more than 6 months old is already brownfield. What if the design of a thing is already so riddled with false assumptions and unknowns that everything is brownfield before it even begins.

Pretend with me for a moment, that if you were to embody the software you write, become your code, and look at your operational lifecycle as if it was your binary career, your future would be a bleak landscape of retirement options. Your code has a half-life.

Everything Is Flawed from the Moment of Inception

Most software is like this…not complete shit but more like well-intentioned gift baskets full of fruits, candies, pretty things, easter eggs, and bunny droppings. Spoils the whole fucking lot when you find them in there. A session management microservice that only starts to lose sessions once a few hundred people are active. An obese 3mb CSS file accidentally included in the final deployment. A reindexing process that tanks your order fulfillment process to 45 seconds, giving customers just enough time to rethink.

Performance engineer doesn’t simply polish turds. We help people not to build broken systems to begin with. In planning meetings, we coach people to ask critical performance questions by asking those questions in a way that appeals to their ego and curiosity at a time that’s cost effective to do so. We write in BIG BOLD RED SHARPIE in a corner of the sprint board what the percentage slow-down to the login process the nightly build as now caused. We develop an easy way to assess the performance of changes and new code, so that task templates in JIRA can include the “performance checkbox” in a meaningful way with simple steps on a wiki page.

Engineers Ask Questions Because Curiosity Is Their Skill

We ask how a young SRE’s good intentions of wrapping u statistical R models from a data sciences product team in Docker containers to speed deployment to production will affect resources, how they intend on measuring the change impact so that the CFO isn’t going to be knocking down their door the next day.

We ask why the architects didn’t impose requirements on their GraphQL queries to deliver only the fields necessary within JSON responses to mobile app clients, so that developers aren’t even allowed to reinvent the ‘SELECT * FROM’ mistake so rampant in legacy relational and OLAP systems.

We ask what the appropriate limits should be to auto-scaling and load balancing strategies and when we’d like to be alerted that our instance limits and contractual bandwidth limits are approaching cutoff levels. We provide cross-domain expertise from Ops, Dev, and Test to continuously integrate the evidence of false assumptions back into the earliest cycle possible. There should be processes in place to expose and capture things which can’t always be known at the time of planning.

Testers ask questions (or should) before they start testing. Entry/exit criteria, requirements gathering, test data, branch coverage expectations, results format, sure. Testing is important but is only a tactic.

Engineers Improve Process, Systems, and Teams

In contrast, engineering has the curiosity and the expertise to get ahead of testing so that when it comes time, the only surprises are the ones that are actually surprising, those problems that no one could have anticipated, and to advise on how to solve them based on evidence and team feedbacks collected throughout planning, implementation, and operation cycles.

An engineer’s greatest hope is to make things work really, really well. That hope extends beyond the software, the hardware, and the environment. It includes the teams, the processes, the business risks, and the end-user expectations.

Value Chain in DevOps

Foreward: Since I highly doubt the following concepts will see the light of day in the final draft of IEEE 2675, I wanted to document that in fact I pushed this to the group June 15th 2017. During a subsequent review, it got huge push-back from our Curator in Chief, the early first of a future string of events that lead me to publish this primary work on my personal blog.

What is a ‘value chain’?

value chain is a set of activities that a firm operating in a specific industry performs in order to deliver a valuable product or service for the market. The concept comes through business management and was first described by Michael Porter in his 1985 publication, Competitive Advantage: Creating and Sustaining Superior Performance.[1]

The idea of the value chain is based on the process view of organizations, the idea of seeing a manufacturing (or service) organization as a system, made up of subsystems each with inputs, transformation processes and outputs. Inputs, transformation processes, and outputs involve the acquisition and consumption of resources – money, labor, materials, equipment, buildings, land, administration and management. How value chain activities are carried out determines costs and affects profits.— IfM, Cambridge[2]

[Wikipedia: Value Chain (Porter)]

How does it related to IEEE 2675?

As related to DevOps, the Value Chain for Software Delivery is an application of a lifecycle perspective that scopes standards adherence to only certain individuals based on their participation in primary or supporting activities related to a particular software product.

DevOps is about continuously delivering value to users/customers/consumers. A value chain perspective disaggregates activities from an organizational funnel, making it easier for teams and consumers of a particular product or service to ask “does this thing meet the standard” without unintentionally including other unrelated teams or products, were the question be phrased as “does this organization meet the standard”.

What problem are we solving by using ‘value chain’?

Adoption. In large organizations with many independent groups or product teams, DevOps principals may apply across the value chain of one product, but not another, provided these products are completely independent from one another. For an organization or team to claim that a particular software product adheres to this standard, all aspects of that product’s value chain must implement the principals and practices set forth in this standard.

Examples of where the broadness of using “organization” presents a challenge to adoption:

  • Consulting agencies with many independent project teams on working on separate contracts/products
    • If only one of those contacts require adherence to 2675, does this require every team/contract (both in the future and retroactively) to do the same?
    • Using “value chain” would scope 2675 adherence to any and all parties performing activities germane to delivering that specific contract/product
    • We know that if even one team implements DevOps per 2675 and sees success, organizations are likely to grow that out to other teams over time; “value chain” helps adoption of the standard. 

  • Enterprises in the midst of transformation to DevOps
    • Can they claim adherence on a specific product if the “whole organization” can’t yet?
    • Much like the above agencies argument, when the scope of adherence is based on activities relating to delivery of a project, enterprises are far more capable of becoming “DevOps ready” because they can grow the practice out *over time*
    • In DevOps, key success factors are determined – driven – by customers, the end user.

  • Organization’s requirements from non-technical internal agencies
    • Can we expect that the legal or HR departments are also “DevOps”? 
      This would of course need to be defined by activities that support the delivery side of the business (i.e. billing, purchasing, etc.), begetting an activities-based perspective on implementation over organizational labeling

Why is this of importance to IEEE 2675?

In layman’s terms, adoption of IEEE 2675 at an organizational level can’t happen overnight, especially in large enterprises with many teams. Fortunately, it doesn’t have to, provided we adequately scope ‘shall’ statements with a perspective that A) is reasonable in scope and impact on the org, and B) enables parties to agree on what it means for a software product to have been developed and delivered using this standard.

How and where would we use ‘value chain’?

Places in the text where use of the word ‘organization’ could infer that IEEE 2675 must be implemented across the whole organization before any single team or product could claim adherence to the standard. For instance:

  • Too broad:

    “Organizations shall implement effective continuous delivery procedures aligned with the architecture definition in a manner that meets the business needs of the system.” … “DevOps itself requires effective architecture across the organization to ensure that application build, package and deployment procedures are implemented in a robust and consistent manner.” (6.4.4)

  • Scoped: 

    “Organizations shall implement effective continuous delivery procedures within a particular value chain that are aligned with the architecture definition and in a manner that meets the business needs of the system.” … “DevOps itself requires effective architecture across the whole value chain to ensure that application build, package and deployment procedures are implemented in a robust and consistent manner.”

  • Too broad: 

    “Organizations shall maintain an accurate record of both code deployed and fully automated mechanisms in order to remove obsolete code.”

  • Scoped:

    “Organizations shall maintain an accurate record of both code deployed and fully automated mechanisms in a value chain in order to remove obsolete code.”

Additional references:

AllDayDevOps 2018: Progressive Testing to Meet the Performance Imperative

Mostly as an appendix for references and readings, but whenever I can I like to have a self-hosted post to link everything back to about a particular presentation.

My slides for the presy: https://docs.google.com/presentation/d/1OpniWRDgdbXSTqSs8g4ofXwRpE78RPjLmTkJX03o0Gg/edit?usp=sharing

Video stream: http://play.vidyard.com/hjBQebJBQCnnWjWKnSqr6C

A few thoughts from my journal today (spelling and grammar checks off):

Will update as the conversation unfolds.