Folding Open Source into Enterprise DevOps

Open source software (OSS) is a foundational part of the modern software delivery lifecycle. Enterprise teams with DevOps aspirations face unique challenges in compliance, security, reliability, and sustainability of OSS components. Organizations-in-transformation must have a complete picture of risk when integrating open source components.

This article explores how to continuously factor in community and ecosystem health into OSS risk analysis strategy.

The Acquisition Process for Open Source Software

Developers need to build on the successes and contributions of others. Having the flexibility to integrate new open source components and new versions of existing dependencies enables teams to go fast, but external code must be checked and validated before becoming part of the trusted stack.

Including someone else’s software is an important moment of engagement. Enterprises typically wrap a formal ‘acquisition’ process around this process, where the ‘supplier’ (the entity who provides the software/service) and the ‘acquirer’ (the entity who wants to integrate the software/service) contractualize.

Though there are already commercial approaches to introducing software packages safely by companies like Sonatype, Black Duck, and others, my question extends beyond the tools conversation to encompass the longer-term picture of identifying and managing risk in software delivery.

Enterprises care deeply about risk. Without addressing this concern, engineering teams will never actualize the benefits of DevOps.

This is a tangible application of the need for DevOps to not only apply at an individual team level, but in the broader organization as well. It takes alignment between a team who needs software and teams providing compliance and legal services to all do so in an expedient manner that matches the clock speed of software delivery.

Communities Empower Enterprises to Address this Gap

Today in a Global Open Source Governance Group Chat, I asked the question:

“What are some methods for determining how significant a supplier/vendor OSS and community contributions are, relative to acquirer confidence?”

This question stems from my involvement with the IEEE 2675 working group, particularly because I see:

  • Prolific use of OSS use in DevOps and in enterprise contexts
  • Reluctance and concern (rightly so) around integration of OSS in enterprise software development and operation in production
  • The convergence of compliance and automation considerations
  • How important transparency and collaboration is to the health of OSS, but also to the supply and acquisition processes in a DevOps lifecycle

The group, expertly facilitated by Lauri Apple, also included key insights from Paul Burt and Jamie Allen. A log of the conversation can be found on Twitter.

As open source projects (like Swagger/OADF for instance) become increasingly important to enterprise software delivery, health and ecosystem tracking also becomes equally important to any new components being introduced.

My point-of-view is that organizations should prepare a checklist for software teams to construct a complete picture of risk introduced by OSS (not to mention proprietary) components. This checklist must include not only static analysis metrics but support, engagement, funding, and contribution considerations.

Measuring OSS Project + Community Health

The group had many suggestions that I wouldn’t have otherwise thought about, another reason for more people getting involved in dialogs like this.

There are already providers of aggregate information on open source community health and contribution metrics such as CHAOSS, a Linux Foundation project, and Bitergia. This data can be integrated easily into dependency management scripts in Groovy, npm, Ant, Maven, etc. and at the very least written in to a delivery pipeline as part of pre-build validation (BVT is too late).

And there is honest, hard-hitting research on open source software…which you should take the time to read….from Nadia Eghbal under the Ford Foundation in a report called ‘Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure‘. If you don’t have time to read it, buy some text-to-speech software and listen to it when you’re in transit.

The group also identified some key characteristics of OSS community health not necessarily tracked by established services, such as:

  • Same day response on reported issues, even if it’s simply acknowledgement
  • PRs under the “magic number” of 400 lines of code…tends to be the limit for # of bugs and useful feedback
  • Outage response, sandbox availability
  • Distribution of component versions across multiple central repositories

More to Come…From YOU

As I integrate both my own learnings and other voices from the community into the larger Enterprise DevOps conversation, the one thing that will be missed is YOUR THOUGHTS, whether your in a large organization or simply in a small team.

Please share your thoughts in the comments section below! Or ping me up on Twitter@paulsbruce or LinkedIn.

More reading:

Recap of DevOps Days Boston 2017 with Corey Quinn

This weekend, I had the chance to have a ‘distributed beer’ with Corey Quinn of Last Week in AWS to chat about the DevOps Days Boston 2017 conference last week. We provide a few takeaways each in about 5 minutes.

You can watch it on Youtube and listen on Soundcloud.

My Recaps of Day 1:


Links from this chat:


Beyond DevOps: The ‘Next’ Management Theory

In a conversation today with Ken Mugrage (organizer of DevOps Days Seattle), the scope of the term ‘DevOps’ came up enough to purposely double-click into it.

‘DevOps’ Is (and Should Be) Limited In Scope

Ken’s view is that the primary context for DevOps is in terms of culture, as opposed to processes, practices, or tools. To me, that’s fine, but there’s so much not accounted for that I feel I have to generalize a bit to get to where I’m comfortable parsing the hydra of topics in the space.

Like M-theory which attempts to draw relationships in how fundamental particles interact with each other, I think that DevOps is just a single view of a particular facet of the technology management gem.

DevOps is an implementation of a more general theory, a ‘next’ mindset over managing the hydra. DevOps addresses how developers and operations can more cohesively function together. Injecting all-the-things is counter to the scope of DevOps.

Zen-in: A New Management Theory for Everyone

Zen-in (ぜんいん[全員]) is a Japanese term that means ‘everyone in the group’. It infers a boundary, but challenges you to think of who is inside that boundary. Is it you? Is it not them? Why not? Who decides? Why?

By ‘management’ theory, I don’t mean another ‘the silo of management’. I literally mean the need to manage complexity, personal, technological, and organizational. Abstracting up a bit, the general principals of this theory are:

  • Convergence (groups come together to accomplish a necessarily shared goal)
  • Inclusion (all parties have a voice, acceptance of constraints)
  • Focus (alignment and optimization of goal, strategies, and tactics)
  • Improvement (learning loops, resultant actions, measurement, skills, acceleration, workforce refactoring, effective recruiting)
  • Actualization (self-management, cultural equilibrium, personal fulfillment)

I’ll be writing more on this moving forward as I explore each of these concepts, but for now I think I’ve found a basic framework that covers a lot of territory.

I Need Your Help to Evolve This Conversation

True to Zen-in, if you’re reading this, you’re already in the ‘group’. Your opinions, questions, and perspectives are necessary to iterate over how these concepts fit together.

Share thoughts in the comments section below! Or ping me up on Twitter@paulsbruce or LinkedIn.


How to Be a Good DevOps Vendor

This article is intended for everyone involved in buying or selling tech, not just tooling vendors. The goal is to paint a picture of what an efficient supply and acquisition process in DevOps looks like. Most of this article will be phrased from a us (acquirer) to you (supplier) perspective, but out of admiration for all.

Developers, Site-Reliability Engineers, Testers, Managers…please comment and add to this conversation because we all win when we all win.


I’ll frame my suggestions across a simplified four stage customer journey:

  1. Try
  2. Buy
  3. Integrate
  4. Improve

Note: As you read the following statements, it may seem that I bounce around talking to various group in a seemingly random fashion. This is actually a result of looking at an organization through the customer lens across their journey. As organizations re-align to focus on delivering real value to customers, our paradigms for how we talk about “teams” also change to include everyone with a customer touch point, not just engineering teams.

1. Make It Easy for Me to Try Your Thing Out

(Product / Sales)
Make the trial process as frictionless as possible. This doesn’t mean hands off, but rather a progressive approach that gives each of us the value we need iteratively to get to the next step.

(Sales / Marketing)
If you want to know what we’re doing, do your own research and come prepared to listen to us about our immediate challenge. Know how that maps to your tool, or find someone who does fast. If you don’t feel like you know enough to do this, roll up your sleeves and engage your colleagues. Lunch-n-learn with product/sales/marketing really help to make you more effective.

I know you want to qualify us as an opportunity for your sales pipeline, but we have a few checkboxes in our head before we’re interested in helping you with your sales goals. Don’t ask me to ‘go steady’ (i.e. regular emails or phone calls) before we’ve had our first date (i.e. i’ve validated that your solution meets basic requirements).

(Product / Marketing)
Your “download” process should really happen from a command line, not from a 6-step website download process (that’s so 90s) and don’t bother us with license keys. Handle the activation process for us. Just let us get in to code (or whatever) and fumble around a little first…because we’re likely engineers and we like to take things apart to understand them. So long as your process isn’t kludgy, we’ll get to a point where we have some really relevant questions.

(Marketing / Sales)
And we’ll have plenty of questions. Make it absurdly easy reach out to you. Don’t be afraid if you can’t answer them, and don’t try to preach value if we’re simply looking for a technical answer. Build relationships internally so you can get a technical question answered quickly. Social and community aren’t just marketing outbound channels, they’re inbound too. We’ll use them if we see them and when we need them.

(Marketing / Community / Relations)
Usage of social channels vary per person and role, so have your ears open on many of them: Github, Stack Overflow, Twitter, (not Facebook pls), LinkedIn, your own community site…make sure your marketing+sales funnel is optimized to accept me in the ‘right’ way (i.e. don’t put me in a marketing list).

Don’t use bots. Just don’t. Be people, like me.

(Sales / BizDev)
As I reach out, ask me about me. If I’m a dev, ask what I’m building. If I’m a release engineer, ask how you can help support your team. If I’m a manager, ask me how I can help your team deliver what they need to deliver faster. Have a 10-second pitch, but start the conversation right in order to earn trust so you can ask your questions.


2. Help Me Buy What I Need Without Lock-in

(Sales / Customer Success)
Even after we’re prepared to sign a check, we’re still dating. Tools that provide real value will spread and grow in usage over time. Let us buy what we need, do a PoC (which we will likely need some initial help with), then check in with us occasionally (customer success) to keep the account on the right train tracks.

(Sales / Marketing)
Help us make the case for your tool. Have informational materials, case studies, competitive sheets, and cost/value break downs that we may need to justify an expenditure that exceeds our discretionary budget constraints. Help us align our case depending on whether it will be coming out of a CapEx or OpEx line. Help us make it’s value visible and promote what an awesome job we did to pick the right solution for everyone it benefits. Don’t wait for someone to hand you what you need, try things and share your results.

Pick a pricing model that meets both your goals and mine. Yes, that’s complicated. That’s why it’s a job for the Product Team. As professional facilitators and business drivers, seek input from everyone: sales, marketing, customers!!!, partners, and friends of the family (i.e. trusted advisors, brand advocates, professional services). Don’t be greedy; be realistic. Have backup plans on the ready, and communicate pricing changes proactively.

Depending on your pricing model, really help us pick the right one for us, not the best one for you. Though this sounds counter-intuitive to your bottom line, doing this well will increase our trust in you. When I trust you, not only will I likely come back to you for more in the future, we’ll also excitedly share this with colleagues and our networks. Some of the best public champions for a technology are those that use it and trust the team behind it.

3. Integrate Easily Into My Landscape

Let us see you as code. If your solution is so proprietary that we can’t see underlying code (like layouts, test structure, project file format), re-think your approach because if it’s not code, it probably won’t fit easily into our delivery pipeline. Everything is code now…the product, the infrastructure configuration, the test artifacts, the deployment semantics, the monitoring and alerting…if you’re not in there, forget it.

Integrate with others. If you don’t integrate into our ecosystem (i.e. plugins to other related parts of our lifecycle), you’re considered a silo and we hate silos. Workflows have to cross platform boundaries in our world. We already bought other solutions. Don’t be an island, be a launchpad. Be an information radiator.

(Product / Sales / Marketing)
Actually show how your product works in our context…which means you need to understand how people should/do use your product. Don’t just rely on screenshots and product-focused demos. Demonstrate how your JIRA integration works, or how your tool is used in a continuous integration flow in Jenkins or Circle CI, or how your metrics are fed into Google Analytics or Datadog or whatever dashboarding or analytics engine I use. The point is (as my new friend Lauri says it)…”show me, don’t tell me”.

(Sales / Marketing)
This goes for your decks, your videos, your articles, your product pages, your demos, your booth conversations, and even your pitch. One of the best technical pitches I ever saw wasn’t a pitch at all…it was a technical demo from the creator of Swagger, Tony Tam at APIstrat Austin 2015. He just showed how SwaggerHub worked, and everyone was like ‘oh, okay, sign me up’.

Truth be told, I only attended to see what trouble I could cause.  Turns out he showed a tool called Swagger-Inflector and I was captivated.
– Darrel Miller on Bizcoder

(Sales / Product)
If you can’t understand something that the product team is saying, challenge them on it and ask them for help to understand how and when to sell the thing. Product, sales enablement is part of your portfolio, and though someone else might execute it, it’s your job to make sure your idea translates into an effective sales context (overlap/collaborate with product marketing a lot).

(Product / Customer Support)
As part of on-boarding, have the best documentation on the planet. This includes technical documentation (typically delivered as part of the development lifecycle) that you regularly test to make sure is accurate. Also provide how-to articles that are down to earth. Show me the ‘happy path’ so I can use it as a reference to know where I’ve gone wrong on my integration.

(Product / Developers / Customer Support)
Also provide validation artifacts, like tools or tests that make sure I’ve integrated your product into my landscape correctly. Don’t solely rely on professional services to do this unless most other customers have told you this is necessary, which indicates you need to make it easier anyway.

(Customer Support / Customer Success / Community / Relations)
If I get stuck, as me why and how I’m integrating your thing into my stuff to get some broader context on my ultimate goal. Then we can row in that direction together. Since I know you can’t commit unlimited resources to helping customers, build a community that helps each other and reward contributors when they help each other. A customer gift basket or Amazon gift card to the top external community facilitators goes a long way to gaming yourself into a second-level support system to handle occasional support overflows.

4. Improve What You Do to Help Me With What I Do

(Product / Development / Customer Support)
Fix things that are flat out broken. If you can’t now, be transparent and diplomatic about how your process works, what we can do as a work-around in the mean time, and receive our frustration well. If we want to contribute our own solution or patch, show gratitude not just acknowledgement, otherwise we won’t go the extra mile again. And when we contribute, we are your champions.

Talk to us regularly about what would work better for us, how we’re evolving our process, and how your thing would need to change to be more valuable in our ever-evolving landscape. Don’t promise anything, but also don’t hide ideas. Selectively share items from your roadmap and ask for our candid opinion. Maybe even hold regional user groups or ask us to come speak to your internal teams as outside feedback from my point of view as a customer.

Get out to the conferences, be in front of people and listen to their reactions. Do something relevant yourself and don’t be just another product-headed megalomaniac. Be part of the community, don’t just expect to use them when you want to say something. Host things (maybe costs money), be a volunteer occasionally, and definitely make people feel heard.

Be careful that your people-to-people engagements don’t suffer from technical impedance mismatch. Sales and marketing can be at booths, but should have a direct line to someone who can answer really technical questions as they arise. We engineers can smell marketing/sales from a mile away (usually because they smell showered and professional). But it’s important to have our questions answered and to feel friendly. This is what’s great about having your Developer Relations people there…we can nerd out and hit it off great. I come away with next steps that you (marketing / sales) can follow-up on. And make sure you have a trial I can start in on immediately. Use every conversation (and conference) as a learning opportunity.

Build the shit out of your partner ecosystem so it’s easier for me to get up and running with integrations. Think hard before you put your new shiny innovative feature in front of a practical thing like a technical integration I and many others have been asking for.

(Development / Community / Marketing / Relations)
If there is documentation with code in it and you need API keys or something, inject them in to the source code for me when I’m logged in to your site (like SauceLabs Appium tutorials). I will probably copy and paste, so be very careful about the code you put out there because I will judge you for it when it doesn’t work.

(Marketing / Product)
When you do push new features, make sure that you communicate to me about things I am sure to care about. This means you’ll have to keep track of what I indicate I care about (via tracking my views on articles, white paper downloads, sales conversations, support issues, and OPT-IN newsletter topics). I’m okay with this if it really benefits me, but if I get blasted one too many times, I’ll disengage/unsubscribe entirely.

Summary: Help Me Get to the Right Place Faster…Always

None of us have enough time for all the things. If you want to become a new thing on my plate, help me see how you can take some things off of my plate first (i.e. gain time back). Be quick to the point, courteous, and invested in my success. Minimize transaction (time) cost in every engagement.

(Sales, et al: “Let’s Get Real or Let’s Not Play” is a great read on how to do this.)

At often as appropriate, ask me what’s on my horizon and how best we can get there together. Even if I’m heads-down in code, I’ll remember that you were well-intentioned and won’t write you off for good.

NEXT STEPS: share your opinions, thoughts, suggestions in the comments section below! Or ping me up on Twitter@paulsbruce or LinkedIn.

More reading:

Stop Using the ‘Staging’ Server – DevOps Days Boston

Chloe Condon presented on how containers and IaC (infrastructure as code) can help us skip over the ‘staging server’ part of traditional deployment strategies. This article is a loose transcript of taking points from her talk at DevOps Days Boston 2017.

What’s Wrong a Staging Environment?

Feedback from a traditional staging environment is too slow. The only thing the reviewer knows is if unit tests passed, the rest of the tests are run after that. “Staging” is usually reserved for integration, functional, UI, and performance testing (i.e. complete feedback). Too little, too late.

We’re all too familiar with the question “who broke staging?”. The fragility and centrality of this staging model creates bottlenecks. Also, the very first time something is brought into pipeline usually happens in staging and that’s when ‘broken’ occurs.

There’s lots of “friction” between environments. Dev/test/staging are often not equivalent and are configured differently, causing deployment between environments to be a hastle. Flows across these environments are time-consuming (environment variables and files missing).

Code changes are being tested more extensively in staging, which means there’s little room for timely feedback.

Ephemeral Environments

The great thing is now, we have containers. We can run every build, package it in a container, then run tests on it in the same pipeline. Microservices are well-suited for this type of model, but also distributed stacks (like a web app, database, and supporting APIs) benefit from this model too.

Additionally, most stages of testing can be containerized. Leaving performance and scalability off for a moment, that enables us to run integration, functional, and security testing as part of a complete containerized package.

The problem still remains: we have the rule that staging has to be as close to prod as possible. This might serve some of those tests (like performance and security), but is largely dis-optimal for unit, integration, and functional tests. Performance tests could also be run earlier to provide us a better heads-up about degradations that creep in over time. In practice, late-stage environments don’t match reality and this causes friction..

So let’s reconsider the premise that all of our non-unit testing has to be run in a shared environment that bottlenecks us. This helps us shift feedback to the left. (Chloe says to insert Beyonce clip here.)

Containers = Consistency & Composition & Completeness

So now the container we’re handing off is much more complete: it includes a more complete set of self-testing capabilities that we can ask our pipeline to run for us.

You can hand off containers to your customers (usually internal but maybe even external) and with composition, you have confidence that the bits they’re running are the same as what you tested and what you want them to have.

Infrastructure as Code

Team should define what code is part of the process. When people are able to spin things up automatically on their own, this streamlines an important part of their process. Visualizations help a lot, which is why CodeFresh and other platforms have visual controls over the package and deploy process.

Infrastructure-as-Code (IaC) includes Dockerfiles, but also deployment scripts. If it’s code, treat it like it’s important because otherwise it’s outside the flow of delivery.

Paul’s take: IaC also includes a whole bunch of other stuff too. For example:

  • Composition scripts (like Docker compose, Kubernetes scripts)
  • Secrets management configuration
  • Network configuration
  • Database configuration (might include data)
  • Tests and test data
  • Feature flag configuration
  • Monitoring configuration & scripts

Implementing IaC requires a few things:

  1. Your team agrees and has an in-depth knowledge of how to push healthy code artifacts into the pipeline. No one is an island, others’ contributions need to be readable and easily debuggable.
  2. A resilient process (i.e. pipeline) including dynamic build/package/test semantics enables contributors to focus on the ‘push’ and feedback rather than the semantics.
  3. Information radiators along the process must cater feedback as granularly as possible: individual contributor first, then channel, then team. ChatOps bots give you immediate feedback about breakage as soon as it occurs.

A complete IaC artifact list will require collaboration between multiple contributors, which facilitates communication. Just make sure that empathy and positive reinforcement is part of your management strategy.

Questions from the Audience:

Q: “How do you describe the state of the code in PRs?”

Chloe: “Badges in the repo, some conventions, success flags on Codefresh.”

Q: “How often do people actually use this for pre-stage vs. just going to prod?”

Chloe: “For lots of people, they maintain separate branches for multiple environments. Then you can introduce new versions dynamically.”

Q: “In more complex systems, is there a composition management layer?”

Chloe: “This is the beauty of the compose files. When you treat them like code, this makes management a lot easier.”

More reading:

Iterative Security – DevOps Days Boston 2017

Tom McLaughlin presented on iterative security,  incorporating security into DevOps cycles through early detection and prevention of vulnerabilities. His slide are here. This is a loose transcript of taking points from his talk at DevOps Days Boston 2017.

Breaches in Practice vs. Theory

Tom made the point that breaches often occur in areas that aren’t covered by development or security teams because vulnerabilities escape due to a lack of objective and continuous risk assessment.

Code still has passwords and tokens in it. Lots of assumed knowledge going from dev to prod. Account access and password policies, patching, is usually handled by someone else. Leads to “good luck, it’s up to you” syndrome.

There’s also security paralysis. When we don’t think we know how to do something, we just won’t. And we’re rewarded for accomplishing things. So long as disaster doesn’t strike, we get by.

Why Do We Suffer from Security Breaches?

Mostly, we get distracted. 0-day exploits, crypto weaknesses, hash collisions. We get distracted by logos and discussion threads, but not patching the system. We get caught up in all of this stuff instead of actually doing what improves security.

Think about all the publicly exposed mongodb and Elasticsearch instances you’ve seen…being proactive isn’t always hard, but is rarely incentivized well.

We don’t do a good job explaining how to get from where you are to where you should be. We also don’t always practice critical thinking. What is you goal? What is your posture about security? Proactive, reactive?

We also don’t always have a wealth of layered instructional content. There’s a lot of information at the extremes (101 and advanced tutorials), but most of us are in the middle.

Solve the Problem Like You’re At Work

So then let’s develop a threat model together, as an example. Let’s start by being realistic. What kind of org and product matters? Align with your company on risk management policies and processes.

Prioritize. Use DREAD (or STRIDE) for rating threats and modeling risk.

Also take care of the easy stuff: USB sticks over man in the ceiling.

Do you still use a service after it’s been breached? I leave that up to you.”

Decompose the system. Map out your architecture and understand the systems. Look at the perimeters, how are credentials proliferated? Understand your data pipeline, where is your really valuable data stored?

Take time to consider things like exposed net ports, unpached containers, weak secrets…there are tools for this. These tools can be found in later slide here.

Putting a Response to Security Threats into Action

Two words: impose constraints. To find which constrains work for you and start with a simple discovery process that includes:

  • Time, how long to solve? Timebox solutions, defensible use of existing time.
  • Complexity, how hard is it? Ask deep questions, iterate over which help.
  • Risk, how risky is the problem and solution?

Secrets management is a first start. Tom pretty much pwns this space and I encourage you to seriously check out his extensive work on the topic here.

In terms of tactical actions you can take today, Tom mentioned these few, but of course there are more:

  • At the code, start with at least something like git-crypt.
    Ask yourself, what should be thrown out before it goes anywhere else?
  • In configuration management scripts:
    Developing a master re-key strategy is a great exercise to flesh this out.
  • Storage…a tool like sneaker for S3
    Really makes you ask questions…who/how are buckets managed.


We need to be better at security, continuous or otherwise. We need to act. There are simple things you can do, but they need to be aligned to your team/organization risk strategy. And make it easy for others to do the right thing, so that it’s far more likely to happen without imposing huge effort cost.

Tom’s a great speaker, engaging and fun to listen to. He is also a huge community contributor and even runs a distributed DevRel (developer relations) slack group. Tom is currently working on the CloudZero team.

More reading:

Enterprise Wild West – DevOps Days Boston 2017

Rob Cummings‘ keynote at DevOps Days Boston 2017 explored how Simon Wardley’s Pioneers, Settlers, and Town Planners model applies in enterprise engineering and large organizations. The general idea is:

  • Pioneers: explorers of new ideas, create prototypes, prove the need
  • Settlers: stealers of new ideas, move prototypes to MVP, prove feasibility
  • Town Planters: manufacturers, MVPs to industries, prove scalability
Bits or pieces?: On Pioneers, Settlers, Town Planters and Theft

The Problem: Overly Simplistic Approaches

Bi-modal IT splits the org into Mode 1 (systems of record) and Mode 2 (systems of innovation). Mode 1 has less line of sight to customers and is governed by enterprise architecture and governance. Mode 2 often runs into Mode 1 when …. The problem is that often, there’s no flow between Mode 1 and Mode 2. Bi-modal is overly simplistic.

The book “Thinking in Systems” is a great place to start your journey beyond these modes. Transition states and feedback loops exist already in your org, but realizing where they are and how they could be improved takes practice and group engagement.

Paul’s advice: System’s thinking is a much broader topic that, if you haven’t actually studied, it would serve you well to listen to The Fifth Discipline by Peter Senge. As context for my presentation in April on IoT testing, this made me realize that systems thinking was a necessary mental tool moving forward.

Everyone Innovates…Sometimes.

Pioneers live outside standards, fail often, and don’t necessarily make decisions based on metrics. Find the new horizon. That’s how they innovate, they bring ideas from outside in.

Settlers make prototypes real, building trust in the org, kick off ecosystems around the adoption of ideas, but sometimes suffer from adoption problems. They bring ideas further in to the org.

Town Planners focus on ops efficiency, build services and platforms that Pioneers rely on for future innovations. They’re metrics heavy and bring reality to the operation of ideas.

Fostering Friendly Theft

The Wild West is a “theft-based pull model”. There are no mandates. Theft occurs from right to left (pioneers on the left). re-use from left to right.. This is a good thing. Everyone is excellent and everyone both should participate in empathy. Foster feedback loops and maintain pull culture.

The Wild model exists within a team, not as separate departments. Again, for DevOps we’re not talking about traditional cost centers and departments; we’ve got mixed teams that are aligned on a shared goal with their own perspectives on how to do things best, together.

Paul’s Take: DevOps Requires Buy-in from Everyone

For DevOps to work, a team needs to understand and adapt to their organizational ecosystem. So while the micro-mechanics of the Wild West help us pull new ideas in on a continual basis, there has to be an understanding that extends across the whole org.

Many conversations at DevOps Days Boston 2017 on day 1 expressed the need for “buy-in from the top”, but effective DevOps also requires buy in from everyone. Teams need to align the virtues of DevOps to how they can positively impact the organization. It does no good for an SMB VP of Engineering to apply DevOps if the purpose of doing so hasn’t been clearly articulated in terms that other dependencies (like the developers, operations, sales, marketing, finance, and support) understand. But when you do so, it’s much easier to carry people with you in planning and execution.

DevOps is Organizational, Operational, and Orthogonal. Applying it in isolation only decreases the value it brings to us.

Scaling to the Enterprise

Rob shared an anonymized anecdote from a large company where the Wild West model was adopted:

A small group of pioneers realized “we need to fix this, can’t meet customer needs”. They knew how to do it and got CIO sponsorship. The team got to MVP status with code. Unfortunately, the Wild West model was not immediately adopted beyond that initial release.

“We were trying to push the model onto the team.” Even though everything done up to that point focused on ease-for-enterprise (weekly demos, code was open sourced, process transparency), adoption took time.

Eventually, another team took the ideas and model, shipped their thing to production, then other teams followed. “Now we have a ‘proliferation problem’…people started customizing tools and artifacts.” Teams often stuck with some favorite tools, and in DevOps culture, tailoring is huge.

But not everyone wants to build their own house. For example, code pipelines…yuk. So Planners came in and built a commodity pipeline platform. This requires talent, people who have skill and can scale, understand operational efficiency.


Here are a few anti-patterns that will reduce friction and increase your flow.

  1. Using enterprise architecture to prevent waste and force adoption.
    Don’t use it as a gate to get to production!
  2. Relying on innovation labs or CoE for pioneers.
    Teams outside your org toss things in that often don’t work inside the org. Be super-public so settlers are likely to steal. Change CoE to “Center of Practice”, inclusive, then everyone can be excellent.
  3. Don’t forget that your org requires a systems thinking approach.
    Create flows not barriers. Each role is filled with excellent people.


More reading:

No Root Cause in Emergent Behavior – DevOps Days Boston 2017

At DevOpsDays Boston 2017, Matthew Boeckman presented on how emergent behavior in complex systems requires us to re-think our root cause analysis paradigms. His slides are here. I also had a great time talking meta with Matthew afterhours, but that’s for a later post.

Traditional RCA in Complex Systems

Unfortunately, traditional RCA focuses on what and who. Despite its roots stemming from NASA, in the software world, RCA is misaligned to find only one channel of causality. A fishbone diagram shows this:

This might be okay for simple systems (i.e. 3-tier web/app/data servers). There’s much more to this: networking, hosting, and operating environments. Beyond that, users access in both benign and benevolent ways.

Waterfall encouraged us to minimize complexity by locking down state (i.e. promote a “don’t change” mentality). Waterfall (think 12mo cycles) encourages us to think that change is the developer’s fault. And there were a lot of constrains in the 80s and 90s, most of them are no longer true.

Root cause is fine for static models, but there are bad when it comes to “lots of boxes”, cloud-based dynamic and distributed systems. It’s very hard to trace the source of problems in this new world. Change vectors (a/b testing, reconfigurations, migrations, feature flags) abound, in fact they’re encouraged.

Our systems are far more complex than they were 20 years ago. They involve the whole stack, the whole team, and the whole organization.

Paul’s Take: Occam’s Anti-Razor

A heuristic idea we often employ is Occam’s razor, in general that, the simplest answer is often the right one. Coupled with a confirmation bias, we (humans) often look for a single causal root to the problems we see. Then we build processes that inherit our bias. But what if operational failures occur because of multiple causes, chain reactions that exceed the typical ‘5 whys’ RCA model?

As quickly as the concept of the razor was introduced, Chatton, a contemporary, countered the idea with: “If three things are not enough to verify an affirmative proposition about things, a fourth must be added, and so on.” Similarly, many ascribe a balance of simplicity and complexity in solving problems to the quote “Make things as simple as possible, but no simpler.” by Einstein.

The idea is right fit…right fit of simplicity/complexity to the problem at hand. With complex systems, we can’t always assume that the simple answer is the most useful one in future scenarios.

Our Systems Aren’t Trees, They’re Forests

Emergence is about collective behaviors, systems we connect and integrate over time, and not simply the aggregate of behaviors emitted by individual subcomponents and nodes.

We need to develop, test, deploy, monitor and issue resolve them like the complex semi-organic systems they are, part of an ecosystem of services and fallible subsystems that they are. We can no longer afford to ignore better paradigms for dealing with them.

Enter Systems Thinking. Understanding why things emerge takes more than an ops dashboard and intuition. Sometimes analysis on complex problems requires a multi-variate perspective.

Paul’s advice: System’s thinking is a much broader topic that, if you haven’t actually studied, it would serve you well to listen to The Fifth Discipline by Peter Senge. As context for my presentation in April on IoT testing, this made me realize that systems thinking was a necessary mental tool moving forward.

Systems thinking helps us to identify activities, interactions, and ultimately change vectors contributing to emergent behaviors. Understanding which dials and levers are involved in the problem enables later actions to resolve the issue. This feeling of being at home in the problem space is also similar to “cynefin”, a welch/gaelic term that in Scottish (my heritage!) means:

“a place to live and belong. where the nature of what’s around you feels right and welcoming”

Not at all coincidentally, the Cynefin framework as applied to emergent behavior helps us make quick decisions during and about incident management situations.

Staying Ahead of Emergent Behavior

The fact is that most workforces, small or large, are a revolving door. So is your current system state after multiple releases and infrastructure migrations. There be the monsters. Software is dynamic, and so should be your product discovery process, your learning loops, your incident management model, and so on.

The Cynefin framework gives us this quadrant visual to show that various issues need to be addressed differently:

The fact is, each of these quadrants assume two things:

  1. The issue occurred already, so you need to fix it and learn from it
  2. Information needs to be radiated (sensed) to make “sense” of it

In my after-hours chat with Matthew, we dived into the issue of metrics. Measuring issue tracking goes beyond mean-time-to-resolution (MTTR). Issues that are flagged with *how* they were resolved using Cynefin categories now have an opportunity for improvement.

Paul’s Take: could this be a JIRA custom field? Just thinking out loud.

Tracking the delta on a specific issue (what approach someone thought should be used at first vs. what would have been better after the fact) is a way to measure successfulness and improvement on a spot basis.

Then over time, aggregates can be used to show team and organizational reflexiveness to dynamic, emergent behavior. Though neither of us have customer anecdotes or proof-of-concept clients, I challenge you who are reading this to try it out for a few sprints or whatever intervals you use.


We need to embrace emergent behavior and learn how to approach incidents better using systems thinking and frameworks like Cynefin. Unlike traditional RCA, we’ll need to step out of our comfort zones, see what works, and learn from our mistakes.

Matthew is a Denver, Colorado native, and has spoken at other conferences like Gluecon (wicked!). If you have questions, ping him (and me) up on Twitter and let’s get a dialog going.

DevOps is Organizational, Operational, and Orthogonal

Some people seem to think that DevOps is a buzzword. It is not. At all.

As part of my research for integrating concepts of risk, quality assurance, and continuous testing into the IEEE 2675 working group (DevOps standard), I am realizing that there is no one single articulation of DevOps that seems to fit all contexts. However, in the spirit of DevOps, I’ll continue to iterate past this issue to explore aspects of the paradigm to provide value to people I meet and conversations we have. Here are three I’ve been pondering:

DevOps Is an Organizational Paradigm

DevOps is about breaking apart established paradigms, structures that worked for the prior generation of problems, management methodologies that are no longer the optimal solution for tomorrows problems today. DevOps is challenging institutional values that don’t actually lead to value. DevOps is helping us to take ownership over our success.
We are learning for the first time, every time, and deliberately discovering what we should know as we build the future.
Collaboration, contribution, sharing, learning, improvement, alignment, focus, value. These are words that describe our homegrown methodology, one whose aim is to meet the pace of innovation better than agility alone. Self management, self organization, self improvement. Shared understanding, shared goals, shared vision. Many experiments, many failures, and many wins.
It is fine if a single team wants to “try out DevOps”, but unless the organization is prepared to support and change to value the positive outcomes of that team, initiatives won’t go very far. In this way, it is a relational paradigm that applies between individuals, between teams, and between organizations too.

If you want to go fast, go alone. If you want to go far, go together.

There’s juice to that statement. Fast and far are relative to what needs to get accomplished (see this link for quote etymology).

DevOps Is an Operational Paradigm

Software tools are a huge part of DevOps conversations now. Why? Because automation and efficiency, sure, but also because its easier to feel confident and efficient in our own ignorance than to face the fact that most of software is about finding the right people to build the right software for other people.
Tools are only a part of our conversation. And more often, it’s tools (I mean outspoken assholes here) that dictate how [little] we understand about DevOps. Just because he buys all the ski equipment and reads a lot about which slopes are best doesn’t make him an expert. Practice matters, and practice means knowing the software landscape.
But tools and automation are only an enabler to the work, an outcome of good decisions; it is the team together which holds the capability to make better decisions tomorrow. And every day has new challenges which yesterday’s solutions won’t overcome, not to mention known challenges that demand experience and perseverance.
If you are automating the shit out of your pipeline, good for you. Is this truly helping you learn how to provide people value, or do you more often find employees arguing about which tool and approach is better? This is an example of how hyper-focusing on tools is counter to the goal of DevOps, to iteratively improve our ability to provide value to (and with) people.

DevOps Is an Orthogonal Paradigm

In this way, DevOps is a mindset that also encompasses those who may not necessarily think it applies to them. It must include everyone, each with our own skills and perspectives. It is not simply about developers and operations. It is about connecting contributors to consumption just as much as it connects consumers to contributions. It is about the whole supply chain, the whole delivery pipeline, and the whole collective of people impacting each other.
For DevOps to be really successful, its execution must be inclusive across boundaries. That includes more than just engineering teams, it involves recruitment, marketing, sales, customer support, HR, PR, and finance. In the IEEE 2675 working group, we are finding that these other groups are a necessary part of the supply chain that DevOps teams depend on. A few examples of the need for an orthogonal approach to DevOps are:
  • How can you go “faster” if your acquisition process takes many months?
  • How can you go “faster” if a supplier doesn’t provide a way to validate that you integrated their product or service correctly?
  • How can testing be continuous if it isn’t automated and therein scalable?
  • How can you expect marketing to crush numbers if you don’t integrate them into your sprints (their work often needs weeks/months of lead time)?
  • If your onboarding process doesn’t train new engineering recruits (dev/test/ops/PM/IT) on your lifecycle, how can you expect them to “go fast”?
  • If it takes days/weeks for customer feedback to reach development cycles, how can you expect to be building the “right thing” tomorrow?
Every one of these questions takes some kind of answer that includes collaboration, which you can’t expect unless you foster positive work culture and encourage people to improve professionally and personally.

DevOps Is Just a Word

DevOps is a word we have now for the next set of ideas for how to sustainable move fast in the right direction. Unlike a manifesto, its goal isn’t to constrain, but to evolve. Hopefully we can come up with a better name in the future, which is highly likely because we iteratively learn. But DevOps is what we have now and so far its doing us a lot of good.

Streaming Tweets to InfluxDB in Node.js

This week, I’ve been exploring the InfluxData tech stack. As a muse, I decided to move some of my social media sharing patterns formal algorithms. I also want to use my blog as a source for keywords, filter out profanity, and apply sentiment analysis to clarify relevant topics in the tweets.

Github repo for this article:

What Is InfluxData?

Simply put, it’s a modern engine for metrics and events. You stream data in, then you have a whole host of options for real-time processing and analysis. Their platform diagram speaks volumes:

From Telegraph Overview

Based on all open source components, the InfluxData platform has huge advantages over other competitors in terms of extensibility, language support, and its community. They have cloud and enterprise options when you need to scale your processing up too.

For now, I want to run stuff locally, so I went with the free sandbox environment. Again, completely open source stack bits, which is very cool of them as lots of their work ends up as OSS contributions into these bits.

Why Process Twitter Events  in InfluxDB?

Well, frankly, it’s an easy source for real-time data. I don’t have a 24/7 Jenkins instance or pay-for stream of enterprise data flowing in right now, but if I did, I would have started there because they already have a Jenkins data plugin. 🙂

But Twitter, just like every social media platform, is a firehose of semi-currated data. I want to share truly relevant information, not the rest of the garbage. To do this, I can pre-filter based on keywords from my blog and ‘friendlies’ that I’ve trusted enough to re-share in the past.

The point is not to automatically re-share content (which would be botty), but to queue up things in a buffer that would likely be something I would re-tweet. Then I can approve or reject these suggestions, which in turn can be a data stream to improve the machine learning algorithms that I will build as Kapacitor user-defined functions later on.

Streaming Data Into InfluxDB

There’s a huge list of existing, ready-to-go plugins for Telegraph, the collection agent. They’ve pretty much thought of everything, but I’m a hard-knocks kind of guy. I want to play with the InfluxDB APIs, so for my exploration I decided to write a standalone process in Node.js to insert data directly into InfluxDB.

To start, let’s declaring some basic structures in Node to work with InfluxDB:

  • dbname: the InfluxDB database to insert into
  • measure: the measurement (correlates to relational table) to store data with
  • fields: the specific instance data points to collect on every relevant Tweet
  • tags: an extensible list of topic-based keywords to associate with the data

Making Sure That the Database Is Created

Of course, we need to ensure that there’s a place and schema for our Twitter data points to land as they come in. That’s simple:

Saving Pre-screened Tweets as InfluxDB Data Points

Minus the plumbing of the Twitter API, inserting Tweets as data points to InfluxDB is also very easy. We simply need to match our internal data structure to than of the schema above:

Notice that the keywords (tags) can be a simple Javascript array of strings. I’m also optionally inserting the raw data for later analysis, but aggregating some of the most useful information for InfluxQL queries as fields.

The InfluxDB Node.js client respects ES6 Promises, as we can see with the ‘.catch’ handler. Huge help. This allows us to create robust promise chains with easy-to-read syntax. For more on Promises, read this article.

Verifying the Basic Data Stream

To see that the data is properly flowing in to the InfluxData platform, we can use Chronograf in a local sandbox environment and build some simple graphs:

To do this, we use the Graph Editor to write a basic InfluxQL statement:

The simple graph shows a flow of relevant tweets grouped by keyword so we can easily visualize as real-time data comes in.

A Few Ideas and Next Steps

Of the many benefits of processing data on the InfluxData platform, processing in Kapacitor seems to be one of the most interesting areas.

Moving forward I’d like to:

  1. Move Sentiment Analysis with Rosette from Node into Kapacitor
  2. Add Machine Learning into Kapacitor for
    A) clarifying relevance of keywords based on sentiment entity extraction
    B) extract information about the positivity / negativity of the tweet
  3. Catch high-relevance notifications and send to Buffer ‘For Review’ queue
    A) accepts and rejects factor back into machine learning algorithm
    B) follow-up statistics about re-shares further inform ML algorithm
  4. Have Kapacitor alert when:
    A) specific high-priority keywords are used (use ML based on my tweets)
    B) aggregate relevance for a given keyword spikes (hot topic)
    C) a non-tracked keyword/phrase is used in multiple relevant tweets
    (could be a related topic I should track, event hashtag, or something else)

As You Build Your Own, Reach Out!

I’m sure as I continue to implement some of these ideas, I’ll probably need help. Fortunately, Influx has a pretty active and helpful Community site. Everything from large exports, plugin development, and IoT gateways are discussed there. Jack Zampolin, David Simmons, and even CTO Paul Dix are just a few of the regular contributors to the conversation over there.

And as always, I like to help. As you work through your own exploration of InfluxData, feel free to reach out via Twitter or LinkedIn if you have comments, questions, or ideas.