Stop Using the ‘Staging’ Server – DevOps Days Boston

Chloe Condon presented on how containers and IaC (infrastructure as code) can help us skip over the ‘staging server’ part of traditional deployment strategies. This article is a loose transcript of taking points from her talk at DevOps Days Boston 2017.

What’s Wrong a Staging Environment?

Feedback from a traditional staging environment is too slow. The only thing the reviewer knows is if unit tests passed, the rest of the tests are run after that. “Staging” is usually reserved for integration, functional, UI, and performance testing (i.e. complete feedback). Too little, too late.

We’re all too familiar with the question “who broke staging?”. The fragility and centrality of this staging model creates bottlenecks. Also, the very first time something is brought into pipeline usually happens in staging and that’s when ‘broken’ occurs.

There’s lots of “friction” between environments. Dev/test/staging are often not equivalent and are configured differently, causing deployment between environments to be a hastle. Flows across these environments are time-consuming (environment variables and files missing).

Code changes are being tested more extensively in staging, which means there’s little room for timely feedback.

Ephemeral Environments

The great thing is now, we have containers. We can run every build, package it in a container, then run tests on it in the same pipeline. Microservices are well-suited for this type of model, but also distributed stacks (like a web app, database, and supporting APIs) benefit from this model too.

Additionally, most stages of testing can be containerized. Leaving performance and scalability off for a moment, that enables us to run integration, functional, and security testing as part of a complete containerized package.

The problem still remains: we have the rule that staging has to be as close to prod as possible. This might serve some of those tests (like performance and security), but is largely dis-optimal for unit, integration, and functional tests. Performance tests could also be run earlier to provide us a better heads-up about degradations that creep in over time. In practice, late-stage environments don’t match reality and this causes friction..

So let’s reconsider the premise that all of our non-unit testing has to be run in a shared environment that bottlenecks us. This helps us shift feedback to the left. (Chloe says to insert Beyonce clip here.)

Containers = Consistency & Composition & Completeness

So now the container we’re handing off is much more complete: it includes a more complete set of self-testing capabilities that we can ask our pipeline to run for us.

You can hand off containers to your customers (usually internal but maybe even external) and with composition, you have confidence that the bits they’re running are the same as what you tested and what you want them to have.

Infrastructure as Code

Team should define what code is part of the process. When people are able to spin things up automatically on their own, this streamlines an important part of their process. Visualizations help a lot, which is why CodeFresh and other platforms have visual controls over the package and deploy process.

Infrastructure-as-Code (IaC) includes Dockerfiles, but also deployment scripts. If it’s code, treat it like it’s important because otherwise it’s outside the flow of delivery.

Paul’s take: IaC also includes a whole bunch of other stuff too. For example:

  • Composition scripts (like Docker compose, Kubernetes scripts)
  • Secrets management configuration
  • Network configuration
  • Database configuration (might include data)
  • Tests and test data
  • Feature flag configuration
  • Monitoring configuration & scripts

Implementing IaC requires a few things:

  1. Your team agrees and has an in-depth knowledge of how to push healthy code artifacts into the pipeline. No one is an island, others’ contributions need to be readable and easily debuggable.
  2. A resilient process (i.e. pipeline) including dynamic build/package/test semantics enables contributors to focus on the ‘push’ and feedback rather than the semantics.
  3. Information radiators along the process must cater feedback as granularly as possible: individual contributor first, then channel, then team. ChatOps bots give you immediate feedback about breakage as soon as it occurs.

A complete IaC artifact list will require collaboration between multiple contributors, which facilitates communication. Just make sure that empathy and positive reinforcement is part of your management strategy.

Questions from the Audience:

Q: “How do you describe the state of the code in PRs?”

Chloe: “Badges in the repo, some conventions, success flags on Codefresh.”

Q: “How often do people actually use this for pre-stage vs. just going to prod?”

Chloe: “For lots of people, they maintain separate branches for multiple environments. Then you can introduce new versions dynamically.”

Q: “In more complex systems, is there a composition management layer?”

Chloe: “This is the beauty of the compose files. When you treat them like code, this makes management a lot easier.”

More reading: