Beyond DevOps: The ‘Next’ Management Theory

In a conversation today with Ken Mugrage (organizer of DevOps Days Seattle), the scope of the term ‘DevOps’ came up enough to purposely double-click into it.

‘DevOps’ Is (and Should Be) Limited In Scope

Ken’s view is that the primary context for DevOps is in terms of culture, as opposed to processes, practices, or tools. To me, that’s fine, but there’s so much not accounted for that I feel I have to generalize a bit to get to where I’m comfortable parsing the hydra of topics in the space.

Like M-theory which attempts to draw relationships in how fundamental particles interact with each other, I think that DevOps is just a single view of a particular facet of the technology management gem.

DevOps is an implementation of a more general theory, a ‘next’ mindset over managing the hydra. DevOps addresses how developers and operations can more cohesively function together. Injecting all-the-things is counter to the scope of DevOps.

Zen-in: A New Management Theory for Everyone

Zen-in (ぜんいん[全員]) is a Japanese term that means ‘everyone in the group’. It infers a boundary, but challenges you to think of who is inside that boundary. Is it you? Is it not them? Why not? Who decides? Why?

By ‘management’ theory, I don’t mean another ‘the silo of management’. I literally mean the need to manage complexity, personal, technological, and organizational. Abstracting up a bit, the general principals of this theory are:

  • Convergence (groups come together to accomplish a shared goal)
  • Inclusion (all parties have a voice, acceptance of constraints)
  • Focus (alignment to shared goal, strategies, and tactics)
  • Improvement (learning loops, resultant actions, measurement, skills, acceleration, workforce refactoring, effective recruiting)
  • Actualization (self-management, cultural equilibrium, personal fulfillment)

I’ll be writing more on this moving forward as I explore each of these concepts, but for now I think I’ve found a basic framework that covers a lot of territory.

I Need Your Help to Evolve This Conversation

True to Zen-in, if you’re reading this, you’re already in the ‘group’. Your opinions, questions, and perspectives are necessary to iterate over how these concepts fit together.

Share thoughts in the comments section below! Or ping me up on Twitter@paulsbruce or LinkedIn.

 

How to Be a Good DevOps Vendor

This article is intended for everyone involved in buying or selling tech, not just tooling vendors. The goal is to paint a picture of what an efficient supply and acquisition process in DevOps looks like. Most of this article will be phrased from a us (acquirer) to you (supplier) perspective, but out of admiration for all.

Developers, Site-Reliability Engineers, Testers, Managers…please comment and add to this conversation because we all win when we all win.

MP3: https://soundcloud.com/paulsbruce/how-to-be-a-good-devops-vendor

I’ll frame my suggestions across a simplified four stage customer journey:

Make It Easy for Me to Try Your Thing Out

(Product / Sales)
Make the trial process as frictionless as possible. This doesn’t mean hands off, but rather a progressive approach that gives each of us the value we need iteratively to get to the next step.

(Sales / Marketing)
If you want to know what we’re doing, do your own research and come prepared to listen to us about our immediate challenge. Know how that maps to your tool, or find someone who does fast. If you don’t feel like you know enough to do this, roll up your sleeves and engage your colleagues. Lunch-n-learn with product/sales/marketing really help to make you more effective.

(Sales)
I know you want to qualify us as an opportunity for your sales pipeline, but we have a few checkboxes in our head before we’re interested in helping you with your sales goals. Don’t ask me to ‘go steady’ (i.e. regular emails or phone calls) before we’ve had our first date (i.e. i’ve validated that your solution meets basic requirements).

(Product / Marketing)
Your “download” process should really happen from a command line, not from a 6-step website download process (that’s so 90s) and don’t bother us with license keys. Handle the activation process for us. Just let us get in to code (or whatever) and fumble around a little first…because we’re likely engineers and we like to take things apart to understand them. So long as your process isn’t kludgy, we’ll get to a point where we have some really relevant questions.

(Marketing / Sales)
And we’ll have plenty of questions. Make it absurdly easy reach out to you. Don’t be afraid if you can’t answer them, and don’t try to preach value if we’re simply looking for a technical answer. Build relationships internally so you can get a technical question answered quickly. Social and community aren’t just marketing outbound channels, they’re inbound too. We’ll use them if we see them and when we need them.

(Marketing / Community / Relations)
Usage of social channels vary per person and role, so have your ears open on many of them: Github, Stack Overflow, Twitter, (not Facebook pls), LinkedIn, your own community site…make sure your marketing+sales funnel is optimized to accept me in the ‘right’ way (i.e. don’t put me in a marketing list).

Don’t use bots. Just don’t. Be people, like me.

(Sales / BizDev)
As I reach out, ask me about me. If I’m a dev, ask what I’m building. If I’m a release engineer, ask how you can help support your team. If I’m a manager, ask me how I can help your team deliver what they need to deliver faster. Have a 10-second pitch, but start the conversation right in order to earn trust so you can ask your questions.

 

Help Me Buy What I Need Without Lock-in

(Sales / Customer Success)
Even after we’re prepared to sign a check, we’re still dating. Tools that provide real value will spread and grow in usage over time. Let us buy what we need, do a PoC (which we will likely need some initial help with), then check in with us occasionally (customer success) to keep the account on the right train tracks.

(Sales / Marketing)
Help us make the case for your tool. Have informational materials, case studies, competitive sheets, and cost/value break downs that we may need to justify an expenditure that exceeds our discretionary budget constraints. Help us align our case depending on whether it will be coming out of a CapEx or OpEx line. Help us make it’s value visible and promote what an awesome job we did to pick the right solution for everyone it benefits. Don’t wait for someone to hand you what you need, try things and share your results.

(Product)
Pick a pricing model that meets both your goals and mine. Yes, that’s complicated. That’s why it’s a job for the Product Team. As professional facilitators and business drivers, seek input from everyone: sales, marketing, customers!!!, partners, and friends of the family (i.e. trusted advisors, brand advocates, professional services). Don’t be greedy; be realistic. Have backup plans on the ready, and communicate pricing changes proactively.

(Sales)
Depending on your pricing model, really help us pick the right one for us, not the best one for you. Though this sounds counter-intuitive to your bottom line, doing this well will increase our trust in you. When I trust you, not only will I likely come back to you for more in the future, we’ll also excitedly share this with colleagues and our networks. Some of the best public champions for a technology are those that use it and trust the team behind it.

Integrate Easily Into My Landscape

(Product)
Let us see you as code. If your solution is so proprietary that we can’t see underlying code (like layouts, test structure, project file format), re-think your approach because if it’s not code, it probably won’t fit easily into our delivery pipeline. Everything is code now…the product, the infrastructure configuration, the test artifacts, the deployment semantics, the monitoring and alerting…if you’re not in there, forget it.

(Product)
Integrate with others. If you don’t integrate into our ecosystem (i.e. plugins to other related parts of our lifecycle), you’re considered a silo and we hate silos. Workflows have to cross platform boundaries in our world. We already bought other solutions. Don’t be an island, be a launchpad. Be an information radiator.

(Product / Sales / Marketing)
Actually show how your product works in our context…which means you need to understand how people should/do use your product. Don’t just rely on screenshots and product-focused demos. Demonstrate how your JIRA integration works, or how your tool is used in a continuous integration flow in Jenkins or Circle CI, or how your metrics are fed into Google Analytics or Datadog or whatever dashboarding or analytics engine I use. The point is (as my new friend Lauri says it)…”show me, don’t tell me”.

(Sales / Marketing)
This goes for your decks, your videos, your articles, your product pages, your demos, your booth conversations, and even your pitch. One of the best technical pitches I ever saw wasn’t a pitch at all…it was a technical demo from the creator of Swagger, Tony Tam at APIstrat Austin 2015. He just showed how SwaggerHub worked, and everyone was like ‘oh, okay, sign me up’.

Truth be told, I only attended to see what trouble I could cause.  Turns out he showed a tool called Swagger-Inflector and I was captivated.
– Darrel Miller on Bizcoder

(Sales / Product)
If you can’t understand something that the product team is saying, challenge them on it and ask them for help to understand how and when to sell the thing. Product, sales enablement is part of your portfolio, and though someone else might execute it, it’s your job to make sure your idea translates into an effective sales context (overlap/collaborate with product marketing a lot).

(Product / Customer Support)
As part of on-boarding, have the best documentation on the planet. This includes technical documentation (typically delivered as part of the development lifecycle) that you regularly test to make sure is accurate. Also provide how-to articles that are down to earth. Show me the ‘happy path’ so I can use it as a reference to know where I’ve gone wrong on my integration.

(Product / Developers / Customer Support)
Also provide validation artifacts, like tools or tests that make sure I’ve integrated your product into my landscape correctly. Don’t solely rely on professional services to do this unless most other customers have told you this is necessary, which indicates you need to make it easier anyway.

(Customer Support / Customer Success / Community / Relations)
If I get stuck, as me why and how I’m integrating your thing into my stuff to get some broader context on my ultimate goal. Then we can row in that direction together. Since I know you can’t commit unlimited resources to helping customers, build a community that helps each other and reward contributors when they help each other. A customer gift basket or Amazon gift card to the top external community facilitators goes a long way to gaming yourself into a second-level support system to handle occasional support overflows.

Improve What You Do to Help Me With What I Do

(Product / Development / Customer Support)
Fix things that are flat out broken. If you can’t now, be transparent and diplomatic about how your process works, what we can do as a work-around in the mean time, and receive our frustration well. If we want to contribute our own solution or patch, show gratitude not just acknowledgement, otherwise we won’t go the extra mile again. And when we contribute, we are your champions.

(Product)
Talk to us regularly about what would work better for us, how we’re evolving our process, and how your thing would need to change to be more valuable in our ever-evolving landscape. Don’t promise anything, but also don’t hide ideas. Selectively share items from your roadmap and ask for our candid opinion. Maybe even hold regional user groups or ask us to come speak to your internal teams as outside feedback from my point of view as a customer.

(Product)
Get out to the conferences, be in front of people and listen to their reactions. Do something relevant yourself and don’t be just another product-headed megalomaniac. Be part of the community, don’t just expect to use them when you want to say something. Host things (maybe costs money), be a volunteer occasionally, and definitely make people feel heard.

(Everyone)
Be careful that your people-to-people engagements don’t suffer from technical impedance mismatch. Sales and marketing can be at booths, but should have a direct line to someone who can answer really technical questions as they arise. We engineers can smell marketing/sales from a mile away (usually because they smell showered and professional). But it’s important to have our questions answered and to feel friendly. This is what’s great about having your Developer Relations people there…we can nerd out and hit it off great. I come away with next steps that you (marketing / sales) can follow-up on. And make sure you have a trial I can start in on immediately. Use every conversation (and conference) as a learning opportunity.

(Product)
Build the shit out of your partner ecosystem so it’s easier for me to get up and running with integrations. Think hard before you put your new shiny innovative feature in front of a practical thing like a technical integration I and many others have been asking for.

(Development / Community / Marketing / Relations)
If there is documentation with code in it and you need API keys or something, inject them in to the source code for me when I’m logged in to your site (like SauceLabs Appium tutorials). I will probably copy and paste, so be very careful about the code you put out there because I will judge you for it when it doesn’t work.

(Marketing / Product)
When you do push new features, make sure that you communicate to me about things I am sure to care about. This means you’ll have to keep track of what I indicate I care about (via tracking my views on articles, white paper downloads, sales conversations, support issues, and OPT-IN newsletter topics). I’m okay with this if it really benefits me, but if I get blasted one too many times, I’ll disengage/unsubscribe entirely.

Summary: Help Me Get to the Right Place Faster…Always

None of us have enough time for all the things. If you want to become a new thing on my plate, help me see how you can take some things off of my plate first (i.e. gain time back). Be quick to the point, courteous, and invested in my success. Minimize transaction (time) cost in every engagement.

(Sales, et al: “Let’s Get Real or Let’s Not Play” is a great read on how to do this.)

At often as appropriate, ask me what’s on my horizon and how best we can get there together. Even if I’m heads-down in code, I’ll remember that you were well-intentioned and won’t write you off for good.

NEXT STEPS: share your opinions, thoughts, suggestions in the comments section below! Or ping me up on Twitter@paulsbruce or LinkedIn.

More reading:

Performance Is (Still) a Feature, Not a Test!

Since I presented the following perspective at APIStrat Chicago 2014, I’ve had many opportunities to clarify and deepen it within the context of Agile and DevOps development:

It’s more productive to view system performance as a feature than to view it as a set of tests you run occasionally.

The more teams I work with, the more I see how performance as a critical aspect of their products. But why is performance so important?

‘Fast’ Is a Subconscious User Expectation

Whether you’re building an API, an app, or whatever, its consumers (people, processes) don’t want to wait around. If your software is slow, it becomes a bottleneck to whatever real-world process it facilitates.

Your Facebook feed is a perfect example. If it is even marginally slower to scroll through it today than it was yesterday, if it is glitchy, halty, or jenky in any way, your experience turns from dopamine-inducing self-gratification to epinephrine fueled thoughts of tossing your phone into the nearest body of water. Facebook engineers know this, which is why they build data centers to test and monitor mobile performance on a per-commit basis. For them, this isn’t a luxury; it’s a hard requirement, as it is for all of us whether we choose to address it or not. Performance is everyone’s problem.

Performance is as critical to delighting people as delivering them features they like. This is why session abandonment rates are a key metric on Cyber Monday.

‘Slow’ Compounds Quickly

Performance is a measurement of availability over time, and time always marches forward. Performance is an aggregate of many dependent systems, and even just one slow link can cause an otherwise blazingly fast process to grind to a halt long enough for people to turn around and walk the other way.

Consider a mobile app; performance is everything. The development team slaves over which list component scrolls faster and more smoothly, spends hours getting asynchronous calls and spinners to provide the user critical feedback so that they don’t think the app has crashed. Then a single misbehaving REST call to some external web API suddenly slows by 50% and the whole user experience is untenable.

The performance of a system is only as strong as it’s weakest link. In technical terms, this is about risk. You at least need to know the risk introduced by each component of a system; only then can you chose how to mitigate the risk accordingly. ‘Risk’ is a huge theme in ISO 29119 and the upcoming IEEE 2675 draft I’m working on, and any seasoned architect would know why it matters.

Fitting Performance into Feature Work

Working on ‘performance’ and working on a feature shouldn’t be two separate things. Automotive designers don’t do this when they build car engines and performance is paramount throughout even the assembly process as well. Neither should it be separate in software development.

However, in practice if you’ve never run a load test, tracked power consumption of a subroutine or analyzed aggregate results, it will be different than building stuff for sure. Comfortability and efficiency come with experience. A lack of experience or familiarity doesn’t remove the need for something critical to occur; it accelerates the need to ask how to get it done.

A reliable code pipeline and testing schedule make all the difference here. Many performance issues take time or dramatic conditions to expose, such as battery degradation, load balancing, and memory leaks. In these cases, it isn’t feasible to execute long-running performance tests for every code check-in.

What does this mean for code contributors? Since they are still responsible for meeting performance criteria, it means that they can’t always press the ‘done’ button today. It means we need reliable delivery pipelines to push code through that checks its performance pragmatically. As pressure to deliver value incrementally mounts, developers are taking responsibility for the build and deployment process through technologies like Docker, Jenkins Pipeline, and Puppet.

It also means that we need to adopt a testing schedule that meets the desired development cadence and real-world constrains on time or infrastructure:

  • Run small performance checks on all new work (new screens, endpoints, etc.)
  • Run local baselines and compare before individual contributors check in code
  • Schedule long-running (anything slower than 2mins) performance tests into pipeline stage after build verification in parallel
  • Schedule nightly performance regression checks on all critical risk workflows (i.e. login, checkout, submit claim, etc.)

How Do You Bake Performance Into Development?

While it’s perfectly fine to adopt patterns like ‘spike and stabilize’ on feature development, stabilization is a required payback of the technical debt you incur when your development spikes. To ‘stabilize’ isn’t just to make the code work, it’s to make it work well. This includes performance (not just acceptance) criteria to be considered complete.

A great place to start making measurable performance improvements is to measure performance objectively. Every user story should contain solid performance criteria, just as it should with acceptance criteria. In recent joint research, I found that higher performing development teams include performance criteria on 50% more of their user stories.

In other words, embedding tangible performance expectations in your user stories bakes performance in to the resulting system.

There are a lot of sub-topics under the umbrella term “performance”. When we get down to brass tacks, measuring performance characteristics often boils down to three aspects: throughput, reliability, and scalability. I’m a huge fan of load testing because it helps to verify all three measurable aspects of performance.

Throughput: from a good load test, you can objectively track throughput metrics like transactions/sec, time-to-first-byte (and last byte), and distribution of resource usage (i.e. are all CPUs being used efficiently). These give you a raw and necessarily granular level of detail that can be monitored and visualized in stand-ups and deep-dives equally.

Reliability: load tests also exercise your code far more than you can independently. It takes exercise to expose if a process is unreliable; concurrency in a load test is like exercise on steroids. Load tests can act as your robot army, especially when infrastructure or configuration changes push you into unknown risk territory.

Scalability: often, scalability mechanisms like load balancing, dynamic provisioning, and network shaping throw unexpected curveballs into your user’s experience. Unless you are practicing a near-religious level of control over deployment of code, infrastructure, and configuration changes into production, you run the risk of affecting real users (i.e. your paycheck). Load tests are a great way to see what happens ahead of time.

 

Short, Iterative Load Testing Fits Development Cycles

I am currently working with a client to load test their APIs, to simulate mobile client bursts of traffic that represent real-world scenarios. After a few rounds of testing, we’ve resolve many obvious issues, such as:

  • Overly verbose logs that write to SQL and/or disk
  • Parameter formats that cause server-side parsing errors
  • Throughput restrictions against other 3rd-party APIs (Google, Apple)
  • Static data that doesn’t exercise the system sufficiently
  • Large images stored as SQL blobs with no caching

We’ve been able to work through most of these issues quickly in test/fail/fix/re-test cycles, where we conduct short all-hands sessions with a developer, test engineer, and myself. After a quick review of significant changes since the last session (i.e. code, test, infrastructure, configuration), we use BlazeMeter to kick of a new API load test written in jMeter and monitor the server in real-time. We’ve been able to rapidly resolve a few anticipated, backlogged issues as well as learn about new problems that are likely to arise at future usage tiers.

The key here is to ‘anticipate iterative re-testing‘. Again I say: “performance is a feature, not a test”. It WILL require re-design and re-shaping as the code changes and system behaviors are better understood. It’s not a one-time thing to verify how a dynamic system behaves given a particular usage pattern.

The outcome from a business perspective of this load testing is that this new system is perceived to be far less of a risky venture, and more the innovation investment needed to improve sales and the future of their digital strategy.

Performance really does matter to everyone. That’s why I’m available to chat with you about it any time. Ping me on Twitter and we’ll take it from there.

A Jenkins Pipeline for Mobile UI Testing with Appium and Docker

In theory, a completely Docker-ized version of an Appium mobile UI test stack sounds great. In practice, however, it’s not that simple. This article explains how to structure a mobile app pipeline using Jenkins, Docker, and Appium.

TL;DR: The Goal Is Fast Feedback on Code Changes

When we make changes, even small ones, to our codebase, we want to prove that they had no negative impact on the user experience. How do we do this? We test…but manual testing is takes time and is error prone, so we write automated unit and functional tests that run quickly and consistently. Duh.

As Uncle Bob Martin puts it, responsible developers not only write code that works, they provide proof that their code works. Automated tests FTW, right?

Not quite. There are a number of challenges with test automation that raise the bar on complexity to successfully getting tests to provide us this feedback. For example:

  • How much of the code and it’s branches actually get covered by our tests?
  • How often do tests fail for reasons that aren’t because the code isn’t working?
  • How accurate was our implementation of the test case and criteria as code?
  • Which tests do we absolutely need to run, and which can we skip?
  • How fast can and must these tests run to meet our development cadence?

Jenkins Pipeline to the Rescue…Not So Fast!

Once we identify what kind of feedback we need and match that to our development cadence, it’s time to start writing tests, yes? Well, that’s only part of the process. We still need a reliable way to build/test/package our apps. The more automated this can be, the faster we can get the feedback. A pipeline view of the process begins with code changes, includes building, testing, and packaging the app so we always have a ‘green’ version of our app.

Many teams chose a code-over-configuration approach. The app is code, the tests are code, server setup (via Puppet/Chef and Docker) is code, and not surprisingly, our delivery process is now code too. Everything is code, which lets us extend SCM virtues (versioning, auditing, safe merging, rollback, etc.) to our entire software lifecycle.

Below is an example of ‘process-as-code’ is Jenkins Pipeline script. When a build project is triggered, say when someone pushes code to the repo, Jenkins will execute this script, usually on a build agent. The code gets pulled, the project dependencies get refreshed, a debug version of the app and tests are build, then the unit and UI tests run.

Notice that last step? The ‘Instrumented Tests’ stage is where we run our UI tests, in this case our Espresso test suite using an Android emulator. The sharp spike in code complexity, notwithstanding my own capabilities, reflects reality. I’ve seen a lot of real-world build/test scripts which also reflect the amount of hacks and tweaks that begin to gather around the technologically significant boundary of real sessions and device hardware.

A great walkthrough on how to set up a Jenkinsfile to do some of the nasty business of managing emulator lifecycles can be found on Philosophical Hacker…you know, for light reading on the weekend.

Building a Homegrown UI Test Stack: Virtual Insanity

We have lots of great technologies at our disposal. In theory, we could use Docker, the Android SDK, Espresso, and Appium to build reusable, dynamic nodes that can build, test, and package our app dynamically.

Unfortunately, in practice, the user interface portion of our app requires hardware resources that simply can’t be executed in a timely manner in this stack. Interactive user sessions are a lot of overhead, even virtualized, and virtualization is never perfect.

Docker runs under either a hyperkit (lightweight virtualization layer on Mac) or within a VirtualBox host, but neither of these solutions support nested virtualization and neither can pass raw access to the host machine’s VTX instruction set through to containers.

What’s left for containers is a virtualized CPU that doesn’t support the basic specs that the Android emulator needs to use host GPU, requiring us to run ‘qemu’ and ARM images instead of native x86/64 AVD-based images. This makes timely spin-up and execution of Appium tests so slow that it renders the solution infeasible.

Alternative #1: Containerized Appium w/ Connection to ADB Device Host

Since we can’t feasibly keep emulation in the same container as the Jenkins build node, we need to split out the emulators to host-level hardware assisted virtualization. This approach also has the added benefit of reducing the dependencies and compound issues that can occur in a single container running the whole stack, making process issues easier to pinpoint if/when they arise.

So what we’ve done is decoupled our “test lab” components from our Jenkins build node into a hardware+software stack that can be “easily” replicated:

Unfortunately, we can no longer keep our Appium server in a Docker container (which would make the process reliable, consistent across the team, and minimize cowboy configuration issues). Even after you:

  • Run the appium container in priviledged mode
  • Mount volumes to pass build artifacts around
  • Establish an SSH tunnel from container to host to use host ADB devices
  • Establish a reverse SSH tunnel from host to container to connect to Appium
  • Manage and exchange keys for SSH and Appium credentials

…you still end up dealing with flaky container-to-host connectivity and bizarre Appium errors that don’t occur if you simply run Appium server on bare metal. Reliable infrastructure is a hard requirement, and the more complexity we add to the stack, the more (often) things go sideways. Sad but true.

Alternative #2: Cloud-based Lab as a Service

Another alternative is to simply use a cloud-based testing service. This typically involves adding credentials and API keys to your scripts, and paying for reserved devices up-front, which can get costly. What you get is hassle-free, somewhat constrained real devices that can be easily scaled as your development process evolves. Just keep in mind, aside from credentials, you want to carefully managed how much of your test code integrates custom commands and service calls that can’t easily be ported over to another provider later.

Alternative #3: Keep UI Testing on a Development Workstation

Finally, we could technically run all our tests on our development machine, or get someone else to run them, right? But this wouldn’t really translate to a CI environment and doesn’t take full advantage of the speed benefits of automation, neither of which help is parallelize coding and testing activities. Testing on local workstations is important before checking in new tests to prove that they work reliably, but doesn’t make sense time-wise for running full test suites in continuous delivery/deployment.

Alternative #4: A Micro-lab for Every Developer

Now that we have a repeatable model for running Appium tests, we can scale that out to our team. Since running emulators on commodity hardware and open source software is relatively cheap, we can afford a “micro-lab” for each developer making code changes on our mobile app. The “lab” now looks something like this:

As someone who has worked in the testing and “lab as a service” industries, there are definitely situations where some teams and organizations outgrow the “local lab” approach. Your IT/ops team might just not want to deal with per-developer hardware sprawl. You may not want to dedicate team members to be the maintainers of container/process configuration. And, while Appium is a fantastic technology, like any OSS project it often falls behind in supporting the latest devices and hardware-specific capabilities. Fingerprint support is a good example of this.

The Real Solution: Right { People, Process, Technology }

My opinion is that you should hire smart people (not one person) with a bit of grit and courage that “own” the process. When life (I mean Apple and Google) throw you curveballs, you need people who can quickly recover. If you’re paying for a service to help with some part of your process as a purely economic trade-off, do the math. If it works out, great! But this is also an example of “owning” your process.

Final thought: as more and more of your process becomes code, remember that code is a liability, not an asset. The less of if, the more lean your approach, generally the better.

More reading:

Automating the Quality of Your Digital Front Door

Mobile is the front door to your business for most / all of your users. But how often do you use your front door, a few times a day? How often do your users use your app? How often would you like them to? It’s really a high-traffic front door between people and you.

This is how you welcome people into what you’re doing. If it’s broken, people don’t feel welcome.

[7/27/2017: For my presentation at Mobile Tea Boston, my slides and code samples are below]

 

Slides with notes: http://bit.ly/2tgGiGr
Git example: https://github.com/paulsbruce/FingerprintDemo

The Dangers of Changing Your Digital Front Door

In his book “On Intelligence”, Hawkins describes how quickly our human brains pick up on minute changes with the analogy of someone replacing the handle on your front door with a knob while you’re out. When you get back, things will seem very weird. You feel disoriented, alienated. Not emotions we want to invoke in our users.

Now consider what it’s like for your users to have you changing things on their high-traffic door to you. Change is good, but only good changes. And when changes introduce problems, forget sympathy, forget forgiveness, people revolt.

What Could Possibly Go Wrong?

A lot. Even for teams that are great at what they do, delivering a mobile app is fraught with challenges that lead to:

  • Lack of strategy around branching, merging, and pushing to production
  • Lack of understanding about dependencies, impacts of changes
  • Lack of automated testing, integration woes, no performance/scalability baselines, security holes
  • Lack of communication between teams (Front-end, API, business)
  • Lack of planning at the business level (marketing blasts, promotions, advertising)

Users don’t care about our excuses. A survey by Perfecto found that more than 44% of defects in mobile apps are found by users. User frustrations aren’t just about what you designed, they are about how they behave in the real world too. Apps that are too slow will be treated as broken apps and uninstalled just the same.

What do we do about it?

We test, but testing is a practice unto itself. There are many test types and methodologies like TDD, ATDD, and BDD that drive us to test. Not everyone is cut out to be a great tester, especially when developers are driven to write only things that works, and not test for when it shouldn’t (i.e. lack of negative testing).

Allistar Scott – Test ‘Ice Cream Cone’

In many cases, automation gaps and issues make it easier for development teams to fall back to manual testing. This is what Allistar Scott (of Ruby Waitr) calls the anti-pattern ‘ice cream cone’, an inversion of the ideal test pyramid, and Mike Cohen has good thoughts on this paradigm too.

To avoid this downward spiral, we need to prioritize automation AND which tests we chose to automate. Testing along architecturally significant boundaries, as Kevin Henney puts it, is good; but in a world full of both software and hardware, we need to broaden that idea to ‘technologically significant boundaries‘. The camera, GPS, biometric, and other peripheral interfaces on your phone are a significant boundary…fault lines of the user experience.

Many development teams have learned the hard way that not including real devices in automated testing leaves these UX fault lines at risk of escaping defects. People in the real world use real devices on real networks under real usage conditions, and our testing strategy should reflect this reality too.

The whole point of all this testing is to maintain confidence in our release readiness. We want to be in an ‘always green’ state, and there’s no way to do this without automated, continuous testing.

Your Code Delivery Pipeline to the Rescue!

Confidence comes in two flavors: quality and agility. Specifically, does the code we write do what we intend, and can we iterate and measure quickly?

Each team comes with their own definition of done, their own acceptable levels of coverage, and their own level of confidence over the what it takes to ship, but answering both of these questions definitively requires adequate testing and a reliable pipeline for our code.

Therein lies the dynamic tension between agility (nimbleness) and the messy world of reality. What’s the point of pushing out something that doesn’t match the needs of reality? So we try to pull reality in little bits at a time, but reality can be slow. Executing UI tests takes time. So we need to code and test in parallel, automate as much as possible, and be aware of the impact of changes on release confidence.

The way we manage this tension is to push smaller batches more frequently through the pipeline, bring the pain forward, in other words continuous delivery and deployment. Far away from monolithically, we shrink the whole process to an individual contributor level. Always green at the developer level…merge only code that has been tested automatically, thoroughly.

Even in a Perfect World, Your Front Door Still Jams

So automation is crucial to this whole thing working. But what happens when we can’t automate something? This is often why the “ice cream cone” exists.

Let’s walk through it together. Google I/O or WWDC drops new hardware or platform capabilities on us. There’s a rush to integrate, but a delay in tooling and support gums up development all the way through production troubleshooting. We mock what we have to, but fall back to manual testing.

This not only takes our time, it robs us of velocity and any chance to reach that “always green” aspiration.

The worst part is that we don’t even have to introduce new functionality to fall prey to this problem. Appium was stuck behind lack of iOS 10 support for months, which means most companies had no automated way to validate on a platform that was out already.

And if anything, history teaches us that technology advances whether the last thing is well-enough baked or not. We are still dealing with camera (i.e. driver stack) flakiness! Fingerprint isn’t as unreliable, but still part of the UI/UX. And many of us now face an IoT landscape with very few standards that developers follow.

So when faced with architectural boundaries that have unpolished surfaces, what do we do? Mocks…good enough for early integration, but who will stand up and say testing against mocks is good enough to go to production?

IoT Testing Provides Clues to How We Can Proceed

In many cases, introducing IoT devices into the user experience means adding architecturally significant boundaries. Standards like BLE, MQTT, CoAP and HTTP provide flexibility to virtualize much of the interactions across these boundaries.

In the case of Continuous Glucose Monitoring (CGM) vendors, their hardware and mobile app dev cycles are on very different cycles. But to integrate often, they virtualize BLE signals to real devices in the cloud as part of their mobile app test scripts. They also adopt “IoT ninjas” as part of the experience team, hardware/firmware engineers that are in change of prototyping changes on the device side, to make sure that development and testing on the mobile app side is as enabled as possible.

Adding IoT to the mix will change your pyramid structure, adding pressure to rely on standards/interfaces as well as manual testing time for E2E scenarios.

[For more on IoT Testing, see my deck from Mobile/IoT Dev+Test 2017 here]

Automated Testing Requires Standard Interfaces

There are plenty of smart people looking to solve the busy-work problem with writing tests. Facebook Infer, Appdiff, Functionalize, and MABL are just a few of the new technologies that integrate machine learning and AI to reduce time-spend on testing busy-work.

But any and all programmatic approach, even AI, requires standard interfaces; in our case, universally accepted development AND testing frameworks and technologies.

Tool ecosystems don’t get built without foundational standards, like HTML/CSS/JS, Android, Java, and Swift. And when they want to innovate on hardware or platform, there will always be some gaps, usually in automation around the new stuff.

Example Automation Gap: Fingerprint Security

Unfortunately for those of us who see the advantages of integrating with innovative platform capabilities like biometric fingerprint authentication, automated testing support is scarce.

What this means is that we either don’t test certain critical workflows in our app, or we manually test them. What a bummer to velocity.

The solution is to have people who know how to implement multiple test frameworks and tools in a way that matches the velocity requirements of development.

For more information in this, see my deep-dive on how to use Appium in Android development to simulate fingerprint activities in automated tests. It’s entirely possible, but requires experience and a planning over how to integrate a mobile lab into your continuous integration pipeline.

 

Tailoring Fast Feedback to Resources (and vice versa)

As you incrementally introduce reality into every build, you’ll run into two problems: execution speed and device pool limits.

To solve the execution speed, most development teams parallelize their testing against multiple devices at once, and split up their testing strategy to different schedules. This is just an example of a schedule against various testing types.

For more on this, I published a series of whitepapers on how to do this.

TL;DR recap

Automating the quality of our web and mobile apps keeps us accurate, safe, and confident; but isn’t easy. Fortunately we have many tools and a lot of thought put in already to how to do this. Notwithstanding ignorance of some individuals, automation continues to change the job landscape over and over again. 

Testing always takes tailoring to the needs of the development process to provide fast feedback. The same is true in reverse: developers need to understand where support gaps exist in test frameworks and tooling, otherwise they risk running the “ship” aground.

This is why, and my mantra remains, it is imperative to velocity to have the right people in the planning room when designing new features and integrating capabilities across significant technological boundaries.

Similarly, in my research on developer efficiency, we see that there is a correlation between increased coverage over non-functional criteria on features and test coverage. Greater completeness in upfront planning saves time and effort, it’s just that simple.

Just like Conway’s “law”, the result of your team, it’s structure, communication patterns, functions and dysfunctions, all show up in the final product. Have the right people in the room when planning new features, retros, and determining your own definition of done. Otherwise you end up with more gaps than simply in automation.

Meta / cliff notes:

  • “Everyone owns quality” means that the whole team needs to be involved in testing strategy
    • To what degree are various levels of testing included in Definition of Done?
    • Which test sets (i.e. feedback loops) provide the most value?
    • How are various tests triggered, considering their execution speed?
    • Who’s responsible for creating which types of tests?
    • How are team members enabled to interpret and use test result data?
    • When defects do escape certain stages, how is RCA used to close the gap?
    • Who manages/fixes the test execution framework and infrastructure?
    • Does the benefits of the current approach to testing outweigh the cost?
  • Multiple testing framework / tool / platform is 200 OK
    • We already use separate frameworks for separate test types
      • jUnit/TestNG (Java) for unit (and some integration) testing
      • Chakram/Citrus/Postman/RestAssured for API testing
      • Selenium, Appium, Espresso, XCTest for UI testing
      • jMeter, Dredd, Gatling, Siege for performance testing
    • Tool sprawl can be a challenge, but proper coverage requires plurality
    • Don’t overtax one framework or tool to do a job it can’t, just find a better fit
  • Incremental doses of reality across architecturally significant boundaries
    • We need reality (real devices, browsers, environments) to spot fragility in our code and our architecture
    • Issues tend to clump around architecturally significant boundaries, like API calls, hardware interfaces, and integrations to monolithic components
    • We stub/mock/virtualize to speed development; signs of “significant” boundaries, but it only tells us what happens in isolation
    • A reliable code pipeline can do the automated testing for you, but you still need to tell it what and when to test; have a test execution strategy that considers:
      • testing types (unit, component, API, integration, functional, performance, installation, security, acceptance/E2E, …)
      • execution speed (<2m, <20m, <2h, etc) vs. demand for fast feedback
      • portions of code that are known-fragile
      • various critical-paths: login, checkout, administrative tasks, etc.
    • Annotations denote tests that relate across frameworks and tools
      • @Signup, @Login, @SearchForProduct, @V2Deploy
      • Tag project-based work (like bug fixes) like: JIRA-4522
  • Have the right people in the room when planning features
    • Future blockers like test framework support for new hardware capabilities will limit velocity, so have test engineers in the planning phases
    • Close the gap between what was designed vs. what is feasible to implement by having designers and developers prototype together
    • Including infrastructure/operations engineers in planning reduces later scalability issues; just like testers, this can be a blocker to release readiness
    • Someone, if not all the people above, should represent the user’s voice

More reading:

Don’t Panic! (or how to prepare for IoT with a mature testing strategy)

Thanks everyone for coming to my talk today! Slides and more links are below.

As all my presentations are, this is meant to extend a dialog to you, so please tweet to me any thoughts and questions you have to become part of the conversation. Looking forward to hearing from you!

More links are in my slides, and the presenter notes are the narrative in case you had to miss part of the presentation.

AnDevCon: Espresso Is Awesome, But Where Are It’s Edges?

For my presentation at AnDevCon SF 2016, I focused on how Espresso represents a fundamental change in how approach the process of shipping software that provably works on a mobile ecosystem that is constantly changing.

The feedback was overwhelmingly good, many people who stopped by the Perfecto booth before or after our talk came to me to discuss topics I raised. In other words, it did what I wanted, which was to provide value and strike up conversation about how to improve the Android UI testing process.

If you’re pressed for time, my slides can be found below or at:
bit.ly/espresso-edges-andevcon

Schrödinger’s Box and other complexities of scaling the software delivery lifecycle

Google Slides

Thank you to everyone at Defrag for attending my session. Great conversations afterwards, and looking forward to any more input you have in the future!

 


You Must Be This High to Ride the Continuous Bandwagon

There’s a lot of hype when it comes to continuous deployment (CD). The fact is that in large organizations, adopting CD takes changes to process, responsibilities, and culture (both technical and management). The right skills really help, but more often the determining factor to success is having the right attitude and vision across the whole team.

continuous-delivery-vs-continuous-deployment-b371cf5be55b1c52635058af7b70188cd2b608bfb92ca5487a3e41694e9ccf6b (1)
(image via Yassal Sundman)

At a carnival, you may have seen a sign that says “you must be this tall to ride”, an indication that the attraction is designed in such a way that it is dangerous to ride for those who don’t meet the specification. Similarly, continuous deployment sets the bar of requirement high, and some teams or products aren’t set up to immediately fit into this new methodology.

Mobile Continuous Delivery Requires Micro-climates

Mobile apps go through a validation process in an app store or marketplace before being generally available to customers, so product feedback loops take a hit in delay to market response to the app update. Mobile apps typically rely on back-end infrastructure which may require synchronous roll out of both front-end app and server-side components such as APIs and database schema. This is not trivial and for apps with thousands to millions of users.

Because of this delay, there’s huge emphasis on getting mobile app changes right before submitting them for review. Internal and beta testing platforms like TestFlight for iOS and HockeyApp for Android become vital to a successful app roll out and update strategy. For organizations that are used to 3 month release cycles and who control their whole stack, being prepared to release perfection every week requires a completely different mentality, often a completely different team too.

This is what I call product ‘micro-climates’, an ecosystem of people, processes, tools, and work that evolves independent of the larger organization. Mobile and API teams are perfect examples. A product needs to go at it’s own pace, accelerate and improve based on its own target audience. Only when organizations align product teams to business goals does this really take hold and become effective.

Prove Your Success, Aim for a Shared Vision

I’ve never seen a Fortune 500 organically evolve to CD without buy-in from a C-level or at least VP. A single group can implement it, but will ultimately run into cultural challenges outside their group (like IT and infrastructure) unless they have the support of someone who controls both groups.

If you’re trying to move in that direction but are hitting barriers outside your team, you’ve may have bitten off too much for now, and need buy-in from above (i.e. an executive sponsor). For that, you need:

  • Proof that what you’re doing is actually improving your velocity
    ‘DevOps’ is a buzzword, but metrics that show how doing kanban/scrum with both teams in the room every day actually matters. If you aren’t already capturing these metrics, I’d suggest you start. The point is to have quantifiable, objective measures that undeniably show success.
  • How your success maps to your executive sponsor’s goals
    An executive often balances potential opportunities with opportunity costs. If you’re changing process, what’s the risk to your actual project? How can this be replicated to other teams? What’s at risk if you don’t do this? Why are other competitors doing it this way too? What strategic objectives does this change enable (i.e. faster releases == competitive advantage)? Take a few moments to think about what your sponsor is measured on, and map your goals to theirs.
  • A clear plan and schedule, not just a bunch of activity
    Adding one or two process improvements is one thing, that’s actually our responsibility anyway, but to move to a model like continuous delivery/deployment you need a plan that includes objectives, strategy, and then tactics. For instance:

    • Objective: meet demand for new features, obtain competitive advantage in market
    • Strategy: streamline the delivery process to achieve 1-2 week release cycles
    • Tactics:
      • Continuous integration of code, multiple commits per developer per day
      • Minimum 80% automated test coverage
      • Test coverage over 5 key platforms and 3 geographic markets
      • Automated security reviews before each release (i.e. like this)
      • Tractability of code changes to production user impact metrics

If you’ve been bitten by the CD bug, it’s more than just an itch to scratch. It takes some concerted effort, particularly in large organizations, but don’t let that hinder you. Get your own team on board, find your velocity metrics, link your proposal to executive goals, get that sponsor, and commit to an implementation plan. Others have done it, and so can you.