At its most general, “releasing software” is the process by
which software is delivered from the engineers creating it to its
users. This can take such forms as:
A boxed product shipped on some kind of physical media
A downloadable executable on a website
A mobile application from the official store for your
device
A remotely accessible network service, especially web
applications
All releases are managed in some way, but there’s a wide
spectrum available for the thoroughness of this management. One
extreme would be the software used on an aircraft or space shuttle.
The software will be rigorously tested, following a strict process.
It will require sign-off from many different people to ensure
proper quality metrics and functional requirements have been met.
Only after this thorough and well documented process is complete
will the software touch the actual hardware in-flight.
By contrast, if you’ve got a website, you may telnet (or, if
you’re really fancy, SSH) into your web server, use
nano to open up index.php, make some
changes, and “release” your software by hitting save.
The former process is vital when lives and millions of dollars
of equipment are in the balance. However, for run-of-the-mill
software projects, such a high overhead would likely be daunting
and overly costly. The latter process (if we can even call it that)
is just barely sufficient for managing the most amateur of
projects, but introduces substantial risk of security holes,
downtime, and lost work, which on almost all projects will be
unacceptable.
Understanding the trade-offs in release management and the
requirements of your project is an important part of designing the
release process for your project. As you’ve likely noticed, the
trade-off here is between the cost of high quality versus its
benefit. In addition to navigating the overall trade-off, this post
will give some advice on tooling and approaches that will reduce
the overall cost of good process, making it possible—to some
extent—to have your cake and eat it too.
Release Management Process Flow
Taking the broadest view, release management is a process that
starts at the requirements phase, and lasts until a successful
release. As a motivating example, we’ll use the case of adding some
basic functionality to an existing website. The process looks
something like this:
Define the requirements for the upcoming
release. In our example: add a visitor counter to the homepage
(we’re going retro). The visitor counter must work on both desktop
browsers and mobile devices.
State some concrete acceptance criterion. This
will be testable assertions based on the requirements. This may be
in the form of automated test cases (perhaps using Selenium for our
example), or a manual test script to be run through. It would
include details like:
Counter displays in browser
Refreshing the page increments the visitor count
Counter displays on mobile device
Deployments are made to a staging server. Such
deployments can be triggered by pushes to a special branch or
manual intervention. The goals should be:
The staging environment is as close as possible to the
production environment to minimize surprises when switching to
production.
Deployments are a cheap activity, requiring as little human
involvement as possible.
Quality Assurance (QA) reviews the staging environment to see if
it meets the acceptance criteria. QA should not only be testing the
new features, but also all old features as well to avoid
regressions. Once again, automated is better than manual, but some
manual testing may be necessary. Make sure you’re not just
performing success tests, but include
failure tests, randomized tests, and much more in your test
scripts.
After proper sign-off from QA, deploy the new software to
production. Once again, devops comes into play, and an ideal
deployment will include these ideas.
Use the same binary artifacts on production as were used on the
staging system. Containerization helps
make this easy, by packaging up all of your artifacts into a single
image.
Minimize (or eliminate) downtime by performing a blue/green
deployment. Update some of the machines in your cluster to the new
image, ensure they are responsive, and after a critical mass is up
and running, atomically switch your load balancer to point at the
new images. Kubernetes’s deployment concept can help with this
(though you’ll still need to consider shared mutable resources,
like how to update a database schema).
As you can see from these steps, we’re defining release
management to be a broad role. The release management objectives in
this sense are to ensure that the code that ends up being released
to users properly implements the new requirements, without
introducing new regressions. This is a process that encompasses
devops, QA, and software development. Importantly:
No one team can be singularly responsible for a successful
release
You may (rightfully) point out that not all of the steps above
are present in the release management of a lot of software in the
world. What I’m describing is a minimal set of steps that maximize
the benefit of a process while minimizing its costs, for projects
of common size and common quality requirements. If you’re throwing
together a static homepage and don’t care if the layout sometimes
breaks: this process is likely overkill. If you’re NASA, it’s far
too little.
Development
You may have noticed that development is conspicuously missing
from the process above. In practice, it occurs between steps 2 and
3 above. I’ve left it out of this list, however, as it’s not truly
part of release management. The development process is external to
release management and provides it input (source code to
build/test/deploy), in much the same way as business decision
activities are external to release management, only interacting
with the process by ultimately defining clear requirements.
Roles and Responsibilities
Some roles naturally fall out of the steps listed above.
The product owner is responsible for defining
the requirements for a release.
The quality manager is responsible for
ensuring that the acceptance criteria accurately represent the
requirements. It should be impossible for a piece of code to pass
the acceptance criteria but leave the product owner unhappy with
the result. The quality manager is also responsible for ensuring
that tests are both properly and efficiently run, avoiding false
positives and false negatives, and giving the development team
accurate feedback on outstanding work items.
The devops team is responsible for maintaining
a continuous integration/continuous deployment setup that will
allow the development team to iterate rapidly, and the quality team
to have confidence that their tests on staging will carry through
to production. The team is also responsible for providing a high
availability production deployment that allows for minimal downtime
in new deployments.
Software Release Management Best Practices
The process above leads to some natural best practices.
Automate as much as possible, both on the
devops and testing side. Automation cuts down on the human cost,
allowing more rapid iteration, and reduces the chance for false
positives and negatives (computers tend to make less mistakes than
humans).
Have clear requirements, and from these make
testable acceptance criteria. There should be no
ambiguity about whether the software is ready to ship.
Minimize user impact of a release by minimizing or
eliminating downtime and testing for
regressions before release.
Make things immutable wherever possible.
Instead of modifying the configuration of an existing machine,
deploy a complete image that contains all of the configuration.
This avoids bugs from appearing due to an unexpected series of
actions. (It’s probably not surprising for a functional programming team to recommend
immutability.)
Terraform allows a devops engineering to
declaratively state what their cloud architecture should look like,
and takes care of the mutable API calls for you behind the
scene.
Kubernetes manages the deployment of your
Docker images across a cluster of machines, and helps avoid
downtime with its rolling deployment system.
Selenium and other automated testing tools are
invaluable for a quality team. The exact tool will depend
substantially on the problem domain being faced. Selenium, for
example, is great for front end web development, but does nothing
to help with stress testing a server. Don’t forget to use your language’s own functionality for
testing.
A continuous integration system with in-repo
configuration. We’ve had a lot of success with
Gitlab, though other tools like Jenkins are
certainly options as well.
Monitoring and alerting throughout the release process via
Prometheus to provide invaluable insights into the running system,
in case something went wrong.
If you’d like to learn more about any of these topics, FP
Complete offers training and support on a
wide range of devops and software development topics.
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.