Open letter/discussion: Hotfixes & Patches

I'd like to start some constructive community discussion around patching and hotfixes in response to the latest hotfix announcement earlier this week (12.5.1 hotfix 12) as well as Wednesday's development update. I haven't seen too much of these topics in the general discussions and want to hear how other IT folks' thoughts differ (or not) from mine.

HF12 appears to be MUCH larger than what I've become accustomed to for typical hotfixes. Not just a SQL script, it requires client files and even updating services with TIM. Why, I wonder, was this not presented instead as a new patch release at version 12.5.2? I am most curious what the network's typical habit (of doing releases only once or twice per year and then continually issuing hotfixes) is modeled after. Comparatively overwhelming numbers of projects tend to follow, roughly, "Semantic Versioning" (see http://semver.org), where the version number alone can tell you if it has bug fixes, new features, or large-scale non-backwards-compatible changes. Tessitura's pattern of development seems to follow these types of changes, but cumulative bug fixes are never (or very rarely) released as a new version, and I wonder how people feel about this.

One of the issues I see are that hotfixes can be selectively applied by the licensees being supported by the network. Were I in the network's position, I would see this as a huge problem—as hotfixes increase, you have a geometrically expanding number of possible configurations to support, but with cumulative patches, this growth is only linear; and the only option for licensees experiencing an issue with any version is to apply the latest cumulative patch.

As a licensee, I would MUCH rather have a single cumulative patch version to update to than to have dozens of different hotfixes to evaluate separately and decide if they are worth my time to install each one. But, I would want to have a good level of confidence in two things: 1. That the changes made in the patch releases have been developed and QA'd to the same standards as any other release, and 2. That patch versions (12.5.x) ONLY contain bug fixes, and no feature changes that would require my organization to do a complete upgrade testing sequence. I think it goes without saying that in this release model, upgrade deadlines would only apply to major versions.

Finally, I recall some discussion from the network around making hotfixes mandatory, or at least easier to install, but also a notion that a hotfix was different from a release because it wasn't held to the same QA standards. I don't know how I would feel about mandatory hotfixes if I knew they weren't being held to the same standards as code for general release. Even in a perfect world, I think hotfixes will continue to exist, but I would like to see a shift from them being an awkward institutional vehicle for delivering patches, to being a temporary-by-design means of getting an organization back on their feet when they can't wait a couple of DAYS for the next patch release.

I have a suspicion that the network's release workflow is a bit more process-intensive than clicking a button. And I imagine that might pose some difficulties for the more frequent releases I am advocating. If that is indeed the case, then I would certainly be speaking for my own organization when I say "Please, spend our money on QA resources, streamlining, automating; whatever it takes to make releases easy and fun and not dreadful and terrifying!" (See also: Why every development team needs continuous delivery) I think it's better for everyone when fixes as well as new features get into our hands when they are ready, and not just when there's a critical mass of stuff to justify an arduous release process.

Parents
  • I think this is an excellent topic to bring up with both the Board and MAC.

  • Hey all,

    Thanks for the feedback.  A couple of points I'd like to make.  First on HF12--we certainly recognize that this is bigger than a normal hotfix and there is a reason behind that.  The changes that we made in HF12 were necessitated by our PA-DSS security auditors who required some changes to meet specifications that were added in the latest version of the PCI standards.  Those changes were pretty pervasive throughout the software as they had to do with our storage of passwords and therefore the way we authenticate those.  Normally a change of that size would be a point release, like 12.5.2.  However (and here is the sticky part), we had mistakenly told our security auditors that we used a 4 part release number x.x.x.x, which meant that any change to any of those number would have required that we restart the entire audit process.  And those of you who have been through one of these audits know that that was the last thing we wanted to happen!  So in this case the way around that restriction was to call the change a hotfix. We will make that mistake in our next audit.

    I know that this is really all semantics anyway, but I did want to explain it a bit. In addition the scope of changes that we had to make in this case caused us to release this as a cumulative hotfix and also to require this to be installed before any future hotfixes can be applied.

    Which leads to my second point.  We are working towards a model of regular cumulative hotfixes (more like Service Packs) that have a regular QA and deployment methodology behind them, with each one a complete software package with all the components.  As you can imagine there are lots of moving parts behind such an effort including better automation of installation which is especially important in the RAMP world where we have to roll these out to a large number of sites.  As Nick points out the current practice of mixing and matching hotfixes is not easily sustainable from a support point of view and that is why we are moving away from it.  At the moment I don't have a estimate of when this new model will take effect, but this a very active project for us right now.

    --chuck

Reply
  • Hey all,

    Thanks for the feedback.  A couple of points I'd like to make.  First on HF12--we certainly recognize that this is bigger than a normal hotfix and there is a reason behind that.  The changes that we made in HF12 were necessitated by our PA-DSS security auditors who required some changes to meet specifications that were added in the latest version of the PCI standards.  Those changes were pretty pervasive throughout the software as they had to do with our storage of passwords and therefore the way we authenticate those.  Normally a change of that size would be a point release, like 12.5.2.  However (and here is the sticky part), we had mistakenly told our security auditors that we used a 4 part release number x.x.x.x, which meant that any change to any of those number would have required that we restart the entire audit process.  And those of you who have been through one of these audits know that that was the last thing we wanted to happen!  So in this case the way around that restriction was to call the change a hotfix. We will make that mistake in our next audit.

    I know that this is really all semantics anyway, but I did want to explain it a bit. In addition the scope of changes that we had to make in this case caused us to release this as a cumulative hotfix and also to require this to be installed before any future hotfixes can be applied.

    Which leads to my second point.  We are working towards a model of regular cumulative hotfixes (more like Service Packs) that have a regular QA and deployment methodology behind them, with each one a complete software package with all the components.  As you can imagine there are lots of moving parts behind such an effort including better automation of installation which is especially important in the RAMP world where we have to roll these out to a large number of sites.  As Nick points out the current practice of mixing and matching hotfixes is not easily sustainable from a support point of view and that is why we are moving away from it.  At the moment I don't have a estimate of when this new model will take effect, but this a very active project for us right now.

    --chuck

Children
  • +1 for a shift towards more of a cumulative update approach, rather than numerous individual hotfixes. That being said, I do hope the new model still allows for individual hotfixes when a critical issue needs to be dealt with; however, if there are more regular cumulative updates, the need for emergency hotfixes should go down.

    +1 for improvements to the automation of installs. In our environment, we have numerous VMs involved in our Tessitura setup as well as multiple test environments: having to visit each one individually and - on top of that - run a TIM installation for each individual service separately results in a process that takes longer than it should. I would love to get to a point where I can have a central configuration describing my entire Tessitura deployment, then update those components from one central console with full logs captured of the results. 

    +1 to Nick's point of hotfixes not being mandatory. I think that should apply to cumulative updates as well. 

    Great dialog - thanks, Nick, for starting up this conversation!

  • And (hopefully) obviously, I meant to say that we will NOT make that mistake in our next audit!