Sunday, November 28, 2010

Anatomy Of A Continuous Integration System Part One

I occasionally give my eleven year old a particular piece of advice - "keep your eye on the prize" - and that's a good way to approach continuous integration. The architecture of a well-functioning msbuild script consists of milestones (significant events) tied together in a linear fashion with a clearly defined "prize" result. That's singular. If you find you need something more complex, you should be cutting your builds up into smaller builds and setting up logical relationships between them.
As I said in part zero I wanted to stay away from philosophy, but I want to talk about that last sentence a little more. Trying to address multiple needed outcomes/deliverables in a single container approaches a worst practice. You will need to check for and track multiple fatal failure conditions and conversely, multiple success conditions, and you're going to fail trying. It's not worth it. You can't effectively serve two masters at the same time - listen to what Yoda said in the movies. A successful msbuild has just one required output, everything else is frosting. Take your uber-complex build monster script, break down the requirements until you have a list of the prizes you need and silo each one into its own build. Capisce?

Back to our build script. Remember wikipedia's definition of CI from part zero of this series?
"small pieces of effort, applied frequently" 
Identify your prize, break it down to the smallest piece of discrete effort, and write it so you can apply it frequently. That's the magic formula. In fact, 99% of the build scripts I write can be boiled down to the same skeleton over and over.  Cleanup, build (clean) and deploy (to Test). Step 1, Step 2 and Step 3. Everything else in your build is window dressing and any failure of those three steps is fatal.
If other factors come into play, such as multiple environmental targets in a single build (deploy to Test, QA, Staging, offsite tape and burn to a DVD gold master), while I may pragmatically add an extra piece or two  at first, it doesn't take very much to push me into cutting one build into multiple builds and chaining them together in some fashion.

Here's a molecular build script - under normal circumstances you don't want to break it down any further than this, and conversely you don't want to get too much more complicated than this. If you're deep in a thorny issue, you happen to step back and take a look at your creation and you don't recognize your msbuild script having a direct, familial relationship with the Clean, Compile, Deploy pseudo-code skeleton below, then you probably made a left turn at Albuquerque along the way and some refactoring needs to occur, perhaps along the lines of creating multiple builds out of your problem build.

 Starter stub skeleton of 99% of my build scripts:


<?xml version="1.0" encoding="utf-8" ?>
<Project DefaultTargets="Publish" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<Target Name="MasterClean">
<Cleanup Actions/>
</Target>

<Target Name="MasterBuild" DependsOnTargets="MasterClean">
<MSBuild (compile) Actions/>
</Target>

<Target Name="Publish" DependsOnTargets="MasterBuild">
<Publish Actions/>
</Target>
</Project>

For troubleshooting purposes, I'll leave three manual cmd files committed into source control in the root of each project; one to kick off the cleanup, one to kick off the compile (which has a prereq of cleanup) and one to kick off publish (which has a prereq of compile, which has a prereq of cleanup - you get the idea). Here's the meat of the Publish.cmd I use in a .Net 4.0 project for instance -

%SYSTEMROOT%\Microsoft.NET\Framework\v4.0.30319\msbuild.exe master.build %*  /t:Publish %*

Very occasionally developers get some use out of them, but they are really for me when I'm initially setting up a build and troubleshooting build issues, build server/CI system issues or when I'm doing maintenance on a build.

-Kelly Schoenhofen

Saturday, November 27, 2010

Anatomy Of A Continuous Integration System Part Zero

Wikipedia puts the definition of software engineering's Continuous Integration (CI) simply as - "small pieces of effort, applied frequently", and that's pretty accurate. The various slices of what is packaged up these days as continuous integration  have been around for decades and re-invented every few years under various names and methodologies, but it's been a semi-recent branding effort to put these particular best practices together under the common name of Continuous Integration. 
Rather than be a flash-in-the-pan methodology, CI has become the underlying platform - the bedrock - that all the trendy methodologies have been built on for the last 10 years including Extreme Programming (XP), Agile, RUP, Scrum, etc. If your methodology isn't Cowboy Coding - and I mean the classic definition of cowboy coding - a lone developer or a handful of developers acting alone practicing anarchy for argument's sake: "your rules stifle me!" - I bet your methodology has most or all of the continuous integration basics at the bottom of it.

The industry book definition of CI has it defined as a handful of best practices - 
  • Using a source code repository system
  • Everyone gets latest (code) on a frequent basis
  • Frequent commits
  • Every source code commit generates a build*
  • Build automation
  • Build fast (and fail fast!)
  • Build results should be public
  • Automated unit tests in the build
  • Test environment should mimic Production environment
* To many people, this is all continuous integration is. It's important, and achieving that standard (every commit being built) means you practically have to be doing half of the other CI best practices. It's a good milestone to shoot for when you are converting to CI or building a CI system from scratch, but it's not the destination. Heck, I don't think achieving every facet and aspect of CI means you've reached your destination, CI is just a means - a tool - to make your software production more efficient and productive with the end result of higher quality at every output. 

All of these best practices can be compared to and validated against the boiled-down idea of "a small iteration - a small complete effort - increases the quality of a larger effort". If these pieces of effort don't pay huge dividends by the time your project is hitting its stride, let alone its delivery date, then simply remove that practice from your playbook. CI shouldn't be stifling or handcuffing the developer - everything about it is about improving the quality of the software being engineered, this is not about a Configuration Manager or System Administrator power tripping. 

For the next few blog posts, I want to just write up some basic articles on constructing & running a CI environment for a work group or entire shop developing in .Net but isn't using TFS* (Team Foundations System), and I'll be mainly focusing on the art of the msbuild file and how your entire CI architecture can and should be seen in the way you craft your msbuild scripts. 

* Maybe you don't use TFS because you're the lone group in your company doing .Net, and you can't get the budget for a TFS farm, or maybe you are required to use source control system XYZ for regulatory purposes and TFS isn't going to happen for purely technical reasons. 

None of the other posts are going to be as wordy or philosophical as this one - I promise they will be short and straight to the point. They are also all works in progress - they are the best practices I have today - I not only can't guarantee they won't evolve in a year from now, I certainly hope they will.

-Kelly Schoenhofen