Saturday, January 22, 2011

Anatomy Of A Continuous Integration System Part Two

No matter what generation you grew up in - Mary Poppins tried to entice you with a spoon full of sugar, Barney used out-of-copyright melodies and hypnotising lyrics, and Dora just remixed what Barney came up with - but you have to clean up, and it doesn't have to be painful.

Most people don't get around to writing cleanup routines until long after everything is setup and running (if at all!), but you, the Wise Reader, are going to start with a cleanup routine. Waiting until you have even a basic build compiling before you start cleaning up makes little sense. If you're going to fail, fail early and fail fast. You don't want an accidentally committed binary file from a developer pre-loaded into a project's object folder giving you a false sense of build success, only to have QA report very odd and hard to reproduce errors. Hours or days later you finally track it down to an artifact that was inadvertently added into your source control system from the developer's PC.

While there are lots of ways to limit your exposure from a developer (or wherever) accidentally adding and committing artifacts you don't want or need into source control (and it's fine to spend a little time implementing walls and filters) they are pretty easily bypassed. A lot of times they have to be set at the project folder level for each and every project, so it's ripe for getting forgotten to be added in the first place when the project is first setup (because the developer isn't going to remember to do it), or forgotten in the migration of code from branch to another, or it's just lost when someone commits clearing the properties of the parent folder. It also doesn't address rebuild scenarios on your Continuous Integration (CI) implementation. The word "Continuous" in CI literally says each build isn't a one-time event, so every ounce of robustness you can put into your build strategy & implementation, it's going to pay off quickly. While most CI systems have the option of a clean build each time, I've never seen anyone utilize that option. It adds alot of time to your builds - especially on your flagship product - when you build servers have to pull it down fresh each and every time. And if you're a virtualized shop, all that I/O on your build server farm is going to get you the stink-eye from your infrastructure engineers when the SAN starts to get bogged down.

To avoid this, go for the simplest routines. I remove specific folders and do a top of the project down to the furthest corners cleanup.
Here's a sample cleanup target block.
<Target Name="MasterClean">
  <Exec Command="BuildCleanup.cmd"/>
  <RemoveDir Directories="
   %(ProjectList.RelativeDir)bin" />
BuildCleanup.cmd is a script that does a recursive delete of specific filetypes mere moments before the compile starts. I love simple - here's the entire contents of my basic BuildCleanUp.cmd -
@echo off
del *.dll /s
del *.pdb /s
del *.cache /s
del *.exe /s
call CompanyName-LibraryUpdate.cmd
exit 0
It does three things:
  1. Delete the bits you absolutely don't want (dll/pdb/cache/exe)
  2. Makes sure your local library of precompiled/3rd party software is up to date.
  3. Exit with a return code of 0 so nothing calling it confuses a successful return code with an error level. 
The other piece I try to accomplish in my cleanup blocks is to remove specific subdirectories that "need" removing (such as obj and bin folders) if it's a quick hit. Some solutions call 12 project files and scripting out a removal of each project's obj & bin folder has a low return rate for your effort. At that point, just rely on your cleanup cmd to remove the bits you absolutely need to get rid of.
Side note - your starting directory by default in an MSBuild script is the location of the .build file, so you don't even need to feed BuildCleanUp.cmd any arguments or pre-load it with fixed directories. It's going to execute at the level of your .build file, which is where you are building out of. You don't need to worry about parent paths because you're not calling any other projects higher than your current level, so a simple script that cleans downhill from your starting point is just what the doctor ordered.

This is really good stuff - you want to make each subsequent build regenerate all local code from scratch so that there's no doubt in QA's mind that a new build of XYZ is going to have exactly what the build notes say it has. Letting a dll from a prior build/different environment creep unintended into a compile is literally horrifying to me. QA's tests start showing odd results, people collectively spend a tremendous amount of time trying to reproduce and understand the issue, and when they track it down to an old dll that is getting carried along each build, I feel like I have personally wasted everyone's time, energy, though processes, bits of their lives, etc, because I could have prevented it if I had been a little more careful or a little more thorough.

You can do this. If you can do this consistently, eventually development, QA and project management/BA's won't jump to the build being the issue when a defect that was reported as fixed isn't showing up as fixed (or with any change in its behavior) in the next deployment it was reportedly fixed in. They will start to believe the builds are rock solid because you've showed them they are. (or at least, they are as good as the input going into them - GIGO)

As a bonus, put all these utility scripts in a folder on your build server - I use C:\BuildTools. Then add C:\BuildTools to the system path and reboot the server. From then on, all builds will be able to call any .cmd file in C:\BuildTools without specifying the path or having to make multiple copies of your cmd files and putting them in the root of all your projects. Putting that single folder with some simple cleanup & deployment assisting scripts is part of the short checklist we have for making a Production build server.

We'll leave the clean-up block behind us for abit, but it will come back for a brief howdy-do when we go over automated deployments.

-Kelly Schoenhofen

No comments:

Post a Comment