a little madness

A man needs a little madness, or else he never dares cut the rope and be free. -Nikos Kazantzakis

Zutubi

Archive for the ‘Continuous Integration’ Category

Article: Optimise Your Acceptance Tests

In a similar vein to my previous post, I’ve revived some old posts about acceptance testing — and made significant additions. The end result is a new article:

Many developers have a love-hate relationship with automated acceptance tests. One major sticking point is the time acceptance tests take to execute, which can easily cause a blow out of project build times. In this article I’ll review 7 successful techniques we’ve put to work in our own projects to optimise our acceptance testing.

You can read the full article at zutubi.com.

Pulse 2.1.11: Get More From Your Build Agents

The latest Pulse 2.1 beta build, 2.1.11, has just been freshly baked. This build includes several new features and improvements. Prominent among them is a new “statistics” tab for agents. This tab lists various figures such as the number of recipes the agent executes each day and how long the average recipe keeps the agent busy. Statistics are also shown for agent utilisation, including a pie chart that makes it easy to visualise:

agent utilisation chart

This allows you to see if you are getting the most out of your agent machines. If you do notice a machine is underutilised, another new feature could help identify the cause: compatibility information for projects and agents. Pulse matches builds to agents by considering if the resources required for the project are all available on the agent. Now when you configure requirements, Pulse shows you which agents those requirements are compatible with. On the flip side, when configuring an agent’s available resources, Pulse shows you which projects those resources satisfy.

Other highlights in this build:

  • Optional compression of large build logs (on by default).
  • Visual indicators of which users are logged in, and last access times for all users.
  • Support for Subversion 1.6 working copies for personal builds.
  • Actions can now be performed on all descendants of a project or agent template (e.g. disable all agents with one click).
  • New options to terminate a build early if a critical stage or number of stages have already failed.
  • The system/agent info tabs now show the Pulse process environment (visible to administrators only).
  • Use of bare git repositories on the Pulse master to save disk space.

Yes, we have been busy :) . Get over to our website and download the beta now — it’s free to try, and a free upgrade for customers with current support contracts!

Boost.Test XML Reports with Boost.Build

My previous post Using Boost.Test with Boost.Build illustrated how to build and run Boost.Tests tests with the Boost.Build build system. For my own purposes I wanted to take this one step further by integrating Boost.Test results with continuous integration builds in Pulse.

To do this, I needed to get Boost.Test to produce XML output, at the right level of detail, which can be read by Pulse. This is another topic I have covered to some extent before: the key part being to pass the arguments “–log_format=XML –log_level=test_suite” to the Boost.Test binaries. The missing link is how to achieve this using Boost.Build’s run task. Recall that the syntax for the run task is as follows:

rule run (
sources + :
args * :
input-files * :
requirements * :
target-name ? :
default-build * )

Notice in particular that you can pass arguments just after the sources. So I updated my Jamfile to the following:

using testing ;
lib boost_unit_test_framework ;
run NumberTest.cpp /libs/number//number boost_unit_test_framework
: –log_format=XML –log_level=test_suite
;

and lo, the test output was now in XML format:

$ cat bin/NumberTest.test/gcc-4.4.1/debug/NumberTest.output
<TestLog><TestSuite name=”Number”><TestSuite name=”NumberSuite”><TestCase name=”checkPass”><TestingTime>0</TestingTime></TestCase><TestCase name=”checkFailure”><Error file=”NumberTest.cpp” line=”15″>check Number(2).add(2) == Number(5) failed [4 != 5]</Error><TestingTime>0</TestingTime></TestCase></TestSuite></TestSuite></TestLog>
*** 1 failure detected in test suite “Number”

EXIT STATUS: 201

The output will not exactly win awards: it has no <?xml …?> declaration, no formatting, and thanks to Boost.Test contains trailing junk. We’ve made sure that the processing in Pulse 2.1 takes care of this, though.

If you are a Pulse user looking to integrate Pulse and Boost.Test, you might also be interested in a new Cookbook article that I’ve written up on this topic.

Fencing Selenium With Xephyr

Earlier in the year I put Selenium in a cage using Xnest. This allows me to run browser-popping tests in the background without disturbing my desktop or (crucially) stealing my focus.

On that post Rohan stopped by to mention a nice alternative to Xnest: Xephyr. As the Xephyr homepage will tell you:

Xephyr is a kdrive based X Server which targets a window on a host X Server as its framebuffer. Unlike Xnest it supports modern X extensions ( even if host server doesn’t ) such as Composite, Damage, randr etc (no GLX support now). It uses SHM Images and shadow framebuffer updates to provide good performance. It also has a visual debugging mode for observing screen updates.

It sounded sweet, but I hadn’t tried it out until recently, on a newer box where I didn’t already have Xnest setup. The good news is the setup is as simple as with Xnest in my prior post:

  1. Install Xephyr: which runs an X server inside a window:
    $ sudo apt-get install xserver-xephyr
  2. Install a simple window manager: again, for old times’ sake, I’ve gone for fvwm:
    $ sudo apt-get install fvwm
  3. Start Xephyr: choose an unused display number (most standard setups will already be using 0) — I chose 1. As with Xnest, the -ac flag turns off access control, which you might want to be more careful about. My choice of window size is largely arbitrary:
    $ Xephyr :1 -screen 1024×768 -ac &
  4. Set DISPLAY: so that subsequent X programs connect to Xephyr, you need to set the environment variable DISPLAY to whatever you passed as the first argument to Xephyr above:
    $ export DISPLAY=:1
  5. Start your window manager: to manage windows in your nested X instance:
    $ fvwm &
  6. Run your tests: however you normally would:
    $ ant accept.master

Then just sit back and watch the browsers launched by Selenium trapped in the Xephyr window. Let’s see them take your focus now!

Pulse 2.1 Beta Rolls On

We’ve reached another significant milestone in the Pulse 2.1 beta: the release of 2.1.9. This latest build rolls up a stack of fixes, improvements and new features. Some of the much-anticipated improvements include:

  • Support for NAnt in the form of a command and post-processor.
  • Support for reading NUnit XML reports.
  • Support for reading QTestlib XML reports.
  • The ability to mark unstable tests as “expected” failures: they still look ugly (so fix them!) but won’t fail your build.
  • Better visibility of what is currently building on an agent.
  • New refactoring actions to “pull up” and “push down” configuration in the template hierarchy.
  • The ability to specify Perforce client views directly in Pulse.

I’ll expand upon some of these in later posts. In addition we’ve made great progress on the new project dependencies support, which should be both easier to use and more reliable in this build.

We’d love you to download Pulse 2.1 and let us know what you think!

Are Temp Files Slowing Your Builds Down?

Lately one of our Pulse agents has been bogged down, to the extent that some of our heavier acceptance tests started to genuinely time out. Tests failing due to environmental factors can lead to homicidal mania, so I’ve been trying to diagnose what is going on before someone gets hurt!

The box in question runs Windows Vista, and I noticed while poking around that some disk operations were very slow. In fact, deleting even a handful of files via Explorer took so long that I gave up (we’re talking hours here). About this time I fired up the Reliability and Performance Manager that comes with Vista (Control Panel > Administrative Tools). I noticed that there was constant disk activity, and a lot of it centered around C:\$MFT — the NTFS Master File Table.

I had already pared back the background tasks on this machine: the Recycle Bin was disabled, Search Indexing was turned off and Defrag ran on a regular schedule. So why was my file system so dog slow? The answer came when I looked into the AppData\Local\Temp directory for the user running the Pulse agent. The directory was filled with tens of thousands of entries, many of which were directories that themselves contained many files.

The junk that had built up in this directory was quite astounding. Although some of it can be explained by tests that don’t clean up after themselves, I believe a lot of the junk came from tests that had to be killed forcefully without time to clean up. It was also evident that every second component we were using was part of the problem – Selenium, JFreechart, JNA, Ant and Ivy all joined the party.

So, how to resolve this? Of course any tests that don’t clean up after themselves should be fixed. But in reality this won’t always work — especially given the fact that Windows will not allow open files to be deleted. So the practical solution is to regularly clean out the temporary directory. In fact, it’s quite easy to set up a little Pulse project that will do just that, and let Pulse do the work of scheduling it via a cron trigger. With Pulse in control of the scheduling there is no risk the cleanup will overlap with another build.

A more general solution is to start with a guaranteed-clean environment in the first place. After all, acceptance tests have a habit of messing with a machine in other ways too. Re-imaging the machine after each build, or using a virtual machine that can be restored to a clean state, is a more reliable way to avoid the junk. Pulse is actually designed to allow reimaging/rebooting of agents to be done in a post-stage hook — the agent management code on the master allows for agents to go offline at this point, and not try to reuse them until their status can be confirmed by a later ping.

CITCON Paris 2009: Mocks, CI Servers and Acceptance Testing

Following up on my previous post about CITCON Paris, I thought I’d post a few points about each of the other sessions I attended.

Mock Objects

I went along to this session as a chance to hear about mock objects from the perspective of someone involved in their development, Steve Freeman. If you’ve read my Four Simple Rules for Mocking, you’ll know I’m not too keen on setting expectations, or even on verification. I mainly use mocking libraries for stubbing. Martin Fowler’s article Mocks Aren’t Stubs had make me think that Steve would hold the opposite view:

The classical TDD style is to use real objects if possible and a double if it’s awkward to use the real thing. So a classical TDDer would use a real warehouse and a double for the mail service. The kind of double doesn’t really matter that much.

A mockist TDD practitioner, however, will always use a mock for any object with interesting behavior. In this case for both the warehouse and the mail service.

So my biggest takeaway from this topic was that Steve’s view was more balanced and pragmatic than Fowler’s quote suggests. At a high level he explained well how his approach to design and implementation leads to the use of expectations in his tests. I still have my reservations, but was convinced that I should at least take a look at Steve’s new book (which is free online, so I can try a chapter or two before opting for a dead tree version).

A few more concrete pointers can be found in the session notes. A key one for me is to not mock what you don’t own, but to define your own interfaces for interacting with external systems (and then mock those interfaces).

The Future of CI Servers

I wasn’t too keen on this topic, but since it is my business, I felt compelled. I actually proposed a similar topic at my first CITCON back in Sydney and found it a disappointing session then, so my expectations were low. Apart from the less interesting probing of features on the market already, conversation did wander onto the more interesting challenge of scaling development teams.

The agile movement recognises the two main challenges (and opportunities) in software development are people and change. So it was interesting to hear this recast as wanting to return to our “hacker roots” — where we could code away in a room without the challenges of communication, integration and so on. Ideas such as using information radiators to bring a “small team” feel to large and/or distributed teams were mentioned. A less tangible thought was some kind of frequent but subtle feedback of potential integration issues. Most of the time you could code away happily, but in the background your tools would be constantly keeping an eye out for potential problems. What I like about this is the subtlety angle: given the benefits it’s easy to think that more feedback is always better, without thinking of the cost (e.g. interruption of flow).

Acceptance Testing

This year it seemed like every other session involved acceptance testing somehow. Not terribly surprising I guess since it is a very challenging area both technically and culturally. As I missed most of these sessions, they are probably better captured by other posts:

One idea I would call attention to is growing a custom, targeted solution for your project. I believe it was Steve Freeman that drew attention to an example in the Eclipse MyFoundation Portal project. If you drill down you can see use cases represented in a custom swim lane layout.

Water Cooler Discussions

Of course a great aspect of the conference is the random discussions you fall into with other attendees. One particular discussion (with JtF) has given me a much-needed kick up the backside. We were talking about the problems with trying to use acceptance tests to make up for a lack of unit testing. This is a tempting approach on projects that don’t have a testable design and infrastructure in place — it’s just easier to start throwing tests on top of your external UI.

Even though I knew all the drawbacks of this approach, I had to confess that this is essentially what has happened with the JavaScript code in Pulse. We started adding AJAX to the Pulse UI in bits and pieces without putting the infrastructure in place to test this code in isolation. Fast forward to today and we have a considerable amount of JavaScript code which is primarily tested via Selenium. So we’re now going to get serious about unit testing this code, which will simultaneously improve our coverage and reduce our build times.

Conclusion

To wrap up, after returning from Paris I plan to:

  1. Give expectations a fair hearing, by reading Steve’s book.
  2. Look for ways to improve our own information radiators to help connect Zutubi Sydney and London.
  3. Get serious about unit testing our JavaScript code.
  4. Get PJ and JtF to swap the dates for CITCON Asia/Pacific and Europe next year so I can get to both instead of neither! ;)

If I succeed at 4 (sadly not likely!) then I’ll certainly be back next year!

CITCON Paris 2009

As mentioned Daniel and I both attended CITCON Paris the weekend before last. I’ve not had a chance to post a follow up yet as we also took the opportunity to eat the legs off every duck in France (well, we tried).

Firstly a huge thanks to PJ, Jeff, Eric and all the other volunteers for another great conference. Thanks again to Eric and Guillaume for acting as local guides on Saturday night. As always, the open spaces format and mix of attendees delivered a great day. It was also great to see a few familiar faces from the year before in Amsterdam (and a familiar shirt thanks to Ivan :) ).

This year I proposed and facilitated a single topic: Distributed SCM in the Corporate World. I finally added a full write-up on the conference wiki earlier in the week for those who are interested. For the impatient, here are my take-aways from the session:

  1. Of the distributed SCMs, there is not much traction in the corporate world just yet, although git appears to have gained a foothold. (Obviously our sample size is small, but I also expect CITCON attendees to be closer to the edge than the average team.)
  2. Where distributed SCMs are used, the topology is still like the centralised model. However, the ability to easily clone and move changes between repositories presents opportunities to work around issues like painful networks (contrast this to special proxy servers which are needed in similar scenarios with centralised SCMs).
  3. The people using git liked it primarily for its more flexible workflow and better merging. It’s conceivable to have this in the centralised model too, but no single centralised contender was mentioned.
  4. So far the use of distributed SCMs didn’t seem to have practical implications for CI – probably due to the use of a centralised topology.

Looks like we’re still waiting to see more creative use of distributed SCMs in corporate projects – perhaps it is something worth revisiting in future conferences. I hope to post on some of the other sessions I attended at a later date.

Zutubi @ CITCON Paris 2009

Any excuse is good enough to get me to Paris, especially while it is only a train ride away. Daniel has actually been tempted all the way from Sydney!1 So you’ll find us both at CITCON Europe 2009 tomorrow night and Saturday. We’re both looking forward to a great weekend, after nothing but positive experiences at previous events. Hopefully we’ll even get a few questions about the new Pulse 2.1 Beta while we’re there!


1 Although combining it with a well-deserved holiday may have been a factor…

Pulse Continuous Integration Server 2.1 Beta!

Exciting news: today we’ve pushed the latest version of Pulse, namely 2.1, to beta! This is the culmination of months of hard work on a ton of new features and improvements, including:

  • Project dependency support.
  • Easier multi-command projects.
  • Personal build improvements.
  • Fine-grained cleanup rules.
  • Built-in reference documentation.
  • Pluggable commands (build tool support).
  • A simpler, faster configuration UI.

The new features are described in more detail on the 2.1 beta page. The largest are the first two: dependencies and multi-command projects.

Project Dependencies

The ability to deliver artifacts from one build to another is a long-standing feature request. Pulse 2.1 supports this as part of a larger dependencies feature. Essentially you can declare one project to be dependent on another, allowing the downstream project to use artifacts built by the upstream one. Artifacts are delivered through an internal artifact repository.

The dependencies feature goes beyond artifact delivery. It also includes smarter triggering for dependent projects, the ability to rebuild a full dependency tree and a new “dependencies” tab which allows you to visualise the dependency graph.

Dependency support is built on top of Apache Ivy. Our aim is for interoperability with existing tools like Ivy and Maven, but without being Java-specific.

Multi-Command Projects

We’ve always had support for multi-command projects in the Pulse build core. However, to access this full flexibility you previously had to write an XML “pulse file” by hand. As of 2.1, the configuration GUI exposes the full flexibility of the underlying build core. This allows you to define multiple recipes per-project, each of which can have multiple commands. All of the advanced command options once restricted to XML files are now also accessible in the GUI.

A key feature related to this is the ability to plug in new commands (e.g. to support a new build tool), and have the plugin seamlessly integrated into the add project wizard. If you plug in support for a command, you get simplified wizard creation of single-command projects using your plugin for free.

Give It A Go!

You can download Pulse today to try it out. Free licenses are available for evaluation, open source projects and small teams.