Selenium on Ubuntu Hardy Heron

May 1st, 2008

Ubuntu 8.04, the Hardy Heron, was released last week. I eagerly upgraded to see if my previous wireless networking issues had been resolved, and joy of joys they had! This allowed me to finally switch back to Ubuntu as my primary development environment.

Most of the switch went without a hitch, however I did encounter one problem. The acceptance tests for Pulse rely heavily on Selenium. When trying to launch Selenium on Hardy, Selenium RC always got stuck at:

Preparing Firefox profile…

The default version of Firefox installed with Hardy is Firefox 3 beta 5 - as opposed to Firefox 2 in previous releases. So it seems that Selenium and Firefox 3 don’t agree with each other - or at least not when using the current release of Selenium (1.0-b1). My best guess is that Selenium is trying to install extensions into Firefox but is failing as the compatible versions for the extensions don’t include Firefox 3. A better diagnostic would be nice!

In any case, the easiest resolution was to install Firefox 2, which can sit alongside Firefox 3 on Hardy. Just install the firefox-2 package:

$ sudo apt-get install firefox-2

Then make sure Selenium uses this version. One way to do this is to create a link named firefox-bin in your PATH that points to the Firefox 2 binary. However, I prefer to leave my system clean and make an exception just for Selenium. To allow this, our tests support an optional SELENIUM_BROWSER environment variable that specifies in Selenium format the browser to use (this comes in handy in other cases too). So before running tests, I ensure that SELENIUM_BROWSER is set:

export SELENIUM_BROWSER=”*firefox /usr/lib/firefox/firefox-2-bin”

To make this my default with no extra effort, I have actually just added it to my .bashrc.

Ext Discovers Step 2 of the Slashdot Business Model?

April 24th, 2008

For many years, step 2 has eluded the slashdot community:

1: Start your project.
2: …
3. Profit!

Now, however, Ext, the company behind the popular ExtJS Javascript library, seem to believe they have pinned down this step. In the latest 2.1 release, Ext has switched from a dual LGPL/commercial license model to a dual GPL/commercial license model. Despite claims from the company that this was done due to community concerns about ExtJS not being fully open-source, the majority have seen what this really is:

1: Start your project under a LGPL/commercial license model.
2: Switch to a GPL/commercial license model.
3: Profit!

The model works because step 1 allows you to build a community around the more liberal LGPL license. In particular, as the LGPL is commercial-friendly, the community will include many people building commercial applications. Once the community is suckered in and committed, the license is changed, leaving them high and dry. Well, not quite: they can continue to use new versions of the library by buying a commercial license. Hence the profit!

Not surprisingly, this sudden license switch has caused a large community backlash. People don’t like to feel that they have been baited in this way. The defense from Ext runs along a couple of lines:

  1. They are just applying the “Quid-Pro-Quo” principle: i.e. if you get something out of Ext, you should give back. This is fine: Ext have created a great library and have every right to charge for it if they wish. But it isn’t the use of a commercial model that is wrong here: it is the sudden switching of models. Ext have massively benefitted themselves from liberal licenses, both in terms of the libraries Ext was originally built around (e.g. YUI which is BSD licensed) and the community that appeared around Ext itself, submitting bug reports, patches and extensions. What are they now giving back to these communities they have used?
  2. The previous licensing was confusing and criticised for not being “open” enough. Simplifying the licensing scheme is fine, but there was no need to change to the GPL to do this. In fact, switching to the GPL has caused more problems for open source projects built around Ext than it has solved. Because of the viral nature of the GPL, those projects must now reconsider their own licenses to be able to upgrade to the latest version of ExtJS. The Ext team surely realise this, so trying to give this reason for the license switch is misleading on their part.

The backlash has been all the greater because of how the switch has been handled:

  1. The community was given no warning or consultation, despite this change being ostensibly for their benefit.
  2. Ext were not forthright with the real motive for the change.
  3. Ext are claiming that a fork of the existing 2.0 version is not legal, due to the way they applied the LGPL. This is likely to be incorrect, and if correct then their use of the name LGPL was grossly misleading.
  4. There is still confusion around the licensing. For example, the commercial license does not clearly spell out what happens when upgrading. Debates continue around the applicability of the GPL to hosted applications and those that generate code. Ext have been far from responsive on these matters.
  5. This is not the first major license change for ExtJS, leaving many anticipating similar upheaval in the future.

The saddest part about this is that the Ext team really have built a fantastic library, and a vibrant community around it. The library had all the hallmarks of an open source success story. Now, however, Ext have committed the cardinal sin of an open source project: they have undermined the trust of their own community. I can see the strategy making a short term profit, and perhaps even a long term business. I cannot, however, see the community returning to full strength. In the long run, although Ext may have a successful commercial offering, they will lose a large part of their current and potential community to competing and still-open projects. The power of the community will see those competitors flourish.

On Fast Feedback

April 16th, 2008

A major focus of agile development is fast feedback - of many kinds. As this idea is taken to its extreme, it is worthwhile asking if feedback can be too fast. That is, is there a point at which shrinking the feedback loop becomes counterproductive?

The benefit of feedback is knowing if you are heading in the right direction. The faster the feedback, the less time you spend heading towards dead ends. So the benefit increases as feedback gets faster. But what of the cost? It is useful to split the cost into two categories:

  1. Setup cost: how hard is it to create the system to produce the feedback? Is it more difficult to create faster feedback?
  2. Ongoing cost: what does it cost to receive the feedback? Do you pay this every time?

Setup cost is not usually a big issue except in extreme cases, as it can be paid off over time. More interesting is the ongoing cost of receiving and processing feedback. Naturally receiving, understanding and acting on feedback takes time. The important part is distinguishing when this time is productive versus wasted. Processing feedback can be wasteful in various ways:

  1. False alarms: if the feedback is incorrect, at best it is an irritation. At worst, significant time is wasted chasing a wild goose.
  2. Distraction: if feedback arrives when you are in the middle of something, it can interrupt your flow. Even if the feedback is useful, its benefit can be outweighed by the cost of losing your train of thought.
  3. Repetition: if feedback tells you nothing new, then it has no benefit. Any time spent acknowledging it is wasted.

How do these problems relate to the speed of feedback? The cost of false alarms is roughly inversely proportional to the length of the feedback cycle. As the relationship is linear, false alarms are not compounded by making feedback faster. In any case, accuracy of feedback is always paramount: false alarms must be eliminated.

More interesting is the relationship between feedback speed and the other two potential problems: distraction and repetition. In both these cases, shrinking the feedback cycle may produce a disproportionate increase in the occurrence of problems. Thus, when shrinking the feedback cycle, care must be taken to address both of these areas.

Distraction can be addressed in multiple ways. Firstly, receipt of feedback should be optional. Users must be able to pause the feedback mechanism (or possibly switch it off altogether). Then when a user needs to for deep thought, they can ensure that they are not interrupted. Secondly, the method of feedback needs to be in keeping with both the importance and frequency of the message to convey. For example, an emergency that requires immediate attention may use a bold and hard to ignore mechanism, whereas a small status update should be less intrusive. Continuous feedback must be modeless — as an example think of the real time spell-checking functionality in many word processors. Lastly, feedback should be categorised to allow smart filtering. Different users often require different levels of feedback — what is important for some is a trivial detail to others.

Repetition is usually simpler to address: feedback can be suppressed when there is no change. Note that in some cases a change for one user may not represent a change for another user. Thus, as above, filtering that is based on individual users may be necessary.

Conclusion

The value of feedback, and indeed fast feedback, is unquestioned. Taking this principle to the extreme, it follows that faster feedback is better. I feel this is true provided that the crucial issues of distraction and repetition are addressed.

Speeding Up Acceptance Tests: Optimise Your Application

March 27th, 2008

As our new acceptance test suite for Pulse 2.0 has grown, naturally it has taken longer to execute. More problematicly, however, several tests towards the end of the suite were taking longer and longer to run. This clearly highlighted a scalability issue: as more data was piled into the testing installation, tests took longer to execute. The situation became so bad that a single save in the configuration UI could take tens of seconds to complete.

A simple way to solve this would be to blow away the data after each test (or at regular intervals) so that we don’t see the build up. This would improve the performance of the test suite, but would also hide a clear performance issue in the application. Better to keep on eating our own dog food and fix the underlying problem itself. This way not only would the tests execute more quickly, but (more importantly) the product would also be improved.

Basic profiling identified the major bootleneck was many repeated calls to Introspector.getBeanInfo. This call was known by us to be slow, but according the documentation the results should be cached internally. It turns out that the cache is ignored when a certain argument is used (quite common!), so we were triggering a large amount of reflection many times over. By adding a simple cache on top of this method, it practically disappeared from the profiles and the time taken to run our test suite was better than halved! More importantly, the latency in the UI was decreased by several times to an acceptable level.

Naturally optimisation is not always this easy. The more tuned your application, the harder it becomes to find such dramatic improvements. However, it pays to keep an eye on your test suite performance, and profile if things don’t feel right. You might just find a simple win-win situation like the one described.

C Unit Testing with CUnit

March 19th, 2008

As a part of our continued efforts to support every unit testing library we can in Pulse, I have been looking into CUnit, the imaginitively-named framework for plain C.

Initial Impressions

The first thing I noticed about CUnit was the clean website and full documentation. This is a rarity amongst the C-based libraries I have seen, and very welcome to a first-time user. The structure supported by the library is a fairly standard suite-based grouping of individual test functions. Notably, CUnit comes with a few reporting interfaces out of the box - including the all important XML reports (typically the easiest way to integrate into a continuous build).

Setup on Windows

I built the library using Visual C++ 2008 Express. Although the provided VC files were from an earlier version, 2008 converted and built the solution without issues. The static library appeared at:

VC8\Release - Static Lib\libcunit.lib

Note: I first tried my luck with an unsupported configuration by building the library under Cygwin using ftjam. This didn’t work, although I suspect porting the build would not be hard.

The First Test

Next I started with a contrived test program to see how it all fits together. The program has a single suite with setup and teardown functions, with one passing an one failing case.

#include <stdlib.h>
#include "CUnit\Basic.h"

char *gFoo;

int setup(void)
{
        gFoo = strdup(“i can has cheezburger”);
        if(gFoo == NULL)
        {
                return 1;
        }

        return 0;
}

int teardown(void)
{
        free(gFoo);
        return 0;
}

void testPass(void)
{
        CU_ASSERT_STRING_EQUAL(gFoo, “i can has cheezburger”);
}

void testFail(void)
{
        CU_ASSERT_STRING_EQUAL(gFoo, “no soup for you!”);
}

int main()
{
        CU_pSuite pSuite = NULL;

        if(CU_initialize_registry() != CUE_SUCCESS)
        {
                return CU_get_error();
        }

        pSuite = CU_add_suite(“First Suite”, setup, teardown);
        if(pSuite == NULL)
        {
                goto exit;
        }

        if(CU_add_test(pSuite, “Test Pass”, testPass) == NULL)
        {
                goto exit;
        }

        if(CU_add_test(pSuite, “Test Fail”, testFail) == NULL)
        {
                goto exit;
        }

        CU_basic_set_mode(CU_BRM_VERBOSE);
        CU_basic_run_tests();

exit:
        CU_cleanup_registry();
        return CU_get_error();
}

The code illustrates how to write test functions with assertions, as well as how to assemble a test suite and run it with the built-in basic UI. The output from this program is shown below:

  CUnit - A Unit testing framework for C - Version 2.1-0
  http://cunit.sourceforge.net/

Suite: First Suite
  Test: Test Pass ... passed
  Test: Test Fail ... FAILED
    1. c:projectscunitplayfirsttestmain.cpp:30  -
        CU_ASSERT_STRING_EQUAL(gFoo,"no soup for you!")

--Run Summary: Type      Total     Ran  Passed  Failed
               suites        1       1     n/a       0
               tests         2       2       1       1
               asserts       2       2       1       1

Generating XML Reports

Generating an XML report required a very small change. Rather than using the basic UI, I switched to the automated UI and set the output filename prefix:

#include <stdlib.h>
#include "CUnit\Automated.h"

/* Test functions the same as previous example. */

int main()
{
        /* All setup code remains the same */
        …

        if(CU_add_test(pSuite, “Test Fail”, testFail) == NULL)
        {
                goto exit;
        }

        CU_set_output_filename(“CUnit”);
        CU_automated_run_tests();
        CU_list_tests_to_file();

exit:
        CU_cleanup_registry();
        return CU_get_error();
}

Running this program produces two XML output files: CUnit-Listing.xml and CUnit-Results.xml. The former contains data about the test suites and cases themselves, the latter contains the results of running the tests. This latter report is useful for integration with other systems; e.g. it is the report that we now post-process to pull test results into Pulse.

Note that these XML files come with a DTD and stylesheet, which is useful, but also akward. As the DTD and XSL file are not present alongside the reports where they are generated, opening the files will fail.

Improvements

Starting from this vey basic test setup leaves plenty of room for improvement. For example, if you look at the sample code above, you will notice that a lot of time is spent assembling the suite and checking for errors. Thankfully the CUnit authors have noticed the same and offer simplifications:

  • There are shortcuts for managing tests. For a non-trivial suite this method of describing the test suites in nested arrays and registering in one hit would be a lot simpler.
  • You can optionally tell CUnit to abort on error. Since any error at the CUnit level is likely to be fatal, this should reduce the need for error handling.

Another improvement that could perhaps be added in the future is better reporting for assertion failures. Looking back at the assertion failure output:

  Test: Test Fail ... FAILED
    1. c:projectscunitplayfirsttestmain.cpp:30  -
        CU_ASSERT_STRING_EQUAL(gFoo,"no soup for you!")

you can see that it pulls out the assertion expression as-is. However, given that I am using a specific string equals assertion, CUnit should also be able to show me the actual values that were compared at runtime, and even the difference. This can avoid the need to run the test again under a debugger to see the values.

Conclusion

Overall I would rate CUnit as a decent choice for CUnit testing. In terms of features all the basics are covered, and important improvements have been made. There is not much new, however, and there is still room for improvement. Where this library really shines is in the out of the box experience; thanks to excellent documentation and multiple included UIs.

If you’re looking for a plain C unit testing solution this is the best open source alternative I have seen. While you’re at it, why not go for a full continuous integeration setup with the CUnit support now in Pulse ;).

Checked Exceptions: Failed Experiment?

March 12th, 2008

Checked exceptions is one of those Java features that tends to work up a lively discussion. So forgive me for rehashing an old debate, but I am dismayed to see continued rejection of them as a failed feature, and even more so to see proposals to remove them from the language (not that I ever think this will happen).

In the Beginning…

Most would agree that mistakes were made in the way checked exceptions were used in the early Java years. Some libraries even go so far as doing something about it. However, an argument against the feature itself this does not make. Like anything new, people just needed time in the trenches to learn how to get the most out of checked exceptions. A key point here is that checked exceptions are available but not compulsory. This leads to a reasonable conclusion, as made in Effective Java:

Use checked exceptions for recoverable conditions and runtime exceptions for programming errors

The key point here is recoverable. If you can and should handle the exception, and usually close to the source, then a checked exception is exactly what you need. This will encourage proper handling of an error case - and make it impossible to not realise the case exists. For unrecoverable problems, handling is unlikely to be close to the source, and handling (if any) is likely to be more generic. In this case an unchecked exception is used, to avoid maintenance problems and the leakage of implementation details.

Continued Criticism

None of the above is new, and to me it is not contraversial. But the continued attacks on checked exceptions indicate that some still disagree. Let’s take a look at some common arguments against checked exceptions that still get airtime:

Exception Swallowing/Rethrowing

Some argue that the proliferation of code that either swallows checked exceptions (catches them without handling them) or just rethrows them all (”throws Exception”) is evidence that the feature is flawed. To me this argument carries no weight. If we look at why exception handling is abused in this way, it boils down to two possibilities:

  1. The programmer is too lazy to consider error handling properly, so just swallows or rethrows everything. A bad programmer doesn’t imply a bad feature.
  2. The programmer is frustrated by a poorly-designed API that throws checked exceptions when it shouldn’t. In this case the feature is an innocent tool in the hands of a bad API designer. Granted it took some time to discover when checked exceptions were appropriate, so many older APIs got it wrong. However, this is no reason to throw out the feature now.

You need well-designed APIs and good programmers to get quality software, whether you are using checked exceptions or not.

Maintenance Nightmare

A common complaint is how changes to the exception specification for a method bubbles outwards, creating a maintenance nightmare. There are two responses to this:

  1. The exceptions thrown by your method are part of the public API. Just because this is inconvenient (and when is error handling ever convenient?) doesn’t make it false! If you want to change an API, then there will be maintenance issues - full stop.
  2. If checked exceptions are kept to cases that are recoverable, and usually close to the point when they are raised, the specification will not bubble far.

Switching every last exception to unchecked won’t make maintenance problems go away, it will just make it easier to ignore them - to your own detriment.

Try/Catch Clutter

Some argue that all the try/catch blocks that checked exceptions force on us clutter up the code. Is the alternative, however, to elide the error handling? If you need to handle an error, the code needs to go somewhere. In an exception-based language, that is a try-catch block, whether your exception is checked or not. At least the exception mechanism lets you take the error handling code out of line from your actual functionality. You could argue that the Java’s exception handling syntax and abstraction capabilities make error handling more repetitive than it should be - and I would agree with you. This is orthogonal to the checked vs unchecked issue, though.

Testing Is Better

Some would argue that instead of the complier enforcing exception handling in a rigid way, it would be better to rely on tests catching problems at runtime. This is analogous to the debate re: static vs dynamic typing, and is admittedly not a clear cut issue. My response to this is that earlier and more reliable feedback (i.e. from the compiler) is clearly beneficial. Measuring the relative cost of maintaining exception specifications versus maintaining tests is more difficult. In cases where an exception almost certainly needs to be recovered from (i.e. the only case where a checked exception should be used), however, I would argue that the testing would be at least as expensive, and less reliable.

Conclusion

I am yet to hear an argument against checked exceptions that I find convincing. To me, they are a valuable language feature that, when used correctly, makes programs more robust. As such, it is nonsense to suggest throwing them out. Instead, the focus should be on encouraging their appropriate use.

Re: Continuous Integration Is A Hack

March 7th, 2008

Ben Rady’s recent blog post Continuous Integration Is A Hack always had a title that would attract the attention of someone working on this said “hack”. One core part of Ben’s argument is that:

Agility is a set of values and principles…and no more. Practices depend on technology and technique, which can (and should) be constantly evolving. Values, on the other hand, address the fundamental issues raised when humans work together to create something, and that is truly what makes Agility worthwhile.

This I completely agree with, and it also makes me sceptical of many “certifications”. However, Ben’s attack on CI (while making for a good headline) is off the mark in many ways. Let’s take a look at what he actually says:

CI is one of many useful practices that Agile developers employ today, but fundamentally, it is a hack. That’s because although it’s very useful to know that I’ve introduced a bug 20 minutes after creating it, what I really want is to know the very second that I type the offending line in my editor. CI is just the best that we can do right now, given the technology that is at our disposal.

This shows Ben’s leaning towards Continuous Testing. However, the criticism is wrong in several ways:

  • CI is not just (or even primarily) about automated testing, it is about integration. Running tests as you edit your code does not take integration with your team members into account!
  • Running tests continuously on your local machine doesn’t help detect cross-platform or machine-specific issues - you need builds running in an independent environment.
  • Some tests will never run fast enough to be run continuously, but they still need to be run as frequently as possible.

I could go on, but there is a more fundamental problem here: Ben is boxing the definition of CI into the technical boundaries that he sees today. CI is a practice, not an implementation. I find it ironic that Ben is railing against the definition of Agile as a static set of practices, but then defines CI by today’s implementations! As time passes tools will improve and will take CI to new levels. A key area of improvement has always been reducing feedback time: so rather than being at odds with CI, continuous testing is one part of the larger practice!

Speeding Up Acceptance Tests: Write More Unit Tests

March 6th, 2008

Speed up your tests by writing more tests? No, I haven’t gone insane, there is a clear distinction between the two types of tests:

  • Acceptance tests verify the functionality of your software at the users level. In our case these tests run against a real Pulse package which we automatically unpack, setup and exercise via the web and XML-RPC interfaces.
  • Unit tests verify functionality of smaller “units” of code. The definition of a unit varies, but for our purposes a single unit test exercises a small handful of methods (at most).

From a practical standpoint, acceptance tests are much slower than unit tests. This is due to their high-level nature: each test will typically exercise a lot more code. Add to this the considerable overhead of driving the application via a real interface (e.g. using Selenium), and each acceptance test can take several seconds to execute. All this time stacks up and can considerably slow a full build and test.

The huge difference between the time taken to execute a typical unit and acceptance test leads to a simple goal for speeding up your build as a whole: prefer unit tests where possible. The important thing is to do so without compromising the quality of your test suite. Remember: at the end of the day what happens at the users level is all that matters, so you can’t compromise your acceptance testing. However, there are many examples where you can push tests down from the acceptance to the unit level.

A recent example in Pulse is testing of cloning functionality. In Pulse 2.0, the new configuration UI allows cloning of any named element. Under the hood the clone is quite powerful: it needs to deal with configuration inheritance, references both inside and outside the element being cloned, cloning multiple items at once and so on. However, none of this needs to be tested at the acceptance level. Instead, the acceptance tests focus on testing the UI (forms, validation and so on), and an example of each basic type of clone. Underneath, the unit tests are used to exercise the various combinations to ensure that the back end works as expected. As the line between the front and back ends is clear, it is relatively easy to decide which tests belong where.

So, next time you’re creating some acceptance tests for your software, remember to ask yourself if you can push some tests down to a lower level. Do all the combinations need to be tested at a high level, or are they all equivalent from the UI perspective? A little forethought can shave considerable time off your build.

Running Selenium Headless

March 5th, 2008

Our dog food Pulse server, which spends all day building itself, is a headless box. This presented no challenge for the Pulse 1.x series of releases, as our builds are all scripted and don’t require any GUI tools. With Pulse 2.0, however, things changed. As mentioned in my previous post, this new release has a whole new acceptance test suite based on Selenium. The problem is: Selenium runs in a real browser, and that browser requires a display.

Fortunately, the dog food server is also a Linux box. Thus there is no need to add in a full X setup with requisite hardware just to have Selenium running. This is thanks to the magic of Xvfb - the X virtual framebuffer. Basically, this is a stripped back X server that maintains a virtual display in memory. Hence no actual video hardware or driver is needed, and we can keep things simple.

Setting things up is straightforward. First, install Xvfb:

# apt-get install xvfb

on Debian/Ubuntu; or

# yum install xorg-x11-Xvfb

on Fedora/RedHat. Then, choose a display number that is unlikely to ever clash (even if you add a real display later) - something high like 99 should do. Run Xvfb on this display, with access control off1:

# Xvfb :99 -ac

Now you need to ensure that your display is set to 99 before running the Selenium server (which itself launches the browser). The easiest way to do this is to export DISPLAY=:99 into the environment for Selenium. First, make sure things are working from the command line like so:

$ export DISPLAY=:99
$ firefox

Firefox should launch without error, and stay running (until you kill it with Control-C or similar). You won’t see anything of course! If things go well, then you need to modify the way you launch the Selenium server to ensure the DISPLAY is set. There are many ways to do this, in our Pulse setup we actually use a resource as it is a convenient way to modify the build environment.

An alternative which may suit some setups is to use the xvfb-run wrapper to launch your Selenium server. I opted against this as Pulse has resources to modify the environment and this way I do not need to change the actual build scripts at all. However if you are not using Pulse (shame! :) ) then you may want to look into it.


1 If you are worried about access on your network, then using -ac long-term is not a good idea. Once you have things working, I would suggest tightening things up by turning access control on.

Acceptance Testing: Getting to the Gravy Phase

February 29th, 2008

Pulse 2.0 has a completely overhauled configuration UI. The current 1.2 UI uses standard web forms, whereas the new UI is ExtJS-based and makes heavy use of AJAX. This makes for a huge improvement in usability, unfortunately at the cost of making all of our existing acceptance tests for the UI redundant! So, we needed to go back to square one with our acceptance test suite. It was a painful process, but going through it again reminded me of something important:

The setup cost of your first few acceptance tests is high, but after that it is all gravy.

The process from zero to gravy is similar for most projects, so I’ve summarised the phases we experienced below.

Phase 1: Choosing the Technology

The first step was to switch out jWebUnit for Selenium. Although jWebUnit can execute much of the Javascript in our new UI, it falls down in two key ways:

  1. Executing much or even most of the Javascript is not good enough - it makes writing important tests impossible.
  2. It does nothing to test real-world differences in the various Javascript engines that are the source of many bugs.

As Selenium drives the actual browser, you can test anything that will run in the browser and can also test compatibility. The main (well known) drawback is that Selenium is slow. As important as speed can be, however, accuracy is far more important.

Phase 2: Scripting Setup and Teardown

This is where the slog started. The most accurate way to acceptance test is to start with your actual release artifact (installer, tarball, WAR file, whatever). In our case, this is the Pulse package, which comes in various forms. Before we can actually test the UI, we need to get the Pulse server unpacked, started and setup. When the tests are complete, we also need scripts to stop the server and clean up. This way we can integrate the tests into our build and, of course, run the acceptance tests using our own Pulse installation!

So, a day of scripting later, and we don’t have a single test. Sigh.

Phase 3: The First Test

Now things got serious: writing the first test was painfully hard. I constantly hit snags:

  • How do I use this newfangled Selenium thingy?
  • How do I test a UI that is asynchronous?
  • How can I verify the state of ExtJS forms?

The important thing here was to keep pushing. The amount of effort to get to the first test case is not worth it for that test alone, but I knew it was an investment in knowlege that would help us develop further tests.

Phase 4: Abstraction

As I continued to write the first few test cases, repetition crept in. Verifying the state of a form is similar for all forms. Many tests navigate over the same pages, examining the state in similar ways. The code was ripe for refactoring! I started abstracting away the actual clicks, keystrokes and inspections and building up a model of the application. The classic way to do this is to represent pages and forms as individual classes that provie high-level interfaces. Over time, our tests changed from something like:

type(”username”, “admin”)
type(”password”, “admin”)
goTo(”/admin/projects/”)
waitForPageToLoad()
click(”add.project”)
waitForCondition(”formLoaded”)
type(”name”, “p1″)
type(”description”, “test”)
click(”ok”)
waitForElement(”projects”)
assertTrue(isElementPresent(”project.p1′))

to:

loginasAdmin()
projectsPage.goTo()
wizard = projectsPage.clickAdd()
wizard.next(name, description)
projectsPage.waitFor()
assertTrue(projectsPage.isProjectPresent(name))

and then to:

loginAsAdmin()
addProject(name, description)
assertProjectPresent(name)

Phase 5: Gravy

With the right abstractions in place, adding acceptance tests is now much easier. The tests themselves are also easier to understand and maintain due to their declarative nature. Now we can reap the rewards of those painful early stages!

Conclusion

The process of setting up an acceptance test suite is quite daunting, and initially painful. But if you persevere and constantly look for useful abstractions, you’ll reap the rewards in the long run.