I wrote ages ago about developing a tool to make nunit tests run in multiple threads part1 and part2. That was pretty good however it hit a slight issue in that if the tests weren’t good about the use of global statics, then things could get very messy.
So more recently I updated that program to split things into separate processes, instead of separate threads. This way I got better isolation between the main runner and the tests being run, and I got to work with tests that each wanted to mess with the spring container, as they are each in their own process so they don’t interfere.
Now the biggest challenge to running most of our ‘local in memory’ tests was that they were using a real sql server (somewhat heavy weight for what are technically ‘unit tests’ ) However a colleague recently altered those tests to use an in memory database, such that each new test fixture was playing in it’s own space, so it was no longer an issue to run multiple tests that attempted to update the same things at the same time.
Now this test bucket had been growing steadily in size and runtime, at the point that it took about 45 minutes to run all the tests in the bucket, we split it into 3 separate buckets based on projects, so that we could run them on 3 build agents. This obviously meant that we were using 3 times the machines, to keep the runtime down to an acceptable level. Again the runtime crept up until each of those 3 buckets was taking about 25 minutes. It is at this point that I started experimenting applying my multi process runner to these tests. until now I’d only applied it to the selenium tests written by my test team, but I’d kept the tool generically nunit, so it was time to play
Our standard build server is a 4 cpu win 2008 machine, with about 2gb of memory allocated, (these are all virtual) so I took the whole run, and used my parallel runner using 4 processes (one per core) and got the whole bucket (~1800 tests) to run in about 25 minutes. So it was no faster than the total elapsed time we already had, but using 1 build engine instead of 3. This was a pretty good result, but not good enough for me…
One issue I had was that the developer written tests are spread over a bunch of dlls, and the nunit framework only lets you give a single runner a single dll. The way I originally wrote the code was based on a single dll with all the tests in, and when I enabled it to work with lots, I basically just wrapped the whole logic in a giant foreach loop. This was pretty inefficient for dlls with relatively few tests in. So I refactored it to pre-load all the dll information, so that it would keep allocating work to free processes as long as there remain more dlls with tests in, rather than wait for a single dll to finish all its tests before starting on the next. making this change pulled down the runtime to around 22-23 minutes, a couple of minutes of runtime just for handling the process allocation logic a little more sensibly. (I know it doesn’t sound like much, but we are talking around 10% improvement here!)
At this point I asked our infrastructure team to hook me up with a build engine that has 8 cpus assigned, instead of the standard 4. This allowed me to boost the number of processes to 8, in theory double the cpu, double the speed, so I had high hopes. I was pretty happy with the results, the change brought the runtime down to about 13.5 minutes! now we’re talking. It’s not a liner improvement, but suddenly things are looking pretty sweet, we can use 1 build engine with the cpu allocation equivalent of 2 of the old build engines, but achieve half the runtime for the entire suite of tests compared to using up 3.
Now, drunk on the possibilities of speed, I scoured the code to figure out ways to squeeze a little more performance out. In doing this I spotted some logic I had put in place for the selenium tests, after launching a new process i actually sleep for 5 seconds to give it some time, this is because I found the selenium server would fail to create its temp folders if multiple processes talk to it in very quick succession (seems like a bug to me) but this sleep was totally unnecessary for the in memory unit tests, so I made it configurable, the existing tests could keep the sleep, but the new ones would reduce it to zero. I wasn’t sure how much impact this would really have, since after the initial launch of processes we could be waiting in other places.
However having made the change I wiped about another minute of the run time, now getting an average of 12.5 minutes per run. Again, 1 minute off may not sound like a big deal, but if you’ve read this far in a post about making tests go fast, I figure you are the kind of person that knows how much of a difference it can make to shave those extra seconds off of a test run. At this point we’re chasing the golden possibility of having a serious commit test bucket, running 1800 tests, taking less than 10 minutes. (not quite yet, but I’m so close I can taste it)
Unfortunately at this point I’m also out of tricks, I’ve looked the code over and there are no obvious places to make a difference. One great hope was a change made by one of the developers to persist a version of the in memory db and deserialise it into memory rather than recreate it from scratch each time. This seems like a great idea, and made a big impact when running the tests serially, but in practice it yielded no appreciable change to running them in parallel.
So I was chatting with the dev architect, and told him that this is as good as it gets, there are no more software improvements I can think of, the only thing left is throwing more brute power at it. So he did…
I came in the next morning to find an email showing the cpu load on a build agent with 24 cores! and 8gb ram!, the runner was pushing it pretty hard, and achieved a runtime of just 5min30seconds, it has to be said I was pretty damn happy with that result. Sure it’s not a linear improvement, but holy cow, we took a test run that took over an hour run serially on one build engine, and dropped it to barely over 5 minutes, the product takes longer than that to compile.
It’s possible that we won’t leave a build engine with that much cpu assigned all the time, however it is great to know that when we want to, we can turn the dial up to 11 and really help crank through the changes.