Using SpecFlow for WebDriver user acceptance tests – Pros and Cons


Recently in my team we have introduced the use of SpecFlow to held create business driven tests. Specifically we’ve been refining the way we work with acceptance criteria for some time, and have been pushing towards the Given, When, Then format. This being the obvious precursor for using tools to help the test generation process.
There are some challenges in getting a business team to work with this structure, so initially my team has done the work of creating the feature files that contain the scenarios structured as Given When Then (GWT) sets. Ideally over time the business will see how they can write directly in this format and help accelerate the process.

However here in lies an issue… SpecFlow comes in the form of a plugin for Visual Studio, and it aids the writing of tests because it auto-completes based on existing statements in the code base. But the implication of that is having to have your business owners run a visual studio instance, with your code base, in order to make most effective use of the tool. Without that they would not get any of the assistance of existing GWT statements to draw upon.
Perhaps there are tools out there to solve this problem, in truth I’ve not had time to investigate much, but it seems to be truly helpful I need someway of getting the business team able to generate the feature files, whilst getting all the assistance of existing options.

Undeterred by that limitation we cracked on with our test team creating tests this way. I let the team get started and build up a few areas of testing using specflow to generate nunit tests (we use nunit as the test runner for all our tests even though they aren’t really unit tests)

This is where things started to look wrong to me… the first hint was trying to navigate from a feature file to the code that implements it, and more importantly, from a method decorated as a Given, When or Then, finding out which tests call it. I could not.

When I had the time I sat down to really understand how things were being integrated and to review the code being produced by my team, and slowly realisation dawned… SpecFlows nunit integration is really kinda hacky.

Why do I say that?, Well it seems that rather than really integrate with nunit, specflow just generates a test which invokes its own custom test runner. So every feature file winds up with a new instance of the spec flow test runner, and every test just invokes that test runner with some strings which are your GWT statements. So you have an nunit test runner, that performs fixture setup, and test setup, then you invoke the specflow test runner which had its own feature setup and test setup, then it invokes your GWT statements, and everything tears down again. It wasn’t initially clear to me why specflow didn’t just generate straight up c# nunit code with method invocations and associated method stubs waiting to be implemented (or linked to existing ones it can find).

It requires its test runner, because of the way they chose to find your GWT methods. rather than just do so directly to the method, they require that you decorate the method with a could of custom tags, so you end up with
[Given (“Something I need”)]
public void SomethingINeed()

specflow uses this decorator to locate the method you want to call. The ‘clever’ bit is that it can find these methods in any class, anywhere in your dll completely regardless of the object hierarchy…Now maybe this is fine in the world of real unit tests, but four our code base this is decidedly not cool.
That means each statement of a test might be located inside a separate class, in separate parts of a class hierarchy, and each one of them reflectively instantiates the object to call the method.

What is so bad about that? Well in our code base all of our tests inherit within a well defined ‘tree’ structure, the root of the tree contains all the logic necessary to establish a webdriver session, and anything very common, so all the helper methods likely to be used by every test. below that level are coarse functional level classes, these provide all the things that are quite specific to a functional area, it is unlikely a test of our communications needs access to methods to navigate around our public information pages. Beneath the coarse functional level are the specification specific helpers, things that tend to be very specific use cases and error scenarios associated to a particular specification. And at the leaf nodes we have the tests. Every test has access to every method it needs access to. Now maybe this is just a stupid structure, I’m not going to claim to being a great architect of software, however it works well for our needs and allows us to do some pretty nifty things (for instance a given test run of our 1700 tests reports ~30 hours worth of test nunit test results, but takes just 5.5hours to run…nifty).
So in our structure the SpecFlow way of doing things causes each statement to potentially recreate and initialise the entire object tree multiple times including constructors that expect to only run once…

Our nifty test structure makes use of featuresetup to do things like create users for the tests (crazy!) but that relies on nunit feature setup methods being override-able by each feature within our structure and doing various stateful decidedly non-static things. But Specflow feature setup methods must be static, and it seems it had to create a special context class just for the purpose of overcoming the issue of randomly spawning objects using reflection, independent of each other. Basically just using specflow as it expects caused a massive integration headache for me, and the resulting tests performed horribly. It encouraged behaviours where we had nice granular methods to enter different fields on a form, but every one of those methods would reinitialise the webdriver page object!
Just filling in a form could spend most of its time loading and reloading the same page for different parts of the same test, mostly because it wasn’t clear how to retain interaction state between them.

Now it may seem that I’m pretty down on SpecFlow, but I’m not. I think the idea has a lot going for it, I just don’t really agree with the way they’ve implemented nunit integration. If it were me, I would have generated tests with direct method calls, use the reflection cleverness to go find them, but don’t break the object hierarchy, if a method isn’t in the right object tree to be called from a test just indicate where there required method currently resides, and let the users refactor.
no custom specflow runner to do runtime cleverness, do all of that at generation time and leave users with normal native c# nunit code.
Effectively this is what we’re now doing. We still write feature files, and those files still benefit from finding the GWT statement in the codebase. However we then take the generated code and turn every testrunner.given(“Something i need”); and turn it into the direct call SomethingINeed();. We do all of this in a new class file so that spec flow doesn’t just wipe our changes on the next regen. Then we just leave the specflow generated version of the file empty.

With our approach we get the benefits of a business readable feature file, and a clear link between this file and the tests which implement them. But we also get a clear test structure and the ability to use our various nifty optimisations that help us keep things running in a vaguely sane timeframe (I’m still not happy with 5.5 hours I’m sure we can do something about that)

Anyone else out there doing things with SpecFlow and WebDriver? Am I missing something? is my structure crazy and ill advised which leads to all my problems with SpecFlow?

Anyone running a similar amount of webdriver/selenium tests that are prepared to share their execution times/environment details?


2 responses to “Using SpecFlow for WebDriver user acceptance tests – Pros and Cons”

  1. Interesting observations indeed. I would like to reflect on the three core questions you have made:1. How to edit specs without VS?
    The generation can be also done during the build, with MsBuild. See http://go.specflow.org/doc-msbuild
    2. Why SpecFlow does “dynamic binding” of steps? (ie why don’t we generate direct method calls)
    The main argument for this is to support the BDD workflow better. Having something specified (ie there is already a feature/scenario for it), but not completed yet is an (progress) indicator rather than an error. So breaking the compilation with every added scenario is a rather contra-productive in an already established process. (Of course it could be partly compensated with stubs, etc. but still. (Think about non-devs editing the specs!)
    3. Why not using the test class hierarchy?
    On one hand, SpecFlow supports many unit test providers, each work a bit differently, especially for setup/teardown and instance lifetime. It would have been quite a big effort to “standardize” that.
    The other argument is even more important IMHO: testing is a cross-cutting concern. A single test is a mix of usage of the different parts of the application in different levels (e.g. some test require the product catalog, some tests require authentication, but this two sets are not equal nor subset of each other). Since C++-based languages, like C# are not too good in handling mixins, forcing the tests into a hierarchical structure is anyway a pain. The patterns, like the driver pattern or the page object pattern (these are kind of strategy-patterns) are commonly used to solve this problem. The core idea here, is to group the automation (inl. setup/teardown) of a functional group (e.g. product catalog or authentication) into one class and used them as needed for tests that require them. In simple cases, these classes can be the ones where you have the G/W/T attributes, in more complex cases, these can form another layer and use the “context injection” feature (see docs).

    In your case, as far as I see, the problems you’ve encountered (or at least some of them), are coming from the fact that you already have a well established, working structure and testing solution. Obviously it is not easy to make a big refactoring and learning a new tool at once. I hope my answers were helpful.

    • Thank you for taking the time to respond. I appreciate it.
      To clarify for point 1, my problem isn’t how to generate from feature files in the absence of VS, my problem is that without access to VS and our whole code base a business user won’t get auto-complete type assistance in writing new scenarios, a minor difference of terminology would lead to duplicate methods with slightly different phrasing.

      On point 2, if you simply generate a stub for methods you can’t find, and have that stub throw notimplemented exception you get the same advantage that unimplemented things don’t break compile along with a much easier ability to identify everything that isn’t implemented. With specflow code you can’t tell until runtime that a method doesn’t actually exist yet.

      As for 3, i can certainly believe my mind set and frame of reference is just too far removed from the expected usage for specflow to clearly see why I would want to go to that lack of hierarchy. Certainly tests are crosscutting, but typically only the golden path of a given area is used by tests of other areas. Leaving all the specific stuff to be isolated to just the test set that care about negative cases etc.

      The flip side is requiring every GWT statement to do things like independantly initalise the same webdriver page object just to fill in one new field. This is very expensive from a performance standpoint compared to initalising a page when you arrive at it and having all subsequent statements able to assume that object is initialised for them already.

      I guess from my point of view, specflow could work easily with my structure but for a few tweaks. I’m sure you feel my tests would work easily with specflow if I just tweaked them 🙂
      But I understand its a lot of work to support many different runners/languages, and the desire to arrive at a standard approach. It just leaves us using it in a somewhat diminished and slightly odd way to get the benefits we want without the percieved drawbacks.

Leave a Reply

Only people in my network can comment.