Posts tagged TDD

Specs and Selenium Together

I recently had the chance to dive into a new project, this one with a rich web interface. In order to create acceptance tests around the (large and mostly untested) existing code, we’ve started writing specs acceptance tests.

Once we have our specs written to express what the existing functionality is, we can refactor and work on the codebase in more safety, our tests acting as a “motion detector” to let us know if we’ve broken something, while we write more detailed low-level tests (unit tests) to allow easier refactoring of smaller pieces of the application.

What’s interesting about our latest batch of specs is that they are written to express behaviours as experienced through a web browser – e.g. “when a user goes to this link and clicks this button on the page, he sees something happen”. In order to make this work we’ve paired up specs with Selenium, a well-known web testing framework.

By abstracting out the connection to Selenium into a parent Scala object, we can build a DSL-ish testing language that lets us say things like this:


object AUserChangesLanguages extends BaseSpecification {

  "a public user who visits the site" should beAbleTo {
    "Change their language to French" in {
      open("/")
      select("languageSelect", "value=fr")
      waitForPage
      location must include("/fr/")
    }
    "Change their language to German" in {
      select("languageSelect", "value=de")
      waitForPage
      location must include("/de/")
    }
    "Change their language to Polish" in {
      select("languageSelect", "value=pl")
      waitForPage
      location must include("/pl/")
    }
  }
}

This code simply expresses that as a user selects a language from a drop-down of languages, the page should refresh (via some Javascript on the page) and redirect them to a new URL. The new URL contains the language code, so we can tell we’ve arrived at the right page by the “location must include…” line.

Simple and expressive, these tests can be run with any of your choice of browsers (e.g. Firefox, Safari, or, if you insist, Internet Explorer).

Of course, there’s lots more to testing web pages, and we’re fleshing out our DSL day by day as it needs to express more sophisticated interactions with the application.

We can get elements of the page (via Xpath), make assertions about their values, click on things, type things into fields and submit forms, basically all the operations a user might want to do with a web application.

There are some frustrations, of course. The Xpath implementation on different browsers works a bit differently – well, ok, to be fair, it works on all browsers except Internet Exploder, where it fails in various frustrating ways. We’re working on ways to overcome this that don’t involve having any “if browser == ” kind of logic.

It’s also necessary to start the Selenium RC server before running the specs, but a bit of Ant magic fixes this.

We’ve got these specs running on our TeamCity continuous integration server, using the TeamCity runner supplied with Specs, where we get nicely formatted reports as to what’s pending (e.g. not finished being written yet), what’s passing, and what’s failing.

The specs written with Selenium this way are also a bit slow, as they must actually wait in some cases for the browser (and the underlying app!) to catch up. When run with IE as the browser, they’re more than just a bit slow, in fact…

They are, however, gratifyingly black-box, as they don’t have any connection to the code of the running application at all. For that matter, the application under test can be written in any language at all, and in this case is a combination of J2EE/JSP and some .NET.

There’s a lot of promise in this type of testing, even with it’s occasional frustrations and limitations, and I suspect we’ll be doing a lot more of it.

By: Mike Nash

Leave a comment »

Scala Continuous Testing with sbt

I’ve recently had occasion to start an open source project, and the correct tool for the job appears to be Scala.

So far the project is going well, but the pain has been around the build and IDE support for rapid and convenient development in Scala. Although all three of the major IDEs I’ve worked with recently (Eclipse, IntelliJ IDEA and Netbeans) have plugins for Scala, they are all early releases, and have various degrees of pain associated with them.

I ended up using Netbeans for editing and as a subversion client, then building with Maven when I wanted to compile and/or run tests. Calling Maven from within Netbeans to build a Scala project is still a bit creaky, so I was doing it from a terminal window directly.

This is very inconvenient, for a number of reasons. First, I’m working in a Behaviour-Driven Development mode, using specs as my BDD framework. This means I first write a specification in specs, see it fail, then write the code necessary to make it pass, then write (or extend) the next specification for the next behaviour I want, and so forth.

When I want to run a test, I had to flip to a command window, issue a Maven command to build and select the specified test to run, something like this:


mvn -Dtest=foo test

In order to make this work I had to declare my specs as JUnit tests (with the @Test annotation), even though they don’t use anything else from JUnit. This felt like a bit of a hack, albeit a useful one. Another pain point was the startup time for Maven (although I understand there’s a “console” plugin for Maven as well that can perhaps reduce this particular pain).

As I like to tinker with new stuff, I thought I’d make a departure from Maven and give sbt a try. Sbt is a build tool written in Scala that supports building both Scala and Java (and mixed) projects in a very simple way. Unlike Ant, there’s no up-front pain to write a build script, though, as sbt can make reasonable assumptions (which you can override) about where to find your classes and libraries, so you hit the ground running.

In literally seconds I was up and running after following the install instructions on the sbt site. After a bit of experimenting I found the “console” mode in sbt, where you launch sbt and leave it running.

Once in console mode you can either just type “test” every time you want to build and run all tests, or be more selective, and run only the tests that failed last time, or just a single specified test, if you’re working on just one feature. Any of these operations are fast – mostly because sbt is already loaded and running, but also because sbt does a bit less work then Maven does on every build.

Although sbt can be configured to work in conjunction with Ivy or Maven repositories, you can also just drop your dependency libs in to “lib” directory in your project. For open source this is rather nice, as it saves the user of the project the trouble of trying to find them. Even supplying a Maven pom that specifies the repositories from which to download your dependencies is not a guarantee, as repositories change over time. Many is the time I’ve gone to download a dependency (or rather, Maven has gone to do it for me), only to find it’s not where it used to be, is a different name or version, or some other problem causes my build to fail. Like Ant, sbt can avoid this problem by keeping dependencies locally. Unlike Ant, it can also go get the dependencies the first time for you from the same repositories Maven uses – perhaps giving you the best of both words in some situations.

Even more interesting was the command


~ test

Which runs all the tests, then waits for any source code to change (test or main code). When it sees a change, it runs all the tests again (after compiling the changes, of course). Poor mans continuous testing :)

Wait, it gets even awesomer! When you say


~ test SomeTest

sbt will wait for any changes, then run just the specified test. This is ideal when you know you’re only working on a specific set of functionality (and therefore affecting only a single test). When sbt is waiting, you can just hit any key to return to the interactive mode, so it’s easy to change it from one of these modes to another.

Other commands in sbt are also very familiar and quick, such as “compile”, which does exactly as you’d expect from the name. “Package” is another good one – it produces a jar artifact, just like the Maven command of the same name. I haven’t yet tried it’s deploy mechanisms properly, but early results look promising.

I also like the “console” command, which runs the Scala command-line console, but with your project on the classpath, along with all it’s dependencies. This lets you do ad-hoc statements quickly and easily, and see the results right away. When you’re not sure what’s going on with a failing spec, I’ve found this mode very helpful to experiment. Scala is such an expressive language, I can write a quick experiment in one or two lines of code, see the result (as the Scala console also evaluates expressions by default), and go back to coding and testing, all without re-starting sbt. Quite nice, and somewhat reminiscent of the similar functionality in Rails and “irb” (and JRuby’s equivilant, Jirb).

There are many other things I’ve found about sbt that I like so far, but those are topics for another post later on….

By: Mike Nash

Leave a comment »

The Corporate Culture of Post-it Notes

Ahh, the ubiquitous Post-It® Note.  My workspace is covered with lovely multi-coloured notes, or it was until I discovered Digital Notes. The unassuming Post-it has become a legend but I wanted to  share it again, as told in The Knowledge-Creating Company by Nonaka, and Takeuchi.

Knowledge_Creating_Company

“Art [Fry] sang in the church  choir and noticed that the slips of paper he inserted to mark selected hymns would fall out.  He decided to create a marker that would stick to the page but would peel off without damaging it.  He made use of a peel-able adhesive that Spence Silver at the Central Research Lab had developed four years previously, and made himself some prototypes of the self-attaching sheets of paper.

Sensing a market beyond just hymnal markers, Fry got permission to use a pilot plant and started working nights to develop a process for coating Silver’s adhesive on paper. When he was told that the machine he designed could take six months to make and cost a small fortune, he single-handledly built a crude version in his own basement overnight and brought it to work the next morning.  The machine worked.  But the marketing people did some surveys with potential customers, who said they didn’t feel the need for paper with a weak adhesive.  Fry said, “Even though I felt that there would be demand for the product, I didn’t know how to explain it in words.  Even if I found the words to explain, no one would understand…” Instead, Fry distributed samples within 3M and asked people to try them out.  The rest was history.  Post-it Notes became a sensation thanks to Art Fry’s entrepreneurial dedication and dogged persistence.

(Nonaka I, Takeuchi H. The Knowledge-Creating Company: How Japanese Companies Create the Dynamics of Innovation. 1995)

That entrepreneurial spirit has been part of 3M’s corporate culture almost since inception.  As stated in the William L. McKnight Management Principles:

“As our business grows, it becomes increasingly necessary to delegate responsibility and to encourage men and women to exercise their initiative. This requires considerable tolerance. Those men and women, to whom we delegate authority and responsibility, if they are good people, are going to want to do their jobs in their own way.

“Mistakes will be made. But if a person is essentially right, the mistakes he or she Post-itmakes are not as serious in the long run as the mistakes management will make if it undertakes to tell those in authority exactly how they must do their jobs.

“Management that is destructively critical when mistakes are made kills initiative. And it’s essential that we have many people with initiative if we are to continue to grow.”

Chris touched on this when he blogged about Failing Should be Easy and even Why don’t people like my ideas?!.  Art Fry was given the opportunity to fail.  When people didn’t like his idea, he proceeded to find a way to prove that his idea truly was great.

I’m proud that we here at Point2 allow people to fail; in fact, using Test Driven Development we ensure that everyone fails at first.

We give people time to explore and experiment, and everyone has some time for professional development.

Incidentally, Ken Schwaber’s early paper, “SCRUM Development Process” references heavily the work of Takeuchi and Nonaka and their description of a rugby organization style.

By: Kevin Bitinsky

Leave a comment »

Levels of Testing

I’ve had reason recently to do some thinking on the various “levels” of software testing. I think there’s a rough hierarchy here, but there’s some debate about the naming and terminology in some cases. The general principals are pretty well accepted, however, and I’d like to list them here and expound on what I think each level is all about.

An important concern in each of these levels is to achieve as high a level of automation as possible, along with some mechanism to report to the developers (or other stakeholders, as required) when tests are failing, in a way that doesn’t require them to go somewhere and look at something. I’m a big fan of flashing red lights and loud sirens, myself :)

Unit
Unit testing is one of the most common, and yet in many ways, misunderstood levels of test. I’ve got a separate rant/discussion in the works about TDD (and BDD), but suffice it to say that unit-level testing is a fundamental of test-driven development.

A unit test should test one class (at most – perhaps only part of a class). All other dependencies should be either mocked or stubbed out. If you are using Spring to autowire classes into your test, it’s definitely not a unit test – it’s at least a functional or integration test. There should be no databases or external storage involved – all of those are external and superfluous to a single class that you’re trying to verify is doing the right thing.

Another reason to write comprehensive unit tests is that it’s the easiest place to fix a bug: there are fewer moving parts, and when a simple unit tests breaks it should be entirely clear what’s wrong and what needs to be fixed and how to fix it.

As you go up the stack to more and more complex levels of testing, it becomes harder and harder to tell what broke and how to fix it.

Generally unit tests for any given module are executed as part of every build before a developer checks in code – sometimes this will also include some functional tests as well, but it’s generally a bad idea for any higher-level tests to be run before each and every check-in (due to the impact on developer cycle-time). Instead, you let your CI server handle that, often on a scheduled basis.

Functional
Some people suggest that functional and integration are not two separate types, but I’m separating them here. The key differentiation is that a functional test will likely span a number of classes in a single module, but not involve more than one executable unit. It likely will involve a few classes from within a single classpath space (e.g. from within a single jar or such). In the Java world (or other JVM-hosted languages), this means that a functional test is contained within a single VM instance.

This level might include tests that involve a database layer with an in-memory database, such as hypersonic – but they don’t use an *external* service, like MySQL – that would be an integration test, which we explore next.

Generally in a functional test we are not concerned with the low-level sequence of method or function calls, like we might be in a unit test. Instead, we’re doing more “black box” testing at this level, making sure that when we pour in the right inputs we get the right outputs out, and that when we supply invalid input that an appropriate level of error handling occurs, again, all within a single executable chunk.

Integration
As soon as you have a test that requires more than one executable to be running in order to test, it’s an integration test of some sort. This includes all tests that verify API contracts between REST or SOAP services, for instance, or anything that talks to an out-of-process database (as then you’re testing the integration between your app and the database server).

Ideally, this level of test should verify *just* the integration, not repeat the functionality of the unit tests exhaustively, otherwise they are redundant and not DRY.

In other words, you should be checking that the one service connects to the other, makes valid requests and gets valid responses, not comprehensively testing the content of the request or response – that’s what the unit and functional tests are for.

An example of an integration test is one where you fire up a copy of your application with an actual database engine and verify that the operation of your persistence layer is as expected, or where you start the client and server of your REST service and ensure that they exchange messages the way you wanted.

Acceptance
Acceptance tests often take the same form as a functional or integration test, but the author and audience are usually different: in this case an acceptance test should be authored by the story originator (the customer proxy, sometimes a business analyst), and should represent a narrative sequence of exercising various application functionality.

They are again not exhaustive in the way that unit tests attempt to be in that they don’t necessarily need to exercise all of the code, just the code required to support the narrative defined by a series of stories.

Fitnesse, Specs, Easyb, RSpec and Green Pepper are all tools designed to assist with this kind of testing.

Concurrency
If your application or service is designed to be used by more than one client or user, then it should be tested for concurrency. This is a test that simulates simultaneous concurrent load over a short period of time, and ensures that the replies from the service remain successful under that load.

For a concurrency test, we might verify just that the response contains some valid information, and not an error, as opposed to validating every element of the response as being correct (as this would again be an overlap with other layers of testing, and hence be redundant).

Performance
Performance, not to be confused with load and scalability, is a timing-based test. This is where you load your application (either with concurrent or sequential requests, depending on it’s intended purpose) with requests, and ensure that the requests receive a response within a specified time frame (often for interactive apps a rule is the “two second rule”, as it’s thought that users will tolerate a delay up to that level).

It’s important that performance tests be run singly and on an isolated system under a known load, or you will never get consistency from them.

Performance can be measured at various levels, but is most commonly checked at the integration or functional levels.

Load/Scalability
A close relative of, but not identical with concurrency tests are load and/or scalability tests. This is where you (automatically) pound on the app under a simulated user (or client) load, ideally more than it will experience in production, and make sure that it does not break. At this point you’re not concerned with how slow it goes, only that it doesn’t break – e.g. that you *can* scale, not that you can scale linearly or on any other performance curve.

Quality Assurance
Many Agile and Lean teams eschew a formal quality assurance group, and the testing such a group does, in favor of the concept of “built in” QA. Quality assurance, however, goes far beyond determining if the software perform as expected. I have a detailed post in the works that talks about how else we can measure the quality of the software we produce, as it’s a topic unto itself.

Alpha/Beta deployments
Not strictly testing at all, the deployment of alpha or beta versions of an application nonetheless relates to testing, even though it is far less formalized and rigorous than mechanized testing.

This is a good place to collect more subjective measures such as usability and perceived responsiveness.

Manual Tests
The bane of every agile project, manual tests should be avoided like the undying plague, IMO. Even the most obscure user interface has an automated tool for scripting the testing of the actual user experience – if nothing else, you should be recording any manual tests with such a tool, so that when it’s done you can “reply” the test without further manual interaction.

At each level of testing here, remember, have confidence in your tests and keep it DRY. Don’t test the same thing over and over again on purpose, let the natural overlap between the layers catch problems at the appropriate level, and when you find a problem, drive the test for it down as low as possible on this stack – ideally right back to the unit test level.

If all of your unit test are perfect and passing, you’d never see a failing test at any of the other levels, theoretically. I’ve never seen that kind of testing nirvana achieved entirely, but I’ve seen projects come close – and those were projects with a defect rate so low it was considered practically unattainable by other teams, yet it was done at a cost that was entirely reasonable.

By: Mike Nash

Comments off

Mocks, Stubs and Spies! Oh my!

I’ve been pretty involved in helping out new developers at Point2.  I try to ease them in to our agile, Scrum/XP environment easily, but there are usually a few roadblocks.  So far, the most troublesome obstacle has been the use of Test Doubles.  To try and mitigate this a bit I’ve written a 3-part series on this topic.

Part I is on stubs.

Part II is about mocks.

Part III covers spies.

Enjoy!

By: Kevin Baribeau

Leave a comment »

Focused Practice

We all want to be better at what we do, right? But how do we go about improving our skills? I think the answer is the same, whether you’re an aspiring software developer, musician, martial artist, or whatever. The key is focused practice. You need to put time into developing your skill. You can’t just put time in either. It has to be focused time. This is time where you’re consciously thinking about your craft, critically analyzing your work, and looking for ways to improve it.

Some of this possible during your day-to-day life; but in my experience, you always get better results by setting aside a special block of time to work on a skill.

Code Katas

The best way I’ve found to apply this to software is through code katas. A code kata is a problem simple enough that developers of any skill-level should be able to solve them. A kata is also small enough to solve in a reasonable period of time. My learning process currently goes something like this.

  1. Pick a skill I want to improve at.
  2. Pick a kata to solve, and a language to solve it in.
  3. Solve the kata while focusing on how to apply that skill to the kata.

For example, let’s say I wanted to work on my TDD-fu. I would pick a kata that I knew reasonably well; in my case that’s the bowling problem. I would pick a language that I knew well, and had a good testing framework; that would be java and junit. I would then implement a solution while thinking about things like…

  • How much code am I writing per test? Should I be writing more? Less?
  • Is my test naming clear?
  • Am I remembering to think about refactoring every time I see a green bar?
  • Am I aware enough of what’s going on to know when I’ve made a mistake and chosen a bad test?

You can use code katas to improve your skills with just about any technique related to software development. Just to name a few, they work well for improving your pairing practices, new languages, writing good OO code, and learning new tools. You just have to pick a focus.

A Coding Dojo

There’s been lots of success stories recently about coding dojos. A dojo gives us a social context in which to work on our katas with other like-minded people. It’s a great mechanism to get feedback on your work, but the downside I’ve found is that a lot of people are intimidated by this. I think the best way to deal with it is just to work on a kata in your own time until you’re comfortable with it, and then bite the bullet and seek some feedback.

You can find more information about code katas and dojos at http://codingdojo.org. See the kata catalogue if you just want to get started. There is another list of problems I’ve found useful here.

By: Kevin Baribeau

Leave a comment »

Hard Problems

thinking

I ran into a hard problem today at work. Here it is:

Given two strings, check that the first does not contain a sequence of 3 consecutive letters that are also contained in the second.

Look easy? It should. I don’t know of very many easier problems. After implementing it though, I can’t truthfully say that it was easy for me. But, I don’t think the lessons I learned today are easy either. As far as I can tell, I made three mistakes:

Lesson #1: Be careful when picking your test cases

It’s easy to pick a test case that’s either too hard, or too easy to implement. If you pick a test case that’s too easy, you and your pair risk getting bored. Oddly enough, this is almost never a problem unless you’re actively trying to avoid picking a test case that’s too hard.

So, if you’re bored, then your tests are too easy… how do you tell when your tests are too hard? Well, that’s another hard problem. I don’t think I’ve nailed this down completely yet, but here are some clues I’ve learned to spot:

  • It takes more than one try to get your newest test to pass
  • When you do get your test to pass, another test fails
  • Your pair doesn’t understand what you just did
  • You find yourself wanting to refactor against a red bar

My advice when you run into any of these indicators is to stop. Stop writing code. Revert to a green bar. Now you have two options: Refactor, to make things easier; or pick a different test.

Sound easy? Try it. There’s two reason why it’s not.

First it’s hard to admit when you’ve picked a test case that’s too hard to implement. We’re programmers (craftsmen if you will), it’s our job to solve hard problems. We take pride in our ability to do so. We want to make that next test case work. Taking a step backward and reverting the code you just wrote (that doesn’t work) is an admission that you’ve made a mistake. Admitting this mistake is especially hard to do when you’re working on an “easy” problem.

Second, both of these options require you to THINK. It’s tempting to think that if you tweak a conditional here or extract a method there you’ll see a green bar soon. This rarely turns out to be the case. Even the easiest problem is going to require you to stop and think about the solution; probably more than you expected to.

If you run into this, Stop. Think about the problem. Discuss it with your pair. Then, take another crack at it.

Lesson #2: Commit Often

Today, my pair and I ran into all of the clues listed above, and failed to stop. It was at this point that we got burned by lesson #2. We realized what was going on, and wanted to revert to our last green bar. We hadn’t committed in 90 minutes. We were stuck with broken code. Oops.

I don’t have a strong opinion on when you should commit. Ideally, I would say you should commit on every green bar, or after every refactor step; whichever is easiest to remember for you. Of course, some teams are cursed with long running unit tests. Since you don’t want to commit without running your tests; this makes it difficult to commit so frequently. Do what you can to keep your tests running quickly, but in the meantime, find some balance between committing often and not letting your tests slow you down.

Lesson #3: Focus on your work — No Distractions

Today, during the “embarrassing pairing session”, I had an interesting email thread zipping through my phone, which dutifully beeped at me every few minutes. Every once in a while I’d try to keep up with it, meanwhile completely losing my focus on the problem and relying on my pair to get me back up to speed when I was done.

Please, be wary of checking your email or other distractions when pairing on a problem. If you’re not at the keybaord, your job is (among other things) to help figure out the next test, watch for warning signs like the ones I listed above, and maintain code quality. You probabaly can’t (I know I can’t) do any of these things while checking your email, or carrying on a casual conversation with a friend. Also, respect your pair. If they’re working, you should be too. They’re going to resent you if you don’t pull your weight.

By: Kevin Baribeau

Comments (3) »

Software Quality

A number of the sessions I’ve attended here at SD West 2009 cover the same theme: software quality. There are a number of practices that can ensure quality; most of them involve the same thing – feedback.

  1. Requirements analysts can build prototypes to get rapid feedback from usability tests before building any code – time frame: hours or days
  2. Developers can write a unit test before each bit of production code to get quick feedback – time frame: minutes
  3. Business analysts can pair with QA roles to develop acceptance criteria / acceptance tests for each user story – time frame: hours or days
  4. Developers can run continuous intergration test suites to get feedback on whether they broke existing functionality – time frame: seconds or minutes
  5. Executives and product owners can communicate their vision for the product or feature, so the team can work toward common goals. Team members then can constantly validate what they are working on, and make design choices based on shared understanding of priorities – time frame: continuous
  6. QA teams can do exploratory testing on working code to find subtler bugs, usability problems, spelling and grammar mistakes – time frame: hours or days
  7. Developers can peer review each others code to find code smells and spot potential bugs early – time frame: hours
  8. Developers in some languages can use strong typing systems to their advantage. Strongly typing parameters and return values to validate input and output – time frame: instant
  9. Developers can use static analysis tools that help find places your code is likely to do something other than what you intended. Some of these tools are shipped with your compiler or IDE, but you can always get more detailed ones, and you can write your own to enforce local coing standards. These tools can be part of your continuous integration – time frame: seconds or minutes
  10. Agile teams can estimate task sizes to identify risks as early as possible – time frame: 1 hour

One thing that was empasized over and over, was that developer and tester are two very different roles. Developers want to write code that works; testers want to break it. Developers will always need that foil, trying to break their code using just as creative techniques as developers are using to write the code in the first place.

TDD is a method of development and code design, not a method of testing. People often get this mixed up because of the name. Even if TDD reduces your bug count by 95%, your QA team will have plenty of work to fill their time, doing what should have been their role in the first place – exploratory testing: testing the unknown-unknowns (everyone who said this, first felt the need to apologize for the Rumsfeldian phrase – you know they must really mean what they are saying, because nobody’s going around quoting Donald Rumsfeld just for the fun of it).

By: Todd Sturgeon

Comments (2) »