Thursday, 24 November 2011

Textual description of firstImageUrl

Recruitment By Example

I find it hard to recruit great testers...

Really hard.


Finding the appropriate candidates with an enthusiasm for the job, combined with the aptitude and skills to excel in my organisation proves extremely difficult. One of the greatest challenges for me is working with recruitment consultants and getting them to understand my needs in testing candidates. Working in a data storage/database environment means that certain skills are particularly valuable to us, SQL/Databases and Unix operating system experience with shell scripting being the primary ones.

Historically the biggest drain on my time when actively recruiting has been the time taken to read CVs and attend telephone interviews to filter out the candidates who list interesting skills on CVs, yet their experience proves to be very limited. This problem can even extend to the candidate having merely tested systems running with such technologies without ever having interfaced with them directly.

I recently met a very enthusiastic recruitment consultant who had taken the time to attend a Tester Gathering in Bristol. Being in the market for a new supplier I decided to give them a try. I decided that with a new relationship I would try a new approach to getting them to understand our needs, and tried a slightly unconventional method to getting them to understand our testing requirements.

Specification By Example Applied

I'm a great believer in the power of examples to help drive our software development. Using examples helps to provide a shared understanding between stakeholders with potentially different domain languages and perspectives. I wondered whether a similar approach might help in our recruiment process to bridge the gap of understanding between the recruiters understanding of my candidate requirements and my own.

I selected 3 CVs that I had received in the last 2 years from potential candidates:-
  • The first listed Unix and SQL knowledge, however had no evidence of direct experience to back up this claim other than in a list of technologies used in the implementation of projects they had worked on
  • One was a candidate who appeared to have the relevant skills and experience, however, on a short phone interview had not been able to back up this experience with any demonstration of understanding
  • The third was a CV which showed clear examples of projects in which she had directly interfaced with the relevant technologies and the role that she had performed in that project. In addition to this she had provided further examples of improvements she had introduced into her working environment to improve the testing effort and steps she had personally taken to improve her testing knowledge
The third CV was from one of the testers already in our team.

I invited the new recruitment suppliers into my office and spent about an hour working through each CV in turn. I highlighted key points to look out for that set off alarm bells for me in CVs one and two. These included factors such as the providing of long lists of technologies that made up the solution tested rather than ones they had actually worked with; a focus on certification and bug metrics and no evidence of a drive to self educate or improve their work. I then went through things about the third CV that marked them out as an exceptional candidate. I think this was something of a tiresome process for the consultants, however they admitted at the end of the session that they had found the session very useful as they had a much clearer picture of the candidates that I was looking for. I also made it clear that I would rather have one good tester put forward for the job rather than 10 poor ones.

There is always a risk with trying something different, and I wondered if taking this approach might not ingratiate me well with my new supplier. Actually it was remarkably effective. As a result of this, instead of twenty inappropriate CVs to filter through in the first few weeks, I received one, an a good one at that. Since then I have not received what I would describe as a poor CV from that agency, and soon after recruited a fantastic candidate that they provided. Obviously I cannot prove whether the agency would not have delivered such a good service without the unconventional start, but I've certainly not had such a great start from any other agency. If you are going through the pain of trying to recruit good testing candidates I recommend getting your agents in and working through some examples of the CVs that you are, and are not, looking for, to drive your recruitment specification.

Thanks to Rob Lambert for prompting this post. Image http://www.flickr.com/photos/desiitaly/2105224119
Textual description of firstImageUrl

The Thread of an idea - Adopting a Thread Based approach to Exploratory Testing

In all of our testing activities my approach is very much to treat our current practices as a point in a process of evolution. Here I write about an excellent example of how we evolved our testing approach over time moving away from the idea of test cases to a more flexible thread based strategy that was more suited to our approach to testing and our need to parallelise our testing activities

Not the tool for the job

When I started in my current role the team was attempting to manage their testing in an open source test case management tool. The process around tool use was poorly defined and rooted very much in the principles of a staged test approach, planning test cases up front and then executing these repeatedly across subsequent builds of the software. This was a hugely unwieldy approach given that we were attempting to work with continuous integration builds. The tool was clearly not suitable for the way that we were operating, with some managing their work outside the tool and bulk updating at a later point.

Needless to say as soon as I was able I moved to replace this system.

Deferring Commitment

I had enjoyed some success in previous teams with a lightweight spreadsheet solution backed by a database. I thought that this would be an excellent replacement, so I worked to implement a variation on this into the team. In an attempt to address the issue of not having a fixed set of test cases to execute, David Evans provided a good suggestion that using an estimate of test cases might alleviate the immediate problem by allowing us to track based on estimated work. This allowed us to defer commitments on the actual tests performed until such time as we were actually working on the feature. The estimates would naturally converge with the actual tests executed as the feature progressed. This did provide a significant improvement, however I still found that the tracking of these estimates and counts was not giving me any useful information in terms of understanding the status and progress of testing during the sprint.

You're off the Case

The conclusion that we were rapidly arriving at was that, counting test cases for us provided little or no valuable information. May of our tests resulted in the generation of graphs and performance charts for review. The focus of our testing revolved around providing sufficient information on the system to the product owners to allow them to make decisions on priorities. Often these decisions involved some careful trade-off in performance between small data performance and scalability over larger data, there was no concept of a pass or fail with these activities. Counts of test cases held little value in this regard as test cases as artifacts describe no useful information to help understand the behaviour of the application under test or visualise any relative performance differences between alternative solutions. It was more important to us to be able to convey information on the impact of changes introduced than support meaningless measurements of progress through the aggregation of abstract entities.

In an attempt to find a more appropriate method of managing our testing activities, we trialled the use of Session Based Exploratory Testing, an approach arising from the Context Driven School of Software Testing and championed in particular by James Bach and Michael Bolton. What we found was that, for our context, this approach also posed us some challenges. Many of our testing investigations involved setting up and executing long batch running import or query processes and then gathering and analysing the resulting data at the end. This meant that each testing activity could potentially have long periods of 'down time' from the tester perspective which did not fit well with the idea of intensive time-boxed testing sessions. Our testers naturally parallelised their testing activity around these down times to retest bugs and work on tasks such as automation maintenance.

The thread of an idea

Whilst not wanting to give up on the idea of the Exploratory Testing Charter driven approach, it was clear that we needed to tailor this somewhat to suit our operating context and the software under test. My thinking was that the approach could work for us, but we needed to support the testers working on parallel investigation streams at the same time. Wondering if anyone else had hit upon similar ideas of a "Thread-Based" approach I did some research and hit upon this page on James Bach's blog.

I am a great believer in tailoring your management to suit the way that your team want to work rather than the other way around, and this approach seemed to fit perfectly. Over subsequent weeks we altered our test documentation to work, not in terms of cases but in terms of testing threads. As discussed here we drive to identify Acceptance Criteria, Assumptions and Risks during the story elaboration which help to determine the scope of the testing activity. In a workbook for each user story we document these and then publish and agree them with the product management team. (BTW - svn apache add-in running alongside a Wiki is an excellent way of sharing documents stored in subversion).

The initial thread charters are then drawn up with the aim of providing understanding of, and confidence in, the acceptance criteria and risks identitied.

Initially each testing thread has:

  • A charter describing scope - we use Elisabeth Hendrickson's excellent guide on writing charters to help us
  • An estimate of time required to complete the charter to the level of confidence agreed for acceptance of the story. This estimate covers any data generation and setup required to prepare, and also the generation of automated regression tests to cover that charter on an ongoing basis, if this is appropriate.

We still use Excel as the primary store for the testing charters. This allows me to backend onto a MySQL database for storing charters and creating dashboards which can be used to review work in progress.



As the investigation on the threads progresses, information is generated to fill out the testing story:-

Given that we were now tracking threads rather than cases this allowed for greater flexibility in the format of our testing documentation - details of the testing activities performed are tracked in Mind Maps or Excel sheets.
  • Mind Maps are generated using XMind and hyperlinked from the Thread entry in the main sheet. For details on using Mind Maps to track testing activities I highly recommend Darren McMillan's blog. There is little more that I can add to his excellent work.
  • In terms of the Excel based exploration, testers can document their decisions and activities in additional entries under the Charter in the main testing sheet, using a simple macro input dialog for fast input on new testing activity as here:-
  • More detailed results, data tables and graphs can then be included in additional worksheets.
  • Details of any automated test deliverables generated as a result of that charter are also reported at the thread level.

Reeling in the Benefits

From a Test Management perspective I have found significant benefits working in this way
  • Variety of testing output is supported and encouraged - rather than gathering useful information and then stripping this down to standardise the results into a test case management structure, we can provide the outputs from the testing in the most appropriate form for understanding the behaviour and feeding the relevant decisions
  • Managing at charter level allows me to understand the activities that the testers are doing without micro-managing their every activity. We have regular charter reviews where we discuss as a team the approach to each feature and pool ideas on other areas that need considering
  • Estimate on remaining effort is maintained - this can be fed directly into a burndown chart. As an additional advantage I've found that our estimates in work outstanding are much more realistic now that these correlate with specific testing charters than my previous method of obtaining an estimate based on the work remaining on the story.

In the few months that we've been working with this approach it has slotted in extremely well with our testing activities. As well as making it easier for me to visualise progress on the testing activities, it has freed up the team to record their investigations in the form that they feel is most appropriate for the task at hand. The process of team review has improved team communication and really helped us to share our testing knowledge, driving our charter reviews using our shared test heuristics in a team based review session.

As with all things, the most appropriate approach to any problem is subject to context. If your context involves similar long setup and test run times, or sufficient distractions to prevent the use of session based approach, it may be that this is a possible alternative to progress away from the restrictions of a test case based strategy.

Image: http://en.wikipedia.org/wiki/File:Spool_of_white_thread.jpg

Eurostar Talk - An Evolution Into Specification By Example

The slides for my talk at EuroSTAR on "An Evolution into Specification By Example" should be available on the EuroSTAR site.

Further Reading from Me

Regarding some of the points raised there is some further reading that may be of interest:-

On Reporting Confidence and Collecting Criteria, Assumptions and Risks

On some of the benefits of Writing your own test harness using a set of principles for test automation

Other Relevant Reading

Some external relevant links:-

Gojko Adzic's Book on Specification By Example and the summary points here

James Bach on Thread based Test Management and the Case against Test Cases

Some posts from Michael Bolton on potential problems with reporting "Done" "The Undefinition of Done" and "The Relative Rule and the Unsettling Rule"


For anyone that has attended the talk and wants to comment or ask any questions please comment on this post and I'll be happy to discuss anything with you further.

Friday, 18 November 2011

Birmingham STC Meetup

On Tuesday I attended the STC Meetup in Birmingham, the highlight of which was a talk by James Bach on transitioning to Context Driven Testing from other schools. Videos of all the evenings talks, including my own lightning talk on where testers can learn about monitoring software's interaction with its environment, are available on Simon Knight's website.

Friday, 11 November 2011

Textual description of firstImageUrl

Sleepwalking into failure

"I feel happier when I have come to the same conclusion as experts in my field independently by making my own mistakes"
This was a statement that I posted on twitter last week, which had responses from a few people including a very witty response from citizencrane
"Don't skydive"
and a response from Lisa Crispin
"I'd just as soon not make the mistakes, but I guess it is a better learning experience!"
Whilst I agree that it is instinctively better to avoid making mistakes, I think that allowing ourselves the ability to try new things and make mistakes is essential to learning. In the week after my comment I saw some interesting ones on the same subject:-

this one from testerswain
For me, making mistakes while #testing is a great learning opportunity and more benefitial than gathering knowledge from books
and this one from Morgsterious
Learning new ways is not a matter of being told, but one of risking and discovering in a loving, trusting context." - Satir
So there are clearly other folks out there who feel the same way. Ironically though, in my post the mistakes that I was referring to were not those that arise from trying new things, but from specifically not doing this...

A questioning approach


Rather than simply accepting the contents of textbooks and certification programs on how testing should look, I strive to question the validity of everything that I do. I try new ideas in place of existing practices that I think are founded on invalid principles. Trying new methods invariably introduces the risk of failure. I've tried new things and failed, but in every case I have learned valuable lessons. The failures that teach me little, and I therefore regret the most, are the ones that arise through specifically not questioning what I do. These were the failures I was referring to in my original quote, the failures of adopting an approach because of accepted wisdom rather than validity and appropriateness for the context. The failures of sleepwalking through false rituals and meaningless metrics. Consider the following scenarios:-
  1. I read an article by a respected tester highlighting the invalidity of an approach that I am still using, with no thought on my part for how valid this is.
  2. I would naturally question my own approach and consider whether I could make any changes to improve my own testing based on what I have learned. I would also feel slightly embarrassed and potentially lose confidence in my own abilities and worth.
  3. I read an article by a respected tester highlighting the invalidity of an approach that I have myself already questioned and changed.
  4. I'd feel great. My confidence in my own understanding of my profession would have received a huge boost thanks to my own critical thinking being backed up by someone I have a great deal of respect for.
Both of these have happened to me, I know which makes me happier. For example when I read Michael Bolton questioning the concept of "Done" in this post and here , I feel justified in my own approach and writing on my own issues with this concept.

Evolving the profession


I believe that testing as a profession is constantly evolving. All of the testers that I hold in high regard are not those that espouse "best practices" and rigid models for success. Instead the individuals that I most respect are those that question the status quo and constantly look to improve testing as a profession (see the blog list on the right for a starting list).

While respecting the thought leaders in my field, many of the most vocal are working are offering consultancy services and reference books (Lisa Crispin, mentioned above is a notable exception. Whilst being a successful author she has also worked as a tester in the same agile team for many years). I believe that the art of improving testing should not be the preserve of the consultants and authors. Even with the most principled of individuals there will inevitably be a different bias for those trying to differentiate their own service in a market from a tester in permanent employment striving to ensure the medium-long term applicability of the methods they adopt. I believe that as a professional I have a responsibility to contribute to the same questioning process and try to improve my craft through my learning on a continuous product development, which is necessarily different from contract engagements. This recent rallying call from James Bach for all testers to spend time thinking, writing and coming up with new testing ideas, applies equally to testers in all types of testing situation. By adopting a questioning approach and writing about my own experiences, both sucesses and failures, I feel that I am at least making the effort to contribute to the testing body of knowledge, and not just sitting back and relying on the excellent individuals that I refer to to question my profession for me.

The alternative, where I neither question myself, nor read any material from any other respected testers to improve my own testing, well that really isn't an option that I care to consider.

Tuesday, 25 October 2011

Textual description of firstImageUrl

A Problem Halved - using a context specific cheat sheet to share product test heuristics

A term that is often used when highlighting the skilled nature of the job of software testing, particularly exploratory testing, is the use of testing heuristics. Heuristics are experience based rules of thumb that we use selectively to guide our approach to a problem, such as testing software features. Over the course of a testers career, (which, as I discuss here can be viewed as its own minor evolution) they will evolve a set of personal heuristics. We can apply these selectively in the appropriate contexts to increase our chances of quickly exposing the potential or actual problems with a software solution. The posession of an excellent set of heuristics can make a huge difference in the effectiveness of testing performed within a given timeframe.

Every solution is different


Over the course of time working on testing a product a tester will develop not only their general testing knowledge but also an excellent set of testing heuristics that relate solely to the context of that product. This knowledge is non-portable and valuable only within its present testing context, however within this context these heuristics can be the most valuable resources for targeted testing of the product in question. Some examples from my current context
  • There is an internal boundary within our NUMERIC libraries between NUMERICs of precision 18 and 19 which adds extra boundary cases to any NUMERIC type testing
  • Some validation on data imports is client side and some is server side, error handling and server logging testing needs to consider both types
  • The query system follows a different execution path when operating on few (<10) data partitions so testing needs to consider both paths
  • For our record level expiry, different paths are followed depending on whether some records in a partition, all records in a partition or all records in an import are expired
The knowledge of these facts adds a richness to any testing beyond that which might be obvious from the external, documented behaviour of product.

(Shared) knowledge is power


For some, their heuristics may be closely guarded secrets, the knowledge that elevates them above the competition or renders them invaluable within the organisation. My opinion, however, is that our goal is to produce the best software that we can as a team, and the sharing of knowledge that has arisen through our combined experience is essential.

Having read Elisabeth Hendrickson, James Lyndsey and Dale Emery's excellent Test Heuristics Cheat Sheet, I found this to be a great method of communicating simple high level heuristics. I felt that a context specific heuristics cheat sheet would be an excellent tool for sharing our internal testing knowledge and for use as a guide/checklist in exploratory testing of our product features. Using a similar layout to the testobsessed cheat sheet, I created a simple framework on our Wiki for our own context specific version for my team to use to document our internal heuristics. We break down by product area and then individual functions or operations. These are then annotated with key-word points that might need to be considered when testing that aspect of the product. If a concept merits further explanation then we link to a separate "glossary" where terms can be expanded upon to avoid cluttering up the sheet. This helps to keep the sheet as a lightweight reference tool rather than anything more involved. The page is maintained by the team, with each member updating the list if they encounter any behaviour that might be useful to others if testing that feature in the future. For example, an entry might look like

Record Level Operations
  • Record Expiry - Expire Subset of Records from Tree ; Expire All in Tree ; Expire All in Import ; Record Level Delete ; Records On Legal Hold ; Purged Records
We've been working with this tool for a few months now and have found it to be very useful. We reference the heuristics in exploratory testing, in elaboration meetings and also charter reviews of story coverage as a mental checklist and to prompt new ideas. If new problems are discussed in reviews then we suggest adding them to the sheet so that it is maintained by the team and for the team. (We have not had to maintain the sheet for long enough to drop features or consider versioning which may cause minor headaches). Of course no checklist or cheat sheet should be considered exhaustive and we have to apply our own personal intelligence to every testing challenge, however, as a simple tool for sharing testing experiences and driving new ones, I can recommend it.
image : http://de.wikipedia.org/wiki/Benutzer:KMJ

Monday, 3 October 2011

Textual description of firstImageUrl

"You were supposed to draw him standing up" - testing preconceptions in design

Last weekend whilst I was supposedly looking after my two eldest children and actually sneakily checking my email my attention was drawn by a tweet linking to a testing challenge on the Testing Planet website.

The focus of the challenge was a great website called draw a stickman. The site is a great fun interactive site that invites the visitor to draw a stickman and then proceeds to engage this man in a series of activities requiring further graphical contribution from the visitor to complete the story.

The title of the challenge, "Play with this, I mean test it casually", indicated to me that the focus of the challenge should be around the user interactions with the application rather than any deep exploration of the html content or structure. Some comments had already been posted around what happens if you draw just a line or a blob. While all valid issues, most of them were based around interacting outside of the instructions of the application. While it is true that many bugs can arise when users operate in contravention of the documentation and instructions, in my experience these can often be resolved through referring the user to the instruction that has been missed and possibly amending the details to make them clearer.

Scope for intepretation

Instead I tried to focus on working within the instructions presented but looking for scope to intepret these in a way that the designer never intended. It was pretty easy finding some excellent ambiguity just by focussing on the first instruction (and the name of the site) "Draw a Stick Man". To most people, including me, my first idea if someone asks me to draw a stick man is something like this:-
However there is a huge amount of scope in the instruction here. My first attempt at pushing the scope was pure deviousness. I drew a similar stick man as the one above, but just upside-down. He duly progressed to live out the remainder of the adventure moving around the screen by means of his head pulsating and sliding him along like a snail's foot. Great fun but probably not what we could call a bug.

Next I progressed from pure deviousness to a more serious deviation from the expected norm. I imagined a wheelchair-user visiting the site and duly drew my stick man sitting in a wheelchair. Again the result was quite fun, the wheel of the wheelchair operating much as the head in the upside-down experiment. The behaviour of the wheel, whilst understandable, might upset a more sensitive wheelchair user so this might constitute a bug that the software had not been designed with this in mind.

My next attempt took the lack of specfic detail around the position of the stick man a little further, I decided to draw him in a reclined position with his hands behind his head, a "lazy stick man".


The results of this were fantastic. Rather than moving down the screen to progress with the adventure, lazy stick man decided that opening boxes and fighting dragons wasn't for him and flolloped off the screen to the right, leaving a completely blank screen with no further instructions. It turns out that this was a lucky shot as in my next 10 attempts I only managed to get 1 man to take the same lazy path off down the pub.

Whilst hilarious fun this also consituted a definite bug in the system, as it was possible for someone who came to the site and followed the instructions in good faith, but deviated slightly from the expected inputs, could be left stranded with no idea what was supposed to happen next.

I find that issues when the instructions are open to interpretation but the application is limited to a single case based on the designers preconceptions can be hugely problematic. In this case it is not possible to refer the user to any documentation or instructions highlighting their mistake, as in their mind they were operating within the instructions and have made no mistake. If not addressed carefully there is even the scope for insulting the users of the system if you mistakenly take the stance of suggesting that their ideas and notions are incorrect (had the wheelchair image decided to exit stage right this could have been a more serious issue risking insulting a potential user).


Question Yourself

Next time you are testing inputs into a system, question yourself and your own preconceptions. Are your default test usernames all based on western length/structure names. Are you assuming that your users all have full ability to use keyboard, screen and mouse? I once worked on a system where a key user had a muscular disorder and could not use a mouse - it certainly opened my eyes to what keyboard accessible really meant. (Darren MacMillan recently wrote an excellent post on accessibility here). As well as cultural ideas, do your technological or organisational experiences lead you to interpret instructions in a way that may be hidden from or contorted by users with less experience of your context?

Are you testing the actual operating instructions or following your own notions on how these instructions should be implemented? If it is the latter then having your stick man run away could be the least of your worries.

Saturday, 24 September 2011

Textual description of firstImageUrl

Wopping the pie stack - demonstrating the value in questioning requirements

Over time in my current role we've made great strides in raising the profile of testing activities as a critical part of our successful development of the software. A key benefit of this is that the testers are an integral part in our requirement elaboration process. The testers role here is to define and scope the problem domain in terms of acceptance criteria targets for the work, driving out any assumptions we are required to make to establish these criteria, and potential risks in the initial models, a process I've detailed more in this post. We also have the opportunity to discuss with the developers early ideas on possible solution structures and question whether these provide an acceptable model to base a feature on. The ability to question the early design ideas in this way can save huge amounts of time by identifying flaws in the design before it has been developed.

Telling not asking

One anti-pattern that is common in software development that we sometimes encounter as part of this elaboration process is when the initial requirement is provided in the form of an already defined solution. It is a characteristic of our product market that both our outbound product team and our customers have good understanding of technologies relevant to the environment in which our software operates. Because of this there is a tendency for requirements to be delivered having already passed through some, possibly unconscious, phase of analysis on the part of the person making the request based on their domain knowledge.

So what is the problem (i)

So what is the problem? Some of the work has been done for us, right? No. It is significantly more difficult to deliver and test a generic solution than it is to solve a defined and scoped problem. As well as relying upon assumptions that the solution addresses the problem at hand, lack of knowledge of the problem on which a solution is based can lead to other mistakes that have serious implications for the suitability of the final product:
  • Under Testing
  • I find that testing based on an existing solution suffers heavily from anchoring bias. Even when a tester understands that they need to test outside the boundaries of a solution domain, there is a subconscious tendency for testers to anchor the testing limits based on those boundaries in which the solution is operating. If tests are being designed based on the solution domain rather than the problem domain this can be at the expense of posing relevant questions on the scope of the problem.
  • Over Testing
  • If the solution provides a scope of use which far exceeds that required to solve the problem at hand then testing to the extent of the solution design will be wasting effort in areas likely to remain untouched as the product is implemented.
  • Missing the target
  • If the assumption that the solution design fully addresses all aspects of the problem is incorrect then important aspects of the problem will remain unaddressed (this is one reason why programmers may be limited in effectiveness when testing their own solutions, there is always a confirmation bias that their design resolves the problem)

So what is the problem (ii)

Having established that trying to test solutions is not ideal, we are left with the same question, so what is the problem. As testers we have a duty to try to answer this question in order to anchor our testing scope on the appropriate domain. A very simple and effective technique much written about in testing literature is that of the "5 Whys" or "Popping the why stack" (which provides us with a wonderful spoonerism and a great title for a blog post). I won't revisit the details and origins of the technique here, they are well covered elsewhere, but I did encounter an excellent example in my company the other day which I felt illustrates the technique beautifully. The story title as originally delivered to the team read something like "The ability to plan a query against X data partitions in 5 seconds" where planning is an internal phase of our querying process, and X was a big enough number for this to be a significant challenge. It was immediately apparent that this was seen as a solution to a bigger problem so we questioned
Why#1: "Why planning in 5 seconds?"
Answer#1: "So that this customer query can run in 12 seconds"
OK so now we have some customer facing behaviour, but still a fairly arbitrary target.
Why#2: "Why does this need to run in 12 seconds?"
Answer#2: "The customer wants to be able to support 5 users getting their query results in a minute, so has targeted 12 seconds per query"
OK so now we have value and a reason for this, to support the targeted system user level. We've probably gone far enough with the whys (it doesn't always take 5), but it is clear that the logic from the customer is flawed:-
Why#3: "Why are we assuming that we can only run one query at a time? Would it be acceptable for queries to take slightly longer but run in parallel so that 5 complete in 1 minute"
Answer#3: "Yes"
So we have a new story, and a new target to develop and test towards. As it turns out we achieved planning in 5 seconds. Here is the critical part though, we also identified and resolved some resource contention issues that would have prevented 5 queries from running in 1 minute had we just focussed on the original target. I know that in some software developments it is hard enough getting requirements at all so it seems counter intuitive to start challenging them. Hopefully this case shows how by using a simple technique it is possible to work back to the root stakeholder value in a requirement and ensure the problem, rather than the solution, forms the basis of the testing effort. (BTW - I try to avoid product promotion, this is an independent blog, but I also try to avoid anything disparaging so - if you are thinking that 12 seconds is a long time for a query, please bear in mind we're talking about tables containing trillions of records so querying in seconds is no mean feat) Copyright (c) Adam Knight 2011 a-sisyphean-task.blogspot.com Twitter: adampknight

Monday, 5 September 2011

Textual description of firstImageUrl

Mind Mapping an Evolution

I've recently introduced the idea of Thread Based Exploratory Testing into my team, with the option of using spreadsheets or xmind mind maps to document exploratory threads. Mind maps are an excellent tool for documenting a thought process and Xmind is a really intuitive and flexible tool for creating and sharing these. As a demonstration I thought I'd show how I've made use of mind maps to plan my EUROStar conference talk later in the year using Xmind.

General Ideas Dump



Fishbone Diagram To Work Out Presentation Flow



If you are attending euroSTAR I look forward to seeing you there, please say hi, I'll be the one with baby sick stains on my jacket.

Copyright (c) Adam Knight 2011 a-sisyphean-task.blogspot.com Twitter: adampknight

Sunday, 4 September 2011

Textual description of firstImageUrl

Plastering over the cracks - Why fixing bugs is missing the point


A few years ago I was fortunate enough to travel through China and take a tour up the Yangtze river, passing the new Hydro-electric dam that was in the process of flooding the famous "Three Gorges". As our (government operated) tour boat navigated the giant locks up the side of the dam our guide informed us that "the cracks that have been reported on the BBC news have been fastidiously fixed". My gut reaction this statement was a feeling of mild panic, yet I knew that the cracks had been repaired - what was my problem? My confidence in the integrity of the dam was massively impacted by the fact that I knew that the apparent fault had been fixed, yet I had no confidence that the underlying problem had been understood. I was travelling on a large body of water which was only being prevented from rushing down the valley below was a lump of concrete that a few weeks ago had cracks in and had no evidence that the government had any idea why those cracks had occurred in the first place. What was to stop the fault that had caused those cracks cropping up in another part of the dam?

Planned Failure


More recently I was reading a discussion group on the subject of what testers felt were the most useful/useless statistics to a testing process. One of the figures suggested was that of actual versus predicted bug rates, the idea being that developers predicted the likely bug rates for a development and then progress was measured on how many bugs were being detected and fixed compared to this "defect potential". I dislike this concept for many reasons, but the foremost of these is exactly the same reason that my subconscious was nagging me on the Yangtze river:-

Just fixing defects is missing the point.


A bug is more than just a flaw in code. It is a problem whereby the behaviour of the system contradicts our understanding of a stakeholder's expectation of what the system should do. The key to an effective resolution is understanding that problem. Only then can the appropriate fix be implemented. I believe that the benefit that can be obtained from the resolution of a bug depends hugely on the understanding of the problem domain held by the individuals implementing and testing the fix.

Factors such as when and how the issue is resolved, and who implements and retests the fix, can impact this understanding. Even the same bug fix applied at a different time by a different person can have an impact on the overall quality of the software through the identification of further issues in one of the feature domains. If the identification or resolution of issues are delayed, either in a staged testing approach, or through the accumulation of bugs in a bug tracking system in an iterative process, then the chances of related bugs going undetected or even being introduced are higher.

While the Iron is Hot


Many factors that can influence the successful understanding of the underlying cause of a bug are impacted hugely by the timing and context in which the bug is tackled.

  • Understanding of the Problem Domain
  • It could be that a problem acually calls into question some principle underpinning the feature model, for example an assumption on the implementation environment or the workflow. A functional fix implemented as part of a bug fixing cycle may resolve the immediate bug but leave underlying flaws in the feature model unaddressed.
  • Understanding of Solution Domain
  • When a feature is being implemented, the indiviuals involved in that implementation will be holding a huge amount of relevant contextual information in their heads. With fast feedback cycles and quick resolution on issues then problems can be addressed while the details of the implementation are fresh in the minds of the developer and associated issues are more likely to be identified. It could be that the most apparent resolution to a bug would compromise a related area of the feature, a fact that could be overlooked if tackling as a standalone fix.
  • Knowledge of related features
  • It is a common situation for a developer to work on a number of similar stories or features as part of a project, often using similar approaches on related features. If an issue has been identified with the functional solution implemented then it could be that similar unidentified problems are apparent in related areas that the developer has worked on.
  • Understanding of Testing Domain
  • In addition to the developers, as I discussed in this post, the tester working on a feature will have a better understanding of the testing domain when actively working on that feature area than if testing the issue cold at a later date. Addressing the retesting of the problem immediately provides the opportunity to review the testing of related features and perform further assessment of those areas, an opportunity that not be apparent if tackling as a point retest

By operating with fast feedback and reslution cycles we take advantage of the increased levels of understanding of the problem, solution and testing domains affected, thereby maximising our chances of a successful resolution and identification of related issues. If a software development process embraces the prediction, acceptance and delayed identification and resolution of issues then many of the collateral benefits that can be gained from tackling issues in the context in which they are introduced are lost.

Copyright (c) Adam Knight 2011 a-sisyphean-task.blogspot.com Twitter: adampknight

Sunday, 14 August 2011

Textual description of firstImageUrl

A disappointed sheep - don't expect your user to care about your product presentation


Last week I was overjoyed at the arrival of my third child, Franklin. Anyone who has experiene of attending a birth in hospital will know that, despite the obvious excitement of the occasion, there are also significant periods of waiting and monitoring. Given the previous history of my wife's childbirth the hospital staff were monitoring her and the baby closely, with some time spent connected to an electronic fetal monitoring system (EFM).

It was during one of the quieter of these periods of monitoring that I found my attention wandering to the straps that were holding the detectors in place on my wife's body. There were two straps, both with a simple printed design with the name of the manufacturer and a simple logo.



Despite my desperate attempts to disable my testing radar to concentrate on more important issues, namely the health of my wife and safe arrival of my child, at the back of my mind a small tester conscience was desperately trying to get my attention every time I looked at one of these straps. Initially I ignored it, on the basis that I had far more important things to think about than matters of quality, however after nagging me persistently for a while I finally allowed myself to listen. And here is what it said:-

"Why is there a picture of a disappointed sheep on that strap?"

Sure enough, the logo on the strap was, when viewed from the appropriate angle, a picture of a smiling baby, however the straps did not have a "right way up" and in our case the midwife had used them so that the logo was inverted from my perspective, presenting me with this rather sad sheep.


Does your implementation context demand the attention you give to presentation?


This rather trivial and humorous example belies a key principle in all product design, software included. Understand when appearance of your product is and is not important to your customers given the context in which they will be using it. An NHS midwife has absolutely no interest in the aesthetic presentation of the equipment that she uses, therefore including design which relies on the correct orientation of a monitor strap to present your logo is a flawed tactic.

A few years ago I was involved in a project to produce a marketing analysis system upgrade. At the time that I was brought into the project a significant amount of time had been devoted to ensuring a seamless graphical framework and icon sets for the product, without actually having delivered any core functionality. Like the midwife in our above example, I would not expect that alpha-blended graphics in the UI to be of any significance to the marketer trying to put together a campaign model. Concentrating development effort on this was therefore simply wasting money.

It is my belief that most long term users of a functional product are only interested in the presentation of that product to the extent that it does not negatively impact the ability of that product to perform the required function. Making a product that helps the user to achieve their goal should be paramount. Unless the context demands it, making it look nice is a far lesser concern. Worse still, relying on a user who has no interest in presenting it well to do exactly that and you run a significant risk of ending up with a disappointed sheep.




Copyright (c) Adam Knight 2011 a-sisyphean-task.blogspot.com Twitter: adampknight

Wednesday, 27 July 2011

Textual description of firstImageUrl

To a Tester Level - self imposed limits of the testing profession


Recently I interviewed a reasonably experienced software tester for a contract position in my company. His CV had been forwarded to me by a recruitment agency on the basis of some key skills that were listed that were important to us to quickly add value in the position. He was a very nice chap and initially we got on well, however by the end of the interview I'd actually reached the point of being quite cross with him...(cross enough to write about it here). I'll explain why.

To a Tester Level


As the interview progressed beyond the initial discussions of his previous roles, I started to ask specific questions around his skills in the technologies of interest to me: - databases, scripting, unix operating system knowledge. In each case his response was that he was knowledgable in that area "To a Tester Level". Apparently a tester level in these areas equates to the ability to:-

- Query a database with a SELECT statement to verify that records are present
- Navigate directories on the unix command line, run programs and open files
- Create a shell script to execute a program with basic parameters

Let me make this clear right now, I do not have an issue with the guys level of knowledge. I'm sure that he possessed other skills which allowed him to perform certain testing roles adequately. What did really irritate me was his assumption that his level of ability could be appropriately described as a "Tester Level". He was imposing an upper limit of ability not only upon himself but on the testing profession as a whole. These limited skills were a sufficient level for testers to aspire to achieve in their profession. Although this chap did provide the most clear cut example of this attitude, he was certainly not the only person that I have spoken to with a similar mindset, which makes me very sad.

Knowledge is power


I'll admit that my current testing domain requires a high level of knowledge in the areas I've decribed, however, I think that more advanced skills in core technologies can be just as relevant in testing any software. The greater the knowledge that we can obtain on the behaviour of the system under test, the more powerful our testing can be. Some simple examples:-

  • The ability to track memory and processor usage on an operating system can allow the identification of issues with memory leaks and CPU thrashing on a service.
  • Examining explain plans of the queries that are being executed by a system on a database backend can allow the tester to identify possible performance issues with the data tier.
  • Scripting knowledge opens up possibilities in being able to iteratively or concurrently perform repeated tasks to model behaviour
  • Being able to write a harness in a programming language can allow exploratory testing against API interfaces in addition to more involved automation

There are countless ways in which a level of expertise in relevant technologies can empower testers in perform their role more efficiently, effectively and thoroughly. By equating superficial familiarity with "a Tester Level" of knowledge we are creating self imposed limitations on our own expertise. Not only this but we are also reinforcing a stereotype of a tester as someone who possesses limited abilities. There is no need for this. I believe that a good tester, and a good test team, possess a matrix of skills and experience (see here ). It may be true that certain skills are not relevant to a specific role or team, however this does not mean that we should consider such abilities beyond the scope of the testing profession.

Copyright (c) Adam Knight 2011 a-sisyphean-task.blogspot.com Twitter: adampknight

Thursday, 14 July 2011

Evolution of a Tester

I've been lucky enough to be selected as a speaker at EuroSTAR in Manchester in November. My talk is entitled An Evolution into Specification by Example. A related blog post on Evolution of a Tester can be found in the EuroSTAR blog site.

Happy reading, hope to see you there.

Copyright (c) Adam Knight 2011 a-sisyphean-task.blogspot.com Twitter: adampknight

Friday, 24 June 2011

Small mercies - why we shouldn't knock "it works on my machine"

I go to quite a few test events around the UK. Conferences, User groups and meetups. A thread of conversation that is a perennial favourite at such events is the ridiculing of programmers that, when faced with a bug in their code, claim
"It works on my machine"
Testers revel in the pointlessness of such a statement when faced with the incontrovertible evidence that we have amassed from our tests. We laugh at the futility in referencing behaviour on a development machine as compared to the evidence from our clean server and production-like test environments. Surely it is only right that we humiliate them for this. Well I say one thing to this

At least they've tried it.


At least the programmer in question has put the effort in to test their code before passing it to you. Of course this should be standard practice. More than once, however, I've encountered far worse when discussing bugs with the programmer responsible:-
"I haven't tried it out but let me know if you get any problems"
"According to the code it should work"
"I can't think of any reason why this should fail"
"You've deliberately targetted an area of code that we know has issues" (I'm not joking)
Programmers work in development environments. These are usually a far cry from the target implementation environment, hence the need for test systems (and possibly testers). The development environment is the earliest opportunity that the programmer has to run the functionality and get some feedback. If the alternative is to waste time pushing untested functionality into a build and a test harness and use up valuable testing resources on it then I'm sorry, but I will take "It works on my machine" over that any day.

Copyright (c) Adam Knight 2011 a-sisyphean-task.blogspot.com Twitter: adampknight

Tuesday, 21 June 2011

Textual description of firstImageUrl

Be the We - on doing more for the team

A turn of phrase that I've heard many times working in software development teams goes like this:-
"We should really be doing ..."
followed by some practice or process that the speaker feels would improve either their job or the team's development activity as a whole. In this context the "We" enjoys a level of mysterious anonymity more commonly reserved for the "They" of "I'm surprised they allow that" and "I can't believe that they haven't done something about it" fame that are the cause of society's ills. The implication is that, by raising the issue as a "We" problem, the speaker has upheld their end of things and the failure to implement the behaviour in question is now a team responsibility.

Who is the We?


So who is the mysterious "We". I'll tell you, it's you, or it should be. It's me, or I hope it is. It is one of the people that I have worked with in the past that have stood out for over and over again taking action of their own back to improve the team environment in which they work. It is the person who watches my back and offers to help when I get too busy, as I do for them. It is the person I want in my team and look for whenever I read a CV.

I review hundreds of tester CVs as part of my job. All of them have testing experience. All of them. Most of them contain the following:
  • a list of the projects they've worked on with project descriptions
  • lists of the technologies making up the environment that the software was implemented in
  • a list of the testing activities performed on each project
If this is the extent of the content then the CV will usually not get considered for even a quick phone interview. Testing experience and tool knowledge alone are not enough. I am a strong believer in taking a team approach to process, however relying on the team's achievments to forward your career when you are not an active participant in those achievements will not get you very far. Put simply, if I can see no evidence that an individual has acted on their own initiative to improve the working lives of themselves and their colleagues then they won't fit into my organisation. We cannot afford to have people in the team who won't step up and be the "We" when we want to improve ourselves.

But what can I do?


Sometimes it is hard to know where to start, but if you regularly find yourself thinking "It would be so much better if we did this ..." , then you have a good starting point. Don't sit around and wait for someone else to do it, tackle it yourself.
  • It would be great if We had some test automation
  • Great - do some research, brush up your scripting and get started. If you don't know scripting or tools, there's nothing like having a target project to work on to develop some new skills.
  • We really should try some exploratory testing sessions
  • Again - fantastic idea, book yourself out some time, do some reading on the subject and get cracking. As I wrote about in my post on using groups to implement change, once you demonstrate the value you'll get some traction with other team members and bingo, you've introduced a great new practice to the team.

The opportunities will depend on your context, but even (and may especially?) in the largest and most rigid of processes there are opportunities to improve through removing inefficiencies or improving the quality of information coming out of your team. The costs are small, maybe some of your own time to read up and learn a new skill. The benefits are huge. You'll earn the respect of your colleagues, improve your chances of promotion and, more importantly, become really good at what you do. Next time you find yourself thinking "Wouldn't it be great if we did ...", try rephrasing to "Wouldn't it be great if I'd introduced ...". Think how much better that sounds, and start becoming the person that, when others say "We", they mean you.

Copyright (c) Adam Knight 2011 a-sisyphean-task.blogspot.com Twitter: adampknight
image: Dog-team-at-Seventh-All-Alaska-Sweepstakes

Saturday, 4 June 2011

Follow the lady - on not getting tricked by your OS when performance testing

Recently my colleagues and I were involved in working on a story to achieve a certain level of query performance for a customer. We'd put a lot of effort into trying to generate data which would be representative of the customer's for the purpose of querying. The massive size of the target installation, however, prevented us from generating data to the same scale so we had created a realistic subset across a smaller example date range. This is an approach we have used many times before to great effect in creating acceptance tests for customer requirements. The target disk storage system for the data was NFS, so we'd created a share to our SAN mounted from a gateway Linux server and shared out to the application server.

False confidence

Through targeted improvements by the programmers we'd seen some dramatic redutions in the query times. Based on the figures that we were seeing for the execution of multiple parallel queries, we thought that we were well within target. Care was taken to ensure that each query was accessing different data partitions and that no files were being cached on the application server.

Missing a trick

We were well aware that our environment was not a perfect match for the customers, and had flagged this as a project risk to address. Our particular concerns revolved around using a gateway server instead of a native NAS device as it was a fundamental difference in topology. As we examined the potential dangers it dawned very quickly that the operating system on the gateway box could be invalidating the test results.

Most operating systems cache recently accessed files in spare memory to improve IO performance, and Linux is no exception. We were well aware of this behaviour and for the majority of tests we take action to prevent this from happening, however we'd failed to take it into account for the file sharing server in this new architecture. For many of our queries all of the data was coming out of memory rather than from disk and giving us unrealistically low query times.

Won't get fooled again


Understanding the operating system behaviour is critical to performance testing. What may seem to be a perfectly valid performance test can yield wildly innaccurate results if the caching behaviour of the operating system is not taken into account and excluded. Operating systems have their own agenda in optimising performance which can operate in conflict with our attempts to model performance to predict beaviour when operating at scale. In particular our scalability graph can exhibit a significant point of inflection when the file size exceeds that which can be contained in the memory cache of the OS.

In this case, despite out solid understanding of file system caching behaviour, we still made an error of judgement as we had not applied this knowledge to every component in a multi tiered model. Thankfully our identification of the topology as a risk to the story and subsequent investigation flushed out the deception in this case and we were able to retest and ensure that the customer targets were met in a more realistic architecture. It was a timely reminder, however, how vital it is to examine every facet of the test environment to ensure that we do not end up as the mark in an inadvertent three card trick.

(By the way - from RHEL 5 onwards linux has supported the hugely useful method to clear the OS cache
echo "n" > /proc/sys/vm/drop_caches
Where n=1,2 or 3. Sadly, not all OSs are as accomodating)

Copyright (c) Adam Knight 2011
a-sisyphean-task.blogspot.com
Twitter: adampknight

Friday, 20 May 2011

Textual description of firstImageUrl

Automated Tests - you on a lot of good days

Automated testing is something of a contentious subject which invokes a lot of passion in the testing community and much is written on the dangers of automated testing and the risks of employing automation in testing. I'm certainly of the opinion that automation is something that should not be undertaken lightly. The costs involved in getting it wrong can vary from wasted time and effort up to an artificially high level of confidence in the system under test. While the potential pitfalls of automation receive a lot of attention, on the flipside I think that one of the key potential benefits of automation can be easily overlooked.

A snapshot of knowledge



When I pick up testing on a feature, my understanding of that area and the context in which it operates increases dramatically as I immerse myself into the testing of that area. For the period in which I am testing that feature, I am holding in my head far more information on it than if I was coming at it cold. As my understanding increases my exploratory tests improve and my ability to identify key tests to ensure the correct operation of that feature increases. If, at this point, I am able to encapsulate some of that knowledge into a set of automated tests, in essence I am capturing a snapshot of that elevated state of understanding. As I move onto other features my understanding of that feature will fade. The tests that I designed at the time will not. Well designed automation will repeatedly check aspects of the product that I thought were important at the time that I best understood that area and customer requirements that drove its development.

On many occasions I review test suites which focus on areas that I have not worked on in a while and find myself quietly impressed at the understanding of that area that I must have had at the time of creation.

There is an advert running in the UK at the moment for a vitamin supplement that claims to help you be you on a "really good day". I'm not suggesting that automation will ever replace the insights of a skilled tester, but a well designed test pack can capture small glimmers of that insight when at its highest to use to drive ongoing checks. You on lots of "really good days".

Copyright (c) Adam Knight 2011

Saturday, 14 May 2011

Textual description of firstImageUrl

Under your nose - uncovering hidden treasure in the tools you already use

At the recent STC meetup in Oxford Lisa Crispin performed a lightning talk on using an IDE to manage and edit your tests. Managing the files that constitute our automated test packs is not an easy process, particularly around the maintainence of these files in SVN, and I liked the idea of using an IDE to help the team with this. On investigating the potential benefits of using an IDE for our team, I was surprised to discover that actually, some excellent productivity benefits in a similar vein could be achieved by making full use of the tools that we were already using in the team.

False start


I already had IntelliJ and eclipse installed on my laptop for developing Java test harnesses and so my first inclination was to follow Lisa's suggestion of using one of these IDEs to manage the test files.

After thinking things through and experimenting with some ideas I realised that this was not really going to work for me and my specific context. The reason for this is that most of the time we develop our test packs on remote test servers not local machines. Although there are facilities for remote editing in eclipse, for example, this relied on services running on the target machines which would have severely diminished the flexibility of this approach. Many of the benefits of using such applications would also not apply to our context given the fact that our test file structure and format is specific to our harness and facilities such as dependency and syntax checking/auto-complete would not be applicable.

A different perspective


Although the idea of using a programmatical IDE appeared to be a non-starter for us, I still felt strongly that some of the benefits of such an approach could still help us if we could achieve them through other tools. On researching specifically the requirements that would directly help us to improve our remote interactive test development, I found that many of the features that I was looking for were available through addins or configuration options for tools that we were already using.

Custom file editing and svn management


On researching explorer tools with custom editor options I discovered that the Notepad++ tool actually supports a solid explorer addin Explorer. As well as file browsing there are also excellent file search and replace features. Combining with the icon overlay feature of tortoiseSVN and the support for the standard explorer context menu gives subversion integration.


In addition to the Explorer searching, the "OpenFileInSolution" addin provides indexing of the project file system for fast searching, and the "User defined language" feature allows me to add syntax highlighting for our syntax commands to pick up on simple syntax errors in command input files.

Remote file editor


Another item on the hitlist was the ability to edit remote files directly in an editor. I discovered another useful Notepad++ addin "NppFTP" that achieves exactly this. This presents a remote explorer window within the NotePad++ application that allows me to quickly access a remote directory and edit test files within my text editor.

Remote file management


Finally my search moved onto the ability to remotely manage files in SVN. Some googling around led me to this great post on Using WinSCP to work with remote SVN repositories. Again, WinSCP was a tool that we already used extensively in the team. Based on the hints available here I quickly established a custom toolbar in WinSCP to add, check status, check-out, check-in and revert files and directories using custom commands.

Custom file actions


Using the power of the custom command support in WinSCP has allowed me to progress further than I was expecting on the level of interaction with our test file packs. Custom commands have allowed us to create functions based on the common actions and bespoke file relationships that are unique to our test development environment. These operations include:-
  • Updating results files from test run directories into the corresponding source pack through a single button operation
  • Adding custom metadata files in for existing tests and test packs via a click and prompt operation
  • Copying existing tests with all associated meta files with a single click operation


All of these activities were obviously achievable through shell scripting, but the addition of simple commands in our remote scp client to perform these actions makes it a far simpler process to interact remotely with the test servers and work with our test pack files.

I know this is not rocket science, and I suppose in essence that is my point. We were already using both WinSCP and Notepad++ in the team, yet had not really investigated the power of the tools under our noses to make our lives easier. It was only through the process of looking for the benefits offered by other applications that I discovered the features that were already at my disposal.

Next time you fire up the tools that you use every day without thinking, why not take the time to have a closer look at how you can make them work harder for you.

Copyright (c) Adam Knight 2011

Tuesday, 3 May 2011

Situation Green - The dangers of using automation to retest bugs

I am a great believer in a using structure of individual responsibility in items of work done rather than monitoring individual activity with fine grained quantitative measurement and traceability matrices. I do think, however, that it is part of my responsibility to monitor the general quality of our bug documentation, both in the initial issue documentation and also the subsequent retesting. On occasion I have had bugs passed to me with retest notes along the lines of "regression test now passes". Whenever I see this my response is almost always to push the bug back to the tester with recommendations to perform further testing and add the details to the bug report. While I understand the temptation to treat a passing automation check as a bug retest, this is not an activity that I encourage in any way in my organisation.

Giving in to temptation

I'm sure that we are not the only test team to feel pressure. When faced with issues under pressure the temptation is to focus on such activity as to remove that issue and restore a state of "normality". The visible issue in a suite of automated tests (or manual checks) is the failing check, and resolving a bug to the extent that the check returns the expected result can seem to be the appropriate action for a quick resolution. The danger that is apparent with this approach, however, is that it results in "gaming" the automation to the extent that we ensure that the checks pass, even though the underlying issue has not been fully resolved. We can focus on resolving the visible problem without the requisite activity to provide confidence in the underlying issue that caused the visible behaviour. Some simple examples:-

  • Fixing one checked example of a general case
    Sometimes a negative test case can provide an example e.g. of our error handling in a functional area. If that check later exposes unexpected behaviour, then a resolution on that specific scenarion could leave other similar failure modes untested. I've seen this situation where a check deleted some files to force failure in a transactional copy. When our regression suite uncovered an change in the transactional copy behaviour the initial fix was to check for the presence of all files prior to copy, fixing the test case but leaving open other similar failures around file access and permissions.
  • Updating result without ensuring that purpose of test is maintained
    There is a danger in focussing on getting a set of tests "green" that we actually lose the purpose of the test. I've seen this situation where a check shows up new behaviour, a tester verifies that the behaviour is as a result of new behaviour and so the new result is updated into the automation repository, but the actual original purpose of the check was lost in the transaction.

These are a couple of simple examples but I'm sure that there are many cases where we can lose focus on the importance of an issue through mistakenly concentrating on getting the automation result back to an expected state. No matter how well designed our checks and scenarios, this is an inherently risky activity. Michael Bolton refers to the false confidence of the "green bar".

Re-Testing

I always try to focus on the fact that retesting is still testing, and as with all testing, is a conscious and investigative process. Our checks existing to warn us of a change in behaviour which requires investigation. They are a tool to describe our desired behaviour, not a target to aim at. As well as getting the check to an expected state, our activity during retesting should, more importantly, be focussed on:-
  • Performing sufficient exploration to give us confidence that the no adverse behaviour has been introduced across the feature area
  • Examining the purpose and behaviour of the check to ensure that the original intention is still covered
  • Adding any further checks that we think may be necessary given that we now know that there is a risk of regression in that area

If we fall into the trap of believing that automation equates to testing, even on the small scale of bug retests, we risk measuring, and therefore fixing, the wrong thing. I am a huge proponent of automation to aid the testing effort, however we should maintain awareness that test automation can introduce false incentives into the team that can be just as damaging as any miguided management targets.

Copyright (c) Adam Knight 2009-2011

Tuesday, 12 April 2011

Textual description of firstImageUrl

A template for success - harnessing the power of patterns to document internal test structures

During a test retrospecitve last year the subject of discussion among the team came onto the documentation of our automated tests and the visibility of the test pack structure. For a while we had been using a metadata structure to document the purpose of the test along with each step in the test packs. This was proving very effective in documenting the functional test case or acceptance scenario being covered, helping us in understanding the reasons why the tests existed. The specific problem in discussion was a lack of visibility of the structure of tests themselves and understanding of how the tests ran, particularly ones created by other members of the team.

More than one way to...


All of our tests are grouped under a top level element of an archive, which usually relates to a set of customer example test data or a custom developed data set to test a specific feature set. Under the archive are test packs which relate generally to specific scenarios (this is the equivalent level to a FIT page or "Given, When, Then" scenario.) Although the test packs were structured in a well understood way, the nature of different requirements resulted in significant differences in the way that individual test archives and the packs within them were structured.
  • Some archives are built from source data up front and then queried, some change through the course of the test execution
  • Some test packs were dependent on the execution of earlier packs, some could be run in isolation
  • Some test packs build the data from scratch, some restore from a backup, some have no archive data at all
  • Some tests ran each pack once, some iterated through the same packs many times e.g. to model behaviour as archives scale in size
We'd made attempts to document each archive in a free text format readme file, but I felt that this was rather cumbersome and wasn't delivering the relevant information that the testers needed in a consistent, consise form.

Inspiration


Around the same time I read this post by Markus Gärtner on test patterns. I'd not really considered using patterns in the documentation of test cases previously but, giving the matter some thought it made real sense to use the power of patterns to document our test pack design.

Taking advantage of a train journey to London, I examined our existing tests, identifying common structures that could be documented as distinct patterns of test execution. I drew up a few graphical representations of the initial patterns I identified, then I started the process of examining the test archives in more detail to identify variations on these patterns around data setup methods and test initialisation. I came up with about 5 core patterns with around 12 variations of these, which I felt was an appropriate number to provide value without too much differentiation. If each archive had its own pattern there would be little point in the exercise. To test the patterns I started tagging the test archives with these to see how effective a fit these were. I added the graphical representations of the patterns to the Wiki before presenting the approach to the rest of the team for some feedback.

Although this process is in its early stages the ideas have been well received by the team and we have seen some significant benefits result from this process already:-
  • A pattern can speak a thousand words
  • The tagging of a test archive with a specific pattern yields an immediate insight into the layout, setup and mode of execution of the tests without the need to read extensive notes or examine the structure in detail.
  • Patterns drive good test design
  • Although not the primary driver behind our use of patterns, the major benefit of using patterns is that they provide a template for implementation, understanding effective patterns can therefore help to drive good test design.
  • Identification of areas to refactor
  • While examining existing tests looking for design patterns I soon realised that some patterns resulted in far more maintainable, clean tests than others. Tagging exising test packs with patterns has helped us to identify the tests that are not in a maintainable structure, and given us a set of target patterns to aim for when refactoring.
  • Patterns identify ambigous test usage
  • With some test archives we found that multiple patterns were applicable. This indicated an ambiguous test design, where one test archive was being used for multiple purposes. This again helped to identify a need for refactoring in separating out the distinct implementations into a single test archive.
  • Patterns become more effective as test functionality becomes more complex
  • In the past we had suffered from testers implementing the flexible capabilities of the test harness in inconsistent ways which made maintainance more difficult, as test structures became more complex this problem was magnified. As we have recently extended the support in our test harnesses to cover more parallel and multi-server processing, patterns have provided a means of introducing the new functionality with recommended structures of implementation. It was far easier communicating the new capabilities of the test harnesses in the context of the new patterns supported than it would have been trying to explain the features in isolation.

As I mention we are in very much the early stages of adopting this approach and I'm sure there are more benefits to uncover and more effective ways to use the power of patterns. The benefits of creating even a very simple set of bespoke patterns which were relevant to our automation context were immediately apparent. If you have a rich automated test infrastructure which involves a variety of test structures with different modes of implementation then it is an technique that I would certainly recommend.

Copyright (c) Adam Knight 2011

Friday, 25 March 2011

Without stabilisers - why writing your own test harnesses really is an option

When considering utilising test automation to assist in your testing, one of the most important decisions that will need to be made at some point is what technologies or tools to use to automate the testing. For many the obvious approach will be to look to the software market to identify the tools which are most appropriate. Making the assumption that third party tools are the way to go could be excluding the perfectly valid approach of building custom automation in house. I can understand the perceived benefits of using an out-of-the-box tool to run your tests, and writing your own test harness can appear to be a frightening alternative, but the creation of your own test harness can be an excellent approach to introducing automation into your testing effort and should not be overlooked.

Advantages of writing your own harness


You can start as small as you like

Writing a harness for automation can be a daunting task, however one of the major benefits of in house developed test harnesses is that you can start very simply. Many systems have some kind of command line interface (See Rolling with the Punches for some good places to start.) Something as simple as a script to take data input files to drive a command line function and verify an expected result can be a powerful addition to your testing capabilities if used sensibly (I'm not going to discuss the relative risks of automated approaches here). By starting small and proving the benefits you can then use the results as evidence when requesting time/help to expand your efforts and implement something on a larger scale. No matter how small you start, I'd recommend adopting a set of automation principles like the ones we use here to ensure that you start with a structure that can work long term.

Driven by your needs

It is unlikely that an out of the box tool is going to integrate seamlessly with your team methods and implementing a tool may involve a level of compromise on your ideal approach to testing your product. Any compromises and workarounds may introduce more risk of the automation not meeting your long term needs or proving more troublesome and error prone than the system being tested. While I have been writing this post (it has taken a while) a couple of posts have been published by Liz Keogh and Elisabeth Hendrickson highlighting the dangers of focussing on tools first. By writing your own tool your product and the team processes and interactions drive your automation rather than the other way around.

You control the interfaces and reporting

When creating your own test harnesses you have more control over the information that is reported from the system and the format of it. You have more control in terms of what you want to measure and how and where you want to put this information, such as in a format for loading into a database.

Issues are resolved in tool not tests

If you find an issue with compatibility between the test tool and the application, you change the tool. There is no need to compromise on the test or amend your approach to due to inflexibility with a tool that you are not in a position to modify.

Quick resolution on problems

If you hit an issue with the test harnesses, you can prioritise a fix and resolve it on your own timescales. You are not beholden to another organisation to resolve issues holding up the test development.

Extensible

As you extend the scope of our testing you can extend the harnesses to suit different scenarios. Having control over your own tool allows features to be added on your priority basis. Initially our test harness covered simple query execution and results comparison. Over time it has grown to cover many other scenarios such as
  • system administration and user management
  • bulk repeat data load
  • parallel operations on clustered servers
  • data schema progression
  • memory monitoring and timing

Tools need development anyway

Many of the testing tools that are available still require a high level of scripting or programming in order to implement fully into a development process. If you are going to have to do this anyway then it may be worth considering going the extra mile and creating your own harness as well to provide extra flexibility.

Advantages of using third partly tools


There are some key benefits that third party tools do provide which need to be considered in the automation decision.

Not reinvesting the wheel

If you create your own automation harness you do so safe in the knowledge that a great deal of what you are doing has been done before, and probably better. Implementing a tools saves time developing infrastructure items such as scheduling, reporting interfaces and results comparison.

Starting feature set

Using third party tools provides a set of functionality out of the box that you can (hopefully) quickly utilise to start testing different aspects of the system. When writing your own harness, functionality is pulled into the system on a priority basis so some lower priority features may be waiting a long time to be implemented.

Availability of outside expertise

If it all starts to go wrong, with a third party testing tool then you have the confidence that there are forums, consultants and user groups out there with experiene implementing the exact tools that you are using and have probably encountered many of your issues before. There is no such support network when your harness is developed in-house.

Externally Supported and Maintained

A test harness is a piece of software and comes with all of the baggage that software development entails. Tools are developed and maintained by people whose job is to do exactly that. Developing your own tool may be an additional overhead on already busy development/test team.

Tools can still be modified

The counterpoint to the last in the list of advantages of writing your own harness, many tools are open source or are highly scriptable to make them more flexible in meeting the needs of the team.

Why my circumstances suit harness creation


So what circumstances suit creating your own harness? I can only speak from my own experience but I'd suggest that creating our own harness has worked well in my current role for the following reasons:-

  • Command line/programmatial API interfaces
  • Most of our server functionality can be operated through command line and file inputs, plus through programmatical APIs. If I wanted to test through a more complex interface such as web then a third party tool would likely be more suitable for any automation of those interfaces.
  • Unique workflow
  • The nature of our interactions with the system mean that "written english" like automation structures would be very difficult to use. ("Given that I've imported this specific 10 million line data file, when I execute this 100 line SQL statement then it yeilds these 1000 results within 10 seconds" doesn't exactly roll off the tongue). Instead we have implemented a layered metadata based approach which allows us to store the intention of each test and step alongside the physical input files and document the "live specification" that our tests provide.
  • Large Volume Data
  • The scale of input data and results of our system mean that even specific database test systems such as DBFit do not scale sufficiently to execute tests matching our customer requirements. By writing our own harnesses we can ensure the scalability of results generation and checking.
  • Scripting/programming experience in team
  • As I stated in my list of automation principles I don't think that testers should necessarily need coding/scripting knowledge to be able to create tests, however developing your own test harnesses does require this knowledge. This can come either from the testers themselves or from others in the organisation helping. We are lucky to possess the relevant skills in our team.

Whatever approach taken to automation will certainly involve some trade-off between the benefits of using a tool versus an in-house development. If you feel that some automation implementation would be appropriate for your testing context but do not know where to start, doing some small scale automation in house may be the perfect way to get you started. You can demonstrate if it really is for you and discover more about your specific needs to aid with the decision on how to progress. It might just be that you decide that an in-house system is the right approach for you long term.

Copyright (c) Adam Knight 2009-2011

ShareThis

Recommended