Sunday, 9 December 2012

Lessons Learned Testing in a Startup

This month I celebrated my sixth anniversary with my company. Whilst five years is a more traidtional milestone, for me six feels like a more fundamental step as it takes me into new stage of longevity, the six to ten year group. It seems like only yesterday that I was sat at my first Christmas party with twenty strangers wondering whether my gamble would pay off and the company would be successful. At this year's Christmas party I sat with many of the same colleagues, who I would now consider friends, discussing how some of the biggest companies in the world are implementing and using our software. It has been an amazing journey so far and, whilst still not guaranteed continued success, I have a lot more confidence in the long term success of the company now than I did when I joined. It seems, then, an opportune time to look at the last few years at how we've grown from the early stages of a startup through the first customer adoptions and into the next phase of evolution as a company. In this post I highlight some lessons that I have picked up along the way. All are relevant to testing although many have more general applicability for, as with many startup members, I've worn a few hats and been involved in many aspects of the growth of the company. I've included here both things that, with hindsight, I wish I had given greater consideration to and also things that I'm very pleased that I made the effort to get right from the start:-

 

Sapling

 

Get your infrastructure right up front

A temptation in startup companies is to postpone infrastructure work until it is really neeeded, instead focusing on adding features and getting those early customers. It is easy to defer building a robust infrastructure, particularly on the testing side, by convincing yourself that you'll have more time to sort such things out once you've got some customers to pay the bills and please the investors. In my experience you only get busier as your customer base grows, and new priorities emerge on a daily basis. WIth this in mind it pays to get the infrastructure right to support you through the initial phases of customer adoption at least:-

  • Build appropriate versioning into your build system to allow for multiple releases (e.g. using svn revision number for labelling builds only works if you only have one branch)
  • Ensuring that you build support for release branching and multiple supported software releases in any testing and bug management systems
  • Ensure that you build your tests in the code control system along with the code to help with multiple version support
  • Consider the need for extending support out to multiple operating systems and try to use generic technologies (Java, compiled code, Perl, generic shell) in your test harnesses rather than OS specific ones (e.g. Linux bash) 
  • Ensure that your tools and processes can be scaled to multiple agile teams from the single team that you are likely to start with

 

Don't build up a backlog of manual tests

As with ignoring infrastructure, another temptation in early startups is to forego the development of automated testing structures to concentrate on getting functionality out the door, instead relying on manual and/or developer led testing. Whilst this may be a practical approach with a small feature set, it soon becomes less so as the product grows and your manual testing is rightly focused around the new features.  Getting a set of automated checks in place around key feature points to detect changes in the existing feature set will pay dividends once the first customers are engaged and you want to deliver the second wave of features quickly to meet their demands.

Also as the company grows and you do want to bring more testers on, an existing suite of tests can help to act as a specification of the existing behaviour of the system. Without this, and in the absence of thorough documentation, then the only specification that testers will have available is the feature set of the product itself, an inherently risky situation. I personally have encountered the situation when joining a maturing startup where there was no test automation and no product documentation and it was unclear exactly what some features were doing. The approach being taken for testing these areas was to run manual checks that they did the same as last time those checks were run, 

 

Prepare for growth in the team

When starting out the testing contingent of your company is probably going to be small, possibly just one tester (or none at all). This will inevitably change with growth. I have found a great approach to plan for future growth is to put a set of principles put in place on key areas such as your automation strategy for new members to refer to. This will allow individuals to ensure that they are working consistently with other testers in the company, even when their activitities are being performed in isolation. 

It is also a good idea to document your bespoke tools and processes thoroughly as well. You and your early colleagues will learn your tools inside out as they are developed, but for new starters who don't have this experience your tools can be a confusing area. This is something that I wish I had done better at my company, as documenting tools retrospectively is difficult to prioritise and achieve. It is a shame when you've put the effort to develop powerful and flexible tools to automate testing against the system if future members of the team won't know how best to use them. .

 

Prepare for growth in the product

When starting out with a product development then the focus is likely to be on moderately sized uses or implementations. This is natural as larger scale adoption seems a long way off when starting out and larger customers often want evidence of successful smaller scale use before committing. If you are initially successful then demand for increased capacity can come very quickly and catch you out if the product does not scale. Testers can help avoid this situation by identifying and raising scalability issues. In the early days at RainStor a lot of the focus was on querying few large data partitions for small scale use cases. I was concerned with the lack of testing on multiple partitions, so I raised the issue in the scrum and performed some exploratory testing around join query scalability, exposing serious scaling issues. Getting these addressed helped to improve the performance of querying larger sets of results which proved timely when our first large scale customers came on board later that year.

The development teams can also help here by ensuring that the early feature designs are structured to allow for scalabiity in the future. If this is not done then you can end up having to rework entire feature sets when the size of your implementation grows. I've experienced cases where we've ended up with two feature sets that do exactly the same thing, the original one which we developed for early customers and a new version for larger implementations to overcome scalability blocks in the earler design. This is confusing for the customer and increases risk and testing effort.

 

Prepare to Patch

Yes we are software testers and admitting that you have had to patch the software is like admitting failure, but for most early developments it is a fact of life. One of the main challenges when testing in a startup is that you don't always know what your customers are going to do with the system. If your requirements gathering is working, and agile methods certainly help with this, then you'll know what they want to do, What you don't know, however, are the wide range of sometimes seemingly irrational things that people will expect your system to do. These will often only emerge in the context of active use, yet will merit hasty resolution. Whether running an online SAAS type system or a more traditional installed application such as the one I work on, I think a very good rule of thumb is to have the mechanism in place to upgrade to your second release version before you go live with your first.

 

Consider Future Stakeholders

Startups are characterised by having a few key personnel who carry a lot of expertise and often wear multiple hats. It is not uncommon to find senior developers and architects doubling up to take on sales channel activities or implementation work in small startups. As a result, in the early stages, these roles require little support and assistance in installing and using the system. If the company enjoys successful growth, however, there will typically follow an influx of specialist individuals to fill those roles who will be operating without the background in developing the system. Assuming that they will seamlessly comprehend all of the idiosyncracies of the product in the same way is unrealistic. I think a great idea is to Include future stakeholder roles in your testing personas as a way of preparing your product for the changes in personnel that will inevitably come as the company grows. 

 

Design temporary things to be permanent

With a small and reactive company chasing a market you cannot predict the directions you will be pulled in the future. Don't assume that the 'temporary' structure you put in place won't still in place in 5 years time. Some of the test data that I created to test ODBC connectivity 6 years ago is still run today as a nightly test and used extensively for exploratory testing. Given that you never know which of the items that you implement will get revisited at some point in the future, it pays to design things to last. I learned this lesson the hard way from working on integrating a suppliers product two years ago. I deferred putting in place an appropriate mechanism of merging their releases into our continuous integration. This happens so infrequently that it is hard to prioritise revisiting this, however when we do have to do it it is a cause of error and unnecessary manual effort. As we found in this case, the overall cost in time from repeatedly working with a poorly structured utility is inevitably greater than the time that we would have spent had we invested in getting it right from the start. Every time I have to follow this process I regret not spending more time on it when our focus was on that area.

 

These are just a few of the lessons that I've picked up from working specifically with a startup. If I could go back and give myself some advice, the points I mention here would certainly have helped me to avoid some of the pitfalls that I have faced. In another six years I may well be writing about tips for taking testing through the next phase of development from a small company to a market leader, or I may have moved on and have a new set of lessons from another context. In the meantime here's to the last six years and I hope you find some useful tips in the points presented here.

Sunday, 25 November 2012

Blood in the Water - why bug 'feeding frenzies' are a good sign


I've been fortunate that my company has enjoyed a successful year and I've been in a position to take on new testers. As the team has grown an interesting phenomenon has become more apparent relating to patterns of bugs that are raised against our system . At first glance it is an apparently worrying trend, however if we look deeper I think it is something that gives me confidence that we're hiring the right folks and ultimately provides an indication of a healthy testing culture.

A feeding frenzy


We enjoy a highly collaborative relationship between the testers and developers across the team. One way that this manifests itself is through open discussion on any issues that we find through testing. Rather than limiting our communication to playing tennis with the bugs database, we'll discuss emerging problems openly with developers. We'll chat about the risks relating to an issue and decide whether to tackle immediately or raise in the tracking system for future prioritisation. Most importantly, though, we'll try to identify the contributory factors behind the issue to pin down exactly what has caused it to occur.

The phenomenon that I find interesting is that, after one of the testers in the team has been discussing an issue in the office, over the next few hours or days I will often see more bugs which have common characteristics to the original issue being raised. The interesting part is that these will come, not from the original tester, but by other members of the team not immediately connected to the original discussion. These will always differ sufficiently from the original issue to not be mere bug plagiarism, but I'll see a clear correlation between the original bug and the subsequent ones to know that the occurrence of the latter was directly influenced by the former.

A Sign of Distraction?


So are the testers getting distracted? That is one way of looking at it. Another is that they are simply copying each other, but given the fact that I place no value on the number of issues a tester raises, this would be pointless. I take a different view. An extremely useful source but rarely mentioned source of testing information is simply listening in on conversations that take place in the office, eavesdropping if you will. If you were to ask me to list my top 10 most powerful testing techniques then eavesdropping would probably rank in there somewhere. When a crop of issues is raised around an area which has recently been the subject of discussion this tells me that the testers in my team have their ears open and are listening to the information that is being generated and shared between their colleagues daily. Not only that but they are using that information, processing it and applying it to their testing.

The Birth of a Heuristic


Testers use heuristics to generate testing ideas. These will be based on combined information from a variety of sources including documentation, conversations, articles and publications, testing theory and simple prior experience. Many of the 'major' issues that I and my team have identified in my system over recent years have been rooted very much in the specifics of the application and its design rather than in programmatical error. The use of context relevant testing heuristics has certainly allowed us to identify and resolve such issues that may otherwise have gone undetected through more traditional 'requirements traceability matrix' based approaches (As I discussed in this post an effective way of sharing these heuristics between team members is to generate and maintain a team heuristics sheet on a Wiki). The fact that a discussion around a new problem area can result in a flurry of related issues being generated to me is an indication that the team are mentally processing new information to generate new heuristics, and then using their instincts to identify other areas of the system where their application might be relevant. This process of heuristic evolution via 'mutation' to me is a valuable sign of critical thinking on the part of my team and is certainly a positive behaviour that should be encouraged. This is why, whenever I see a series of issues being raised within the team that looks like the testers have been copying each others homework, it always makes me smile. After all, you can't turn off instinct in the best hunters, and when there's blood in the water expect a feeding frenzy to follow.

Image: Wikipedia

Tuesday, 13 November 2012

Sparing the Time for Personal Development

A couple of things that I have read recently have really got me thinking about the subject of personal development. Firstly, I read on twitter that Darren MacMillan will not be attending EuroStar this year as he wanted to focus his holiday time on his family. Although he clarified later when I quizzed him on this that he probably could get the time off from his company to attend, it seemed strange to me that he would consider needing to use his annual leave entitlement to attend a conference. In a similar vein I was talking to Anna Baik a while ago who said that she too used her holiday time to attend conferences.

Secondly, Huib Schoots wrote a blog post on his new job where they operate a 4+1 system. 4 days of working then "1 day per week for the gathering and sharing knowlegde and expertise". This sounds like a great culture with a healthy appreciation of the need for personal development time.

A worthwhile investment?


Perhaps I am very lucky but I've never had to take holiday to attend a conference. I have always justified both the time and cost of attendance on the basis that it is important to develop myself and doing this will in turn improve my work. I do apply self imposed limits though. I'm happy to justify the costs for the great one day conferences such as the Skillsmatter Agile Testing and BDD Exchang, or Ministry of Testing's TestBash. I was lucky enough to attend Eurostar 2011 as a speaker but the full cost of a ticket marks a significant investment relative to e.g. the per-tester hardware budget for a small company and may not always constitute a worthwhile investment compared to smaller scale events and other internal options. Comparing the cost with putting all testers through the same certification based training course, for example, and suddenly it is looking less expensive and I feel that the potential benefits to the company are higher.

A culture of learning


While not quite extending to the 4+1 culture in Huib's company, we do try to promote a culture of learning in my organisation. Some of the ways that I aim to promote this.
  • Each tester has a bi-weekly review of their personal development needs
  • I maintain a mind map of personal development areas for each team member and we work together on what interests them and tasks they can work on to build their skills and knowledge. If they have an interest in a skill or technology not directly related to their current workload then we'll try to find some tasks to get exposure to that area.

  • We allow team members to choose their own tools
  • Their laptops are their own to configure with whatever tools help them to be productive. The restrictions being that the software must be safe and legal, and that they must share the knowledge of any useful tools with others on the Wiki and in a lunch and learn session.

  • Testing research tasks
  • I often give team members background tasks to research a relevant testing subject and present their findings to the team to prompt discussion. Subjects we have covered so far include Exploratory Testing, State based testing, Risk Based Testing, Feature Injection, User Stories and upcoming sessions on Model Based Testing and Testing Oracles

I suggest that team members take a day per sprint on their personal development items and provide the option of attending conferences and meetups. I'm sure that what we do is very limited in comparison to some companies. I have, however, encountered testers who've spent 10 years or more in the job and not met up with others to discuss their craft. One of the arguments that I've heard levied against investing in conferences, for example, is how much of what they encounter the attendee will be able to apply back in their own company. I'd counter this argument in a number of ways:-

  • People will always have an eye on their career progression.
  • It is easy to feel that the world is moving on without you when in the confines of your organisation. Allowing your testers to interact with the wider community will give them a feeling of self progression but also reassure them that they are abreast with the latest trends.

  • The grass is not always greener
  • Personally I find one of the greatest benefits of talking with other testers is to reassure myself that other people encounter the same issues as we do. I recently discussed the problems of testing large data stores with test lead testing in a significantly bigger and well known organisation than my own and was pleased that they had hit many similar issues to the ones we face.

  • Losing staff through lack of personal development will cost more
  • I've been lucky to have had very low turnover of testers in my teams over the last 8 years and I think one of the main reasons for this is the opportunity for development in the individuals concerned, combined with working hard to find individuals who will thrive in such an environment.

  • You never know what is going to be relevant
  • Conference speakers and more importantly attendees come from a variety of companies and cultures. You never know when you'll hear an interesting technique or technology which can be directly applied to improve your own testing. Not attending conferences on the assumption that your process is fixed and nothing applicable can be learned from outside is myopic.

As with all things I think a healthy balance and a consideration for your context needs to be applied. Given the volume and range of conferences and meetings available it would be quite possible for a tester to spend an entire year (and a lot of money) just attending these and not actually doing any testing. I think that for a company, allowing some regular time on personal development items combined with time and budget for a few days per year getting out of the office and learning from others, is an sensible investment for the continuous, development , happiness and ultimately success of your testers.

Image : https://www.flickr.com/photos/slambo_42/558579574

Tuesday, 16 October 2012

Moving Backwards by Standing Still - How Inactivity Causes Regressions



In the last couple of weeks I've encountered an interesting situation with one of our customers. They'd raised a couple of issues through the support desk with specific areas of our system, yet the behaviour in question had been consistent for some time. We were well aware of the behaviour and had not considered that it needed addressing before, in fact it was not even considered to be a bug previously. So what had changed?

The customer, through their ongoing use of the system, had grown more confident in their operation at their existing scale and had started to tackle larger and larger installations. Working on a Big Data system we are naturally well accustomed to massive scale, however in this case the size was in a different direction to our more common implementations. What the customer viewed as 'bugs' had not arisen through programmatical error, and had not been 'missed' in the original testing of the areas in question. They had arisen through changes over time in the customers perception of acceptable behaviour through their evolving business needs. The customer had moved in a new direction and the previously acceptable behaviour to them was now considered an issue.

A Static Measure of Quality


The common approach to regression testing is to identify a set of tests whose output can be checked to provide confidence that it has not changed from the time that the initial testing of that area was performed. In my post Automated Tests - you on a lot of good days I identified the benefits of such an approach in that testers have a heightened awareness of the area under test at the point of designing these checks than they would performing manual regression testing at a later date. On the flipside, however, automated tests represent a static 'snapshot' in time of acceptable behaviour, whereas customers expectations will inevitably change over time. Automated regression tests in themselves will not evolve in response to changes in the customers' demands of the product. The result is that regressions can occur without any change in the functionality in question, and with all of the existing regression tests passing, through ongoing changes elsewhere which have a consequential negative impact on the perception of that functionality.

Changing Expectations


I've personally encountered a number of possible drivers for changing customer expectations, I'm sure there are many others :-

  • The Raving fan
  • This is the scenario that we encountered in my example above. The product had delivered and exceeded the customer expectations in terms of the scale of implementations that they were tackling. This gave them the confidence to tackle larger and larger installations without feeling the need to raise change requests with our product management team to test or advance the functionality to meet their new needs, they just expected the product to scale. In some ways we were the masters of our own downfall in this regard, both positively by performing so well relative to the previous targets, but also by not putting an upper limit on the parameters in question. Putting limits in place can tactically be a great idea, even if the software potentially scales beyond it. It will provide confidence in what has been tested and at least provide some valuable information on the scale of use when customers request that the limit be extended.

  • The new shiny feature
  • This one is again a problem that a company can bring upon itself. One of our implementation team recently raised an issue with a supported character set for delimiting export data on the client side, he thought that there had been a regression. Actually the client side export had not changed. We had, more recently extended both the import and a parallel server side export feature to support an extended delimiter set, thereby changing the expectations of the client side feature, which up until that point had been perfectly acceptable. Testers need to be on top of these type of inconsistencies creeping in. If new features advance behaviour to the extent that other areas of the system are inconsistent or simply look a little tired in comparison then these issues need raising.

  • The moving market
  • Markets don't stay still and what was market leading behaviour can soon get overtaken by competitors. In the wake of the Olympics some excellent infographics comparing race winning times, such as the graph here. Notably in the 2012 Olympics Justin Gatlin posted a time of 9.79 - this was faster than the 9.85 that won him the title in 2004, but only landed him bronze this time. The competition had moved on. In the most obvious software cases advances can come in the form of improved performance and features, however factors such as platform support, integration with other tools and technologies and price can also be important. The failure of Friends Reunited stands out as an extreme example. Why would users pay for your service if a competitor is offering a more advanced feature set for free?


Keeping Up


Now, this is a testing blog, and I'm not saying that it is necessarily the tester's sole responsibility to keep up with changing markets and drive the direction of the product. One would hope that the Product Owners have an overall strategy which takes market changes into consideration and drives new targets and features in response, certainly responsibility for the decline of Friends Reunited cannot be laid at the feet of the testers. What I am saying is that we need to maintain awareness of the phenomenon and try to adapt our testing approach accordingly. Some such regressions may be subtle enough to slip through the net of requirements gathering. It may be, as in our case, that the customer up until that point has not felt the need to highlight their changing use cases. It could be the incremental introduction of inconsistencies across the product. It is our responsibility to check for bugs and regressions in our software and understand that these can arise through changes that are undetectable by automated or scripted regression tests as they occur outside the feature under test. Along with a solid set of automated regression tests there needs to exist an expert human understanding of the software behaviour and a critical eye on developments, both internal and external, that might affect the perception of it.

  • As new features are implemented as well as questioning those features, question whether their implementation could have a negative relative impact elsewhere in the system
  • Clarify the limits of what you are testing so that you can raise awareness when customers are approaching these limits
  • Talk to your support desk and outbound teams to see if customers' demands are changing over time and whether your tests or product need to change to reflect this
  • Monitor blogs and news feeds in your market to see what new products and features are coming up and how this could reflect on your product
  • Try to use relevant and current oracles in your testing. This is not always easy and I'm as guilty as anyone of using old versions of software rather than going through the pain of upgrading, however your market is constantly changing and what consitute relevant oracles will also need to change over time as new products and versions appear.

As I discussed in this post bugs are a personal thing. It stands to reason when quality is subjective that regressions are relative, and require no change in the behaviour of the software functionality in question to occur. When interviewing candidates for testing roles I often ask the question "What differentiates good employees from great employees". When it comes to testing, maintaining an understanding of this and having the awareness to look outside of a fixed set of pre-defined tests to ensure that the product is not suffering regressions would be one thing for me that marks out the latter from the former.

image : ray wewerka http://www.flickr.com/photos/picfix/4409257668

Sunday, 23 September 2012

The Problem with Crumpets - on Information and Inconsistency



My 3 year old daughter has a milk allergy. I'm not talking about an intolerance, although those can be pretty bad, I mean a full on allergic histamine reaction to any dairy products either ingested or touching her skin. When we tell other parents about this a common sentiment is that they can't imagine how they would cope in such a situation. But we do cope, just as many other parents cope with similar situations and, sadly, other conditions that are much more severe. While this was a significant hurdle for us to tackle in the weeks after we found out, over time we've adjusted our own behaviour to take Millie's condition into account to the extent that for much of the time it is nowadays not at the forefront of our minds.

Accidents aside, when we do encounter problems they can usually be attributed to one of two situations.
  1. A lack of information about a product
  2. Inconsistency in a product with our expectations

The Known Unknowns


Lack of information hits us when we cannot tell whether a product contains milk or not. Whilst I have to say in the UK the food labelling is excellent, we do still encounter situations at home and abroad where we cannot be sure whether a foodstuff contains dairy. This is incredibly frustrating when we are trying to feed my daughter. She is a great eater and will try most things so it saddens us when we are unable to give her things that are probably perfectly safe as we don't have the information on the ingredients.

Whilst very frustrating, lack of information is not specifically dangerous. We are conscious of our absence of knowledge and can take steps to improve this, or adopt a low risk strategy in response. This usually involves disrupting a shop assistant's otherwise peaceful day to fetch packaging and read out long lists of ingredients. Sometimes, as a last resort, it involves just ordering my daughter chips / french fries.

Almost as bad as the complete absence of information is the situation where allergy information has been provided, but it has clearly not been considered under what circumstances such information might be required. Restaurant allergy lists that mark entire meals as containing dairy when actually it is only the dressing on the side salad, or list all of the ingredients that the restaurant uses but leave an almost impossible task of mapping these back to the meals on the menu, are prime examples. Information that is not available in the appropriate format or location when it is required can be as bad as no information at all. Burying the failings of your system deep in the documentations and proudly pulling this out shouting 'RTFM' when your users raise support requests is about as sensible a strategy as telling customers standing in your restaurant that your allergy list is only available on your website (this has happened to us).

My key point here is that lack of or poor quality information may not directly cause mistakes, but it certainly creates frustration and extra work. If your product is not intuitive to use and has poor user documentation then the customers may not necessarily be getting themselves into trouble but they will have to work harder to find out how to achieve their goal. Your support desk is probably busier than it needs to be answering questions, just as my wife and I use up the shop assistants time running round reading packaging. Alternatively they might act out of frustration and try to plough on regardless and get themselves into trouble. Again the result is likely to be a costly inquiry to your support team.

The danger of the unknown unknown


A potentially bigger problem that we face is inconsistency. When products, product ranges or companies are inconsistent in their use of dairy then it can have grave consequences for my daughter. At least with a lack of knowledge we are aware of our situation. When we encounter inconsistency we may not possess a similar awareness, instead falsely believing that we are acting in a position of knowledge which is far more problematic. Some examples:

  • Asda brand crumpets are dairy free but Marks and Spensers are not (both are UK supermarkets).
  • Jammie Dodger standard size biscuits contain no dairy, but the smaller snack size versions have milk in.
  • Heinz ketchup does not contain milk but Daddies ketchup does
  • Hellmans original mayonnaise has no dairy but the reduced fat mayonnaise contains cream (yes, they honestly add cream to reduce the fat content)
  • McDonalds in the UK have a full allergy list in store plus a kids meal that is dairy free, thereby providing a safe (if less than appealing) food option when travelling; McDonalds in France have no allergy list and no dairy free meal options - even the burger buns contain milk.
  • Some of the serving staff at TGI Fridays in my home town are aware of their allergy list but some are not and so do not consult it when suggesting safe dairy free options

As you can probably tell, all of these are examples we've personally encountered, with varying degrees of disaster. It is when we have encountered situations of inconsistency that Millie has been most at risk. We act, assuming ourselves to be in a position of knowledge, yet that assumption is incorrect. The impact can vary from having to disappoint our daughter that actually she can't have the meal/treat we just promised, to her having a full blown allergic reaction having eaten a crumpet that Millie's grandmother mistakenly believed to be safe.

The key point here is that in the presence of lack of information, customers may still act out of frustration but will be aware that there is an element of risk. With inconsistent behaviour that awareness of the risks of their actions may not be present. As testers a key part of our job is to understand the context in which the users are using the system and the associated behaviours that they will expect. Michael Bolton recently wrote a post extending the excellent HICCUPS mnemonic with regard to consistency heuristics that help to consider the different consistency relationships that might exist for our product.

Some ways that I have used to try to consider other viewpoints in our testing:-

  • Developing personas and identifying the products and terminology that those personas will relate yours with.
  • Sometimes the context in which you are looking for consistency is not obvious and some care must be taken if the appropriate oracles are to be identified. I have a friend who once deleted a lot of photographs from his camera as he selected the 'format' option thinking it would allow him to format his photos. For the context of the SD card as a storage device the word format has one meaning consistent with computer hard disks and other such devices, but in the context of photography the term 'formatting' has quite another meaning, with which the behaviour was inconsistent.

  • Researching other products associated with your market and using these as oracles in your testing.
  • This may change over time. My organisation has historically worked in the database archiving space and a lot of our testing oracles have been in that domain. As we have grown into Hadoop and Big Data markets a new suite of associated products and terminologies have started to come into our testing.

  • Question functional requirements or requirements in the form of solutions
  • Try to understand not only the functionality required but also the scenarios in which people may need to use that functionality. As I wrote about here using techniques such as the 5-whys to understand the situations that prompt the need for a feature can help to put some perspective around the relationships that you should be testing and identify the appropriate oracles.

  • Employing team members with relevant knowledge in the market in which your product is sold
  • As I wrote about in this post it is a great idea to develop a skills matrix of relevant knowledge and try to populate the team with a mix of individuals who can test from different perspectives. Testers can share their knowledge of the expectations of the roles that they have insight into and help to construct more realistic usage scenarios.

Consideration of the Situation

Most products from supermarket bakeries have somewhere on the packaging 'may contain nuts or other allergens'. If the requirement for food labelling is that it warns of the potential presence of allergens then this label does achieve that and would pass a test of the basic criteria. As a user experience, however, it is extremely frustrating and results in an approach of not buying perfectly suitable products, time consuming requests to supporting staff or unnecessary risk taking on the part of us, the customer. I'm sure that the restaurant allergy lists that I referred to above would look very different if the designers had tested the scenario of actually trying to order a meal for someone suffering an allergy, rather than just delivering the functional requirements of listing all of the allergens in their food.

A critical testing skill is the ability to be considerate of the situation and experiences of the users and their resulting expectations. Delivering and testing the required functionality is only one aspect of a solution. In addition to testing the 'what' we should also consider 'why' and 'when' that functionality will be required and the situations that demand it. If the customer cannot utilize our features without making mistakes or requiring further information due to missing information and inconsistencies with their expectations then neither your customers or your support team are unlikely to thank you for it.

As a daily user of allergy labelling it is very clear to me those companies and products aim to meet the basic requirements of food labelling and those that go beyond this to provide consistency across their range, clear labelling and useful additional information at the point at which it is required. Needless to say it is these organisations and products that we seek out and return to over and over again.

image : http://www.flickr.com/photos/theodorescott/4469281110/

Thursday, 13 September 2012

Starting Early 2 - Internship Review


The end of August marked the completion date for Tom our intern. As I wrote about in a previous post we'd offered Tom an internship placement with my team during his summer break. He spent 8 weeks with the company, the first two being spent job shadowing and doing work experience, followed by a 6 week individual test automation project.

Building Foundations


Over the first two weeks we spent time introducing Tom to the principles that we followed in developing our software, and the importance of testing to our success. We spent time on both specific development and testing concepts and also developing more general business skills to support these. Watching Tom struggle through the first week trying to keep his eyes open after lunch reminded me of the shock to the system that I had working full time after a student lifestyle. I like to think that it was no reflection on his level of interest in what he was doing.

A critical and often overlooked factor in successful testing is the ability to communicate effectively with the business. This is a skill area that graduates aren't necessarily exposed to through their academic studies so I spent some time with Tom on making presentations, attending meetings and managing emails. Tom practised these skills researching and making a presentation on 'Testing in an Agile Context' to the team.

An Agile Internship


In the subsequent weeks we split Tom's test automation project into a set of user stories, each delivering value to the test team. One of our senior developers helped Tom with creating a set of unit tests and a continuous integration build. The other testers and I helped him with creating his testing charters and reviewing his testing. I introduced him to ideas around exploratory testing, using great articles by Elisabeth Hendrickson and James Bach to introduce essential principles.

I think the style of work and the level of interaction among the team was very new to Tom. He relished the working environment and took to his tasks with enthusiasm and diligence. He completed the initial story well and went on to deliver a second to add additional valuable functionality.

Wrapping Up


Tom wrapped up his internship with a presentation on his experiences to our VP. He had had a fantastic time and learned a huge amount. Tom's feedback on the team was that, despite the fact that everyone was obviously busy they always had time to help him. This is a really important part of our culture so I'm pleased that in his short time, working on a relatively low priority project, Tom still developed this perception. He felt that we were a great company and just the sort of place that he would like to work in future.

An Educational Gap


Over the summer I hope that Tom has developed a solid understanding of agile methods and the importance of testing in a successful software company. Based on conversations with Tom it was apparent that these were subjects that suffered from limited coverage in his University course. According to Tom on his group development project there were no marks attributed to the error handling or stability of the delivered software. Demonstrating that the final solution had been tested was not a requirement for the project. In my earlier post I lamented the lack of exposure to testing as a career option in universities. The lack of a requirement to demonstrate testing at all is a more fundamental concern. One of the biggest problems I've seen in software that I have tested has been that validation and error handling have been secondary considerations after the initial functionality has been written. Not every graduate programmer will adopt such an approach (just as not every tester measured on their bug counts will focus on raising arbitrary cosmetic issues), however it relies heavily on the integrity of the individual not to slip into bad habits. How can we expect anything other than a trend towards delivery of the happy path functionality alone when it is clear that this is exactly the approach that is promoted in university projects?

Tom returns to his final year with a good understanding of testing and agile approaches, and a copy of Gojko Adzic's 'Specification by Example'. When questioned over whether his experience would assist him in his final year he was unsure, given the lack of exposure to agile in the university CS syllabus. I am more hopeful - I subsequently persuaded him that an agile style approach to his dissertation project on AI modelling, delivering incrementally more complex models, would be an ideal tactic to ensure he did not overrun. This is obviously a small pebble in a large ond, but if more companies offer these placements and expose students to the methods and skills being used in commercial software development, along with the importance of testing, then maybe the quality of our commercial applications will benefit as these students progress into their careers.

Monday, 3 September 2012

A contrast of cultures


Whilst travelling back on the train from UKTMF in January I happened to sit opposite another of the conference attendees on the train Steve. We got chatting and I found that Steve ran the BA and testing for insurance company Unum, where we realized we had a common acquaintance. A tester who had worked in a team I ran in a previous role now worked for Steve's department. As we talked about our common acquaintance and our own jobs it soon became apparent that, although we both ran testing operations, the cultures and approaches in our relative teams were significantly different. I described to Steve how we operated in an agile environment with high levels of collaboration and discussion and an exploratory approach to testing backed with high levels of automation. Steve's environment, on the other hand, was one of a classic staged approach with extensive documentation at each stage and heavy use of scripted manual testing. As we talked an idea started to form of doing an 'exchange' scheme, with members of our teams visiting the other to examine their processes and learn more about their relative cultures.

An Exchange Visit


A few weeks later one of my team spent the day in Unum's Bristol offices. She spent time with testers and managers in the company learning about their approach to testing and the processes and documentation supporting this. She returned from the day with a wealth of information and a hefty set of sample documentation. Our whole department attended a session where she presented her findings from the day. As Steve had described she explained that the process was very formal, with a strong reliance on scripted manual testing in HP Quality Centre. What was also clear was that Steve's team took quality seriously and were achieving very high levels of customer satisfaction with their approach. Legislation was a key driver in requirements resulting in long predictable feature timescales. The fact that their feature roadmap was fixed months in advanced allowed a more rigid documented approach to be successful. The long turnaround times on new features were both accepted and expected by the business.

The return trip


One week later we received a visit from one of the senior testers from Unum, Martin. He'd was a very experienced tester within the organisation, having moved to testing from a role in the business 15 years earlier (Steve later explained that around 4 in 5 testers they recruit are hired internally from the business). I spent the morning with Martin discussing our approach to testing and explaining our use of: -
- continuous integration
- automated acceptance testing
- collaborative specification with examples
- thread based exploratory testing
Before one of my team took him through some examples of our exploratory testing charters.

Martin was really interested in our approach and I think the success we were achieving with very lightweight documentation was a real eye opener. He appreciated the fact that our approach really gave us flexibility to pursue new features and change our priorities quickly, allowing us to be competitive in fast moving markets.

We discussed whether a more lightweight agile approach might be suitable for some of Unum's projects. Our visitor's gut feeling was that they would struggle with larger projects due to the lack of management control through the documentation and surrounding metrics, and would most likely have to trial it on some smaller developments. Bugs for them, for example, were an important measurement of both tester and developer efficiency. The fact that our team often didn't raise bugs , preferring a brief discussion and demo, was not something that would have fitted well. I can understand this, however a potential pitfall on taking this approach is that the small scale projects would be likely to be short term with limited scope, yet the biggest asset of an agile approach is delivered through an ongoing process of continuous improvement. The greatest benefits that I've seen through agile have been as a result of a maintained commitment to the principles of collaboration, team autonomy and usually testing automation that underpin it. Doing short term 'trials' of an agile approach would be selling the ethos short. Martin and Steve both understood this, and felt that it would hinder their adoption of agile methods.

Summing up



This week Steve was kind enough to visit our office to review the exchange and discuss any future steps. I plan to visit another of Unum's offices later this year to learn about their performance testing, and Steve has suggested to his development team that they consider sending one of their number to visit our team as well.

It may seem counter-intuitive that two teams with such different cultures would benefit from such an exchange, however I have found it really useful. I can understand why the differences exist between our relative organisations, and there is still a huge amount that we can learn from each other. Theirs is an older company with proven success with a mature methodology. Given their historical success they are naturally wary of new methods but doing an exchange like this shows that Steve and his team are have an open mind about different approaches and a pragmatic view on their ability to adopt them.

The biggest lesson that I have taken away from the exchange was that, whatever methodology people are working with, great teams can and will deliver great software. Having worked on testing proects in more traditional methodologies before, this was more of a timely reminder to me than a new idea. Having now worked in a successful agile environment I know that my preference will now always be for an leaner more iterative approach. Steve's department may be doing things 'old school' but they are doing it in a positive and professional way, which means there is a lower incentive to change. I would rather work for a department like Steve's than an organisation who has adopted agile methods as a knee-jerk reaction to failures due to bad management and cultural shortcomings.

image : http://www.flickr.com/photos/xtianyves/7110002847

Sunday, 29 July 2012

Putting your Testability Socks On



One of the great benefits in working in an process where testers are involved from the very start of each development, testing and automating concurrently with the programming activities, is that the testability of the software becomes a primary concern. In my organisation testability issues are raised and resolved early as an implicit part of the process of driving features through targeted tests. The testers and programmers have built a great relationship over the last few years such that the testers are comfortable in raising testability concerns and know that everyone will work together to address these.

As is natural when you have benefitted from something for a while, I confess that I'd started to take this great relationship for granted. A recent project has provided a timely reminder for me on just how important the issue of testability is...

Holes in our socks


We've recently introduced a new area of functionality into the system that has come via a slightly different route than most of our new features. The majority of our work is processed in the form of user stories which are then elaborated by the team along with the product owner in a process of collaborative specification. This process allows us to identify risks, including testability issues, and build the mitigating actions into our development of the corresponding features.

In this recent case the functionality came about from a bespoke engineering exercise by our implementation team driven by a customer, which was then identified as having generic value for the organisation and so brought in-house to integrate. The functionality itself will certainly prove very useful to our customers, but as the initial development has been undertaken in the field in a staged project, issues of testability have not been identified or prioritised in the same way as they would in our internal process. We've already identified a number of additional work items required in order to support the testability of the feature long term through future enhancements. Overall it is likely that the testing effort on the feature will be higher than if the equivalent functionality had been started within the team with concurrent testing activities. Given the nature of the originating development it is understandable why this happened, but the project has served as a reminder to me of the importance of testability in our work. It has also highlighted how much more effective iterative test driven approaches are at building testability in to a product than approaches where testing is a post development activity.

Why Testability?


In response to a request for improving testability, a senior programmer in a previous employment once said to me "Are you suggesting that I change the software just to make it easier for you to test?". In a word, yes.

Improving the testability of the software provides such a significant benefit from a tester's perspective that it seems surprising how many software projects I'm aware of where testability was not considered. In the simplest sense by improving testability we can reduce the time taken to achieve testing goals by making it quicker and easier to execute tests and obtain results. Testability also engenders increased confidence in our test results through better visibility of the states and mechanisms on which these results are founded, and consequently in the decisions that are informed by these.

The benefits of improved testability are not limited to testing either. From working on supporting our system I know that improved testability can drive a consequential improvement in supportability. The two have many common characteristics such as relying on the ability to obtain information on the state of the system and the actions that have been performed on it.

Adding testability can even see an improved feature set. I read somewhere that Excel's VBA scripting component was originally implemented to improve testability, and has gone on to become one of its key user features for many users (sadly I can't source a reliable reference for this - if anyone has one please let me know).

So what does this have to do with socks?


When researching testability for a UKTMF session a few years ago I came across this presentation by Dave Catlett of Microsoft, which included a great acronym for testability categories - SOCK (Simplicity, Observability, Control and Knowledge). I'm not a great fan of acronyms for categorisations normally, as they tend to imply exhaustiveness on a subject and fall down in the case of any extensions. As with all things testing, James Bach also has a set of excellent testability heuristics which include similar categories, with the additional one of Stability. As luck would have it, in this case the additional category fits nicely onto the end of Catlett's acronym to give SOCKS (it would have been very different if the additional category was Quotability or Zestfulness). As it is, I think the result is a great mnemonic for testability qualities:-

  • Simplicity

  • Simplicity aids testability. Simplicity primarily involves striving to develop the simplest possible solutions that solve the problems at hand. Minimising the complexity of a feature to deliver only the required value helps testing in reducing the scope of functionality that needs to be covered. Creeping the feature set or Gold plating may appear to be over-delivering on the part of the programmer, however the additional complexity can hinder attempts to test. Code re-use and coding consistency also fall into this category. Re-using well tested code and well understood structures improves the simplicity of the system and reduces the need for re-testing.

    I feel that simplicity is as much about limiting scope as it is about avoiding functional complexity. I've grown accustomed to delivering incrementally in small stories where scope is negotiated on a per story and per sprint basis. Working on a larger fixed scope delivery has certainly highlighted to me the value in restricting scope to target specific value within each story, and the ensuing testability benefits of this narrow focus.

  • Observability

  • The ability to monitor what the software is doing, has done and the resulting states. Improving log files and tracing allows us to monitor system events and recreate problems. Being able to query component state allows us to understand the system when failures occur and prevent misdiagnosis of issues. When errors do occur, reporting a distinct message from which the point of failure can be easily identified dramatically speeds up bug investigations.

    This is one area where we have identified a need to review and refactor recently in order to improve the visibility of state changing events across the multiple server nodes of our system. In addition to being a great help to testers, this work will also have the additional benefit of improving the ongoing supportability as well.

  • Control

  • Along with observability, control is probably the key area for testability, particularly so if wanting to implement any automation. Being able to cleanly control the functionality to the extent of being able to manipulate the state changes that can occur withing the system in a deterministic way is hugely valuable to any testing efforts and a cornerstone of successful automation.

    Control is probably the one area in my most recent example where we suffered most. Generally when implementing asynchronous processes we have become accustomed to asking for hooks to be integrated into the software that allow them to be executed in a synchronous way. The alternative is usually implementing sleeps in the tests to wait for processes to complete, which results in brittle, unreliable automation.

    Exposing control in this way is achieved much more quickly and easily at the point of design rather than a retrofitting activity afterwards. I remember working on a client-server data analysis system some years ago which, as part of its feature set, also included a VBA macro capability. This was testability gold, as it allowed me to create a rich set of automated tests which directly manipulated the data objects in the client layer. The replacement application was in development for over a year before being exposed to my testing team, by which time it was too late to build in a scripting component. We were essentially limited to manual testing, which for a data analysis system was a severe restriction.

  • Knowledge

  • Knowledge, or Information, in the context of testability revolves around our understanding of the system and the behaviour that we are expecting to see. Do we have the requisite knowledge to critically assess the system under test? This can be in the form of understanding the system requirements, but can also include factors such as domain knowledge of the processes into which the system must integrate and an understanding of similar technologies to assess user expectations.

    In the team in which I work knowledge issues in the form of missing information or lack of familiarity with technologies are identified early in the elaboration stages. The approach to address these can vary from simply raising questions with the product owner or customer to clarify requirements to a targeted research spike researching specific technologies or domains. As we are seeing, with longer term developments the learning curve for the tester coming into the process becomes much steeper and testability from each testers perspective is diminished. Additionally with less immediate communication the testers have less visibility of the early development stages and consequently a weaker understanding of design decisions taken and the rationale behind them. It has taken the testers some time to become as familiar with the decisions, designs, technologies and user expectations involved in our latest project as those where they are actively involved in the requirement elaboration process.

  • Stability

  • The 'additional S' - I can see why it was not included in Catlett's acronym as this is not an immediate testability characteristic, however as James suggests it is an important factor in testability. James defines stability specifically in terms of the frequency and control over changes to the system. Working in an agile process where the testing occurs very much in parallel with active programming changes, functional changes are something to be expected so implementing these in a well managed and communicated way is critical. I find that the daily stand-ups are a great help in this regard. Having had experience in the past of a code base which was under active development by individuals not involved in the story process, I know how much it can derail the testing effort having changes appear in the system which are not expected by the testers and have not been managed in a controlled fashion.

    I'd also be inclined to include stability of individual features under this category. It is very difficult to test a system in the presence of functional instability in the form of high levels of functional faults. The reasons for this are primarily that issues mask issues. The more faults that exist in the system, the greater the chance of other faults lying inaccessible and undetected. Additionally investigating and retesting bugs takes significantly longer than testing healthy functionality. Nothing hinders testing, and therefore diminishes testability, like an unstable system.


In hindsight I think that the big takeaway from this experience is that lack of testability becomes more likely the later you leave the exposure to testing of your software. Following an agile development process has a natural side effect of developing testability in to the product as you go along. As George Dinwiddie points out in his post on the subject - if you drive your development through your tests then you will naturally build testability into each feature as you go along. After enjoying this implicit benefit of our development approach for years, this value couldn't have been demonstrated to me more effectively than working on a feature that had not been developed in this way.

References


I hope it is clear that I make no claim as to have invented the categorisation of testability concepts, I just like the Socks acronym and find it a useful breakdown to discuss my own experiences on the subject. When presenting on this, as with most posts/presentations, my first steps were to write down my own ideas on the topic before researching other reference. In doing this I did come up with a similar set of groupings, so was inevitably pleased when I found a good correlation with the references I've mentioned. For these, and other good links on the subject, please look below:-

Heuristics of Software Testability, James Bach.
Improving Testability – Dave Catlett, Microsoft (presentation)
Design for Testability – Bret Pettichord
design for Testability - George Dinwiddie
image: http://www.flickr.com/photos/splityarn/3132793374/

Wednesday, 27 June 2012

A Testers Toolkit

As I mentioned in a recent post, I've intended for a while to put together a list of testing tools that I use on a day to day basis in my testing efforts. This idea was brought back to the forefront of my mind recently when reading this post by Jon Bach on and exchange of tool ideas between Jon and Ajay Balamurugadas.

I've added this as a permanent page to my site. A link to the page can be found at the side bar or here. Please let me know what you think, and if you have any great tools to share I'd be more than happy to know about them.

Tuesday, 19 June 2012

Starting Early - A Testing Internship



I received a mail a few weeks ago from an young Computer Science undergraduate with a dilemma. He was trying to find an undergraduate placement, or internship, but was struggling. In his own words
The main issue that I am having is that all the internships that I have found so far seem to be heavily software engineering based and whilst I would say that I am a competent programmer, it's not something that I really enjoy and I'm not sure that I want to have a career where the main focus is programming.
I was saddened by this but not entirely surprised. The impression that I get from most of the recent CS graduates that I've spoken to is that they have very little visibility of any careers other than programming when leaving University. While programming is an essential element in software development, the professional field of IT has a much richer range of important roles that can form rewarding careers. Given that this is primarily a testing blog, I'm clearly talking about testing, however I'm also thinking about technical support, systems administration, implementation and many other key roles.

In my reply I explained that I understood his position, programming careers were not for everyone, and assured him that there were many possible careers that he could consider that required a range of skills in addition to or instead of programming, starting with my own areas of testing and technical support:-
Testing - this is not a role that people necessarily look to as a career when leaving university, however there are many of us out there who are enjoying careers in testing. Personally I am enjoying far greater success as a tester than I think I ever would have achieved as an Analyst Programmer, which is where I started out. Testing can provide a great mixture of problem solving, modelling, business / customer facing work and scripting / programming so is great for someone who values a mixed role. There is a trend at present to discuss the "death of testing" but I think that this is a good thing. Many historically have perceived testing as a low skilled role and actually what we are looking at is the evolution of testing roles into more interesting and challenging jobs for talented individuals.

Support - technical support comes in many forms, the main two being internal IT support withing large organisations (supporting Desktop and server installations) and customer support for software vendors. My team fall into the second category. It is challenging work as the customers often have high demands, however there is a lot of problem solving involved and, depending on your company, there is the potential to work with people from all around the world. Some of our customers include organisations in the USA, Italy and Japan and large mobile phone companies in the middle east. From a support role there can be a good progression into more responsible outbound roles such as presales and implementation support, where you work directly with the customers helping them with their initial software implementations and resolving the issues that they face. Again this can be a challenging role but also very rewarding both in terms of money and job satisfaction.
I went on to highlight some other roles that I had not personally done but that Tom could consider, these included Pre-Sales and implementation, Systems Administration, DBA and Business Analysis.

An Educational Endeavour


Tom was enthusiastic about finding out more about testing and support. After further discussion, and some useful input from his university careers office, I was pleased to offer him an internship placement in my team. Our VP was very supportive and and made all of the arrangements.

I am planning a test exploration and automation task, combined with some job shadowing. I'm hopeful will provide an enjoyable position and show him how how much problem solving and creativity is involved in both testing and support jobs. Given that it is my first foray into running this kind of scheme, I'm sure I'll make some mistakes, however I hope to continue to offer similar placements in the future.

Internship placements are a great way to allow young people access to potential careers. The US based SummerQAMP scheme, backed by the AST is looking like a great incentive to expose testing as a potential career to young people in America through summer internships. In the absence of a similar incentive in the UK then the more organisations that offer similar placements on an individual level to our talented undergraduates, the better.

Let's not miss out


The lack of visibility of testing and other roles to CS undergraduates seems to be a real problem. Programming is a field that does not appeal to everyone yet there are undergraduates like Tom who appear to have little knowledge of their alternatives. The skills that make good programmers are very different to those that make good testers or implementation consultants and successful IT organisations require a mixture of abilities and personalities. Without incentives to educate our youngsters on their options, I worry that this lack of knowledge of the other careers that are available could actually be deterring young people with valuable aptitudes from progressing into IT.

image: http://www.flickr.com/photos/lrargerich/3187525211/

Tuesday, 29 May 2012

Spot the Difference - using Diffmerge on log files to investigate bugs

I've been meaning for a while to publish a blog post on tools that I find useful in my day to day testing. This idea was brought back to the forefront of my mind recently when reading this post by Jon Bach on and exchange of tool ideas between Jon and Ajay Balamurugadas. The problem that I have is that there are so many great tools that help me in my testing and I'd want to do justice to them all with a good level of detail as I did with WinSCP and NotePad++ in this post. I've started to compile a summary list, which I'll hopefully post sometime soon, but in the meantime I thought I'd do a more detailed post on a fantastic tool by SourceGear called DiffMerge, and how I use it to investigate software problems through the comparison of log files between systems.

There are a number of diff tools available. The reason that I choose to use DiffMerge is that I find it has the nicest interface I've found for visually representing differences between files in a clean and simple manner. It is this visual element that is most important for my use case. It is not immediately obvious why someone working primarily with big data, manipulating huge log files and SQL via command line interfaces, would worry too much about visual elegance. As a data warehouse tester friend of mine Ranjit wrote in his blog - being able to visualise key characteristics of information gives is a critical skill in testing in our domain, and here is a great technique to demonstrate this.

An inaccessible problem


Both through my testing work and my role running the technical support I am often faced with investigating why certain operations that work fine in one context will fail to do so in another. The classic It works on my machine scenario is a common example. If the installation appears to be sound and the cause of the issue is not obvious, then it can be difficult to know where to look when trying to recreate and diagnose problems. This is the situation that I found myself in a while ago when a someone found an issue trying to use our software via an Oracle Gateway interface and had encountered an error returning LONG data types. Having set up a similar environment myself a few weeks before I was well placed to investigate. Unfortunately after checking out all of the configuration files and parameters that I knew could cause issues if not set correctly, I'd not got any nearer to the cause of the problem. I had requested and received full debug tracing from the Oracle gateway application, however I was not overly familiar with the structure of the logs, which contained a number of bespoke parameters and settings.


Staring at the log file was getting me nowhere. I knew that my own test environment could perform the equivalent operation without issue - it worked on my machine. Figuring that I could at least take the logs from my healthy system to refer to , I performed the same operation, got the trace files then compared the two equivalent logs in DiffMerge


I was unsure about how much useful information this would provide, or even whether the logs would be comparable at all, however the initial few rows looked promising. A quick examination showed that
  • The logs were from equivalent systems
  • The logs were comparable
At this point my flawed but still amazing human brain came into its own. The tool was showing me that there were differences on almost every line between the two files, yet my brain was quickly identifying patterns of differences. Within a few seconds I was able to identify and categorise common patterns of differences: date stamps; database names and session ids were easily spotted and filtered out. Very quickly I was scanning through the files, able to ignore the vast majority of differences being shown.

The first few pages revealed nothing interesting, then my eye was drawn to some differences that did not conform to any of the mental patterns I'd already identified.



It was clear that some of the settings being applied in the gateway connection were different to my installation. Although not familiar with the keys or values, I could make a reasonable inference that these settings related to the character sets in the connection. Definitely something worth following up on - I noted it and pressed on.

I rattled through the next couple of hundred lines, my brain now easily identifying and dismissing repeated date and connection id difference patterns. I could, with a little effort, have written a script to abstract these out. I decided that this wasn't necessary given how easy it was to scan through the diffs, and I also was still in a position to spot a change in use or pattern of these values.


The next set of differences was interesting again - it showed some settings being detected or applied for the data columns in my query. Again the differences that caught my eye appeared to relate to the character set (CSET) of the columns.


With a scan to the end of the file revealing nothing further that appeared interesting, I investigated the character set changes further. Having an area of interest to focus on I was able to very quickly recreate the issue by altering the character set settings of the Oracle database, which I discovered were then passed by default into the gateway and resulted in the incorrect language settings being used on the ODBC connection. I verified my new settings exhibited the same behaviour by re-diffing the log files from my newly failing system against the ones I'd been given. A bit of internet research revealed how to explicitly configure the Gateway connection language settings to force the correct language, and the problem was resolved.

An addition to my testing toolkit


I've used this technique many times since, often to great effect. It is particularly useful in testing to investigate changes in behaviour across different versions of software , or when faced with a failing application or test that has worked successfully elsewhere. It is similarly effective for examining why an application starts failing when a certain setting is enabled - only today I was examining a difference in behavior when a specific flag was enabled on the ODBC connection using this technique.

Give it a go - you might be surprised at how good your brain is at processing huge volumes of seemingly inaccessible data once it is visualized in the correct way.

Monday, 21 May 2012

Automation and the Observer Effect, or Why I Manually Test My Installers

As anyone who has read my blog before will know, I make use extensive use of automation in the testing in my organisation. I believe that the nature of the product and interfaces into it make this a valid and necessary approach when backed up with appropriate manual exploration of new features and risk areas. There is one area of functionality, however, that I ensure has always undergone manual testing with every new release and that is the installer packages. I've had occasion to defend this approach in the past, so I thought I'd share a great example that highlights why I think this is so important, whilst also providing some excellent examples of the problem of Observer effects that can be particularly apparent in test automation.

Still waters run ... slowly


As part of our last release, one of my team was testing the installers in Linux and found that it was taking an inordinately long time to install the server product. In one test it took him 15 minutes to install. The programmers investigated and found that a new random library used to generate keys was relying on machine activity to provide the randomisation. On a system with other software running, the random data generation was very fast. On a quiet machine with no activity, other than the user running the installer, it could take minutes to generate enough random data to complete the process. By its very nature any automated testing had not uncovered the problem as the monitoring harnesses were generating sufficient background activity to feed the random data and reduce the install time. Through manual testing on an isolated system we uncovered an issue which could otherwise have seriously impacted the customers' first impressions of our software

This is a great example of the phenomenon of Observer effects, most commonly associated with physics but applicable in many fields, notably psychology and Information Technology. The act of observing a process can affect the actual behaviour and outcome of that process. In another good example, earlier this year we had a problem reported from a customer using an old version of one of our drivers which was complaining about library dependencies missing. It turns out that the tool that had been used to test the successful connectivity of the installation actually incorporated some runtime libraries on the library path that were needed for the drivers to function, but were not included in the install package. The software used to perform the testing had changed the environment sufficiently to mask the true status of the system. Without the tool and associated libraries, the drivers did not work.

Such Observer Effects are a risk with throughout software testing efforts where the presence of observing processes can mask problems such as deadlocks and race conditions by changing the execution profile of the software. The problem is particularly apparent with the use of test automation due to the presence of another software application which is accessing and monitoring exactly the same resources that are being used by the application under test. The reason I'm discussing Observation effects specifically in a post on installers, is that I've found this area to be one where they can be most apparent. Software installation testing by its nature is particularly susceptible to environmental problems. The presence of automation products and processes can fundamentally change the installation environment. Relying on automation alone to perform this nature of testing seems particularly risky.

Falling at the first


The install process is often the "shop window" of software quality, as it provides people with their first perception of working with your product. A bad first impression on an evaluation, proof of concept or sales engagement can be costly. Even when the severity of the issue is low, the impact in terms of customer impressions at the install stage can be much higher. If your install process is full of holes then this can shatter confidence in what is otherwise a high quality product. You can deliver the best software in the world but if you can't install it then this gets you nowhere.

This week I was using a set of drivers from another organisation as part of my testing. The unix based installers worked fine, however the windows packaged installers failed to install, throwing an exception. It was clear that the software had come out of an automated build system and no-one had actually tested to ensure that the installers worked. No matter how well the software drivers themselves had been tested I wasn't in a position to find out as I couldn't use them. Also my confidence in the software had been shattered by the fact that the delivery had fallen at the first hurdle.

I can't claim that our install process has never had issues, however I do know that we've identified a number of problems when manually testing installations that would otherwise have made it into the release software. I've also seen install issues from other software companies that I know wouldn't have happened for us. Reports from our field guys are that in most cases our install is one of the easier parts of any integrated delivery, which give me confidence that the approach is warranted. Every hour spent on testing is an investment, and I believe that making that investment in the install process is money very well spent.

Image: http://www.flickr.com/photos/alanstanton/3339356638/

Wednesday, 25 April 2012

If this ain't a bug - why does it feel so bad?


I had a mildly embarrassing experience with Twitter the other day. I had tweeted a link to a new blog post that I'd written , as I've done dozens of times before. I then spent a couple of hours in a meeting and returned to my desk to find a message from Mohinder Khosla (who I can heartily recommend as a fantastic source of knowledge and excellent testing person to follow at @mpkhosla) telling me that the link on my tweet targeted a non-existent page. Mohinder was, of course, correct that the link in the tweet was broken. Annoyed with myself and embarrassed that I'd made such a mistake in a social media forum, I thought over how it had happened and soon came to thinking that it might actually have been the result of a bug...

A bug or not?


Having a busy job and three young kids, my time for reading and blogging tends to get squeezed into the late evening. Once I finish a post I tend to tweet a link straight away but then follow up the next morning with a "New post last night ..." tweet for those sane UK folks who are in bed by 12am. For this second tweet I had got into the habit of just copying and pasting the text of the first, and editing it appropriately. With the Tweetdeck desktop client this worked fine because the link shortening in this application results a link in which the display text matches the link target. As a result my use of this application had grown to depend on this relationship. As I found to my cost, the Chrome version of the tweetdeck application that I happened to be trying out that day, did not uphold this relationship and so when I copied the display text, it did not match the original target and the link was broken.

Was this a bug? The obvious answer is no. The relationship between link target and link address is not guaranteed and the application makes no claims as to support the ability to copy and paste links between applications. As someone who works in IT I knew this full well and, if I had thought about it, I would not have made that mistake.

I was not thinking about it, however. I was following a process which was very familiar to me in a way that I'd evolved to give the desired results when using the specific tools in question. I don't want to have to think about what a piece of software is doing when I am using it, and the best software is designed in such a way that I shouldn't have to. In this case I would have had to apply my understanding of the underlying technology to have avoided this mistake. Secondly, I was using the Chrome application based on expectations that I had developed around using the desktop version of the same product. Inconsistencies between the two applications resulted in a failure to meet my expectations and consequently a mistake in my use of the former version. Lastly, and most importantly of all, the behaviour of the Chrome application made me, a long time user of Tweetdeck, feel embarrassed and angry to the extent that I stopped using that version of their product. No matter whether the behaviour was technically correct and conformed to the target design and use cases for that versions, to me personally it was a bug.

A bug is in the eye of the beholder


What constitutes a bug can be one of the most contentious discussions in software development, causing particular stress for relationships such as between tester and programmer, and supplier and customer. When it comes to straightforward programming errors then the definition will hopefully be clear and unequivocal. When we consider a bug in terms of the personal feelings and opinions that it elicits then the waters of what constitutes a bug are a lot more murky.

I would not personally limit the definition of a bug solely to the terms of a coding defect. To me, a bug is a behaviour which elicits a negative emotional response in a positive stakeholder (I use positive stakeholder to exclude those who have a negative relationship with a product, such as a competitor who's emotional response is likely to be inversely propoprtional to our successful behaviours). Once the personal emotional influence is acknowledged then the decisions on what to fix and what behaviours to change become human ones rather than technical ones, raising some interesting implications:-
  • The decisions around fixing the issue can be defined not around technical severity but on how many people will be impacted by the behaviour and how bad will it make them feel. My problem with the links in Tweetdeck might not consitute a bug on the basis of my feelings, however if half the user community felt the same way then a change in behaviour would certainly be justified. If one marathon runner had had their details published on the internet this week, I doubt that the story would have merited the BBC News coverage that it actually received.

  • Two users of the same system may have mutually exclusive opinions on how that product should behave, such that a fix for one will constitute a bug for another. I've certainly encountered this scenario when attending user groups for software products in use by a number of different organisations with contrasting business workflows. In Microsoft Outlook, the fact that the meeting reminders took focus from other applications was a sufficient problem for Microsoft to remove this. For me, and many others, the fact that this prevents reminders from popping up is a major problem, as an icon on my taskbar turning orange is not sufficient to distract me from what I am doing and remind me that I have a meeting to attend. Both behaviours constitute bugs to certain individuals resulting in frustration and anger, and in my case installing another application (the excellent Growl for Windows) to workaround the issue.

  • The impact of the bug may not occur now, but at some point in the future, even after the application has ceased to operate. If the data is ported to another application. I saw a system a few years ago which defined a primary key on its non-normalised transactions table based on forename, surname and DOB - obviously not a good basis for a primary key. The system had not encountered an issue with this up until that point but the nature of the system and the data could have had very serious negative consequences on individuals in the future. Working in the data storage and analysis space I'm all too aware of the frustration that is caused when trying to transfer badly validated data from one system to another.

  • The emotional impact of a bug may not be caused directly by the software itself, to the extent that the victim of the bug are unaware that their emotions are the result of a software behaviour. If a security bug, such as the London Marathon one, releases someones details resulting in fraudulent or undesirable results for that individual then they might be totally unaware of where their details were obtained.

  • A bug that is very hard to justify in any quantifiable terms, might be much more easily justified in emotional ones. Whenever I use the ticketing machines in UK train stations I always feel that they should be faster. I have no supporting evidence for this other than that I see the same emotions in all those who try to use them - frustration, impatience and anger that it takes so long to achieve what should be a simple operation. Attempting to quantify this in terms of transaction times makes it hard to justify a change. When, as I saw today, this results in individuals missing their trains then the emotional justification can be much more powerful.

  • A coding mistake can exist in a product for years and not consitute a bug, as long as no-one at any point, either now or in the future, has a desire that it should be/should have been changed.

  • No matter how much testing you do to ensure you have requirements traceability and functional correctness in relation to the specification, your product can still be full of bugs if the users hate the way that it works.

So I've moved back to using the Tweetdeck desktop application for now. It does mean an extra application running on my laption but at least I can copy and paste my links with impunity. As to whether my problem was caused by user error or was actually a bug - I'll leave that for you to decide - bugs are in the eye of the beholder after all.

ShareThis

Recommended