Sunday 1 December 2013

Potential and Kinetic Brokenness - Why Testers Do Break Software

I'm writing this to expand on an idea that I put forward in response to a twitter conversation last week. Richard Bradshaw (@friendlytester) stated that he disliked saying that testers "break software" as the software is already broken. His comments echo a recent short blog post by Michael Bolton "The Software is already broken" . I know exactly what Richard and Michael are saying. Testers don't put problems into software, we raise awareness of behaviour that is already there. It sometimes feels that the perception of others is that the software is problem free until the testers get involved and suddenly start tearing it apart like the the stereotypical beach bully jumping on the developers' carefully constructed sandcastles.

I disagree with this statement in principle, however, as I believe that breaking software is exactly what we do...

Potential and Kinetic Failure

I'm not overly keen on using the term 'broken' in relation to software as it implies only two states - in all but the simplest programs, 'broken' is not a bit. I'm going to resort here to one of my personal dislikes and present a dictionary definition - what I believe to be the relevant definition of the word 'break' from the Oxford Dictionary:-

Break: - make or become inoperative: [With subject] he’s broken the video

The key element that stands out for me in this definition is the "make or become" - the definition implies that a transition of state is involved when something breaks. The software becomes broken at the point when that transition occurs. I'd argue that software presented to testers is usually not inoperative to the extent that I'd describe it as broken when we receive it for testing. I believe that a more representative scenario is that the basic functionality is likely to work, at least in a happy path scenario and environment. The various software features may be rendered inoperative through the appropriate combination of evironment changes, actions and inputs. In the twitter conversation I likened this to energy:-

It's like energy. A system may have high 'potential brokenness', testers convert to 'kinetic brokenness'

What we will do in this case is to search for potential for the system to break according to a relevant stakeholders expectations and relationships with the product. In order to demonstrate that potential exists we may need to force it into that broken state, thereby turning this potential into what could be described as kinetic failure. Sometimes this is not necessary, simply highlighting the potential for a problem to occur can be sufficient to demonstrate the need for a rework or redesign, but in most cases forcing the failure is required to identify and demonstrate the exact characteristics of the problem.

Anything can be broken

In the same conversation I suggested that:-

Any system can be broken, I see #testing role to demonstrate how easy/likely it is for that to happen.

With the sufficient combination of events and inputs, pretty much any system can be broken in the definitive sense of being 'rendered inoperative'. For example if we take the operating factors to extremes of temperature, resource limits, hardware failure or file corruption. I suggest that the presence or not of bugs/faults depends not on the presence of factors by which the system can be broken, but on whether this combination falls within the range of operation that the stakeholders want or expect it to support. As I've written about before in this post - bugs are subjective and depend on the expectations of the user. Pete Walen (@PeteWalen) made this same point during the same twitter conversation:-

It may also describe a relationship. "Broken" for 1 may be "works fine" for another; Context wins

The state of being broken is something that is achieved through transition, and is relative to the expectations of the user and the operating environment.

An example might be useful here.

A few years ago had my first MP3 player. When I received it it worked fine, uploaded and played my songs, I was really happy with it. One day I put the player in my pocket with my car keys and got into my car. When I took the player out of my pocket the screen on the player had broken. On returning it to the shop I discovered that the same thing had happened to enough people that they'd run out of spare screens. I researched the internet and found many similar examples where the screen had broken in bags or pockets. It seems reasonable that if you treat an item carelessly and it breaks then that is your responsibility, so why had this particular model caused such feedback? The expectation amongst the experiences that I had read was that the player would be as robust as other mobile electronic devices such as mobile phones or watches. This was clearly not the case, which is why it breaking in this way constituted a fault. I've subsequently had a very similar MP3 player which has behaved as I would expect and stood up to the rigours of my pockets.

  • So was the first player broken when I got it? No. It worked fine and I was happy with it.
  • Who broke the first mp3 player? I did.
  • Was the first player broken for everyone who bought it? - No. My model broke due to the activity that I subjected it to. I'm sure that many more careful users had a good experience with the product.
  • Was the second player immune to breaking in this way? - No. I'm pretty sure that if I smacked the one I have now with a hammer the screen would break. But I'm not planning to do that.

The difference was that the first player had a weaker screen and thereby a much higher potential for breaking such that it was prone to breaking well within the bounds of expected use of most users. This constituted a fault from the perspective of many people, and could have been detected through the appropriate testing.

Any technology system will have a range of operating constraints, outside the limits of which it will break. It will also have a sphere of operation within which it is expected to function correctly by the person using it and the operating environment. If the touch screen on the ticket machine in this post had failed at -50 degrees Celcius I wouldn't have been surprised and would certainly not have written about it. The fact that it ceased working at temperatures between -5 and 0 degrees is what constituted a breakage for me due to the environment in which it was working. It wasn't broken until it got cold. It possessed the potential to break given the appropriate environmental inputs, which manifested itself in 'kinetic' form when it was re-installed outside and winter came along. Interestingly it had operated just fine inside the station for years, and would not have been broken in this way if it had not been moved outside.

Taking a software example, a common problem for applications is where a specific value is entered into the system data. An application which starts to fail when a surname containing an apostrophe is entered into the database becomes 'broken' at the point that such a name is entered. If we never desire to enter such a name then it is not a problem. At the point that we enter such data into the system we change the state and realise the potential for that breakage to occur. Testers entering such data and demonstrating this problem are intentionally 'breaking' it in order to demonstrate that potential exists in live use so that that decision can be made to remove that potential before it hits the customers.

State change can occur outside the software

You could argue that not all bugs include a change of state in the software as I describe above. What about the situation, for example, where a system will simply not accept a piece of data, such as a name with an apostrophe, and rejects it with no change of state in the software before and after this action. Surely then the software itself was already broken?

In this situation I'd argue that the change of state occurred not in the software itself but in its operating environment. A company could be using such an application internally for years without any issues until they hire an "O'Brien" or an "N'jai". It is at the point at which this person joins the company and they attempt to enter that employees details that the state of the software changes, from "can accept all staff names" to "cannot accept all staff names" and breaks. Given that testers are creating models to replicate possible real world events in order to exercise the application in and obtain information on how it behaves, the point at which we add names containing apostrophes to our test data and expose the application to these is the point at which we 'break' the software and realise the potential for this problem to occur.

As well as event based changes, such as that, breakages can also occur over time through a changing market or user environment, and our lack of response to it. Using the above hangul example, the potential for breaking increases dramatically the moment that our software starts being used in eastern markets. I won't expand on this subject here as I covered it previously in this post.

So Testers Do Break Software

I can understand why we'd want to suggest that the software was already broken. From a political standpoint we don't want to be seen as being the point at which it broke. I think that saying that it was already broken can have political ramifications in other ways such as with the development team. I'd argue that when we receive software to test, it is usually not 'broken' in the sense that it has been rendered inoperative and suggesting that it was may affect our relationships with the people coding it. Instead I think a more appropriate way of looking at is is that it possesses the potential to break in a variety of ways and that it is our job to come up with ways to identify and expose that potential. If we need to actually 'break' the software to do so then so be it. We need to find the limits of the system and establish whether these sit within or outside the expected scope of use.

If we have a level of potential breakability looming in our software as precariously as the rock in the picture above then the testers job to ensure that we 'push it over' and find out what happens, because if we don't exercise that potential then someone else will.

Joe said...

I understand your viewpoint, but respectfully disagree. Saying that Testers break software in a "kinetic versus potential" kind of way, would be like saying you add energy by nudging a rock off of a cliff. http://www.allthingsquality.com/2010/04/software-testing-is-not-breaking-things.html http://www.allthingsquality.com/2012/10/software-testing-is-not-breaking-things.html

Adam Knight said...

Joe

I appreciate your taking the time to comment and your difference of opinion. The fact that I can present ideas and have people of your experience discussing these in a critical way gives me a great motivation to write.

I am prepared to accept that there may be flaws in the 'potential vs kinetic' argument but I don't believe what you have presented is on of them, quite the opposite, in fact. The point of the argument is, like energy, we cannot add bugs into a system. The potential energy is already in the rock due to its elevation. What we can do is perform the correct sequence of events to change the state of those bugs from 'potential breakages' to 'actual breakages' - we kick the rock to demonstrate how it can fall. I like this idea as it maintains the principle of the bugs already being there, whilst avoiding the rather extreme statement that the software was broken, which I believe can have a negative implication for both developers and business. Any software can be broken, but one system may be easier to break than we would hope, constituting a high potential for failure and a risk to the business.

Thanks for your comments and the link to your post, this is clearly a subject which you have given a lot of thought to.

Adam

Joe said...

So testers don't break the software, they just change the "state of brokeness"?

Adam Knight said...

Yes, I suppose so.

My biggest concern is avoiding the 'already broken' tag which I think carries negative connotations both in undermining the subtlety and skill of testing in identifying and realising the potentials for breaking within the product, and also with giving an overly negative perspective on the state of the software that we receive from developers when the steps required to realise a 'broken' state may actually be quite unlikely.

Thanks

Adam.

Whatsapp Button works on Mobile Device only

Start typing and press Enter to search