Monday, 19 March 2018

Textual description of firstImageUrl

Disruptive Work

Just over a year ago I instigated a move in my company to stop recording bugs. It was a contentious move, however it seemed that for the context in which we worked - tracking bugs just didn't seem to fit. I'd planned to write about it at the time but wanted to see the results first. In light of some recent internal discussion in my company in favour of re-instigating bugs, now seems like as good a time as any to write about what we do instead.

A scheduling problem

My current company, River, has historically worked on an agency basis. Rather than having development teams devoted specifically to one product or feature area, each team will have their time allocated on a sprint-by-sprint basis to one of the client programmes that we develop and maintaining software for.

This presents a number of challenges to effective agile development, probably the most significant being scheduling work. When product teams work in agile sprints they commit to delivering a backlog of work for each sprint. Inevitably I my experience sometimes things didn't always go as planned and work may need to be brought into a sprint that has come in urgently from another channel such as a Proof of Concept (POC) or the customer support team. In product teams I've worked on there was always an understanding across the business that these disruptive items would impact on the teams ability to tackle its scheduled work and subsequent sprints would be affected. As all of the sprints related to the same product this was manageable - we may have had to reduce the scope of a long term release deliverable, but for the most part the work could be absorbed as an overhead of the long term development work of the team.

The challenge we faced at River with the agency model was that for any given sprint for a team there was a good chance that the work would be completely unrelated to what they had done in the previous one. Any items being raised through other channels such as ad-hoc customer requests or support tickets may have come from a different programme from the one that was the subject of the active sprint and therefore the emergent work could not simply be slotted in to the team backlog at the appropriate priority level.

We saw some challenging tensions between undertaking new planned developments and maintaining existing programmes. I'd personally struggled to make progress on two high profile projects due to the impact of disruptions from existing ones, so I was keenly aware of the impact that unplanned work could have. A particular problem that I'd encountered was issue snowballing. As one sprint was disrupted by unplanned work, it would inevitably have a knock on effect as promised work would be rushed or carry over and impact a later iteration. Timescales on the new programme would get squeezed resulting issues on that programme, which would consequently come in as disruptions which impacted on later sprints..

Being Careful with metrics

Last year across the development group we set about establishing a set of meaningful goals around improving performance. Years of working in software testing provides me with a strong mistrust of metrics and I was keen to avoid false targets when it came to striving to improve the quality of the software. I'd never suggest that counting bugs provided any kind of useful metric to work to, however I did feel that examining the amount of time spent on bugs across the business could provide us with some useful information on where we could valuably improve and so I set about investigating use of bugs across the company.

What I found on examining the backlogs of the various teams was that each team took a different approach on recording bugs. Some teams would capture bugs for issues they found during development, others would restrict their 'bug' categorisation for issues that had been encountered during live use. Some teams went further and raised nearly everything as a user story - yielding such interesting titles as "As a user I don't want the website to throw an error when I press this button".

I and the group involved in looking at useful testing metrics took a first principles approach by taking a step back from what to track on bugs and actually understanding the problem that we were trying to solve. Whilst obviously the absence of bugs was a desirable characteristic in our programmes, more importantly was reducing the amount of disruption encountered within teams relating to software that they weren't working on at the time, whatever the cause and so we decided that looking at disruptions would give us more value than looking at bugs alone.

The Tip of the Iceberg

What became apparent was that the causes of disruptions went far deeper than just classic 'bugs' or coding flaws. The work that was causing the most disruption included items such as ad-hoc demands for reports from customers that we weren't negotiating strongly on, or changes requested as a result of customers demanding additional scope to that which had been delivered. I would have been personally happy to consider anything that came in this way as a bug, however I'm politically aware enough to know that describing an ad-hoc request from an account director as a 'bug' might be erring on the side of bloody-mindedness.

The approach that we decided to take was to categorise any item that had to be brought into an active sprint (i.e. something that had to impact the current sprint and could not be scheduled into a future sprint for the correct programme) as 'disruptive'. The idea was that we then track the time spent in the teams disruptive items and then look to reduce this time by targeting improvements, not just in the coding/testing, but in all of the processes that impacted here including prioritisation, product ownership and customer expectation management. A categorisation of a 'bug' was insufficient to identify the range of different types of disruptive work that we encountered. We therefore established a new set of categories to better understand where our disruptive backlog items were coming from :

  • Coding flaws
  • Missing behaviour that we hadn't previously identified (but the customer expected)
  • Missing behaviour that we had identified but hadn't done (due potentially to incorrect prioritisation)
  • Missing behaviour that would be expected as standard for a user of that technology
  • Performance not meeting expectation
  • Security vulnerability
  • Problem resulting from customer data

Each item raised in the backlog would default to be a 'normal' backlog item, but anything could be raised as 'disruptive' if a decision was made to bring it in to a team and to impact a sprint in progress.

Did it work?

After over a year of capturing disruptive work instead of bugs we are in a good place to review whether it worked. Typically and frustratingly the answer is both yes and no.

Some of the things that did really help

  • No-one argued any more about whether something was a bug. I have gone over a year in a software company without hearing anyone argue about whether something was a bug or not. I cannot overstate how important I think that I has been for relationships and morale.
  • We don't have big backlogs of bugs. The important stuff gets fixed as disruptive. The less important stuff gets backlogged. At my previous company we built up backlogs of hundreds of bugs that we knew would never get fixed as they weren't important enough, treating all items as backlog items avoids this 'bug stockpiling'.
  • All mistakes are equal. I've always disliked the situation where mistakes in coding are categorised and measured as bugs but mistakes in customer communication that have far bigger impact are just 'backlog'. There is a very murky area prompting some difficult conversations when separating 'bugs' from refactoring/technical debt/enhancements. These conversations are only necessary if you force that distinction.

What didn't work so well

  • People know where they are with bugs. In many cases they are easy to define - many issues are so clearly coding flaws that categorising them as bugs is easy and allows clear conversations between roles simply down to familiarity with bug management.
  • There was still inconsistency. As with bugs, different teams and product owners applied subtly different interpretations of what disruptive items were. Some were treating any items that impacted on their planned sprint as disruptive, even if they related to the programme that was the subject of that sprint, others only raised "disruptives" if the work was related to a different programme.
  • The disruptive category led to a small degree of gaming. Folks started asking for time at short notice to be planned into the subsequent sprint rather than the current one. This was still significantly disruptive to planning and ensuring profitable work, however it could be claimed that the items weren’t technically "disruptive items" as the time to address them had been booked in our planning system.

In Review

Right now the future of disruptive items is uncertain as one of the product owners in my team this week raised the subject of whether we wanted to consider re-introducing bugs. Although I introduced the concept I'm ambivalent on this front. Given the problems that we specifically faced at the time, tracking disruptive items was the right thing to do. Now that we have a larger product owner team and some more stability in the scheduling disruptive work is not the debilitating problem than it was. At the same time I'm not convinced that moving back to 'bugs' is the right move. Instead my inclination is to once again go back to first principles and look at the problems that we are facing now, and track appropriately to address those, rather than defaulting to a categorisation that, for me, introduces as many problems as it solves.

Sunday, 28 January 2018

Textual description of firstImageUrl

Aligning with the Business

I have over the last few months been concurrently involved in some of the most and least inspiring work of my career. Naturally having a software tester mindset I decided to write about the negative stuff as a priority. What can I say? My glass is usually half empty.

Engaging Through Alignment

I recently had the pleasure of inviting an inspiring lady named Susie Maguire to run a workshop at River. Susie has a wealth of experience in the field of engagement and motivation and was the perfect person to discuss, question and regions our own expertise in this area. In one discussion during the workshop Susie discussed the importance of aligning the goals of the individual, with the goals of the team, with the goals of the organisation to achieving true employee engagement.

As with many of the most powerful ideas in successful work, this is a blazingly simple concept yet surprisingly difficult to achieve and therefore depressingly rare. The divided and hierarchical nature of many organisational structures means that teams can aggressively optimise to their own established goals, which, over time, can deviate drastically away from those of the wider company. As we were talking I couldn't help thinking about a process that I was working through at the time which was an extreme case of when such a deviation of goals occurs.

A Painful Process

As I mentioned at the start I've also recently been involved in some of the least inspiring work of my career in relation to implementing a software programme into a large organisation. This is not in itself inherently painful and the relationships with the immediate client at the start of the programme were healthy and long standing. As part of the implementation we were required to work with an internal team from the wider global organisation to 'certify' that one element of our software meet their standards. We were happy to do this on the basis that we'd successfully delivered software to the client before and felt confident in meeting these standards based on our programmes with other global clients. Our confidence proved to be misplaced, however, when we discovered the details of the process.

  • It became apparent early on that the certification team were not going to engage with us in any kind of collaborative relationship at all but instead would operate primarily through the documented artefacts of the process
  • The requirements that we had to meet were captured in lengthy and convoluted documentation from which we had to extract the relevant information and interpret for our situation. Much of the documentation was targeted at in-house development in different technologies to our stack.
  • Some parts of the process involved different people submitting the same lengthy and detailed information into separate documents or systems, which were then all required to align exactly across the submission
  • Many of the requirements documented were either impractical or actually not possible in the native operating systems we were supporting
  • The process involved no guidance or iteration towards a successful outcome, instead certification involved booking a scheduled 'slot' which had to be reserved weeks in advance based on the predicted delivery date.
  • Any failure to meet the standards discovered during the certification slot were not fed back during the slot in time to be resolved towards a successful outcome, but were communicated via a lengthy PDF report once the process was complete
  • Items as minor as a repeated entry in a copyright list or a slight difference in naming between a help page and guidance prompts were classified as major failures resulting in failing the certification
  • Approaches were presented in the specification as reference examples, yet any deviation from the behaviour of the 'example' was treated as a major failure, even if the logical behaviour was equivalent.
  • The inevitable failure in the certification slot required a second 'slot' to be booked for a retest

The final straw came when, as part of the second booked review slot, new requirements were identified which we hadn't been told about in the initial certification, yet our failure to meet them still constituted a failure of the overall certification. Software components not raised in the first review were newly identified as 'unacceptable' in the second, and the missing behaviours stated in the first review were frustratingly in themselves insufficient to pass when it came to the second.


What was clear to me in going through this process was that here was a team where the goals of the team had diverged significantly from the goals of the company.

The goals of the team appeared to be

  • Ultimately protect the team budget by maintaining a healthy stream of failures and retests (the internal purchase only covered two test slots - a third retest resulted in an additional internal purchase and internal revenue for the team)
  • Tightly document the requirements of software solutions irrespective of value or practical applicability
  • Maximise failure through maintaining a position of zero tolerance for ambiguity or delivering value in different ways.
  • Maintain an internal view - limiting communication outside the team and Interfacing primarily through artefacts - such as requirements and failure reports

If this only affected us as a supplier then I would probably not be writing this. What was more frustrating was that I was working on behalf of a client company that was a component of the larger global organisation. The behaviours of this team were directly preventing the progression of an exciting and engaging programme. Instead of adding value to their programme they were using valuable budget on frustrating bureaucratic processes and inane adjustments that they saw very little value from and ultimately placed the programme at risk.

It could be argued that the team were protecting the company to ensure standards. I'd argue that a process of collaborative guidance and ongoing review would have been easier, cheaper in terms of both team and our costs, and far more likely to achieve a successful outcome. The process as designed was not aligned with the needs of wider company, including my client.

The other side of the fence

I get very frustrated in situations like the one above as they affect me on two fundamental levels.

Firstly, we only have a limited amount of time on the earth. Seeing so many talented people wasting their valuable time on such pointless activities is very frustrating. For me work is about more than making money. If intelligent and capable people are spending their time on undertakings that add little value beyond meeting the specific idiosyncrasies of a self-propagating process then they will start to question themselves and their work deeply. The nature of the process caused tension across all the people involved and caused anxiety for people that simply wouldn't have been required if the process had been structured differently. We wanted to deliver work that benefitted the programme and pleased the customer, yet we were unable to do so due the effort required simply to adhere to the process imposed on us.

Secondly, the process that I described above was essentially a testing process. It's true that the process was so unrecognizable from what I would describe as testing that it took me a while to appreciate it, but testing it was. The process fitted exactly the pattern of:

  • a strict requirements document written before the software was developed
  • a predefined sequence of checks based on adherence to the documentation performed subsequent to and in isolation from the development process
  • the absence of feedback loops that would allow issues to be resolved in a timely fashion
  • communication via artefacts and failure reports rather than direct communication between the person performing the checks and the developing team

Which could describe testing processes in many organisations throughout the world of software development.

Not what I call Testing

Being on the other side of such a testing process was a new and enlightening experience. It gave me an insight into how frustrating it can be working with a testing unit that refuses to engage. I can understand some of the suspicion and even hostility that developers historically have felt towards isolated testing teams. When your best efforts to meet the documents expectations are fed back via reports covered in metaphorical red pen, it's hard to harbour positive feelings for the people involved.

It's heartening then, that each year I read the results of the "State of Testing" survey and see a testing community that is in parts rejecting this kind of approach and embracing more communication and collaboration. In fact the testing skill rated as most important in both 2016 and 2017 survey reports was communication. Whilst this is encouraging, the level of importance placed on communication for testers did drop from '16 to '17 - which is not a trend that I'd want to see continue.

I recommmend if you are a tester reading this that you take the time to take part in this years "State of Testing" survey here

Our ability to communicate risks and guide and inform decisions is paramount in delivering a testing activity that prioritises the needs of the business over the delivery of the testing process. Going back to Susie's lessons on employee engagement - if alignment with the goals of the wider business is the key to successful engagement then testers whose approach is focussed more on continuous communication and helping to guide towards business goals, are ultimately the ones who will improve their own satisfaction at work but also the others whose lives they impact in the process.