Tuesday, 21 October 2014

The Workaround Denier

Your software is a problem, to someone. It may be uncomfortable to accept but for somebody that uses your software there will be behaviours in it that inhibit their ability to complete the task that they are trying to achieve. Most of the time people don't notice these things as the limitations are within the realm of accepted limitations of computer technology (the word processor doesn't type the words as I think of them; the online shop doesn't let me try the goods before ordering). In some cases the limitation falls outside of accepted technological inadequacy, with the most likely result being a mildly disgruntled user having to reluctantly change their behaviour or expectations. In cases where the difference is more profound, but not sufficient for them to move to another product, it may be that the person involved will attempt to work around the limitation. The result in such situations is that the software features can be used in a manner that they have not been designed or tested for.

As a tester I find myself slipping towards hypocrisy on the subject of workarounds. Whilst I am happy to consider any (legal) workarounds at my disposal to achieve my goals with the software of others, when testing my own software my inclination is to reject any use of the system outside of the scope of operation for which it has been designed and documented. I think this position of 'workaround denial' is something that will be familiar to many testers. Any use of the product that suits outside our understanding of how it will be used is tantamount to cheating on the part of the user. How are we expected to test for situations that we didn't even know were desirable or practicable uses of the product?

An example last week served as a great contradiction to the validity of such a position, demonstrating how important some workarounds are to our customers. It also reminded me of some of the interesting and sometimes amusing workarounds that I have encountered both in software that I use and which I help to produce.

The software I use

I am constantly having to find ways around when one of the many software programs that I use on a daily basis doesn't quite do what I want. In some cases the results of trying to work around my problems are a revelation. As I wrote about in this post on great flexible free tools, the features that you want might be right under your nose. In other cases the actions taken to achieve my goals are somewhat more convoluted and probably sit well outside the original intention of the development team when designing the software.

  • The support tracking system that I implemented for our support teams does not support the ability for our own implementation consultants to raise or track tickets on behalf of their clients. It will assume that any response from our company domain is a support agent response and forward it to the address of the owner of the ticket. To avoid problems of leakage we restrict the incoming mails to the support mailbox on the exchange account, but this does limit the possibility of including our own implementation consultants as 'proxy customers' to help in progressing investigations into customer problems.
  • The bug tracking system that I currently use has poor support for branching. Given the nature of our products and implementations we are maintaining a number of release branches of the software at any one time and sometimes need to apply a fix back to a previous version on which the customer is experiencing a problem. With no inherent support for branching in our tracker I've tried a number of workarounds, including at times directly updating the backend database, all with limited success.
  • As I wrote about in this post, I was intending to replace the above tracking system with an alternative this year. Interestingly the progress on this project has stalled on the presence of exactly the same limitation, poor branching support in the tool we had initially targeted to move to, Jira. The advice within the Jira community suggested that people were resorting to some less than optimal workarounds to tackle this omission.
  • Outlook as an email client has serious workflow limitations for the way that I want to manage a large volume of email threads. I've found the need to write a series of custom macros in order to support the volume and nature of emails that I need to process on a daily basis. These include a popup for adding custom follow up tags so I can see not only that a follow up is required but a brief note of what I need to do, a macro to pull this flag to more recent messages in the conversation so that it will display on the collapsed conversation, and also the ability to move one or more mails from my inbox to the directory that previous mails in that conversation are stored.
  • The new car stereo that I purchased has a USB interface to allow you to plug in a USB stick of mp3 files to play. On first attempting to use this I found that all of the tracks in each album were playing in alphabetical order rather than the order that the songs appeared in the albums. A Google search revealed that the tracks were actually being played in the order that they are added to the FAT32 file system on the stick. Renaming the files using a neat piece of free software and re-copying to the stick resolved the issue. On reading various forums it appears that this was a common problem, but the existence of the workarounds of using other tools to sort the files was apparently sufficient for the manufacturer not to feel the need to enhance the behaviour.
  • The continuous integration tool Jenkins has a behaviour whereby it will 'helpfully' combine identical queued jobs for you. This has proved enough of a problem for enough people that it prompted the creation of a plug in to add a random parameter to Jenkins jobs to prevent this from happening, which we have installed.

The software I test

Given that my current product is a very generic data storage system with API and command line interfaces, it is natural that folks working on implementations will look to navigate around any problems using the tools at their disposal. Even on previous systems I've encountered some interesting attempts to negotiate the things that impede them in their work.

  • On a financial point of sale system that I used to work on it was a requirement that the sales person completed a fresh questionnaire with the customer each time a product was proposed. Results of previous questionnaires were inaccessible. Much of the information would not have changed and many agents tried to circumvent the disabling of previous questionnaires to save time on filling in new ones. The result was an arms race between the salespeople trying to find ways to reuse old proposal information, and the programmers trying to lock them out.
  • On a marketing system I worked on we supported the ability to create and store marketing campaigns. One customer used this same feature to allow all of their users to create and store customer lists. This unique usage resulted in a much higher level of concurrent use than the tool was designed for.
  • The compression algorithms and query optimisations of my current system require that data be imported in batches, ideally a million records or more, and be sorted for easy elimination of irrelevant data for querying. We have had some early implementations where the end customer or partner has put in place a infrastructure to achieve this, only for field implementation teams to try to reduce latency by changing the default settings on their systems to import data in much smaller batches of just a few records.
  • One of our customers had an issue with a script of ours that added some environment variables for our software session. It was conflicting with one of their own variables for their software, so they edited our script.
  • One of our customers uses a post installation script to dynamically alter the configuration of query nodes within a cluster from the defaults.
  • In the recent example that prompted this post a customer using our standalone server edition across multiple servers did not have a concept of a machine cluster in their implementation. Instead they performed all administration operations on all servers, relying on one succeeding and the others failing fast. In a recent version we made changes with the aim of improving this area such that each server would wait until they could perform the operation successfully rather than failing. Unfortunately the customer was relying on failing fast rather than waiting and succeeding so this had a big impact on them and the workaround they had implemented.

My former inclination as a tester encountering such workarounds was to adopt the stance of 'workaround denier', being dismissive of any non-standard uses of my software. More recently, thanks partly due to circumstance and partly in light of the very positive attitude of my colleagues, I've grown to appreciate the idea that being aware of and considering known workarounds is actually beneficial to the team. It is far easier to cater for the existence of these edge uses during development than to add support later. In my experience this is one area where the flawed concept of the increasing costs of fixing may actually hold true given that we are discussing having to support customer workflows that were not considered in the original design, rather than problems with an existing design that might be cheaply rectified late in development.

What are we breaking?

Accepting the presence of workarounds and slightly 'off the map' uses of software raises an interesting question of software philosophy. If we change the software such that it breaks a customer workaround, is this a problem? I don't believe that there is a simple answer to this. On one hand our only formal commitment is to the software as delivered and documented. From the customer perspective, however, their successful use of the software includes the workaround, and therefore their expectation is to be able to maintain that status. They have had to implement such a workaround to overcome limitations on the existing feature set and therefore could be justifiably annoyed if the product changes to prevent that. Moving away from the position of workaround denial allows us to predict problems that, whilst possibly justifiable, still have the potential to cause negative relationships once the software reaches the customer.

Some tool vendors, particularly in the open source world, have actively embraced the concept of the user workaround to the extent of encouraging communities of plug-in developers who can extend the base feature set to meet the unique demands of themselves and subsets of users. In some ways this simplifies the problem in that the tested interface is the API that is exposed to develop against.

(While this does help to mitigate the problem of the presence of unknown workarounds, it does result in a commitment to the plug-in API that will cause a much greater level of consternation in the community should these change in the future. An example that impacted me personally was when Microsoft withdrew support for much of the Skype Desktop API. At the time I was using a tool (the fantastic Growl for Windows) to manage my alerts more flexibly than was possible with the native functionality. The tool relied upon the API and therefore no longer works. Skype haven't added the corresponding behaviour into their own product and the result is that my user experience has been impacted and I tend to make less use of Skype as a result.)

Discovering the workarounds

The main problem for the software tester when it comes to customer workarounds is knowing that they exist. It is sometimes very surprising what users will put up with in Software without saying anything, as long as somehow they can get to where they want to be. The existence of a workaround is the result of someone tackling their own, or somebody else's, problem and it may be that they don't inform the software company that they are even doing this. It can take a problem to occur for the presence of the workaround to be discovered.

  • For the sales agents on the point of sale system trying to unlock old proposals, we used to get support calls when they had got stuck having exposed their old proposal data but now unable to edit them or save any changes.
  • The customer of the marketing campaign system reported issues relating to concurrency problems in saving their campaign lists.
  • For the team that edited the environment script, we only discovered the change when they upgraded to a later version and the upgrade had a problem with the changes they'd made and threw errors. Again this came in via the support desk.
  • For the team who reduced the size and latency of their imports, we only discovered the problem when they reported that the query performance was getting steadily worse.
  • For the recent customer who was taking a 'fail fast' approach to their multi-server operations, again the problem came in via the support desk exhibiting as a performance issue with their nightly expiry processes.

So the existence of a workaround is often only discovered when things go wrong. In my organisation the channels are primarily through the technical support team, and in running that team I get excellent visibility of the issues that are being raised by the customers and any workarounds that they have put in place.

For other organisations there may be additional channels through which information on customer workarounds can be gleaned. As well as being a useful general source of tester information on how your software is perceived, public and private forums are also the places where people will share their frustrations and workarounds. I've already mentioned that I used a public forums to discover a solution to my problem with my car stereo. My colleague also discovered the 'random parameter' add-in to Jenkins on a public forum, and it was public threads that we looked to in order to identify workarounds to the lack of branch support in Jira.

Prevention is Better than Cure

Responding to problems is one thing. If we work to gain an understanding of where the customers are getting into trouble I think that it is possible to anticipate places where customers try to work around limitations in Software and testers can be on the lookout for potential consequences of they do. Doing this, however, requires an understanding of customer goals and frustrations that sit outside of the defined software behaviour. I believe that the signs are usually there in advance if you look for them in the right places, perhaps a question from an implementation consultant working on customer site on understanding why a feature was designed in a certain way, an unsatisfied change request on the product backlog, or a post on a user forum. If we can make efforts to understand where the customers are getting frustrated then we can test scenarios where they might try to get around the problem themselves and establish how they might get themselves into trouble if they do. There are some testers in my team who actively help out with support so gain good visibility of problems. In order to encourage a wider knowledge of customer headaches throughout the test team we have started to run regular feedback sessions where a support agent discusses recent issues and we discuss any underlying limitations that could have led to these.

Of course, ideally we would have no need for workarounds at all, the software should simply work in the way that the users want. Sadly it is rarely possible to satisfy everyone's expectations. The responsibility to prioritise changes that remove the need for specific workarounds is probably not one that falls directly on testers. In light of this, is it something that they need to maintain awareness of? As I've stated my inclination was to dismiss them as something that should not concern testing - after all, we have enough challenges testing the core functionality. It is tempting to adopt the stance that the testing focus should be solely on the specified features, that we need to limit the scope of what is tested and the documented feature set used 'as designed' should be our sole concern. This is a strong argument, however I think that this belies the true nature of what testing is there to achieve. Our role is to identify things that could impact quality, or value the product provides to some stakeholder, to use Weinberg's definition. From this perspective the customer value in such scenarios is obtained from having the workarounds in place that allow them to achieve their goals. Rather than dismissing workarounds, I'm coming around to the idea that software testers would better serve the business if we maintained an understanding of those that are currently in place, and raise awareness of any changes that may impact on these. In our latest sprint, for example, we introduced some integrity checking that we knew would conflict with the post configuration script mentioned above, so as part of elaborating that work the team identified this as a risk and put in place an option to disable the check. This is exactly the kind of pragmatism that I think we need to embrace. If we are concerning ourselves with the quality of our product, rather than adherence to documented requirements, it appears to me to be the right thing to do.

Image: https://www.flickr.com/photos/jenlen/14263834826

No comments:

Post a Comment

Thanks for taking the time to read this post, I appreciate any comments that you may have:-