Friday, 19 December 2014

Textual description of firstImageUrl

New Beginnings, Old Desk

This week it was announced publicly that RainStor has been acquired by Teradata. For anyone not familiar with the name, Teradata is a US based company and its one of the largest data warehouse companies in the world with over 10 thousand employees - a marked difference to the roughly 50 employees of RainStor.

Having worked for RainStor for 8 years (I celebrated my 8 year anniversary at the start of December) this is a change that will have a huge impact on me and my colleagues. At the same time I am confident that, for the most part, the changes will be positive.

8 Years a startup

RainStor has been a fantastic small company to be a part of. Through the whole time that I've been lucky enough to be on board we've managed to maintain the vibrant intensity of a startup company. During this sustained period of intense activity, the team ethic has been amazing. It almost became predictable that the 'teamwork' subject in retrospective meetings would be full of happy yellow post-it notes praising the great attitude shared amongst the team.

Being a perennial startup has its disadvantages though. A small company inevitably has to 'chase the ball' in terms of market, with the result that focus of testing can be distributed too thinly across the capabilities and topologies supported to meet each customer's need. I'm optimistic that a company with a wider portfolio of products will allow us to focus on fewer topologies and really target our core strengths.

There have been ups and downs in fortune over the years. The credit crunch was a troubled time for organisations relying on funding and the fact that we got through it was testament to the strong belief in the product and the team. It seemed scant consolation though for the colleagues who lost their jobs. Counter that with the jubilant times when we won our first deals with household name companies the likes of which we'd previously only dreamed of, and you are reminded that those working in a small software company can typically expect to face a turbulent spectrum of emotions (tweet this).

So what now?

I've every reason to be positive about the future of RainStor. From my conversations with the new owners they seem to have an excellent approach to the needs of research and development teams. The efforts that have been undertaken to provide an open and flexible development environment give a lot of confidence. Combine this with an ethic of trying to minimise the impact that the acquisition has on our development processes and I'm optimistic that the approaches that we've evolved over time to manage testing of our big data product in an agile way will be protected and encouraged.

The resources available to me in running a testing operation are way beyond what we could previously access. As well as access to a greater scale of testing hardware, I'm also going to relish the idea of interacting with others who have a wealth of experience in testing and supporting big data products. I know that my new company have been in the business for a long time and will have a well developed set of tools and techniques in testing data products that I'm looking forward to pinching learning from.

What about me?

Whilst I've known about the change for some time, it is only since this week's public announcement that I've realised how long it has been since I worked for a large company. I've been well over 14 years in small companies, or as a contractor, so the change is going to take some getting used to. I love the uncertainty and autonomy of working for small companies, where every problem is shared and you never know what you'll have to take on next. As a smaller part of a large product portfolio I imagine we'll sacrifice some of that 'edge of the seat' excitement for more stability and cleaner boundaries.

A change such as this, particularly at this time of year, inclines one towards retrospection over the events leading up to this, and the possibilities for the future. I'm both proud of what I've achieved at RainStor the small company, and regretful of the areas that I know I could have achieved more. Nothing will bring back those missed opportunities, however the fact that, with just a handful of individuals we've written tested, documented and supported a big data database to the point of being used in many of the largest telecommunications and banking companies in the world leaves me with very few regrets. I hope that as we grow into our new home I can take the chance to work with my colleagues in delivering more of the amazing work that they are capable of.

So that's it, my days of testing in a startup well and truly are over for the foreseeable future. I obviously don't know for certain what's to come. One thing I am confident in is that we couldn't have found a better home. As well as having a great attitude to development my new company also have an excellent ethical reputation, which at least mitigates some of the uncertainty over not personally knowing who is running the company. I am already enjoying building relationships with some excellent people from our new owners.

I imagine that few folks reading this will be ending the year with quite as much upheaval, but I'm sure some are looking forward to new jobs or new opportunities or experiencing sadness at leaving cherished teams. Whatever your position, thanks for reading this post and thanks to those who've followed this blog through the last few years, I'm sure I'll have some interesting new experiences to write about over the coming months. I hope that, if you take some moments in the coming days to look at the past years and what is to come, you have as many reasons to be proud and excited as I do. .

Wednesday, 10 December 2014

Textual description of firstImageUrl

Not Talking About Testing, and Talking About Not Testing

I recently spent 3 very pleasurable days at the EUROSTAR Conference in Dublin. I was genuinely impressed with the range and quality of talks and activities on offer. For me the conference had taken great strides forward since I last attended in 2011 and congratulations have to go to Paul Gerrard and the track committee for putting the excellent conference together.

As with all conferences, while the tracks were really interesting, I found that I personally took away more from the social and informal interactions that took place around the event. One of the subjects that came up for discussion more than once was the fact that, for many of the permanently employed folks attending, our roles had changed significantly in recent years. For some of us our remits had extended beyond testing to include other activities such as technical support, development or documentation. For others the level of responsibility had changed more to management and decision making and less on testing activities involving interfacing directly with our products.

The new challenges presented by these new aspects of our roles dominated much of the conversation over the pub and dinner table, for example Alex Schladebeck in this post mentions exactly such a conversation between Alex , Rob Lambert , Kristoffer Nordström and I. Subjects discussed during these interactions included

  • How to motivate testers in flat organisational hierarchies.
  • How to advance the testing effort and introduce new technologies and techniques, sometimes in the face of demand to deliver on new features.
  • Making difficult decisions when there are no clear right answers.
  • Balancing the demands of testing when you have other responsibilities
  • How to ensure that areas of responsibility are tackled and customers needs met without putting too much pressure on individuals or teams. In my case this can include how to fit maintenance releases for existing products into the agile sprint cycle.
  • How to approach the responsibilities of managerial roles in organisations with self managing scrum teams.
  • Exerting influence and exhibiting leadership whilst avoiding issues of ego and 'pulling rank'.

It was clear in conversations that, whilst testing was an area that we had studied, talked on, debated and researched, these areas of 'not testing' were ones where we were very much more exploring our own paths. Traditional test management approaches seem to be ill suited to the new demands of management in the primarily agile organisations in which we worked.

The problem with not testing

Whilst testing is a challenging activity, for some of us 'not testing' was presenting as many problems. One of the things that I know was concerning a number of us was balancing the up to date testing and product knowledge with performing a range of other duties. As the level of responsibility for an individual increases, inevitably the time spent actively working on tasks testing the product itself can diminish such that it is easy to lose some of the detailed knowledge that comes from interacting with the software.

One thing that I personally have found definitely does not work is kidding myself that I can carry on regardless and take on testing activities like I always have. Learning to accept the fact that I'm not able to pick up a significant amount of hands on testing whilst facing a wealth of other responsibilities has been a hard lesson at times. It is a sad truth though that in recent years when I have assigned myself sprint items to test these have ended up being the stories most at risk in that sprint, due to my having not been able to focus enough time to test them appropriately. Whilst I enjoy testing on new features, the reality that I have had to accept is that the demands of my other responsibilities mean that others in the team are better placed to perform the testing of new features than I am.

Dealing with not testing

I won't pretend to be entirely winning the battle of not testing, however I find that most of the other activities that I do perform contribute to my general product knowledge and can balance and complement the time spent not testing:

  • Attending elaboration and design meetings for as many features as possible
  • Reviewing acceptance criteria for user stories, both at the start and also the confidence levels at the end
  • Having regular discussions with testers on test approaches taken and reviewing charters

Additionally some of the areas that I have become responsible for help to give different perspectives on the product

  • Discussing the format and content of technical documentation reveals how we are presenting our product to different users
  • Scheduling update releases and compiling release notes helps me to maintain visibility of new features added across scrum teams
  • Time spent on support activities helps give an insight into how customers use the product, how they perceive it (and any workarounds that folks are using)

These all help, however I think that as a company and product grow it is inevitable that individuals will be unable to maintain an in depth understanding of all areas of the product. Something that I have had to get accustomed to is deferring to the levels of expertise that exists in others on specific areas of product functionality which are much higher than I can expect to maintain across the whole product.

The future

I came away with the feeling that I'm part of a generation who are having to provide their own answers to questions on managing within agile companies. We are seeing a wave of managers who haven't had agile imposed on their existing role structures. (tweet this) Instead we are growing into managerial roles within newly maturing agile organisations which don't necessarily have a well defined structure of management on which to base our goals and expectations. We are attempting to provide leadership both to those who have come from more formal cultures and to those who have only ever known work in agile teams, and have very different expectations as a result. Consequently there is no reference structure of pre-defined roles to base our expected career paths around. Some, like Rob, have taken on the mantle of managing development. Others, like myself, have the benefit of colleagues at the same level who are focused on programming activities, however this does mean learning to work in a matrix management setup where the lines of responsibility are not always clear given the mixed functions of the team units.

It was clear from the conversations at EuroSTAR that for many of us our perspective on testing, and test management was changing. Whether this was simply the result of individuals with similar roles and at similar points in their careers gravitating towards each other, or the signs of something more fundamental, only time will tell. The conversations that I was involved in at the event were only a microcosm in relation to the wider discussions and debates over testers and testing that were going on amongst the conference attendees.

If anything can be taken from the range of different management roles and responsibilities that the individuals that I know have evolved to adopt, it is this - there is no clearly defined model for test management in modern agile companies - (tweet this). It is really up to the testers in those organisations to champion the testing effort in a way that works for them, and to build a testing infrastructure that fits with the unique cultures of their respective organisations.

Thursday, 20 November 2014

Textual description of firstImageUrl

In The Spirit of Windows

As I wrote in my post on The FaceBook effect, as you progress in your career you build up more examples of when you got it wrong, or you exhibited opinions which change over time to the extent that you later come to disagree with your former self. Perhaps the most fundamental example of this for me was my position on programmer/tester collaboration that I wrote about in this post. Another good example comes from an earlier role when I was testing a marketing data analytical client/server system, when my position on the need for explicit requirements was very different...

Demanding the Impossible

At the time the process that I was working under was a staged waterfall style process. We didn't follow any specific process model but it was compared on occasion to RUP. There was a database which contained formal requirements written by the product managers, and a subset of these were chosen for each long term (6-9 month) release project.

Whilst the focus of most of the development, and all of the testing, was on the main engine, a parallel development had been underway for some time to deliver a customer framework for housing the client components and managing the customers data rules, objects and models. This had been done on a much more informal, interactive manner than the core features with the result that requirements had been added to the database thick and fast as the product managers thought of them. These were much less rigidly specified up front than was usual in the company with behaviour established instead through ongoing conversations between the programmes and the product manager.

Enter the testers

After a long period of development, it was decided to perform a first release of the new framework. At this point the test team were brought into the project. What we found was a significant accumulation of requirements in the database, some of which were delivered as written, many had changed significantly in the implementation, and for many of them it was unclear whether they had been delivered or not. To clear up the situation the product owners, testers and architects day down for a couple of lengthy meetings to work through this backlog and establish the current position.

One requirement in particular caused the most discussion, and the most consternation for the testers. I don't have the exact wording but the requirement stated something like

“The security behaviour within the system will be consistent with the security of the windows operating system”

We pored over this one, questioning it repeatedly in the meetings. What did it mean specifically and how could we test it? What were the exact characteristics of the security model that we are looking to replicate? How could we map the behaviour of an operating system to a client server object management framework? In somewhat exasperated response the Development Manager tried to sum up the requirement in his own words :

“It should be done in the spirit of windows”

This caused even more consternation from the testers. Behind closed doors we ranted about the state of the requirements and the difficulties we faced. How could we test it when it was so open to interpretation? How could you write explicit test cases against a requirement “in the spirit of” something? How did you know whether a specific behaviour was right or wrong? We complained and somewhat ridiculed the expectation that we were to test "in the spirit of" something.

A rose by any other name

Looking back on that time, I can see that most of the requirements were closer to user stories than the formal requirements that we were accustomed to write our test cases against . They were not intended to explicitly define the exact behaviour, but to act more as placeholders for conversations between the product owners and the relevant development team members on how the value was to be delivered. These were small summaries of target value which were open to interpretation on their implementation and required discussion to establish the detailed acceptance criteria. The problem was that, within agile context the expectation here would be that this conversation be held between the '3 Amigos' of Product Owner, Programmer and Tester. Unfortunately in the process that we were working, the 'Third Amigo', the tester, had been missed from the conversation, with the result that the testers only had the requirement as written to refer to, or so we felt.

The Spirit of an Oracle

Let us examine the requirement that so vexed me and the other testers so much at that time - that the user security model should work in the 'spirit of windows'. As a formal requirement yes it was ambiguous, however as a statement of user value there was a lot of information here for testers to make use of. The requirement instantly provides a clear test oracle, that of the windows file system security model, from which we could establish some user expectations.

  • Windows security on files is hierarchically inherited such that objects can inherit their permissions from higher level containers
  • inherited properties can be overridden such that an object within a container has distinct security settings from its parent container
  • Permissions are applied based on the credentials of the user that is logged into windows
  • Permissions on objects may be applied either to individuals or at a group level
  • Ability to perform actions is based on roles assigned to the user
  • Allow permissions are additive so a user will have the highest level of permissions applied to any group that they are a member of
  • Deny permissions for any object overrode Allow permissions And so on.

The important concept that we failed to recognise here is that it didn't really matter what the exact behaviour was. The need for explicitly defined behaviour up front in this scenario was not present. The value was in the software providing as familiar and consistent a user experience to windows as possible whilst also providing an appropriate security structure for the specific product elements.

It is true that the ambiguity of the requirement made it more difficult to design explicit test cases and expected results in advance of accessing the software. What we were able to do was examine another application which possessed the characteristics of our target system to establish expectations and compare actual behaviour. As Kaner points out in this authoritative post on the subject, and Bolton explains eloquently through this fictitious conversation - oracles are heuristic in nature in that they help guide us in making decisions. Through the presence of such a clear testing oracle we were able to explore the system and question the behaviour of any one area through comparison with an external system. If there were behaviours that were inconsistent with our expectation, based on the windows system, then we were able to discuss the behaviour directly with the product owners who sat in close proximity to the testers. This required judgement, and sometimes compromise given that the objects managed in our system and the relationships between them were inherently different to Windows files and directories. As with all heuristic devices, our oracle was fallible and required the judgement of the testers to decide whether inconsistencies corresponded to issues or acceptable deviations. In many ways it was a very forward thinking setup, it just wasn't what we were accustomed to, and the late introduction of the testers into this process resulted in our exclusion from important discussions over which of the above behaviours we needed to deliver, and therefore restricted limited our judgement in relation to the oracle system. This, combined with our unfamiliarity with this way of working, resulted in our resistance to the approach taken.

The spirit of testing

I find great personal value in examining situations from the past to see how my opinions have changed over time. Not least this provides some perspective on my current thinking around any problems and can act as a reminder that my position on an issue may not be constant as my experience I grows. In the years since my 'spirit of windows' incident I've grown more pragmatic around the need in testing for rigidly specified requirements. In particular experience has demonstrated that specifications are themselves fallible, yet are treated as if they should be unambiguous and exhaustive instead of being treated as simply another type of oracle, to be used with judgement in making decisions. This can have damaging consequences - I have seen the situation where incorrect behaviour was implemented on the basis of a specification, when there was an excellent test oracle available that was not referenced during testing as there was no perceived need to do so.

The availability of a clear testing oracle provides an excellent basis for exploring a system that is being actively developed with the fluidity of minimising documentation to focus on asking questions and discussing design decisions through the development process, such as when working with agile user stories. What the example above clearly highlights is the importance of early tester engagement in this process if the testers are to understand the value in the feature, the decisions that go into the design and, crucially, the characteristics of the test oracle that we are looking to replicate.



Tuesday, 21 October 2014

Textual description of firstImageUrl

The Workaround Denier

Your software is a problem, to someone. It may be uncomfortable to accept but for somebody that uses your software there will be behaviours in it that inhibit their ability to complete the task that they are trying to achieve. Most of the time people don't notice these things as the limitations are within the realm of accepted limitations of computer technology (the word processor doesn't type the words as I think of them; the online shop doesn't let me try the goods before ordering). In some cases the limitation falls outside of accepted technological inadequacy, with the most likely result being a mildly disgruntled user having to reluctantly change their behaviour or expectations. In cases where the difference is more profound, but not sufficient for them to move to another product, it may be that the person involved will attempt to work around the limitation. The result in such situations is that the software features can be used in a manner that they have not been designed or tested for.

As a tester I find myself slipping towards hypocrisy on the subject of workarounds. Whilst I am happy to consider any (legal) workarounds at my disposal to achieve my goals with the software of others, when testing my own software my inclination is to reject any use of the system outside of the scope of operation for which it has been designed and documented. I think this position of 'workaround denial' is something that will be familiar to many testers. Any use of the product that suits outside our understanding of how it will be used is tantamount to cheating on the part of the user. How are we expected to test for situations that we didn't even know were desirable or practicable uses of the product?

An example last week served as a great contradiction to the validity of such a position, demonstrating how important some workarounds are to our customers. It also reminded me of some of the interesting and sometimes amusing workarounds that I have encountered both in software that I use and which I help to produce.

The software I use

I am constantly having to find ways around when one of the many software programs that I use on a daily basis doesn't quite do what I want. In some cases the results of trying to work around my problems are a revelation. As I wrote about in this post on great flexible free tools, the features that you want might be right under your nose. In other cases the actions taken to achieve my goals are somewhat more convoluted and probably sit well outside the original intention of the development team when designing the software.

  • The support tracking system that I implemented for our support teams does not support the ability for our own implementation consultants to raise or track tickets on behalf of their clients. It will assume that any response from our company domain is a support agent response and forward it to the address of the owner of the ticket. To avoid problems of leakage we restrict the incoming mails to the support mailbox on the exchange account, but this does limit the possibility of including our own implementation consultants as 'proxy customers' to help in progressing investigations into customer problems.
  • The bug tracking system that I currently use has poor support for branching. Given the nature of our products and implementations we are maintaining a number of release branches of the software at any one time and sometimes need to apply a fix back to a previous version on which the customer is experiencing a problem. With no inherent support for branching in our tracker I've tried a number of workarounds, including at times directly updating the backend database, all with limited success.
  • As I wrote about in this post, I was intending to replace the above tracking system with an alternative this year. Interestingly the progress on this project has stalled on the presence of exactly the same limitation, poor branching support in the tool we had initially targeted to move to, Jira. The advice within the Jira community suggested that people were resorting to some less than optimal workarounds to tackle this omission.
  • Outlook as an email client has serious workflow limitations for the way that I want to manage a large volume of email threads. I've found the need to write a series of custom macros in order to support the volume and nature of emails that I need to process on a daily basis. These include a popup for adding custom follow up tags so I can see not only that a follow up is required but a brief note of what I need to do, a macro to pull this flag to more recent messages in the conversation so that it will display on the collapsed conversation, and also the ability to move one or more mails from my inbox to the directory that previous mails in that conversation are stored.
  • The new car stereo that I purchased has a USB interface to allow you to plug in a USB stick of mp3 files to play. On first attempting to use this I found that all of the tracks in each album were playing in alphabetical order rather than the order that the songs appeared in the albums. A Google search revealed that the tracks were actually being played in the order that they are added to the FAT32 file system on the stick. Renaming the files using a neat piece of free software and re-copying to the stick resolved the issue. On reading various forums it appears that this was a common problem, but the existence of the workarounds of using other tools to sort the files was apparently sufficient for the manufacturer not to feel the need to enhance the behaviour.
  • The continuous integration tool Jenkins has a behaviour whereby it will 'helpfully' combine identical queued jobs for you. This has proved enough of a problem for enough people that it prompted the creation of a plug in to add a random parameter to Jenkins jobs to prevent this from happening, which we have installed.

The software I test

Given that my current product is a very generic data storage system with API and command line interfaces, it is natural that folks working on implementations will look to navigate around any problems using the tools at their disposal. Even on previous systems I've encountered some interesting attempts to negotiate the things that impede them in their work.

  • On a financial point of sale system that I used to work on it was a requirement that the sales person completed a fresh questionnaire with the customer each time a product was proposed. Results of previous questionnaires were inaccessible. Much of the information would not have changed and many agents tried to circumvent the disabling of previous questionnaires to save time on filling in new ones. The result was an arms race between the salespeople trying to find ways to reuse old proposal information, and the programmers trying to lock them out.
  • On a marketing system I worked on we supported the ability to create and store marketing campaigns. One customer used this same feature to allow all of their users to create and store customer lists. This unique usage resulted in a much higher level of concurrent use than the tool was designed for.
  • The compression algorithms and query optimisations of my current system require that data be imported in batches, ideally a million records or more, and be sorted for easy elimination of irrelevant data for querying. We have had some early implementations where the end customer or partner has put in place a infrastructure to achieve this, only for field implementation teams to try to reduce latency by changing the default settings on their systems to import data in much smaller batches of just a few records.
  • One of our customers had an issue with a script of ours that added some environment variables for our software session. It was conflicting with one of their own variables for their software, so they edited our script.
  • One of our customers uses a post installation script to dynamically alter the configuration of query nodes within a cluster from the defaults.
  • In the recent example that prompted this post a customer using our standalone server edition across multiple servers did not have a concept of a machine cluster in their implementation. Instead they performed all administration operations on all servers, relying on one succeeding and the others failing fast. In a recent version we made changes with the aim of improving this area such that each server would wait until they could perform the operation successfully rather than failing. Unfortunately the customer was relying on failing fast rather than waiting and succeeding so this had a big impact on them and the workaround they had implemented.

My former inclination as a tester encountering such workarounds was to adopt the stance of 'workaround denier', being dismissive of any non-standard uses of my software. More recently, thanks partly due to circumstance and partly in light of the very positive attitude of my colleagues, I've grown to appreciate the idea that being aware of and considering known workarounds is actually beneficial to the team. It is far easier to cater for the existence of these edge uses during development than to add support later. In my experience this is one area where the flawed concept of the increasing costs of fixing may actually hold true given that we are discussing having to support customer workflows that were not considered in the original design, rather than problems with an existing design that might be cheaply rectified late in development.

What are we breaking?

Accepting the presence of workarounds and slightly 'off the map' uses of software raises an interesting question of software philosophy. If we change the software such that it breaks a customer workaround, is this a problem? I don't believe that there is a simple answer to this. On one hand our only formal commitment is to the software as delivered and documented. From the customer perspective, however, their successful use of the software includes the workaround, and therefore their expectation is to be able to maintain that status. They have had to implement such a workaround to overcome limitations on the existing feature set and therefore could be justifiably annoyed if the product changes to prevent that. Moving away from the position of workaround denial allows us to predict problems that, whilst possibly justifiable, still have the potential to cause negative relationships once the software reaches the customer.

Some tool vendors, particularly in the open source world, have actively embraced the concept of the user workaround to the extent of encouraging communities of plug-in developers who can extend the base feature set to meet the unique demands of themselves and subsets of users. In some ways this simplifies the problem in that the tested interface is the API that is exposed to develop against.

(While this does help to mitigate the problem of the presence of unknown workarounds, it does result in a commitment to the plug-in API that will cause a much greater level of consternation in the community should these change in the future. An example that impacted me personally was when Microsoft withdrew support for much of the Skype Desktop API. At the time I was using a tool (the fantastic Growl for Windows) to manage my alerts more flexibly than was possible with the native functionality. The tool relied upon the API and therefore no longer works. Skype haven't added the corresponding behaviour into their own product and the result is that my user experience has been impacted and I tend to make less use of Skype as a result.)

Discovering the workarounds

The main problem for the software tester when it comes to customer workarounds is knowing that they exist. It is sometimes very surprising what users will put up with in Software without saying anything, as long as somehow they can get to where they want to be. The existence of a workaround is the result of someone tackling their own, or somebody else's, problem and it may be that they don't inform the software company that they are even doing this. It can take a problem to occur for the presence of the workaround to be discovered.

  • For the sales agents on the point of sale system trying to unlock old proposals, we used to get support calls when they had got stuck having exposed their old proposal data but now unable to edit them or save any changes.
  • The customer of the marketing campaign system reported issues relating to concurrency problems in saving their campaign lists.
  • For the team that edited the environment script, we only discovered the change when they upgraded to a later version and the upgrade had a problem with the changes they'd made and threw errors. Again this came in via the support desk.
  • For the team who reduced the size and latency of their imports, we only discovered the problem when they reported that the query performance was getting steadily worse.
  • For the recent customer who was taking a 'fail fast' approach to their multi-server operations, again the problem came in via the support desk exhibiting as a performance issue with their nightly expiry processes.

So the existence of a workaround is often only discovered when things go wrong. In my organisation the channels are primarily through the technical support team, and in running that team I get excellent visibility of the issues that are being raised by the customers and any workarounds that they have put in place.

For other organisations there may be additional channels through which information on customer workarounds can be gleaned. As well as being a useful general source of tester information on how your software is perceived, public and private forums are also the places where people will share their frustrations and workarounds. I've already mentioned that I used a public forums to discover a solution to my problem with my car stereo. My colleague also discovered the 'random parameter' add-in to Jenkins on a public forum, and it was public threads that we looked to in order to identify workarounds to the lack of branch support in Jira.

Prevention is Better than Cure

Responding to problems is one thing. If we work to gain an understanding of where the customers are getting into trouble I think that it is possible to anticipate places where customers try to work around limitations in Software and testers can be on the lookout for potential consequences of they do. Doing this, however, requires an understanding of customer goals and frustrations that sit outside of the defined software behaviour. I believe that the signs are usually there in advance if you look for them in the right places, perhaps a question from an implementation consultant working on customer site on understanding why a feature was designed in a certain way, an unsatisfied change request on the product backlog, or a post on a user forum. If we can make efforts to understand where the customers are getting frustrated then we can test scenarios where they might try to get around the problem themselves and establish how they might get themselves into trouble if they do. There are some testers in my team who actively help out with support so gain good visibility of problems. In order to encourage a wider knowledge of customer headaches throughout the test team we have started to run regular feedback sessions where a support agent discusses recent issues and we discuss any underlying limitations that could have led to these.

Of course, ideally we would have no need for workarounds at all, the software should simply work in the way that the users want. Sadly it is rarely possible to satisfy everyone's expectations. The responsibility to prioritise changes that remove the need for specific workarounds is probably not one that falls directly on testers. In light of this, is it something that they need to maintain awareness of? As I've stated my inclination was to dismiss them as something that should not concern testing - after all, we have enough challenges testing the core functionality. It is tempting to adopt the stance that the testing focus should be solely on the specified features, that we need to limit the scope of what is tested and the documented feature set used 'as designed' should be our sole concern. This is a strong argument, however I think that this belies the true nature of what testing is there to achieve. Our role is to identify things that could impact quality, or value the product provides to some stakeholder, to use Weinberg's definition. From this perspective the customer value in such scenarios is obtained from having the workarounds in place that allow them to achieve their goals. Rather than dismissing workarounds, I'm coming around to the idea that software testers would better serve the business if we maintained an understanding of those that are currently in place, and raise awareness of any changes that may impact on these. In our latest sprint, for example, we introduced some integrity checking that we knew would conflict with the post configuration script mentioned above, so as part of elaborating that work the team identified this as a risk and put in place an option to disable the check. This is exactly the kind of pragmatism that I think we need to embrace. If we are concerning ourselves with the quality of our product, rather than adherence to documented requirements, it appears to me to be the right thing to do.


Monday, 8 September 2014

Textual description of firstImageUrl

The FaceBook Effect

I recently celebrated the birth of my 4th child. Whilst my wife was recovering from the birth I enjoyed the opportunity to take my older children to school and to speak to friends and other parents wanting to pass on their congratulations and wishes. One such day I was chatting with a friend of my wife's and the conversation strayed into an area which she felt particularly passionate about, which also struck a chord with me both in a personal and professional capacity. The friend was telling me that she was so excited to hear news of the birth that she had logged onto Facebook for the first time in months to check my wife's status. She explained that she had previously stopped using Facebook as she felt that it compelled her to present a false image of her life. Whilst I occasionally use Facebook I was inclined to agree with her that there is a lot of pressure on social media to present a very positive image of yourself and your life. Indeed this pressure is such that many people seem more focused on staging personal occasions to take photographs and post details to social media than on actually enjoying the occasion themselves.

But What's this got to do with Testing?

Whatever your position on social media, you're probably wondering why I'm recounting this conversation on a testing post. The reason that the conversation struck a chord with me on a professional level was because I think that there is a similar pressure in professional communities and social media groups, with those associated with Software Testing being particularly prone for reasons I'll go into.

With the advent of professional social media in the last decade, professionals now enjoy far greater interaction with others in their field than ever possible before. In general I think that this is a hugely positive development. It allows us to share opinions and discuss ideas and accelerates the distribution of new methods and techniques through the industry. Social media channels such as Twitter, LinkedIn and discussion forums also provide less experienced members of the community far greater access to experienced individuals with a wealth of knowledge and expertise than ever possible than when I started testing. More importantly social media allows us to be vocal and criticise activities which could damage our profession and find other individuals who share the same concerns. The recent rallying behind James Christie's anti ISO 29119 talk would simply not have been possible without the social media channels that allowed like minded individuals to find a collective voice in the resulting online petition (I don't suggest that you go immediately and sign the petition, I suggest that you read all of the information that you can, and make up your own mind. I'd be surprised if you decided not to sign the petition ). Social media has the power to give a collective voice where many individual voices in isolation would not be heard.

On the flipside of the positive aspects, Social Media carries an associated pressure - let's call it the 'Facebook effect' - where those contributing within a professional community feel the need to present a very positive image of what they are doing. It is easy to compare one's own work in a negative light and engender feelings of professional paranoia as a consequence. This is not something that is specific to testing and the phenomenon has been highlighted by those writing on other industries, such as this post highlighting problems in the marketing community.

For the many who make their living in or around the social media industry, the pressure to be or at least appear to be an expert, the best, or just a player is reaching a boiling point.

The message is clear. Advancing in the modern professional work is as much about climbing the social media ladder as the corporate one and in order to do that we need to present the right image.

Living up to the image

Based on the article above, and the references at the end of this post, it is clear that the negative side of social media affects other industries, so what is it about testing in particular that I think makes us particularly conscious of how we present ourselves?

Before putting this post together I approached a number of testers to ask whether they had ever experienced or observed the symptoms of the Facebook effect in their testing careers. I received some very interesting and heartfelt responses.

Most accepted the need to present a sanitised, positive image on social media

You don't want to wash your dirty linen in public

And the need for constant awareness that everything posted publicly was open to scrutiny

I know that anything I say is open to public 'ridicule' or open for challenge

Some went further to admit that they had experienced occasions where they felt paranoid or intimidated as a result of interactions with testing based social media. One tester I spoke to highlighted the problem of opening dialogues requesting help or input from others and being made to feel inferior

when you make an opening to someone in our industry asking their thoughts or opinions and they seem to automatically assume that this means you are a lesser being who has not figured it all out already

I don't think either myself or the person writing that feel that assumptions of this type are always the case, but I've certainly experienced the same thing. A few years ago, as an experienced tester finding my feet in the agile world, I found myself frustrated by the 'you are doing it wrong' responses to questions I posted in agile testing lists. I don't want a sanctimonious lecture when asking questions on my problems, I want some open help that acknowledges the fact that if I'm asking for assistance in one area it doesn't mean that I don't know what I am doing in others.

Why is testing so affected?

I think there are a number of factors that contribute to testing being particularly prone to the 'Facebook effect'.

  • what are we admitting?

    I obviously can't comment on how it is for other professions, but I think for testing the pressure of presenting a positive image is particularly prevalent due to the implications of any negative statements on perception of our work or organisations. Any admission of fault in our testing is implicitly an admission of the risk of faults in our products, or worse, of risks to their data as customers. Whilst we may be prepared to 'blame the tester' in the light of problems encountered, it is not the case that we want to do this proactively by openly admitting mistakes. Some testers also have the additional pressure of having competitor companies who can pick up on revelations of mistakes to their advantage. As a tester working for a product company with a close competitor told me:

We are watching them on social media as I am assuming they are watching us. So I do need to be guarded to protect the company (in which) I am employed.
  • what are we selling?

    Some of the most active participants in any professional communities will be consultancy companies or individuals, and testing is no different. These folks have both a professional obligation not to criticise their clients, and also a marketing compulsion to present the work that they were involved in as successful so as to present value for money to other prospective customers. The result is a tendency towards very positive case studies from our most vocal community members on any engagements, and an avoidance of presenting the more negative elements to protect business interests.

  • Where are we from?

    Testing is a new profession. I know that some folks have been doing it for a long time, but it just doesn't have the heritage of law, medicine or accountancy that provides stability and a structure of consistency across the industry. Whereas this does result in a dynamic and exciting industry in which to work, it also means that It workers operate in a volatile environment where new methodologies compete for supremacy. Attempts to standardize the industry may appear to be an attractive option in response to this, offering a safety net of conformity it a turbulent sea of innovation, however the so far flawed attempts to do so are rightly some of the greatest points of contention and result in the most heated debate in the world of testing today. The result is an industry where new ideas are frequent and it can be hard to tell the game-changing innovations from the snake-oil. Is it really possible to exhaustively test a system based on a model using MBT? Are ATDD tools a revolutionary link between testing and the business or really a clumsy pseudocode resulting in inflexible automation? In such an industry it is naturally hard to know whether you have taken the right approaches, and easy to feel intimidated by others' proclamations of success.

  • Where are we going?

    For some an online presence is part and parcel of looking for advancement opportunities. LinkedIn is particularly geared towards this end. Therefore presenting only the most successful elements of your work is a prudent approach of you want to land the next big job. Similarly for companies who are recruiting, if you want to attract talented individuals then presenting the image of a successful and competent testing operation is important.

Facing Your Fears

One of the problems that we face individually when interacting with professional social media is the fact that the same 'rose tinted' filtering that is applied to the information that we read on other testers and their organisations is not applied to our own working lives. We see our own jobs 'warts and all' which, during the leaner times, can lead to professional paranoia. This is certainly something that I have experienced in the past when things have not been going as well as I would like in my own work. I found that this was more of a problem earlier in my career and has got less as I gain more experience which provides perspective on my work in relation to others. The FaceBook Effect does still rear is head during periods of sustained pressure when I have little chance to work on testing process improvements, as I have experienced this year.

The manner in which we deal with these emotions will inevitably depend on the individual. I think that, whilst easy to fall into a pattern of negativity, there are responses that show a positive attitude and that can help to avoid the negative feelings that can otherwise haunt us.

  • look to the experienced

    Ironically it seems to be the most experienced members of a profession that are most willing to admit mistakes. This could be because many of those mistakes were made earlier in their careers and can be freely discussed now. It could be that having a catalogue of successful projects under your belt furnishes folks with the confidence to be more open about their less successful ones. It could also be that the more experienced folks appreciate the value of discussing mistakes to help a profession to grow and build confidence in its younger members. These are all things that I can relate to and as my experience grows I find an increasing number of previously held opinions and former decisions that I can now refer to personally, and share with others, as examples of my mistakes.

  • get face to face

    I wrote a while ago about an exchange that I did with another company to discuss our relative testing approaches. I have since repeated this exercise with other companies and have another two exchange visits planned for later this year. The exchanges are done on the basis of mutual respect and confidentiality, and therefore provide an excellent opportunity to be open about the issues that we face. There is an element of security about being face to face, particularly within the safe environment of the workplace, which allows for a open conversations even with visitors that you have known for a short time.

  • consultancy

    I don't rely extensively on external consultancy, however I have found it useful to engage the services of some excellent individuals to help with particular elements of my testing and training. In addition to the very useful 'scheduled' elements to the engagement, almost as useful is having an expert with a range of experiences available to talk to in a private environment. As I mention above, consultants should maintain appropriate confidentiality, and they will also have a wealth of experience of different organisations to call on when discussing your own situation. Having had the benefit of a 'behind the doors' perspective of other companies provides a far more balanced view of people's relative strengths and can put your own problems into a more realistic context as a result. There are few more encouraging occasions for a test leader in an organisation than being told that your work stands up in a positive light to other organisations (even if they aren't at liberty to tell you who these are) .

  • closed groups

    I was fortunate to be involved in a project recently that involved being a member of a closed email list. I found this to be a liberating experience. The group discussed many issues that affect testers and test managers openly and without fear of our words being misinterpreted by our organisations or others in the community. There were disagreements on a number of the subjects and I personally found it to be much easier discussing contentious issues with reference to my own position in the closed group environment. The problem with discussing internal issues in an open forum is obviously the risk that your candid talk is seen by the wrong eyes, a closed group avoids this problem and allows for open and candid discussion with sympathetic peers. In fact I obtained some really interesting input from exactly that group prior to writing this post.

  • trusted connections

    I am lucky to have some fantastic testers on my private email address list who I can turn to in times of uncertainty or simply to bounce ideas off before putting them into the public domain. For example I recently had some questions around Web testing. This is not something that I've looked at for some time, having been focussed on big data systems. I received some invaluable guidance from a couple of people in my contacts list without any stigma around my 'novice' questions, as the individuals I spoke to know me and respect the testing knowledge that I have in other areas. Their advice allowed me to provide an educated view back to my business and make better decisions on our approach as a result. As with the closed group, I approached a number of personal contacts for their experiences and opinions to contribute to writing this post.

Don't worry be happy

When my brother left his last job in water treatment engineering his colleagues gave him one piece of parting advice.

Lose the nagging self doubt - you are great at your job

So it could well be that I suffer from some familial predilection to being self critical. You may not suffer from the same. Whether you do or not, I think that when using social media as a community we should maintain awareness of how others will feel when reading our input. We should try to remember that just because others ask for help doesn't mean that they don't know what they are doing. We should consider admitting failures as well as successes so others can learn from our mistakes and gain confidence and solace in making their own.

If interacting with social media personally leaves you with a taste of professional paranoia - I recommend reading this excellent short post from Seth Godin, and remind yourself that the simple fact that you are looking outside your work to the wider professional community to improve your testing, probably means you're doing a fine job.

Other links

Image: Sourced from twitter @michaelmurphy

Monday, 14 July 2014

Textual description of firstImageUrl

A Map for Testability

Here Be Dragons

Chris Simms (@kinofrost) asked me on Twitter last week whether I'd ever written anything about raising awareness of testability using a mind map. Apparently Chris had a vague recollection of me mentioning this. This is certainly something that I did, however I couldn't remember where I had discussed it. I've not posted about it, which is surprising as it is a good example of using a technique in context to address a testing challenge. As I mentioned in my Conference List, I have a testability target for this year. It therefore feels like an opportune moment to write about the idea of raising awareness of testability and an approach to this that I found effective.

Promoting Testability

As I wrote in my post Putting your Testability Socks On there are a wealth of benefits to building testability into your software. Given this, it is somewhat surprising that many folks working in Software don't consider the idea of testability. In environments where this is the case it is a frustrating task getting testability changes incorporated into the product, as these are inevitably perceived as lower priority than more marketable features. As Michael Bolton stated in his recent post, testers should be able to ask for testability in the products they are testing. The challenge comes in promoting the need for testability, particularly in products where it has not been considered during early development. This is a responsibility which will, in all likelihood, fall on the tester.

A great way that I found, almost by accident, to introduce the idea of testability in my company was to run a group session to the whole department on the subject. I say by accident as I'd initially prepared the talk for a UKTMF quarterly meeting and so took the opportunity to run a session on the subject internally in a company off-site meeting by way of a rehearsal for that talk. The internal presentation was well received. It prompted some excellent discussions and really helped to introduce awareness of the concept of software testability across the development team.

The Way Through the Woods

Even with a good understanding of testability in the organisation it is not always plain sailing. As I mentioned in my previous post, developments that persist without the involvement of testers are most at risk of suffering from lack of the core qualities of testability. It is hard to know how to tackle situations, such as the one I was facing, where lack of testability qualities are actually presenting risks to the software. The job title says 'software tester' so as long as we have software we can test, right?

On that occasion I took a somewhat unconventional approach to raise my concerns with the management team and present the problems faced in attempting to test the software. I created a mind map. As anyone who has read To Mind Map or not to Mind Map will know that I don't tend to use mind maps to present information to others. In this case I generated the map for personal use to break down a complex problem, and the result turned out to be an appropriate format for demonstrating the areas of the system that were at risk due to testability problems to others.

The top level structure of the map was oriented around the various interfaces or modes of operation of the software features. This orientation was a critical element in the map's effectiveness as it naturally focussed the map around the different types of testability problem that we were experiencing. The top level groupings included command line tools, background service processes, installation/static configuration and dynamic management operations such as adding or removing servers.

  • The installation/static configuration areas suffered from controllability problems due to their difficulty to automate and harness
  • The asynchronous processes suffered from lack of controllability and visibility around which knowing which operations were running at which time
  • The dynamic management operations lacked simplicity and stability due to inconsistent workflows depending on the configuration.

One of the key benefits of mind maps, as I presented in my previous post on the subject, is to allow you to break down complexity. After creating the map I personally had a much clearer understanding of the specific issues that affected or ability to test. Armed with this knowledge I was in a much better position to explain my concerns to the product owners, so the original purpose of the map had been served.

Presenting the Right Image

What I have said in my previous post on mind maps was that I don't tend to use them to present information to others, but if they are to be used for this purpose then they need to be developed with this in mind. In this case I felt that the map provided a useful means to assist in developing a common understanding between the interested parties and so tailored my personal map into a format suitable for sharing. I used two distinct sets of the standard Xmind icons, one to represent the current state of the feature groups in terms of existing test knowledge and harnessing, and the second representing the testability status of that area.

Mind Map Key

The iconography in the map provided a really clear representation of the problem areas.

Mind Map

Driving the conversation around the map helped to prompt some difficult decisions around where to prioritise both testing and coding efforts. I won't claim that all of the testability problems were resolved as a result. What I did achieve was to provide clear information as to the status of the product and the limitations that were imposed on the information we could obtain from our testing efforts as a result.

Highlighting the testability limitations of a system in such a way opens up the possibility to getting the work scheduled to address these shortfalls. It is difficult to prioritise testability work without an understanding amongst the decision makers of the impact of these limitations on the testing and development in general.

In an agile context such as mine then legacy testability issues can be added to the backlog as user stories. These may not get to the top of the priority list, but until they do there will at least be an appreciation that the testing of a product or feature will be limited in comparison to other areas. What's more it is far more effective to reference explicit backlog items, rather than looser desirable characteristics, when trying to get testability work prioritised.


Hopefully this post has prompted some ideas on how raise awareness of testability, both proactively and in light of problems that inhibit your testing. As well as this I think that the key lesson here is about coming up with the most appropriate way to present information to the business. In this case, for me, a mind map worked well. In all likelihood a system diagram would have been just as effective. Some of the customers that I work with in Japan use infographic type diagrams to great effect to represent the location of problems within a system in a format which works across language boundaries - something similar could also have been very effective here.

Testing is all about presenting information and raising awareness. The scenarios that we face and the nature of the information that we need to convey will change, and it pays to have a range of options at your disposal to present information in a manner that you feel will best get it across. There's absolutely no reason why we should restrict these skills to representing issues that affect the business or end user. We should equally be using our techniques to represent issues that affect us as testers, and testability is one area where there should be no need to suffer in silence.


Both my previous post Putting your Testability Socks On and Michael Bolton's recent Testability post ask for testability contain good starting references for further research on Testability.

Monday, 23 June 2014

Textual description of firstImageUrl

The Conference List

Conference List Notebook

I'm really pleased to be presenting a talk at EuroSTAR again this year. Having spoken before, I know this is a great opportunity and with Paul Gerrard as conference chair I'm sure it will be a fantastic event in Dublin.

There are many benefits to speaking at a conference, the most obvious being the opportunity attend a high profile testing events without having to pay for a ticket. There is also a lot to gain from discussing your work with your peers, as I discussed in my post sparing the time .

There are some less obvious benefits too. These are useful to be aware of, particularly for permanent employees such as myself who are not looking to achieve any marketing value for their product or service from attending. One particularly subtle positive for me results from the shift in perception that arises at the thought of presenting my work to others, and my response to looking at my work more critically.

Getting your house in order

One of the hidden benefits for me in speaking at a conference comes from the extra effort that I put in to completing my in-house background projects leading up to a speaking engagement. In order to attend and present to other testers I need to be coming from a position of confidence in the work that I'm doing. Whilst it should be the case that I have confidence in the testing that we do at all times, it is also the case that for inherently self critical individuals like me, things are rarely exactly as I want them. I always have projects in the pipeline that are aimed at improving the way that we work and filling the gaps that I see in our testing approach. Some of these may be background improvements to our processes and tools, some may be areas of testing areas that I think need attention. Whatever the situation, a speaking deadline provides an excellent incentive to get my house in order and progress those areas that I feel need improvement before I can discuss them with others.

What's on my Conference List?

Here are a few of the things that are on my list to try to do before EuroSTAR this November :

  • Team adoption of stochastic SQL generation

    Last year I set myself the task of creating a tool capable of 'randomly' generating SQL queries based on a data and syntax model designed by the tester. In order to do this I spent some time teaching myself Ruby, as I find including the learning of a new skill helps to maintain enthusiasm for any personal project. I'll save the details for another post, but just an at the stage now where I'd like to get more people involved and enthused about this, with the aim of making it part of our standard testing activities. I'm kicking this off this week with an introductory session with query team.

  • Customised Ganglia monitoring on all test machines

    [Ganglia] ( "ganglia") cluster monitoring has become a core element in our soak and scale testing activities. It supports both generic monitoring of operating system resources and also custom monitoring of metrics relevant to our software operation. In our case this includes, amongst other things, the memory of our processes; the disk space utilised in key areas; the numbers of tasks in our processing and pending work queues. Of course we need to be careful of the Observer Effect here. The collection of metrics on the behaviour and performance of processes and resources inevitably impacts that which it is monitoring. From discussions with our customers I know that the software can start to impact application performance as you increase metrics and machines,. Ganglia has proved very useful in pinpointing resource problems, and I intend to get it installed and running on all of our other test servers in the next few weeks.

  • Improved testability in background services.

    As I wrote about in my post Putting your Testability Socks On we did introduce some testability issues into parts of the system a while ago. Whilst many of these have since been tackled, there are some background processes which still exhibit issues with controllability and observability that I hope to resolve. Splitting a maintenance process down into a series of individually callable operations will allow us to explicitly trigger each of the operations managed by the process. This should provide more control for tests involving those operations, but also prevent the introduction of non-deterministic behaviour into other tests which are currently volatile. Needless to say we will also have the full process running for many other tests.

  • Replace our bug tracking system

    Our bug tracking system was adequate when we were a single team company with few customers. Now that we have a larger engineering department with multiple teams, many customers and, most importantly, multiple supported release branches, the system is struggling and I want to replace it with something more suitable. (Of course we could move towards not using a tracking system at all - Gojko Adzic's old post Bug Statistics are a Waste of Time is a good starting point for anyone wanting to go down that road).

What's your Conference List?

in my experience it is often the background projects that individuals work on that provide the greatest advances in the way we work. The user stories or developments that we are working rarely improve our work processes directly. It is the ideas that arise laterally out of my work that triggers the best ideas, yet these need cultivating by motivated individuals and can languish uncompleted without some targets to deliver.

Of course, does not have to be conferences that we might establish as arbitrary targets to deliver some longer term goals. Presenting on testing at internal company meetings, meeting other testers at testing meetups, recruiting new team members or even family personal milestones (My wife and I have baby number 4 due in July) can act as a target for completing long term tasks. Even in an open culture it is often hard to prioritise personal or background goals in the face of more immediate business needs. Establishing a personal deadline helps me in giving the incentive to deliver, particularly when that target involves presenting on your testing in front of a room full of testing experts. If you are struggling to deliver your background projects - try looking at your calendar and put some targets in place around events that you have coming up - it might just provide the push you need to turn that great idea into reality, and may even provide some material for your next talk.

Wednesday, 28 May 2014

Textual description of firstImageUrl

The Friday Puzzle


Every Friday I put together an email entitled the 'Friday Report' which I send round to all of my teams. In it I include relevant articles and blog posts of interest to the various roles within the teams, including testing, documentation and support. As well as these links I also try to include a puzzle for people to tackle.

The original intention of including a puzzle was, as one might expect, as a bit of fun to encourage people to discuss the problem and find a solution. As I researched and presented more puzzles I started to see the scope for a less obvious, yet potentially more valuable benefit to these. I found that the most interesting discussions around the puzzles arose when individuals move beyond searching for the expected answer and start to apply critical thinking around the premises of the puzzles themselves. Through some simple questions I've been able to encourage some excellent debate around the puzzles themselves, calling into question the validity of the assumptions, the integrity of the constraints, and the viability of the solutions themselves.

  • Is the puzzle logically sound?
  • is the solution "on the card" the only one that fits?
  • Is it possible to come up with a different solution if we remove the assumed constraints?

Puzzles share many common characteristics with the situations that we are presented with working in a software company. They are usually based around some simplistic model of the world and the individuals within it that restrict behaviour, in a similar way that behaviour is modelled when developing software solutions. Puzzles may be presented with a limited set of information on the scenario in question, just as problems requiring diagnosis are rarely presented with all of the salient facts up front. Puzzles are presented with carefully chosen language to lead or mislead, just as customers, suppliers and other parties we interact with may sometimes use language to define their problems with the intention of leading us to the conclusions that they feel are relevant. In questioning the puzzles we are practising valuable skills in critical thinking that can be applied directly in our working contexts to ensure we maintain an open mind and consider options beyond the obvious ones presented to us.

I'm really pleased how enthusiastically everyone tackles these puzzles. On most Fridays now the discovery of the given answer is only the beginning of an excellent display of questioning and critical thinking around the problem. I've included here a few choice examples of puzzles and subsequent discussions that I think exhibit the exact characteristics that I'm really hoping for in this kind of exercise.

Choose your execution

This was the first puzzle in which I really started to see and introduce the idea of questioning the integrity of the puzzle itself as opposed to simply looking for the 'correct' answer. The interesting thing here is that the basis of the puzzle is that the prisoner escapes through manipulating the logic of his situation, however for me the language of the puzzle scenario provided scope for exploring other ideas.

Puzzle: General Gasslefield, accused of high treason, is sentenced to death by the court-martial. 
He is allowed to make a final statement, after which he will be shot if the statement is false 
or will be hung if the statement is true. Gasslefield makes his final statement and is released. 
What was his statement?**
  • A: I think the General said “I’m going to be shot”.*
  • Me: Well done, that’s the solution given. I personally would not have let him go – anyone see a way round letting him go?
  • J: Yes – I don’t see any reason why the alternative to “shot” or “hung” had to be “set free”. That’s not stated anywhere. It could have been continued incarceration – or some other method of execution. Anyway, given the statement, as soon as you take one or other of the two prescribed actions, either the statement or the action becomes invalid under the rules. If you don’t take an action, then we don’t yet know whether the statement is true of false – so he could continue to be held.
  • Me: They say they’ll shoot him if his statement is false but don’t explicitly say they won’t shoot him if it is true (i.e. it is an OR not an XOR) I’d have hung him then shot him.
  • S: Or may be shoot him on the leg and then hang him.

The Smuggler

This is a great example of something that is seen a lot in software circles. A solution that, on first examination, appears plausible, but that does not stand up well to closer scrutiny considering the practicalities of the situation.

A man comes up to the border of a country on his motorbike. He has three large sacks on his bike. The customs officer at the border crossing stop him and asks, “What is in the sacks?” 
“Sand,” answered the man.
The guard says, “We’ll see about that. Get off the bike.”
The guard takes the sacks and rips them apart; he empties them out and finds nothing in them but sand. He detains the man overnight and has the sand analysed, only to find that there is nothing but pure sand in the bags. The guard releases the man, puts the sand into new bags, lifts them onto the man’s shoulders and lets him cross the border.
A week later, the same thing happens. The customs officer asks, “What have you got?”
“Sand,” says the man.
The officer does another thorough examination and again discovers that the sacks contain nothing but sand. He gives the sand back to the man, and the man again crosses the border.
This sequence of events repeats every day for the next three years. Then one day, the man doesn’t show up. The border official meets up with him in a restaurant in the city. The officer says, “I know you’re smuggling something and it’s driving me crazy. It’s all I think about. I can’t even sleep. Just between you and me, what are you smuggling?”
What was he smuggling?
  • A: Smuggling gold dust in the sand?
  • L: I think he’s smuggling motorbikes.
  • Me: That is the 'answer on the card' - he was smuggling motorbikes. Any flaws in the puzzle?
  • O: Finding pure sand seems amiss as sand has absorbing properties. The analysis should have uncovered the sand plus all that it had absorbed e.g. water, sack fibre etc.
  • N: Surely the border guard would have noticed that it was a different bike each time?
  • S: could be the same model/colour with false number plates?
  • A: The official must have been suspicious if he was riding what appeared to be the same bike for 3 years but not showing any signs of wear and tear
  • S: He drove at night time; with poor lighting :)
  • Me: If he was smuggling motorbikes into the country - how did he get back? Would the lack of a motorbike on the return journey not have triggered some suspicion?
  • N: Maybe he smuggled cars or bicycles in the opposite direction? Or maybe he came back by bus or train.
  • S: The customs officer won’t be working 24 hours, so on his return on foot, a different officer is working and always sees him on foot.
  • Me: So the guy goes across every day and the customs officer is convinced that he's smuggling something to the extent that it is "driving him crazy" - but he doesn't tell his colleagues to check how the guy gets back and what he has with him? I wouldn't want to hire that customs officer for anything important.
  • S: We know little about him coming back – maybe he is doing the same in reverse…..(different model/make) and the customs officers did talk about it, but the ‘crazy’ one assumed it would be the same bike.

Predicting the score

This is a great example of a puzzle that is trying so hard to be clever that it has omitted the glaringly obvious, that the simplest answer may just be correct. Also included for a great piece of lateral thinking in the final response which is actually more pleasing than the rather unsatisfactory answer on the card.

Bill bets Craig £100 that he can predict the score of the hockey game before it starts. 
Craig agrees, but loses the bet. Why did Craig lose the bet?
  • S: Because the score of a hockey game before it starts is 0:0
  • Me: That's the answer on the card - everyone happy with that?
  • W: Because Bill guessed the correct score.
  • Me: That is some seriously twisted lateral thinking.
  • J: You can predict anything. Bill predicted the score. He didn't bet he would predict it correctly.

The Woman with half boys

This is a great example of where the language of the puzzle included unintended ambiguity. The answer on the card is based on an unsatisfactory premise that you can have half a boy. In this case, as with the previous one, I think that the team came up with a fast more elegant solution than the answer on the card.

 A woman has 7 children, half of them are boys. How can this be possible?
  • A: Maybe “half” means at least half, in which case she has at least 4 boys. Maybe she is expecting another one but doesn’t know the gender, so 4 of the 7 are boys, 3 are girls, and the one on its way could be either.
  • J: The statement is only that half are boys, there's no saying what the other half are - they could be boys too. If so, half are boys, a quarter are boys - any fraction. However, half of 7 of anything seems odd when the individual items are logically indivisible. Of course it could be in a specific application of the term "boy". So if one of the woman's children was a man or a woman rather than a boy, that could leave 3 and 3 boys and girls. That seems really inconsistent, though - being pedantic about "boy" but loose with "children" (and indeed loose with "them").
  • J: The "them" could include the woman, at which point, there are 8 people and if 4 are boys we're there.
  • Me: Well done J for coming up with an answer which, in my opinion, is actually neater than the one on the card (which was that they were all boys)

Carrying Sacks

This one I chose specifically due to ambiguity in the question that I thought would prompt interesting debate. It is a poorly worded puzzle and the team had no problem in dismantling it, and interestingly using it as an opportunity to explore the socio-political implications of the problem scenario!

A farmer and his hired help were carrying grain to the barn. 
The farmer carried one sack of grain and the hired help carried two sacks. 
Who carried the heavier load and why?
  • A: The farmer… the hired help had empty sacks… well it doesn't specify there was grain in them.
  • N: Even if there WAS grain in the hired help’s sacks, it would depend on how much grain was in each sack. So it could be either of them.
  • S: "A farmer and his hired help were carrying grain to the barn." This suggests that both were carrying grain. We don’t know the size or content of the sacks, or the weight of an empty sack. As it stands I do not think we can tell.
  • A: My thoughts exactly. Or maybe they were carrying different types of grain that weighed different amounts. Or maybe one was carrying dry grain and one was carrying waterlogged grain. Without further information it’s all just guesswork!
  • L: I’d hope that if the farmer had paid the guy, the hired man would have been of some help so therefore should be carrying at least as much weight as the farmer.
  • N: Depends what he or she was hired for. It might have been to feed the chickens and collect their eggs.
  • Me: Well done to N and L for getting the answer on the card. I'm with S and N - the first sentence states that they were both carrying grain. The second does not exclude the possibility of the hired help carrying grain, so the question is ambiguous and unanswerable. For me the most sensible answer is the 'obvious' one that the hired help carried more as the probability is that the sacks were full and of equal size, and as L points out, he was hired to help not just mess about with sacks.
  • E: But isn’t it more of a political/economic/philosophical question? The “hired help” represents the oppressed masses, forced to sell their labour for little or no reward. The “farmer” represents the capitalist oppressor. The burden of the hired help will always be the greater, in the exploitative capitalist society in which the farmer and the hired help notionally co-habit. My only question is why the farmer is carrying anything at all.

Riding into the Forest

Another one where the simplistic model of the problem doesn't stand up to practical considerations. Whilst something of an easy target, I really like the answers here.

How far can a horse run into a forest?
  • N: The answer to the puzzle is “half way”. Any further and the horse will be running OUT of the forest.
  • M: Depending on the density of the forest, I’d have to say “as far as the first tree”
  • Me: N has the answer on the cards, although Michael has applied some practical considerations to the issue….any other thoughts?
  • S: With regard N’s answer, that assumes the horse is running in a
  • straight line. It might be going in circles. A few answers from me:
    • As soon as it crosses the boundary into the forest, it is in, so it can’t run into it any further.
    • Until it stops running
    • Until it runs out of the forest
  • N: If the horse was riderless, it probably wouldn’t run into the forest at all. It would run over open ground if it could – less hazardous!
  • L: I agree with N - if a horse was riderless, it would not run through a forest. It would walk. Unless it was being chased, at which point I guess the answer is ‘until it got eaten’.

A Valuable Distraction

Whilst you could argue that these exercises distract from 'real' work I think that they are invaluable in promoting team debate around a subject and, more importantly, engender a team culture of collaborative criticism. I believe in encouraging an environment where it is acceptable to question the invisible boundaries that may have been established. Whether through our own assumptions or the dictate of others, were are often faced with the situation where we are introduced to scenarios with pre-established constraints, and the easiest option is usually to accept these and operate within them. I hope that by practising the process of questioning the scope and boundaries of every situation helps to ensure that we don't just blindly accept the information presented to us. I don't want to work with testers who only look for problems in the functionalities presented to them, I want to work with testers who question the functionality itself. I want to work with support agents who question the conclusions on that have been made by customers on what has caused the problems that they are facing. I want to work with technical authors who question behaviour and how it fits with the product set as a whole rather than simply documenting what has been done. Luckily for me that is exactly what I have, something I'm reminded of every Friday.


Some nice puzzle links can be found here:-


Saturday, 10 May 2014

Textual description of firstImageUrl

Testing with International Data

I've recently been working on a new database utility that my company are releasing, helped by our intern-turned-employee Tom. One of the areas that I've been focussing on, for the first time in a while, is testing support for international characters and different character encodings. Tom is new to testing of this nature and working with him on this has reminded me of two things: firstly how difficult testing support for international characters can be, and secondly how enjoyable it is.

Tackling the challenge of ensuring data preservation and consistency across different data applications is a tricky job that I've seen trip many people up. In that difficulty comes a great deal of satisfaction when you understand behaviours and identify problems. I thought that, while working in this area, it might be useful to document some of the tips and gotchas that I have encountered.

Don't Trust Characters

The single biggest mistake that I've seen people make working on data verification is to base their deductions on the characters that are presented to them by their application interfaces. I have found the characters that are presented in an application interface to be an extremely unreliable means of checking the data that has been transferred or stored in a data system.

The reason is simple. Data is not stored or transferred as characters. This is done in the form of streams of bytes. The decision on how to present those bytes as characters is typically only made at the point of presenting it to the interface. The thing is, much as the metaphorical tree in the forest, the data won't actually 'appear' unless someone views it, and in order to do that it must be mapped to characters. The decision on how to present a byte or set of bytes as a character will depend on the encoding that the application is using and that can vary between applications and interfaces.

This exposes the tester to the possibility of two fundamental characteristics of character data which, in my opinion, are the most important to understand when working with data and systems of this nature:

1. The same data can appear as different characters when presented using a different encoding.

2. Different data can appear the same when presented using a different encoding.

What really confuses the matter is that the decision on which encoding to use is not made in one place, it will depend on factors internal to the application itself and in its operating environment. These include the data, the operating system, the database settings, the application settings and even the method of connection.

All about encodings

I'm not going to cover character sets and encodings in detail here as it is much too broad a subject. I will present some of the key encodings and their characteristics, but I suggest if working on any data system to read up and gain a solid understanding of character encodings. Character Encodings are a mechanism to map stored information to specific characters. Each encoding discussed here maps a byte or set of bytes to the code points in the encoding, each of which typically then represents a character, punctuation mark or control.

Single byte encodings.

Single byte encodings involve the storing of data in which a single byte is used to represent any individual character or control code. Hence single byte encodings are limited to the range of unique single bytes, at most 256 characters. Most are based around using the 7-bit Ascii encoding to map the byte range 1-127 ( or 0x01 to 0x7F ) to the basic Latin characters, numerics and punctuation required for the English language. Clearly there were many languages based on other accentuation, characters and alphabets and different encodings evolved around the use of 8-bits, or 1 byte, which support different ranges of characters in the 128-255 ( 0x80 to 0xFF ) in an attempt to support these. Many different single byte encodings exist which map this small range of possible single byte 'extended' code points to different characters. Some examples of common ones that I've encountered.

Unicode encodings

Unicode encodings are different from single byte ones in that they are designed to support unique representations of every possible character in every language, and Unicode uniquely defines a specific unique code point for every character. In order to represent this many unique points it is clear that more than one byte per character is required, and different Unicode encodings exist to map multiple bytes to each unique unicode code point.

  • UTF-8
  • This is the unicode encoding most used in Linux and Unix based operating systems. One of the greatest assets of UTF-8 can also prove to be a cause of great confusion when it comes to testing. That is that the first 127 points are single byte, and match exactly to the Ascii single byte encoding. This means that you can convert data encoded in UTF-8 with a single byte encoding and much of the data will appear as you expect. It is only when code points above the 0-127 range are used that the confusion starts as UTF-8 uses 2 or more bytes for all characters outside this range. Many times I've been presented with something like this

    UTF-8 data viewed as Ascii as an example of corrupt data, when in actual fact it was UTF-8 data being viewed in an application using a single byte encoding. (As a rule of thumb, if when looking at your data you see a lot of "Â" characters - you're viewing multi-byte data using a ISO-8859 encoding).
  • UTF-16
  • This is the encoding used for Unicode operations on Windows operating systems. Many windows applications will use the term 'Unicode' to refer to UTF-16 which can be confusing if you are working across operating systems. UTF-16 uses at least 2 bytes, sometimes 4, for every code point. Therefore it is incompatible with single byte encodings which, whilst making it sometimes less efficient, at least avoids confusion and ambiguity between UTF-16 data and single byte data.
  • UTF-32
  • I've not had a great deal of experience with UTF-32 - I know that it uses exactly 4-bytes for each code point, there is no variability in the number of bytes used for each code point, but I've not encountered this much in commercial use.

Given the many different encodings that can be used it is hardly surprising that confusion occurs. Whilst the increasing use of Unicode encodings, in preference to the legacy single byte encodings, is solving many of the problems involved with abiguity in character encodings, anyone testing with legacy data or across multiple systems will need to maintain awareness of both the single-byte and Unicode encodings that are are in play.

As I've mentioned, character encodings can be influenced by a number of factors. The primary one is the locale that is being used by the operating system or session. In Windows this is controlled via the "Control Panel - Region and Language" panel under "Language for non-Unicode programs". This setting controls the system locale which, amongst other things, sets the encoding used to interpret single byte data. Linux uses UTF-8 as its standard encoding and will typically, in my experience, be configured with a UTF-8 locale. As I describe above, this has the advantages of any data containing bytes solely in range of Ascii being consistent with most single byte source encodings, it can be confusing if dealing with single byte data from different locales e.g. data coming from other systems. The locale in Linux is controlled by the LC_* and LANG environment settings - a decent summary of which can be found here.

Lets look at some areas where I've encountered problems.

Don't trust applications

Application interfaces are the first place that we might look to verify whether the data that we expect is present, however these can be the least reliable. Application interfaces are likely to yield the least visibility and control over what you are seeing. The application may be using operating system settings or internal defaults to perform character mapping, and the absence of access to underlying data makes it hard to be sure what is stored.

In client-server or n-tier systems the problem is made worse due to the potential for both client and server side settings to be in play and for data transformations to be occurring in internal layers that you have no visibility of.

As I demonstrated in this post it can be hard to understand what s going on between the various interfaces of a data system. If the characters being presented are not what you expect then you might need to examine the data settings of the application and use application tracing to help understand the situation (of course, here you are beholden to the quality of that tracing).

Another thing that you need to bear in mind when viewing character data in an application is whether the font that is being used by that application is capable of displaying the characters represented by your data. Some fonts are limited to specific ranges of characters, with support for Chinese characters often missing due to the sheer number of characters required. If you are using an application that allows control over fonts then changing to a font that is appropriate for the data that you are viewing may be necessary to view the characters it represents.

When looking to test data systems to check the validity or conservation of data, the application interface can provide useful immediate feedback that something is wrong , however if you want to really see what is stored then getting access to the underlying data is essential.

Don't trust Databases

Accessing a database directly should be a much less ambiguous operation than working through an application interface, however it can still be confusing. I've seen a number of issues raised with our support teams resulting from confusion as to what a customer is seeing.

When accessing databases directly you are still using an application. This may be a GUI based application such as SQL Server Management studio or Toad, or a command line like SQLplus. It is still a client application and it pays to understand how the data is being presented. Inserting character data via a console or GUI interface using SQL is particularly unreliable as the bytes that are stored may differ from those intended based on the database settings and the interface used.

Databases will store data typically as single-byte CHAR/VARCHAR columns, or as unicode/variable-byte NCHAR/NVARCHAR columns. There are two important properties that need to be understood when working with character data in databases.

  • The database will use a CHARACTER SET to map the single byte column data. This character set is usually chosen from the same set of single-byte character sets that are used in most operating systems, however as we will see this is not always the case.
  • A database will also support a COLLATION which dictates sort order and equivalence when comparing character data - for example whether upper case and lower case letters are considered equivalent when comparing and sorting, and whether accented letters match to non-accented ones or not.

Different databases have different implementations with regard to character sets and collations - Sqlserver , for example, uses the collation of a database, table or column to cover both the character set of data on the server and the SQL collation. Oracle, on the other hand, supports different properties for the character set and collation of a database, table or column.

The effect of character sets on character columns is nicely demonstrated in SQLServer if we run a simple script to populate the bytes from 1 to 255 into columns, each using a different collation.

CREATE TABLE MixedEncodings
     Arabic_CS_AS CHAR(1) COLLATE Arabic_CS_AS NULL,
     Cyrillic_General_CS_AS CHAR(1) COLLATE Cyrillic_General_CS_AS NULL,
     Latin1_General_CS_AS CHAR(1) COLLATE Latin1_General_CS_AS NULL
     Chinese_PRC_CS_AS CHAR(1) COLLATE Chinese_PRC_CS_AS NULL,
     Vietnamese_CS_AS CHAR(1) COLLATE Vietnamese_CS_AS NULL

INSERT INTO MixedEncodings(code) VALUES (1),(2),(3),(4),(5),(6), ... you get the idea ... (253), (254),(255)

UPDATE MixedEncodings
  SET Arabic_CS_AS=CAST(code AS BINARY(1)),
  Cyrillic_General_CS_AS=CAST(code AS BINARY(1)),
  Latin1_General_CS_AS=CAST(code AS BINARY(1)),
  Chinese_PRC_CS_AS=CAST(code AS BINARY(1)),
  Vietnamese_CS_AS=CAST(code AS BINARY(1))

We can see that the result of querying this data is that the "code" byte is mapped to different characters at the point of presenting the results of querying the different columns, based on the collation of each column.

SQLserver bytes encoded with different collations

Yet, based on a CAST of the data to Binary we can see that the actual bytes stored in the column are the same for each column, it is the collation that is dictating that these bytes be presented as different characters, a neat demonstration of fundamental characteristic number 1 that I highlighted at the start.

Bytes matching different collations in SQLserver

This will not always be the case. The same text data, stored in NCHAR or WCHAR columns will have different underlying byte encodings from single byte CHAR data. I've had to deal with confused customers who were comparing HEX output from their source database with my product, not realising that our underlying UTF-8 server encoding resulted in different bytes to represent the same character data as their source single byte encoding, fundamental characteristic number 2. To avoid confusion when creating test data in databases I recommend, as with the example above, explicitly injecting data using binary format rather than attempting to insert characters using SQL string INSERT statements.

As I hinted above, some databases do not use standard character encodings but have their own custom encodings, which can trip up the unwary. Teradata uses different encodings for the data on the server and the client, and the Teradata Latin server encoding is a custom one which does not match any of the ISO or windows single byte latin ones. The byte 0xFF in ISO-8859-1 is the character ÿ, yet that byte loaded into Teradata using the Teradata custom Latin encoding is the Euro symbol. Therefore the resulting characters when a single byte file is loaded into a teradata database will be slightly different from those we might expect from looking at the source file from the operating system terminal.

In fact, while we are on the subject of terminals...

Don't trust Terminals

Working with command line connections can be particularly misleading, as folks often don't realise that these are applications which will have their own encodings to map bytes to characters, which are sometimes independent from the system being connected to. I think the fact that it is a command line leads people to place more trust that they are seeing the data 'as it appears on the server'.

Some terminals will not support unicode by default, and are often limited to specific fonts which, as mentioned, can result in the inability to present certain characters. Here, for example, is a small file containing a mixture of characters from different alphabets stored in a unicode file, viewed via the windows command line.:-

Remote console connections can be particularly bemusing. I've dealt with confused clients who were trying to view multi-byte data using a putty terminal, not realising that this was using a single byte encoding to present characters, when the database server they were connecting to was using a utf-8 encoding. The character set setting for putty is found in the connection properties - when working with Linux I have found that changing this to UTF-8 by default avoids a lot of mistakes.

Putty Encoding settings

here are some examples of the same data viewed in different putty sessions with changed encodings - the first with a default Latin encoding and the second with the correct UTF-8 setting:-

Data viewed with Latin console settings

Putty file viewed as UTF8

Sometimes it is easier rather than viewing remotely to get the files onto your machine and view them there, or is it.....?

Don't trust files

Even looking at files on a local machine is fraught with peril. Copying files needs to be done in binary transfer mode to avoid changing characters 'helpfully' for your operating system. Then you need to make sure that you view the file using an appropriate tool. Using a text editor brings you back to the problem of applications.

One of the classic mistakes that I've seen people make is to open a unicode file in a text editor, not seen the characters that they expected, so performed a File - Save As and re-encoded the data as presented, i.e. the incorrectly interpreted set of characters, back into Unicode format thus permanently losing the original data. Luckily text editors are getting better at automatically detecting Unicode encodings, however Windows achieves this throuh the use of a "BOM" (Byte order marker) at the start of the file which can be confusing when viewing in other operating systems or as Hex. There are excellent editors such as Notepad++ which allow you to select the encoding that you want to use to view the data, without modifying the file.

Encoding settings in Notepad++

There is still need for care here - the "Encode" menu items here do not change the data, only the encoding used to present the data as characters. The lower "Convert" options however will encode whatever characters are currently in the application window into the encoding selected, very much as I've described above. A useful option but one to be used with care.

So What Do I trust

The main thing that you need to be able to trust when working with international data and different encodings is your own understanding. If you are confident in what you are looking at, what are expecting to see and where you can trust the characters that you are seeing, then the battle is almost won. Working out where data is stored, what translations it undergoes and what encoding is being used to map that data to representable characters is critical to make sure that your testing is based on the appropriate target behaviours.

The most reliable way that I've found to verify the content of data is to view using a Hex Editor. The hexdump command on Linux with the -c option is a great way to view character files. On Windows I've found HHD Hex Editor Neo very useful as it allows you to view Hex data alongside different encoding representations of the same bytes, including Unicode ones. Most Hex editors are limited to single byte representations.

In addition to allowing the viewing of the underlying bytes in a data file, many Hex editors also support the comparison of files. The downsides of comparing in hex is that you don't tend to get the nice matching and alignment that comes with textual comparisons. If you can obtain supposedly identical data in a matching encoding and format, including line endings and delimiters, then hex comparison can be a very precise and surgical means of checking for consistency.

Hex Editor Neo HDD

As I mentioned in the 'database' section you can't necessarily rely on a difference in bytes indicating a difference in the character data that is represented. Remember the second of the key principles presented at the start. However if you can obtain data in a consistent format then viewing as hex can remove all manner of confusion as to differences that you are seeing. I also suggest that, where possible, data comparisons will be done programatically based on the byte content of the data rather than relying on visualising the characters. I have a Java program that I use to compare output files from our application with the source data loaded in to validate data conservation within the system.

One trick that I also find particularly helpful for testing character data conservation is to create a unicode copy of any character data that I test with. Then I can use this copy to compare back to on a character basis if the transformations that have occurred to the data within the system make binary comparisons to my source data inappropriate, for example if I've imported single byte data into a UTF-8 based system. To check for character support I keep copies of the entire byte ranges for both single byte and unicode encodings as test source data files, these are incredibly useful to push through a data system or process to check for conservation and consistency.

Working with character data can be hugely confusing field if you aren't sure what you are doing. By understanding the key concept that character representations of data are untrustworthy and inconsistent we have a much better chance of avoiding getting caught out. Developing a familiarity with the concepts of encodings and the various settings that control these provides an invaluable foundation for anyone involved in data testing.

Further Reading

As with all things in computing there are multiple layers of complexity to understanding the impact and complexity of character sets, encodings and locales. I've not found the need to delve to the depths, for example, of how the Linux kernel processes to perform data validation operations myself but there is plenty of information available.

  • This site has a good summary of the settings and considerations when working with Unicode in Linux.
  • A good introduction to code pages on Microsoft can be found here.
  • For more generic cross platform considerations this page provides a good concise history of encodings and the implications of mixing between unicode and non-unicode ones
  • this blog post provides a good in depth explanation of character encoding considerations, targetted at programmers.