As we become more comfortable with technology it is easy to forget how reliant we are on the systems that we use until something goes wrong. Before it’s too late we need to take a step back and analyse how important our data is to us and the likelihood of it disappearing off the face of the earth!
Recently I was in a car and drove past someone with two sat-navs. At first I thought it was over the top but then thought about the time that my sat-nav froze, nearly sending me completely past where I needed to go! This scenario shows the difference between myself making a short trip to the shop as opposed to the other driver with two sat-navs that potentially had a lot more riding on his trip. Contingency is about mitigating the impact or probability of an event by using what we know to prepare. Perhaps this is exactly what the other driver was doing.
There are many pains when working with systems, e.g. data becoming irretrievable once lost. However there are various methods that we can follow to reduce the chances of this happening.
We can begin with identifying the known issues and understand the probability and impact attached to them. The following example lists a few risks associated with the development of software using our best estimates about the probability and impact if the events were to occur.
|Hard drive failure||Low||High|
|Late changes to requirements||Significant||Significant|
This list should help us analyse and prioritise the most serious risks. To help us understand when we need to take action and reduce the risks we can use the below matrix:
Depending on which tile the risk lands on we need to take a different course of action.
If the risk lands on a ‘monitor’ tile then we need only observe the issue over time to ensure that the risk does not become of a higher importance. Risks landing on a ‘mitigation’ tile need to be dealt with based on their urgency by reducing the probability and/or impact to move the risk to a ‘monitor’ tile.
Looking at our above risks we can see that the ‘network loss’ risk lands in the ‘monitor’ zone. We can look back on this issue at the start of each sprint checking that the probability and/or impact has not increased. If the network has become more unreliable since the last check then we may have to look at alternative methods of connection such as a mobile dongle. As for the ‘hard drive failure’ risk we have landed on a ‘mitigation’ tile. Rather than monitoring this risk we need to look at ways to mitigate it. One way of doing this would be to backup any important files to reduce the impact or invest in a more reliable storage medium to reduce the probability. The same goes for ‘late changes to requirements’, we should look at incrementally developing the software so that we can monitor, plan and test more effectively.
The above matrix is aimed at changing your thinking towards risks so that you can plan ample contingency and ensure that known risks do not come to affect your work. It is not set in stone and can be adapted to fit your needs better. You may find that an ordered list of risks proves more useful. The important aspect here is keeping track of your risks and monitoring for change.
Example Mitigation Techniques
Below I have listed some common risks and their possible mitigation techniques, which you may encounter on a daily basis. These are not definitive answers but may highlight an area that you have not experienced issues with yet (let’s keep it that way!).
|Loss of source code||Commit early and often using source control tools e.g. Git or SVN.|
|Loss of online form data||Chrome plugin ‘Lazarus: Form Recovery’, useful when your browser freezes during an online application.|
|Loss of documents – e.g. text files containing credentials, useful data or notes||Use one of the many file/password storage tools e.g. Dropbox, Google Drive, LastPass, Trello, Evernote|
|Absence of team member||Knowledge share more often, cross train, daily updates, shared information stored on tools such as Trello or Google Docs|
|Unable to travel to work||Ensure that you have access to your data from home e.g. whitelisting of home IP address, devices available to work on and methods to communicate with the team.|
|Unwell||Ensure that another team member has access to your work and knowledge about your day to day tasks so that they can continue whilst you are away. It is also important to commit or share your work before you finish for the day.|
|Theft of device||Ensure no credentials are stored on your device and check that your laptop is password protected, locking after 5 minutes of inactivity.|
The list is far from exhaustive but hopefully sparks ideas for some of the risks that you come across.
It is important to ensure that what we do is reliable as it is not always possible to plan for all risks that may occur. However, we can prepare for the ones that we know about and by being more contingent in our actions, such as using two sat-navs, however stupid it may seem to others.
By Sam Hendriksen