The six risks of dark data
It’s sensible to think that dark data which is out of sight and out of mind, could contain unknown risks. […]
Every business has its own individual IT problems depending on the service it provides and the nature of its client base; but when it comes to active digital preservation the dilemma affects every sector and every size of company.
From families who want their photographs and home videos to be accessible in 10 years’ time to global corporations trying to protect their corporate memory, it has now reached a stage where technology moves at such a fast pace that even information which needs to be read five years from now could quite easily be left behind if no plan is in place to prevent obsolescence.
In the corporate world the danger of failing to upgrade servers and hardware is far more serious, it could be a company’s entire corporate memory.
One of the worrying statistics from the Crown Records Management Survey was that only 60% of businesses reported were regularly upgrading their servers, which almost certainly leaves others open to the risk of being unable to read key data in future.
Every business should be concerned but there are certain sectors where this risk is higher either because of legal requirements for data to be kept for long periods or because of the need to keep information to protect against litigation.
Building regulations require that data in this sector has to be kept long into the future, at least 30 to 40 years. Detailed plans for the building of a bridge or flyover may be vital in future when emergency repairs are suddenly required – or in the case of collapse to prove that proper procedures were followed.
Not being able to access or read this data would have serious consequences. Contracts, AutoCAD diagrams and maps may all need to be indexed and searchable.
There are clear and strict guidelines for the retention of data in this sector, however the focus of law firms is often on live data rather than on historical files. Just finding this information can be a problem, let alone reading it if the format is not regularly updated. In fact, on some occasions it has proved impossible for the defence to open information provided by prosecuting solicitors, even before you begin to think about how it can be accessed in years to come.
In our survey, 78% in the legal sector were concerned that data may not be readable in future. Patent and trademark data needs to be kept for 35 years from the point at which a patent is granted.
There is heavy pressure on the NHS to ensure patient data is digital by 2020, a target which looks unlikely to be met. You would think keeping patient data at the very least for the duration of a patient’s life would be vital – and yet our survey showed 34% of respondents described ensuring data could be read for more than 50 years as ‘not that important’.
By contrast 77% admitted they were worried data in their industry would not be readable in future. There is a strong compliance issue in this industry – radiotherapy records, for instance, need to be kept for 75 years after the patient dies.
Records in this sector need to be kept indefinitely and that places a lot of pressure on departments to ensure they remain readable and accessible.
The development of new drugs is a data-heavy industry, not least because research has to be kept forever. Keeping that data readable is once again a challenge. When surveyed, 86% in this industry rated it ‘very important’ that data could be read in five years’ time – but that figure dropped to as low as 19% for data to be readable 30 years into the future.
Data in this industry should be kept for at least seven years, although in some circumstances the requirement is for much longer. What about a customer who joins a bank as a teenager and is still a customer at 80? No wonder 100% of respondents said it was important data should be readable in future.
Nevertheless, only half said they upgrade servers regularly – and a worrying 5% said they had no systems in place to preserve electronic information stored for more than five years.
This industry can often be overlooked in the active digital preservation debate but surprisingly, PhDs need to be kept forever, a rather easier task when they were stored on paper and left in boxes. Data versions are more accessible and searchable, which is progress – but will they still be readable for future generations?
Schools, which are always facing a budgetary squeeze, also face a challenge about how to keep information readable for the future.