Safeguarding the nation’s digital memory: Towards a Bayesian model of digital preservation risk





The National Archives’ digital strategy commits us to ‘becom[ing] a digital archive by instinct and design’ and to ‘measur[ing] preservation risks and publish[ing] the results’. Along with being ‘transparent about our practice as the basis for trust in the digital archive’.

Current models for managing digital preservation are top-down, defining functions that a system requires for long-term preservation of digital objects. As early as 2005, Rosenthal et al suggested that additionally we should develop risk models and describe how a system is designed to mitigate and protect against those risks. We now have models such as the SPOT (Simple Property-Oriented Threat) model and DRAMBORA (Digital Repository Audit Method Based on Risk Assessment), but these are essentially qualitative models and do not lend themselves to comparing very different types of risk, nor to examining the relationships and interplay between risks.

This presentation outlines our initial experimentation and prototyping through to the co-development and design of a framework fit for managing digital preservation risk in the age of the next generation disruptive digital archive. Using a Dynamic Bayesian Network, already employed in a wide variety of fields with comparable complexity (from banking capital adequacy assessment to improving pollinator abundance for food security), we can:

  • Formulate archivists’ and other practitioners’ expert judgement into robust qualitative measures of risk and combine them with available hard statistical evidence
  • Describe and explain complex and interdependent risk events and mitigations and their impact on preservation outcomes
  • Compare and prioritise diverse threats to the digital archive
  • Present understanding of risk in an intuitive graphical form, and easily compare the impact of different actions

To gain as broad a view of the risks as possible, we are engaging with a group of archives of a range of types and sizes, including county records offices, university archives and corporate archives, as well as contributing our own experience, in order to co-create the risk model.

As well as improving record keepers’ own understanding and decision-making, the framework aims to allow us to influence stakeholders and budget allocators by demonstrating how different preservation approaches affect risk and how resource constraints affect our ability to deliver the best possible preservation outcomes. This will also bring new skills into the archives sector, with archivists being equipped to carry out techniques for eliciting expert judgement, and statistical skills, so we can continue to develop the model to reflect changes in the risk landscape.