Digital Services and carbon emissions in the heritage sector: some preliminary findings

Introduction

Digital services make a contribution to an organisation’s carbon footprint greater than simply the electricity use in the building from powering servers (and their associated cooling), laptops, monitors, Wi-Fi and terminals and screens for visitors. On-site electricity use is relatively easy to monitor, thanks to smart meters and billing and it is relatively easy to derive carbon data from such figures.

Digital services also make use of cloud storage, where electricity use cannot be directly monitored, and it is emissions from these services that will be particularly considered here.

The purpose of this report is to examine the carbon footprint of off-site digital services used by The National Archives using real, live data from those services and to consider this footprint in the context of overall emissions for large heritage organisations. In carrying out this exercise recommendations can be made on carbon reduction and monitoring.

There are many aspects to sustainability in the context of the climate emergency. However, this paper is specifically interested in understanding the scale of carbon emissions to achieve their fast and effective reduction and minimise extensive effort resulting in only modest reduction.

Compiling this report required the assistance of colleagues from across The National Archives – and this will be true of all future carbon monitoring at The National Archives. Very considerable thanks are due to colleagues in IT Services, Web Archiving, Digital Services and the Chief Executive’s Office in supporting the research for this report. Particular thanks are due to Steve Hirschorn, Tom Storrar, James Noble and Kate Todd.

1. Scoping digital emissions

The Greenhouse Gas (GHG) Protocol, developed by the World Resources Institute and the World Business Council for Sustainable Development is a global corporate and accounting standard for measuring emissions and is widely adopted, including by the largest members of the National Museums Directors Council.

Under the protocol, emissions are grouped into three ‘scopes’. Scope 1 emissions basically encompasses physical energy use on site: fuel oil, natural gas usage – coal at a heritage railway. It is difficult to see how digital activity can generate scope 1 emissions, though exotic server cooling systems using particularly damaging gases could theoretically meet the criteria. Generally, IT will not be a contributor to scope 1 emissions.

Scope 2 emissions are those purchases and generated elsewhere – so electricity use falls into scope 2 emissions and on-site IT infrastructure (servers, monitors, terminals, hubs, charging laptops etc.) will be a contributor here.

Scope 3 emissions are emissions generated throughout an organisation’s supply chain and are usually said to be the largest part of an organisation’s carbon footprint. It would be normal for them to constitute 80-90% of an organisation’s emissions. Cloud computing services and any other off-site IT infrastructure would form part of this. Business travel, such as flights, are also scope 3 emissions and here IT can make a contribution to carbon reduction by providing services such as Zoom or Teams to reduce the need for staff travel. Waste generated, purchased goods and services and investments also form part of an organisation’s scope 3 emissions.

2. Heritage organisations and their carbon footprints

The National Archives does not publish details of its carbon emissions by scope as part of its annual reporting. This is increasingly unusual amongst large cultural heritage organisations and some that have not done this in the past, such as the V&A, have announced that they will commence this shortly.

Examining what is published shows significant variation between organisations.

Emissions (tonnes of CO2)
Institution 2017-18 2018-19 2019-20 2020-21
National Trust 863,838 605,751
British Film Institute 19,684
Imperial War Museum 24,639 22,763 20,605 11,278
Natural History Museum 11,196 11,139 10,616 11,258
Royal Botanic Gardens, Kew 8,993 7,717 9,284 6,994
British Library 10,000 9,000 7,500 6,000
British Museum 8,516 7,080 7,164 5,861
National Gallery 5,762 5,391 4,716
English Heritage 3,993 3,591
Science Museum 3,548 3,563 2,541
National Maritime Museum 4,659 2,186
Historic Royal Palaces 4,605 1,993
National Library of Scotland 1,226 995 984 777

 

This data is a mixture. Some organisations work across multiple sites, some just one. Some organisations scrupulously consider many elements of their scope 3 emissions – figures designed to represent their whole supply chain activity. Many ignore this. The Imperial War Museum, for example, explicitly only includes business travel and no other impacts in its scope 3 emissions. This means that even though it appears to be one of the worst performing institutions, its emissions are actually significantly higher than they appear to be.

These figures mean that, even though similar figures for The National Archives are not immediately available, we can estimate The National Archives’ annual carbon emissions (calculated on a similar basis – i.e., involving very incomplete scope 3 data) as likely lying in the range of 2-10k tonnes of CO2 annually.

This will help us put the other figures in this report into perspective.

3. The National Archives in the Cloud

The National Archives uses a range of cloud services to provide storage and compute capacity in different parts of the business. Microsoft’s Azure platform and a range of Amazon Web Services are used. Fortunately, both Microsoft and Amazon supply tools to measure the carbon emissions from the use of their platforms. The data in this simple analysis has been extracted from these tools: Microsoft’s Sustainability Calculator and Amazon’s Customer Carbon Footprint tool. These tools are freely available to users of these platforms. Carbon is calculated based on the organisation actual use of storage and compute in the context of the data centre infrastructure in which the information is being held.

Examination of these services provides the following data:

Function Service Period Period (months) Emissions (MTCO2e) Monthly emissions 2022 projection
Digital services Amazon services Feb-Apr 2022 3 2.8 0.93 11.2
Web archive (The National Archives internal) Amazon EMEA Jan 2020 – Jan 2022 25 0.5 0.02 0.24
Web archive (Mirrorweb) Amazon S3 Sept 2021 – Jan 2022 5 0.2 0.04 0.48
Web archive (Mirrorweb-Main) Amazon services Jan 2020 – Jan 2022 25 0.8 0.03 0.38
TOTAL 0.09 1.10
IT services – general Azure Jan – June 2022 6 0.89 0.15 1.78
IT services – Microsoft 365 Azure Jan-June 2022 6 1.177 0.20 2.35
TOTAL 0.34 4.13
Projected total for 2022 16.44

 

This gives us an annual total of around 16 tons of CO2 for The National Archives cloud services – or a mere 0.8% of the organisation’s emissions even under the most optimistic estimate in the range above.

On the other hand, 16 tons of CO2 is roughly the amount emitted in the manufacture of two Ford Focus Titaniums or twice the average carbon emissions of a citizen of the People’s Republic of China (Berners-Lee, ‘How bad are bananas: the carbon footprint of everything’, 2nd edition, p.145 and p.148). Colleagues must ensure they are making responsible and efficient use of these services.

4. What is being covered by these numbers and what is not?

These figures represent a broad, but by no means complete, slice of The National Archives’ digital activity. Many digital services are still run ‘on-prem’ on local servers and not in the cloud. The figures from Digital Services cover the main static content of The National Archives’ website and much of legislation.gov.uk. But the Discovery catalogue and Archives Media Player, for example, have not yet been migrated. So, this figure is very likely to rise in the future.

IT Services largely use Azure for infrastructural purposes, but our Microsoft 365 instance is also here – email (Exchange), Teams and SharePoint are key functions covered by these quantities (In fact the tool is still in development so these and OneDrive are the key services included with some other low intensity or use services (PowerBI, Office) excluded.) We can see that The National Archives’ direct communications are far lower contributor of emissions than web-based services.

Finally, the data from the Web Archives, covers, as far as I have been able to ascertain, every aspect of the storage and provision of content. These emissions, for a web archive with a size on disk today of around 285 TB is highly reassuring because recent CIPFA data (2020/1) gives some sizes of born digital archival holdings for repositories with established digital preservation programmes. Gloucestershire’s holdings, for example, were given as 1.3 TB, Essex as 2.2 TB, Dorset as 2.6 TB, Norfolk (the largest) as 3.3 TB. This suggests that carbon emissions from these digital collections are likely very small. We see here the value of accurate data overestimates. The consultancy Wholegrain Digital have estimated the carbon footprint of 1 TB of cloud storage as being 2.7 kg of CO2, per year. Their source data makes clear in context that this includes ‘compute’ (i.e., serving or operating over some portion of the data) and not just the data ‘at rest’ on some server or drive somewhere10. For seldom accessed data, this value, it seems fair to say, would be lower. For volatile (rapidly changing) or very regularly accessed sources of data it would likely be higher.

On Wholegrain Digital’s metric, storing and accessing the UKWA would be expected to generate carbon emissions equivalent to about 2,295 kg of CO2, per year (0.23 tons) but we see here that the real figure is substantially (more than four times) higher than this rough value.

Some digital archives are quite large. The UK Web Archive, based at the British Library consisted of around 500 TB in 2017 and probably exceeds 850 TB today. (I assume this value incorporates some data compression.) This is still less than 1 petabyte and would not be considered especially large by a company focused around ‘big data’. Even at this scale, however, it is hard to see that the quantities involved are of much significance in the current context of the British Library’s overall emissions.

5. Reducing the carbon footprint of public-facing web services

If we did want to tackle what appears to be the core component of The National Archives’ cloud emissions, is there obvious scope to reduce this? The data would suggest that there may be but modelling this might be a project in and of itself. Use of a tool such as websitecarbon.com aims to provide figures for the carbon emitted by the data transfer of a webpage and some representation of its footprint at rest in a data centre. Calculating these values remotely relies on a set of approximations different from those above. The tool is perhaps helpful in establishing some sense of relative emissions between page types and websites but less robust in deriving absolute figures as we have seen above.

As an illustration, loading The National Archives’ homepage is said to generate 1.47g of carbon, compared to 2.93g for the National Library of Wales. The National Archives’ education home page weighs in at 3.14g. However a low level page on The National Archives’ website (our opening times, for example) gives a reading of 0.08g. Figuring out how many pages are ‘dirtier’ and how many ‘cleaner’ and what could be done to make the former more resemble that latter (frequency of use would also obviously be a factor) could be an opportunity to embed lower carbon coding practice into the organisation now so that, as carbon reduction at The National Archives gathers pace, colleagues in Digital Services are well placed to reduce the footprint of web services accordingly. In reducing page sizes, these changes will benefit users with slower connections or tight data limits (likely those in rural or low-income households) and may result in more accessible design: in other words, there are wider benefits to ‘decarbonising’ web front end services beyond the carbon reduction itself.

At the same time, the data from Amazon already makes clear what the total impact of these page emissions actually are, and we can see that in the context of The National Archives’ total emissions they are relatively trivial.

It may also be considered worthwhile to assess the impact of The National Archives’ use of third-party social platforms such as YouTube and TikTok. Uploading and downloading content to and from these platforms by our audiences also forms part of The National Archives’ scope 3 emissions and is not easily monitorable in the same way as these other cloud-based services. However rough values can be calculated for a major platform for illustration. In the last year (2021/22), visitors to The National Archives YouTube channel consumed roughly 10,000 hours of streaming video produced and published by us. As an apparently resource hungry activity, this total might give us pause. However, studies by the International Energy Agency and Carbon Trust have put the emissions associated with an hour of streaming video at between 36g and 55g respectively. This puts The National Archives’ emissions in this area at between roughly 0.36-0.55 tons, lower than all the digital services discussed above.

These impacts are important to consider and calculate but clearly – without new and significant live data – The National Archives’ streaming video use should not lead its carbon reduction efforts.

6. Digital licensing – how far does scope 3 extend?

One potential aspect of scope 3 emissions not measured here are the footprints of licensed digital services. The scope 3 standard makes clear that a company’s franchises should be included in their scope 3 emissions. This makes sense – clearly McDonalds is responsible for the carbon emissions generated not just by its corporate headquarters and supply chain but also all of its restaurants, however the business is structured. However, it is less clear that licensing IP creates a direct responsibility for any resultant carbon emissions generated, particularly online. In other words, it is not obvious, under the Greenhouse Gas Protocol, whether or not The National Archives is responsible in whole or in part for the carbon emissions of Ancestry or Find My Past. It is also not obvious whether or not The National Archives should count carbon emissions from a product such as Gale-Cengage’s State Papers Online as part of its carbon footprint, even where the site is wholly produced from The National Archives content.

It is clearly the case that without The National Archives’ IP, these sites and their resulting emissions would not exist or be reduced. Determining the precise boundaries of The National Archives’ digital carbon footprint will require external adjudication.

At present, as far as I am aware, The National Archives has no mechanism for obtaining carbon data from licensees.

Conclusion

Upon examination, the specific kinds of digital services examined in this report are to some extent a distraction. They represent a very small proportion of a significantly resourced heritage institution’s carbon footprint. If we are looking for areas where significant carbon reductions could be made quickly, they are not to be found here.

The evidence is that hosting digital services on site results in more carbon emissions than a sensibly located (i.e., in a territory with a high proportion of electricity generated from renewables) cloud host and that, where it might be felt that migrating services simply migrates emissions from scope 2 to scope 3, in practice cloud providers can offer the same storage and compute with lower emissions. Amazon in particular reports its view of the carbon ‘saved’ by using its services rather than your own, but these are estimates and should not be regarded as robust.

In the transition phase, as services are migrated, it is possible digital emissions will rise as ‘on-prem’ servers and cloud compute are run together, where before only ‘on-prem’ were used. Organisations should strive for a sensible balance between using physical media (such as LTO magnetic tape), cloud storage and ‘on-prem’ storage. From a climate perspective, tape has a strong advantage, but it may offer less ease of access and not be suitable in every setting.

The lessons for archives would appear to be:

  • Do not regard your digital assets as an immediate priority for carbon reduction – your organisation’s footprint lies elsewhere. ‘Greening’ your IT is more likely to be a matter of reducing hardware waste and using more energy efficient devices than changing the way you handle data.
  • At the same time, do be mindful about what digital assets are being stored and where
  • Consider moving digital services into the cloud rather than running carbon intensive infrastructure on site
  • Recall standard archival practice about reducing duplication, assessing and weeding material
  • Make use of tools which can measure cloud carbon use
  • Collect and publish emissions data in robust and transparent ways.

If you do not have an accurate picture of the carbon intensive parts of your organisation you risk publishing and acting upon unreliable data and failing to make the most crucial changes for tackling the climate emergency.

First published and last edited in August 2023