- UK Government Web Archive
Information on web archiving
The National Archives role for the 21st century is to collect and secure the future of the public record, both digital and physical, to preserve it for generations to come, and to make is as accessible as possible. Public records can exist in any format and the open digital record is made available through a wide variety of online publishing platforms. Government now uses a range of digital media to engage, consult and interact with the citizen online.
The National Archives is preserving this rich seam of information by archiving UK central government websites and their social media presence in the UK Government Web Archive (UKGWA).
On this page you can access information that will help you find what you need in the web archive, use the archived websites, and the social media archive, and make use of Memento to see what a website looked like at any point in time. You can read our statement on the reuse of content accessible through the web archive, about the development of the UK Government Web Archive, the technical limitations, and the takedown policy.
Finding what you need in the Web Archive
You can locate a specific website using our A-Z index.
You can perform a full-text search across our collection using the advanced search (beta).
You can browse the collection by category or via 'Themed collections'.
You may be automatically redirected to the UK Government Web Archive if the web page you're looking for is no longer current. We've put a banner at the top of each archived page and [ARCHIVED CONTENT] in your browser's titlebar so you'll know that you are not on a live site.
The UK Government Web Archive contains some partial, or incomplete, copies of sites. If you find an archived site which seems incomplete you may be able to find a more complete version by trying a different date.
Tips on finding content in the UK Government Web Archive (PDF, 0.47Mb)
We have also developed a beta version of an application programming interface (API) to enable searching at scale. This blog post gives an example of how it can be used. Please get in touch if you would like to use the API.
If you need any advice about finding the most complete archived version of a site or have any other queries about finding information in the UK Government Web Archive, please email: email@example.com.
Using archived websitesYou can often use and browse archived websites as you would on a live site. Due to limitations in the software that harvests website content, some functionality is not available in archived versions of sites. This includes search and most drop-down menus as well as some interactive features such as forms and questionnaires.
All the sites in our collection have been archived individually. External links in archived websites will not work.
Some of the early archived websites were collected using experimental technology and therefore only gathered partial content, often without images.
You may find that some archived websites have more than one listing. This is either because the name of the department and website have changed (for example, the Public Records Office is now The National Archives), or it is because the website address has changed (for example, the Department of Health website was hosted at http://www.doh.gov.uk/ and http://www.dh.gov.uk/).
Contact details on archived websites, such as email addresses and telephone numbers, may not be current. Using outdated contact details may result in a delay in obtaining the information you require. We recommend that you use contact details found on the live web for organisations which still exist or contact the department now responsible for the functions of an organisation which has closed. If you have any queries please contact the Web Continuity team who will be happy to advise.
Memento in the UK Government Web Archive
Memento is a tool which allows users to see a version of a web resource as it existed at a certain point in the past. It was originally developed by researchers at Los Alamos National Laboratory in the USA and has now been made available for use in several web archives.
In order to use Memento you will need to install the Firefox web browser on your machine, which can be downloaded from the Mozilla website. You will then need to install the MementoFox Add-On and follow the instructions below to use Memento in the UK Government Web Archive.
- Click on the '?' button on the far right of the Memento timeline bar. The MementoFox Preferences dialogue box will open.
- Select the 'Timegates' tab.
- Double click on the line in the 'Timegate' window which reads 'New'.
- Copy and paste the following: http://webarchive.nationalarchives.gov.uk/timegate/
- Click on the 'Up' button to move the text you have just added to the top of the list. Two timegate URLs should now appear in the list with the webarchive.nationalarchives URL at the top.
- Click the 'Save' button in the 'Timegate' window and then close the 'MementoFox Preferences' window.
MementoFox is now ready to use. Type the URL of the resource you want to view in the Firefox address bar. Then either slide the slider or alter the date in the date box to move back and forward through time.
For example, to view all versions of the HM Revenues and Customs website, type the URL of the website (http://www.hmrc.gov.uk/) in the address bar and then move the slider, or enter a specific date, to view the website at some time in the past.
The plugin is also available for Chrome.
Using the social media archive
The National Archives has developed automated tools to efficiently capture and provide suitable access to social media content. Thousands of videos and over 65,000 tweets originally published online by UK central government organisations were captured during the pilot stages of a two year project. This collection will continue to grow alongside our wider web archiving activities. We estimate that the capture of new video content will take place on an annual basis and plan to monitor the volume of tweets produced by government and archive Twitter accordingly.
The social media archive follows the principles of our approach to web archiving by preserving the open digital record in a way that keeps it accessible, retains its context and makes it available for reuse. The earliest archived content available dates from 2006 and covers some major events in our recent history, including the London 2012 Olympic Games, and gives an insight into how government is using these digital tools to communicate.
Our Twitter archiving activity has been guided by the following rules that have informed our approach to building effective technical solutions that can work at scale:
- In: The tweets made by UK central government organisations and the official London 2012 Olympic and Paralympic Games accounts are captured. Where these tweets contain a link to web content that is included in the UK Government Web Archive users can generally expect the link to behave almost as it would on the live web as it will resolve in full to an archived version of that website.
- Out: Re-tweets made by these government accounts are excluded and tweets sent from non-government accounts that form part of a conversation on Twitter but don't appear in the API for the accounts we're collecting (e.g. replies, or tweets directed at the government accounts) haven't been preserved. Tweeted links that direct the user to web domains that are not in scope for our other web archiving activity (e.g. newspaper websites) will lead to a 404 or a 410 error message that allows users to see the destination of the link either in the address bar of their browser or within the error message itself so it is possible to locate the material elsewhere.
The beta version of the video archive includes a search function that searches across the video titles, as given by the publishing department. The Twitter content does not have a search option at present but it is possible to use the JSON and XML files we have published to interrogate and analyse the information contained in the tweets.
Re-use of content accessible through the UK Government Web Archive
Most, but not all, of the websites accessible through the UK Government Web Archive (UKGWA) were created by Crown bodies and are Crown copyright. Most of the archived content of these websites and services is also Crown copyright. Unless otherwise stated, you may re-use Crown copyright material obtained from the UKGWA freely under the terms of the Open Government Licence.
Where websites have used third party (non-Crown) material the copyright status of this material should be clearly stated on the site, either attached to or embedded within the material itself or on the copyright page on said site. In such cases the third party content is not re-usable under the Open Government Licence and the onus for obtaining the consent of the copyright owner rests with the person or organisation who wishes to re-use it.
Please note that the Open Government Licence does not permit the re use of personal information and that photographs that depict an identifiable individual can constitute personal data for the purposes of the Data Protection Act.
In addition to the above, further restrictions apply to the reuse of material originally published by government bodies, such as the Ministry of Defence (MoD), that have been granted a delegation of authority by the Controller of Her Majesty's Stationery Office (HMSO). For example, material published on MoD websites may be reproduced for the purposes of non-commercial research or private study and for the purposes of reporting current events only, unless other terms are set out against the respective content. You should check the relevant MoD copyright licensing information before assuming it is acceptable for you to copy and / or re-use the material under the Open Government Licence.
The National Archives does not warrant that all third party content is appropriately marked. The re-use of any copyright material that is not clearly identified as being Crown copyright is not authorised by The National Archives. It is your responsibility to ensure that you have any necessary permission for the re-use of copyright material obtained from the UK Government Web Archive.
Development of the UK Government Web Archive
Continuing access to online documents
Archiving websites helps to ensure continuing access to government's online information. We ask government webmasters help us to provide a web continuity service that enables online access to government information over time. Webmasters can do this by installing a simple piece of software that will redirect users if the information they are seeking has been moved or removed from its original location.
Web continuity means not getting a 'page not found' error message when you click on a web link on a government website, even if the information linked to has been removed, or moved.
The web continuity project that devised the technical solution that is now managed by The National Archives was established in 2007 to address concerns raised by Jack Straw, then leader of the House of Commons, about broken links in Hansard. Membership included The National Archives, the British Library, the Parliamentary Libraries, the Central Office of Information, and website managers from several government departments. Find out more about web continuity.
We also provide advice to government website managers on how best to design and maintain their websites for archiving purposes. We deliver guidance on the software that will enable more comprehensive capture of website content. Please see the Information for webmasters page for further details.
Archiving of government websites
Our web archiving programme began in 2003. We originally harvested around 50 selected government web sites using a not-for-profit specialist company called Internet Archive.
When we entered into contract with the Internet Archive, we gained access to their back catalogue. This means that you can find some sites dating as far back as 1997. Please note that archived sites hosted by the Internet Archive do not display the red UK Government Web Archive banner.
Since 2005 archiving has been carried out under contract to the Internet Memory Foundation, a not-for-profit specialist web archiving organisation, founded to build an 'internet library' for researchers, historians and scholars.
We started taking 'snapshots' of government websites due to close under the government's website review programme, which started in January 2007. The programme aimed to reduce the increasing number of government sites in order to provide a clearer and more user-friendly service for the public.
From November 2008 we began to archive a larger number of sites and to archive some sites at an increased frequency to support our Web Continuity Initiative.
More recently, The UK Government Web Archive is supporting the Government Digital Service in closing websites and moving them to GOV.UK.
We archive sites according to a regular schedule.If you need any further information about our archiving schedules, please email firstname.lastname@example.org.
Work with the UK Web Archiving Consortium
The National Archives was a founder member of the UK Web Archiving Consortium which, between 2004 and 2009, worked to develop a common shared infrastructure for the selective archiving of websites. Partner organisations were The British Library, The Wellcome Trust, the Joint Information Systems Committee (JISC), The National Library of Wales and The National Library of Scotland.
Websites selected for archiving by The National Archives during this period are available through the UK Web Archive or the UK Government Web Archive.
The UK Web Archiving Consortium disbanded in 2009 and The National Archives now works with its successor organisation: The Web Archiving and Preservation Task Force which aims to build on the work of the UK Web Archiving Consortium by drawing on the expertise and experience of many organisations involved in web archiving. The task force has been operational since 2010. Find out more about this group.
Limitations of web crawling technology
New web technologies and trends develop rapidly, therefore solutions have to be found to enable their successful archiving. Web archiving necessitates continual research into both capture (retrieving and storing data) and accessing (presenting the archived data in the archived site).
Web crawlers are often unable to capture dynamic web pages (pages that are generated via a database in response to a user's request) and search functionality, (as they require interaction with a 'hidden' database). In order to ensure the best possible success for the web archiving, webmasters are encouraged to use a variety of means for both accessing and presenting content.
The National Archives has a takedown policy, which explains the circumstances in which material will be taken down from websites. See a list of pages removed from the UK Government Web Archive under the policy.
We welcome feedback from you on any issues relating to particular sites. If you want to tell us about any issues, or have any comments on the UK Government Web Archive in general, please email us: email@example.com
This page contains PDF files. See plug-ins and file formats for help in accessing these file types.