Episode one: Knowledge organisation and visibility

In this episode of the Annual Digital Lecture audio series, we hear from a range of colleagues about the ways in which care is embedded in archival and knowledge organisation practises, as well as in online engagement.

We explore this by thinking about how we create and preserve records alongside accessibility and inclusivity from the perspective of metadata and research, and through the creation of online content published on our website, such as blogs, articles, and online galleries.

Listen now

Image by GarryKillian on Freepik

Transcript

[Start of recorded material at 00:00:00]

Lily: Hello and welcome to the Annual Digital Lecture podcast. My name is Lily and I’m the academic engagement officer here at The National Archives in Kew. For the past six years, The National Archives has hosted the Annual Digital Lecture, an event where we welcome leading speakers to talk about digital research and practice in addition to highlighting some of the innovative digital work happening here at The National Archives.

This past year’s lecture, hosted on the 28th of November 2023, was delivered by the Creative Studio Identity 2.0 who explored self-care and memory making in a digital age. Building on these discussions from November, this podcast explores how care is part of the innovative digital work happening at The National Archives. In each episode, we’ll have a conversation with colleagues across a range of different departments and teams who will lend their expertise to talk about how the digital and care intersect in their everyday work.

From the care for our records through preservation and processes, to the care for people who work with and use our records. Join us to reflect and to look to the future. In this first episode, we’ll be hearing from a range of colleagues about the ways in which care is embedded in archival and knowledge organisation practices, as well as through engagement. We’ll explore this by thinking about how we create and preserve records, alongside accessibility and inclusivity from the perspective of metadata and research and through the creation of online content published on our website, such as blogs, articles and online galleries.

So today we’re joined by Andrew Janes, who is head of archival practice and data curation, Grace van Mourik, who is a senior archivist. Rachel Gardner, who is a senior digital transfer advisor, and Ashley March, who is a content designer. Our first guests. Andrew, I hope that you’re happy to kick us off. As head of archival practice and data curation, can you first of all tell us a little bit more about your work?

Andrew: Yes. So, my job description says that I am the service owner for cataloguing operations, policies, and practices. So I spent a lot of time advising colleagues about cataloguing related things. So, the catalogue is essentially the inventory of our collections and it’s made available to search on our website. And it’s something that we are constantly adding to and improving. Much of our cataloguing work at The National Archives takes place within my own team but quite a lot is also done by other people.

When other government bodies transfer records to us, they do some cataloguing as part of that process. And many of our record specialists also run projects to improve the cataloguing of our existing collections. And that often involves supervising groups of volunteers to do that. The way that I see it, pretty much all of our cataloguing activity is collaborative in some way. So it’s not just individuals battling through mountains of old papers on their own. Definitely not. Many of our colleagues have very specialised combinations of knowledge and skills that are highly valuable.

And it often needs the time and effort of more than one person to create high quality metadata that describes our collections in a meaningful and helpful way and to get that into our online catalogue. So I’ve used the word metadata there because we think of our catalogue as being made up of data. And the historical record is also a form of data. So we regard catalogue entries as data, about data or metadata. We could also say that catalogues are meta records because they are records about records.

Or even we could say that they are paratexts because they are texts that support and provide a virtual framing for the texts of our archival collections. So hearing myself say paratext like that out loud, it sounds like quite over the top academic jargon. But I think there is a real sense in which a catalogue tells a kind of story about the records that we have in our care and it gives them a context.

Lily: So how do you think about care within the context of your work and the wider archival profession?

Andrew: Yes, I suppose if you ask people to name a caring profession, I expect the most popular answer would be nursing. I don’t think many people would say archiving. But as an archivist, I very much think of myself as belonging to a caring profession. I have a duty of care towards the collections of historical records, both physical and digital and also towards the people affected by those records. That’s people who want to use the records now and in the future but it’s also the people who made those records originally and the people whose lives and experiences that they reflect.

The traditional ways of thinking and talking about the role of the archivist can come across as quite combative, I think. For instance, we say that we have a custody of records which I always think makes us sound rather like we’re the history police. I think it’s now quite old fashioned, but I definitely still think it’s true to say that one of the archivist’s duties is the physical and moral defence of the records. Any archivist listening to this will likely recognise that phrase from the work of Sir Hilary Jenkinson, who was one of our predecessors.

Nowadays though, many of us prefer to think of our work to protect the records as being a form of care. So when I think of about what Jenkinson called it, the moral defence of the records, I see it as part of our duty of care. So we need to preserve the integrity of the records as systematical evidence, through knowledge of the original context. In archivists jargon, we use words like vision, order, and provenance. But in plainer language, the point is really that to understand our collections properly, people will often want to know the answers to questions like, Where did these records come from? Who created them? And how did they originally use them? And why?

Because that’s part of their story. I think many people would say that they love history but hate bureaucracy. But for those of us working with the historical records of state institutions, our collections are very much products of the machinery of government. Well, I say machinery, but what I really mean is that the records exist because of the actions of many generations of people who worked in public service. And that work certainly continues today all across government.

Lily: Thank you, Andrew. I think it’s really wonderful to start off this series of conversations thinking about care through the lens of the archivist and the archive which, as you’ve shown, can be understood in so many different ways. You also mentioned that The National Archives collections of products are the machinery of government and we are the official archive and publisher for the UK government and for England and Wales. So I’d now like to take a step back and think about how records end up at The National Archives. And the small caveat here is that we’re talking about contemporary digital records as Rachel, you advise on the transfer of digital records to the archives. So could you tell us a little bit about this process?

Rachel: Yes, absolutely. And I think it’s worth starting with the Public Records Act. So bodies subject to the Act, such as government departments and public inquiries, are required to select and transfer records of historical value to The National Archives for permanent preservation. And this is required to happen before those records are 20 years old. And if you think back to record creation 20 years ago, it’s really not a surprise that an increasing amount of these public records are now digital, rather than paper. And that’s really only set to increase.

So for both paper and digital records, there are steps that have to be carried out before we get to transfer, and that includes appraising and selecting the material, carrying out a sensitivity review and formalising any closures, and then getting to the preparation of material for transfer. So many records and information management teams across government might be really familiar with this process for physical records. But they might be looking at transferring digital records for the first time. And while there are the same broad steps to follow, the fact that these records are digital does have some pretty big implications for how these steps are carried out.

So Andrew referred to preserving the integrity of records, and the principle here is the same whether the record is physical or digital. We need to know the records are what they say they are, that there are measures to protect them physically, and that they’re accompanied with enough catalogue information or metadata to enable TNA to preserve them permanently. And also for the public to be able to find them and understand their context and content.

So to go a little bit further into the details, so for digital records, the transfer process is underpinned by several tools. So the most important of these is a tool called DROID which is a file format identification tool that was developed by The National Archives. This extracts key technical metadata for the files and then transferring bodies will add additional, what we might call, human generated metadata. So following a validation process, the records and metadata CSV are then transferred to us on an encrypted hard drive.

And at that point, they’re ingested into our preservation system and catalogue entries and open records are made available on our catalogue. So speaking of a role of care, for me, it’s really not only taking care of the digital public record, but there’s also an important role of care, supporting transferring bodies through this quite technical process. But we’re also working to make this process easier and more intuitive. For example, we’ve developed an online transfer tool called Transfer Digital Records or TDR for short. And that’s replacing the need for transferring bodies to download these tools or to use hard drives.

I would definitely say that this engagement is a two way street as well, because we’re learning from our colleagues across government about all the challenges that digital records are posing and also the innovative solutions they’re finding to these challenges.

Lily: So do all of the transfers look the same or do you have to develop different bespoke procedures?

Rachel: In my experience, no transfer looks quite the same. And I think that’s because no digital record really looks quite the same. A digital record might be a PDF or an Excel spreadsheet, DVD video files, an email with a long web of attachments or database. There’s a huge range of complexity for digital objects and we’re encountering new challenges all the time. So no transfer looks quite the same but I think the principle of what we’re doing does remain the same. Transfer the authentic digital record, preserving its integrity and with enough information about the record for it to be preserved and understood.

So to do this, we start with a standard minimum set of essential metadata and then work with transferring bodies to help them think about any other information they might hold about the records that would help future users. So at this point, it’s really a question of where does that information sit best. Do we have a metadata field already that can accommodate that, for example, description or a form of reference field? Or do we need to consider if there’s a new field or a way of structuring that data that would be appropriate to capture that information? And of course, we’re working really closely with our cataloguing colleagues to do that.

So we have a central metadata that we need for every single digital record. And this is primarily intrinsic metadata held within the file itself which can be auto extracted. So some examples of this include the file name. So for us, that’s a really important intrinsic part of that original digital record. So we don’t want that file name to be renamed or tidied up for transfer. The name it was given during its creation, during its use, really tells us a lot about that record’s context. However, we also understand that file names might not be very descriptive. For example, you might get a report.pdf or a doc1.doc. So we do enable transferring bodies to provide free text descriptions for records at this level as well.

But what might also give contextual information for that report.pdf is the folder structure that were saved in. And we really encourage transferring bodies to preserve this original folder structure and transfer. And this will be reflected in the file path identifier and displays on the catalogue as arrangements. That’s another really key piece of intrinsic metadata. And of course, the date of the record would be a really key piece of information about that record. And then that brings me on to the date last modified. So of course, the date of the record is a really key piece of contextual information as well. However, we know that digital records are at risk of losing their intrinsic date metadata, for example, being lost through a system migration.

Where this is the case, although we preserve that date last modified still, it’s more useful if a transferring body can provide a more accurate, more meaningful date for their record as they would when preparing physical files. It’s really a combination of preserving this intrinsic metadata, and that’s part of our principle of preserving the integrity of the record, combined with enabling the transferring body to provide additional descriptive information so that we can preserve and provide access to the richest record possible.

Lily: And what would you say would be one of the key challenges in this process?

Rachel: I think the key risk to be mitigated is really the loss or corruption of metadata or file integrity. And that could either be in preparation for transfer or even earlier in the life cycle of the digital record. For example, we’ve talked about how central date metadata is here and how easily it’s lost. So while it’s important that transferring bodies can provide a more curated date for their records in these cases, what’s even better is if that intrinsic date metadata can be preserved. A key issue here is where the records are before transfer, if they need to be exported from a particular system, what is that export functionality like? What’s happening to the records in this process?

For example, are they all given today’s date on export? Do their file names change? And also, does that system include a lot of rich metadata and can this also be exported and preserved and transferred to us? And there are really big considerations here for us. And it also includes the difficulties of preserving this record and metadata integrity in files in cloud environments, so looking at SharePoint and Google Workspace. And here, I would say it’s not only a question of how we manage the record’s integrity on export but how records are being created and managed in new ways. For example, if files are no longer structured in folders, but if they’re organised through tagging and custom metadata, the file path is no longer going to best represent that arrangement and preserve that contextual information.

So it’s not only a case of thinking, is it possible to meet our minimum metadata standard, but is our metadata standard actually enough for these?

Lily: So Grace, I’d now like to turn to you as Rachel just mentioned the importance of preserving the content and context of a record when it’s being transferred. I’d love to hear from your perspective as senior archivist, what work is subsequently done to continually preserve that content and context.

Grace: Yeah, absolutely. So as Rachel’s covered, within my role, I work closely with digital archive and colleagues for digital transfers and government transfer colleagues for physical transfers. And the key aim of this work is ensuring preservation of a record content and context before, during and after transfer. So firstly, to ensure context preservation, we make sure that transferred records are placed within their agreed structures in the catalogue hierarchy. We agree on those prior to transfer and we base decisions on the preservation of context and content.

So it might just be useful to briefly discuss what we mean by a catalogue hierarchy and how we believe this preserves content and context. So at The National Archives, all of our records sit within catalogue structures. So there are three high level groupings which are more conceptual. So these are department, division and series, and then we’ve got the lower levels, and their piece and item and these describe the individual records. So we arranged the records into these higher level conceptual groupings because we believe they give users important information about the context, i.e. how and why the record was created, who and what, and who and what was responsible for creating them.

So we’ve got department, which is the top level, and that contains the information about government departments and bodies that created or accumulated all the records that sit within it. Underneath this, we’ve got the division level. So this is optional and it’s used most often when departments have large and complex organisational structures, such as directorates or to reflect the activities of a body’s various functions. And then underneath this, we have the series level. So this sits underneath departments, and where appropriate, they sit within divisions in TNA’s online catalogue. So there’s usually a number of series under each department and each of these documents specific activities and events carried out by these government departments.

So series is also a conceptual grouping and it’s at this level that transferred records are usually conceptually placed. Managing this is a really key area of our work for preserving context and content before, during, and after transfer. I like to visualise each series as a draw within a filing cabinet which contains a range of records that share provenance as they were created through the course of the same business, activity, function or purpose. So the records themselves are catalogued within the series as piece and item, and these catalogue entries describe the specific record information and are what the end user interacts with.

So quite often, we make the decision to transfer records into a completely new series. This maintains context, as the records which were created and/or accumulated together stay grouped together. And this ultimately communicates to users how the records relate to one another and the common history function and purpose they share. So when creating these, we aim to create descriptive content that provides an accurate summary of the subjects covered in the records, as well as key contextual information about how, when, and why the records were created.

So we aim to be consistent when creating these, so that the same level of information is being captured and displayed for all record transfers. Sometimes we might transfer records into existing series, which we call accruals. So we do this where there is a clear continuity in content and context with records already at TNA. So when we do this, we’re really keen to make sure that we update the series content so it accurately reflects the inclusion of new records. We update the format and extent of records and the date of creation. We might also add in new creators, which allows users to have a fuller understanding of all bodies that are active in record creation.

Occasionally, and definitely more frequently at the moment, we’ve got a number of series which contain physical and digital records. So we call these hybrid series. We tailor content within these to give users vital information about the different record components. For example, we explain how the references are different for physical and digital records. So once we’re happy that records are in the correct arrangement, then we go about checking the content of new descriptions. So each description runs through archival workflows and are quality assured before release onto Discovery.

For example, our series content is checked to ensure it matches that supplied by the transferring department, that it meets cataloguing standards and that it contains the key information for users. The contextual grouping and accurate descriptive content is combined and presented together in our online catalogue. This gives users an insight into the content of the records, why they came to be created and how the records relate to one another and other material held at TNA. We believe this preserves and displays content and context and best provides users with the information they need to identify records of interest.

Lily: Grace, can you speak to the tensions and complexities that can arise when preserving the context in which a record was created and then thinking about our duty of care and the needs of those individuals who are interacting with those records today?

Grace: So when we catalogue records, we aim to describe them in a way which is accurate, inclusive and respectful to individuals and communities that created the records, those who use our collections, as well as those represented within them. We also work to ensure that our descriptions preserve the integrity of the historical record and the context in which they were created. So this does create a natural tension in terms of making sure that we apply the duty of care to our users, making sure that they are accessible and inclusive to all. Whilst also preserving and displaying the content and context of records through our descriptions.

So an example of this tension, or rather what I consider as a need for a balanced approach, is our established practice of using original file titles as assigned by the record creator for file level descriptions of records at TNA. So we do this because we believe this information is representative at the time and context in which the records were created and is also part of the record. However, we are aware that the language and terminology used at the time of the file’s creation may now be considered harmful, offensive, pejorative, abusive or inappropriate.

So we know, therefore, that we really do need to establish and follow an approach that balances the preservation of this context with an awareness of our need to present descriptions that are accurate and inclusive to all. So given that it’s a very complex and sensitive area of work, we’ve conducted research within The National Archives to consider approaches. And have liaised with archive and heritage sector colleagues to share knowledge and review outcomes of tried and tested approaches for handling this tension.

Lily: We’ve heard about the behind the scenes journey of a record at The National Archives and thought about how care is enacted throughout this process. But at The National Archives, we don’t just preserve. We also research and share the stories behind our records and one of the ways in which we can do this is through digital content. So Ashley, you are a content designer for our website. I was wondering whether, first of all, you could just share a little bit about what this involves.

Ashley: Sure. So essentially a content designer is a website editor but there’s a bit more to it than that. I help plan and make changes to the general information that sits on our website and I work with The National Archives’ record experts to produce interpretative online content about our collections, meaning telling the stories in our records for the general public to engage with. I commission, edit and publish things like articles and blogs, helping shape our authors research into the right online formats, making sure they’re readable, consistent and error-free.

So in some workplaces, web editors are responsible for uploading whatever the content they’re given, however they receive it. But a content designer, a term that’s only been around for about 10 years, is also responsible for making sure that that content, whether it’s financial information, historical narratives, anything really, is formatted in the most appropriate, accessible and effective way for its users. Rather than defaulting to whatever’s convenient for the organisation. Since that’s the case, content designers get involved as early as possible when changes are happening to a website that affect any user’s experience, like when new features, formats or page types are developed.

We do things like working out what content is needed, what formats are most useful, whether something, for example, should be a written address or a map on a webpage. We feed into the designs, we write briefs for authors, we don’t just push the button to publish. So my role really is all about bearing in mind the needs of others and ensuring that they’re addressed properly. I’d say caring runs through it.

Lily: So I know that there’s been quite a lot of exciting work that’s been happening around this recently. Could you tell us more about how the public can engage with The National Archives records online?

Ashley: Of course. So my department, Digital Services, has been working on a project to reimagine how the public engage with archives online by building a new and improved website and online catalogue. It’s called Project Etna, from ‘Exploring the Nation’s Archives’, and I’d say it’s fundamentally driven by care because it’s all about improving people’s experiences of using our website. And encouraging audiences who might not know much about us or archives generally to engage with our collections and expertise.

Why is that care? I think because of the social and cultural importance of the material we hold. It’s so important to broaden the range of people who understand what we are, what we have and what we do and feel comfortable and positive about using our services. Whether that’s reading an article as a one off to help with some homework or starting a research journey that leads to visiting our search room at Kew, maybe regularly. The project is based around user stories like that, specific needs that certain kinds of people have that we’re aiming to address.

And a huge amount of care goes into identifying and validating these. Instead of making assumptions about what people want from us, we have user researchers dedicated to interviewing, surveying and testing prototypes with people from target audiences. And data analysts who collect and interpret information about how people have behaved on our website. So we have what people say, and what they do to inform what we produce and publish. So what have we made? A new range of editorial formats, two different kinds of short form articles, a long form article template and image galleries. When previously, all our editorial content, for general audiences, was confined to our blog and varied greatly in style, length and format.

So, someone turning up and viewing a blog might not know exactly what they’re going to get. We’ve also created a section of the website called Explore the Collection which offers multiple ways of browsing these articles and image galleries, including by topic, time period or relevance to the content that you’re looking at. We’re aiming to give users a visual way to quickly get a sense of the range of our collections and some of our most intriguing and important records, encouraging them to go on journeys of discovery through our record stories in ways that suit their interests and habits.

For example, someone might google suffragette leader, Emmeline Pankhurst and visit our article about her life which contains images and transcripts of key documents and lots of links to our online catalogue. Then, follow a link to an article that looks at one specific record in more detail. The list of suffragettes arrested from 1906 to 1914 which has revealing images, curious details and historical context about that record. Then they might follow a link to an image gallery showing highlights from our collection about the fight for women’s suffrage in the UK. Then, onward to a page that shows more content on the theme of democracy and protest and so on. Hopefully, users can happily follow their curiosity.

Lily: Are there other ways that you see care in your work as a content designer?

Ashley: Yeah, I absolutely have to talk about accessibility here, making sure that everything that we publish online can be used by as many people as possible if they apply the equivalent amount of effort. So web content accessibility guidelines or WCAG, state that websites must be perceivable, operable, understandable and robust for all kinds of users. So that goes for the functional things on our website, but also the editorial content that I work on. We’re legally obliged to meet WCAG standards, but we aim to go above and beyond them.

We know that nearly 30% of visitors to The National Archives website have some form of accessibility need, whether temporary or permanent, visible or invisible, from motor difficulties and sensory impairments to learning disabilities. Considering these, it’s fundamental to how we publish content day to day and develop things for the future. For example, while working on our new editorial formats, we have user stories, including people with specific needs. They might be something like, as a user who is sight impaired, I want access to a transcript of an image that contains text so that I can learn what it says.

We’ve met this need by developing and testing a transcript button users can press to see a full or partial transcription of images featured in our articles on the same page. User testing revealed how much all of our users value having a transcript available, especially for handwritten or typed text, to easily and confidently understand the contents of a record in an image. Working on accessibility provides care for all kinds of users. It’s not just about meeting additional needs either. It’s about ensuring that people using our website across different devices, operating systems, etc., all have an equally good experience.

Although that sounds like it’s just technical, it’s about social equality. Many aspects of our identities shape our preferences, behaviours and the technology that we use. And we all deserve the same opportunities and quality of experience. Here are some other ways that I work on making content more accessible. Ensuring that everything we publish is written in plain English, not using academic or technical language, if it can possibly be avoided.

Assuming no prior knowledge, e.g. adding the years that the term Georgian period refers to, rather than expecting people to know. In explaining acronyms, which there are many in political history, the kind of things that crop up in our records all the time. We also don’t format text in italics, which research shows is hard to read. And work on subtle things like ordering lists in clear and consistent ways wherever they appear on our website, so that there’s a little less mental load on users whenever they’re having to do something on a website, whether that’s for their enjoyment or for a practical purpose.

I also want to talk about representation and inclusivity in what we publish. Content design involves caring about how different people might feel when encountering our content in a way that Grace mentioned earlier. The National Archives collections can illuminate the lives of all kinds of people that live in the UK today. But is that apparent to somebody viewing our website for the first time in the topics we discuss, the imagery we use? When we cover potentially contentious or upsetting topics, is it clear that we’ve considered the responses they might evoke in people with personal connections to the issues and identities discussed? We’re currently doing work on content warnings as well.

Lily: I have one final question that I’d love to ask you all before I let you go. So building on the conversations that we’ve been having today, what are the most relevant conversations around digital technologies that are currently taking place in your field, or that you’re personally particularly interested in? James, would you like to go first?

James: One thing that’s quite big at the moment is moves towards having links data technologies underlying archives catalogues. And one of the, one of the things that’s driving this is the recognition that the provenance of sets of records can be more complex than we’ve traditionally been able to acknowledge within archives catalogues. And I think that’s particularly true for born-digital material. So many archives, large and small, throughout the world are likely to find implementing this kind of approach quite challenging though. It’s certainly what we currently see as our future within The National Archives of the UK.

But transitioning to a completely different kind of database is not going to be at all quick or easy. And I think as well, it’s something that feels like quite a technical behind the scenes thing with our users may not really notice the difference, at least not straight away. And I think it’s also worth us thinking that for people working in archives, there can be some tension between attempts to make catalogues easy for people to use and understand and attempts to reflect the nuanced complexity of archival collections more fully in the way that links data technologies would allow us to.

So both of those things are very much needed but they seem to me, at least, to be pulling us in slightly different directions. And I think we should take care that they don’t.

Grace: I think yeah agreeing with Andrew on that. And I think what’s also becoming apparent with the cataloguing of born-digital material is how our descriptive standards or the standards which guide our descriptive practices at the moment are no longer fit for purpose, although some might contest to that. And I think that’s what Andrew was talking about in terms of the linked data and the idea that a record could have two, a shared provenance. So I think there’s a conversation which is going on about that. And obviously there is the development of the records in context, standard or data model. That’s one possible way of addressing that.

But I think, as Andrew says, it’s how can that be implemented for much smaller repositories? And then it’s going to, it’s going to demand the archive profession to shift from what has been the status quo of ISADG as a cataloguing standard and to consider how we can work to describe born-digital records which are far more complex and nowhere near as static in nature.

Lily: Is there something that ties in with your work there, Rachel?

Rachel: Yes, I think that’s really interesting, and I encounter that a lot in my discussions with transparent bodies, with government departments who want to understand how their digital records, how that kind of structure and how users are going to be interacting with those. And I think a big part of this is the scale of digital records when we’re looking at digital records on that kind of granular and record by record level. And the sheer quantities that we expect to be transferred to us by transferring bodies as the years progress.

I think that here there’s a bit of a tension because to manage such huge amounts of records and to manage that process where we’re preserving that integrity and making those available, part of that solution will be exploring more automated ways of doing this. And there’s a lot of discussion about the potential role of AI, for example, but there is also that balance between encountering complex digital records that perhaps we’ve not encountered before and thinking about the challenges that they might pose. And how do we strike that balance between the bespoke care for the digital records coming to us and being prepared to be able to manage that at scale.

Lily: And Ashley, what are your thoughts?

Ashley: I’m going to follow on from your point about AI. I’m glad you mentioned it because that certainly, I think, is a topic for discussion among people involved in content work and publishing in the heritage sector specifically, I think. Because while AI promises means of streamlining our workflows of AI could generate a draft of a marketing copy or introductory content or some areas where I’ve seen AI being used to generate draft alt text for images. So alt text is alternative text. It is what people are using screen readers to access the content of a page we use to have a description of what is in an image when they may not be able to access that image themselves for any reason. Where there might be a sight impairment or some other kind of purpose.

And in that specific area, AI is already being applied to do the drafting of alt text but I don’t feel enormously comfortable about that because writing good alt text is a skill. AI can learn skills, but really it’s about the thing that AI struggles with most which is identifying the core meaning of something. So bad alt text describes exactly what is in a picture. There is a sun and there are some clouds and there is a hill. A good alt text is able to say it is such and such hill on a summer’s day and explain a level of interpretation or what is attempted to be conveyed through an image rather than just technically what appears in the frame.

An AI there potentially could lower the quality of people’s experiences if it’s used exclusively to generate that kind of content. There’s another tension as well that I see. So my latest version of Photoshop is telling me to experiment with the generative AI capabilities that it has. And while that’s extremely tempting, so just last week we published an article with an image of a famous person that had a hole punched right out of the middle of that person’s face. And it’s very frustrating to think that it’s nearly a perfect image that shows them in their element but it’s not a high quality image because a key part of it is gone.

And that’s exactly the kind of area where generative AI could be useful. It could complete that picture but that would not be accurate or authentic. And we know that our users value authenticity and accuracy, trustworthiness. That is one of our USPs, we’re always trying to distinguish ourselves from Wikipedia and information sources that other people go to more readily. And what we specifically can offer is the record as it is and was. So in those cases, AI, I don’t know, it may not be for us for a while or at all.

Rachel: Yeah, I think it’s using AI to support our work but there is always going to be the need for that human interpretation, that human evaluation of the record. I think we’ve been looking at how we can use computational analysis to run the audit unoffensive terminology. And whilst that can generate, we can give it a lexicon to work off and we can be given a number of how many descriptions contain this language, it’s still going to need that human level of interaction to, like you say, for those descriptions to be viewed in context.

And so I think it is going to be a balancing or handling that tension in terms of welcoming AI to support work but not viewing it as a one size fits all sort of thing.

Reader: Thank you for listening to our Annual Digital Lecture podcast and thank you to our experts for taking the time to talk to us today. To learn more about the Annual Digital Lecture and watch recordings of our previous lectures, click the link in the text on the episode page, or visit nationalarchives.gov.uk and search for Annual Digital Lecture. If you’re interested in learning more about our research, as well as our work as an independent research organisation, Visit our website nationalarchives.gov.uk and search for research and academic collaborations. Follow us on X at UK National Archives Research to stay up to date with our research projects, upcoming events and other opportunities and don’t forget to read our blogs at blog.nationalarchives.gov.uk. This audio recording from The National Archives is crown copyright. It is available for reuse under the terms of the Open Government Licence.

[End of recorded material 00:36:18]

Episode one: Knowledge organisation and visibility

Listen now

Transcript

Find out more

Websites

Site help

Legal

Follow us