A few months ago, at the ICA2016 conference in Seoul, the Expert Group on Archival Description (EGAD) released their first draft model (or standard) for a new relationally enhanced mode for archival description. There’s an email list for comments so I thought I would start there…


Jenny Bunn from the Department of Information Studies, University College London, kicked off by asking what format was preferred for responses, as she and some colleagues were getting together to work through the standard:

“Primary Entities
1. Do you agree with the membership of the list? Should anything else be included as a primary entity? Should anything be taken off this list?
2. Do you have any specific comments on any of the entities in particular, e.g. changes to wording, additional examples, confusion about usage?

1. Do you agree with the lists of properties for each entity? Should anything be added/taken away?
2. Do you have any specific comments on any of the properties in particular, e.g. changes to wording, additional examples, confusion about usage?

1. Do you agree with the lists of relations? Can you suggest further relations?
2. How should these relations be presented? What information do you need/would you like about each relation?

General comments
1. Anything else you want to say.”

Daniel Pitti, who appears to have been the driving force, agreed to that format, suggesting that general comments come first.

Australia’s Chris Hurley immediately picked up on the relationships, noting that he had identified “792 relationships and still counting.” He then suggested that there needed to be an understanding of the different relationship types, and also a glossary. Chris provided some examples of relationship categories, but I think it would be useful to go back to the original standards and work from there.

RiC is based on the four standards produced by the ICA – ISAD(G) for archival descriptions (fonds, series, items, etc), ISAAR-CPF for archival authorities (organisations, families and individuals), ISDF for the functions which are the reason for records to be created and ISDIAH, which describes the archival institutions and collecting organisations.  Of these, only ISAAR-CPF has relationships included in it, which are hierarchical (which organisation controls or owns which), temporal (which organisation preceded which), associative and related, which is mostly used for families and individuals. The different relationships are described in a follow on field. The Australian series system recognises relationships within and between archival descriptions, authorities and functions, and identifies that they may be reciprocal. In amending Access to Memory software for use in with series registration, my colleagues and I at State Records Office of Western Australia worked with the relationships and created subsets within the temporal and hierarchical relationships – controlled and controlling, subordinate and superior; succeeding and preceding. Relationships among individuals were not well defined, but in a private archive or manuscript library scenario I can see how these too may be developed. There are also the relationships such as custodian, creating and transferring, which describes the relationships between authorities and descriptions.

George Charonitis (Georgia State Archives) concurred with Chris with respect to identifying relationship types and also advocated for some more definitions, particularly with respect to context/s, provenance, creation, accumulation and selection. Chris’s next post looked at and suggested some common properties that could be used across all description types – identifier, dates and relationships, as well as looking at and reminding us of the relationships used in series registration, between deed, doer and document.

John Machin, also from Australia, picked up on the next part of the RiC process – the creation of ontologies, asking whether any existing ontologies would be used and how closely they would be followed. Florence Clavaud, from the EGAD group, responded that RiC-O (for ontologies) would probably be unique, and that they would then work on linkages and alignments.

Professor George Bak, from University of Manitoba, also made comment on the new standard, pointing out that the introduction is very Eurocentric (a point also made by the InterPARES Trust, of which slightly more below) and asking whether much thought had been given to indigenous perspectives, and also from the perspective of social memory. He queried how much of the standard had been aligned with current data visualisation practice, and looked at the scholarship in this area. He then followed up with a summary of a discussion held by some Winnipeg based archivists, looking at digital systems and raising the question of definitions and understandings of the way in which information is created and understood, by pointing to the OAIS model for representation information, information objects and so on  (he also writes beautifully, so it’s worth reading his posts just for the language).

Finally, the InterPARES Trust have released a bit of a broadside, however politely phrased, against EGAD online, pointing to the lack of communication over the past few years (the RiC project was instigated in 2012). One of their criticisms, which I agree with, is that although RiC is based on the four current standards produced by ICA, there is no evidence of a review of those standards, or how they have been implemented by different archival cultures. Like Bak and Machin, they are concerned that there is no higher ontological model or ‘anchor’ on which the new standard is based. They also suggest that looking at current relational database models, rather than focusing on data visualisation, may be of more use to both users of archives and those describing them for use. Indeed, the lack of user representation or awareness of the new model is also an issue.

Should you wish to review RiC-CM or add your voices to the mix, you have until the end of December to do so. For Australian archivists, the ASA is looking at presenting a combined response, so please do contact them.


STOP PRESS – deadline for comment now extended to 31 January, 2017

Archival description and discovery layers

Some years ago, Campbell Soups ran a campaign about their thick and rich soup range, one of which included Australia’s own Rose Porteous (I can’t find the link, perhaps someone cottoned on?). Anyway, I always think of that and, more academically, of Clifford Geertz’s ‘thick description’ when I think about the ways in which archives can describe their holdings. It’s not always the case, of course. Sometimes, time and pressure mean that holdings and archival authorities are described in minimalistic terms, but the potential for rich and thick description still exists, especially when contextual relationships between creators, functions and records are fully developed. It’s this that sets us aside from library description, and why archives generally don’t use the library MARC (MAchine Readable Catalogue) formats, even though there is a special set for archives (MARC-AMC).  Libraries describe the  individual elements of the soup – the pea, the bean, the meaty chunk, the liquid – on their own merit. The author statement can bring these elements together but doesn’t give a sense of how they interact. Archives describe the soup, and then the elements.

Given this difference, it’s been interesting to see how different archives have been included into broader, generally library based discovery layers. Our own TROVE is one such instance, and I’ve previously flagged how both the ANU Archives and PROV have added content to TROVE in my #GovHack posts. I’ve not seen much about what compromises had to be made, so I was very interested when the Digital Repository of Ireland brought out its guide to including archival description a few months ago.  The Digital Public Library of America has recently released a white paper for similar content. Both the DRI guidelines and the DPLA white paper use EAD (encoded archival description) as the major tool for exploring and exporting information. Both work within a fonds based hierarchical descriptive framework, and focus on the archival object or levels of description. The links made to archival authority and to function (Chris Hurley’s doer and deed) are minimal at best.

The DRI guideline is, by its nature, prescriptive. If you are looking for a good description of the elements within EAD and how they can be matched to standard elements in descriptive practice, then this is a good place to start.  The descriptions of each required and recommended element are clear, and provide some food for thought in Australian practice with regards to name, place and subject indexing of archival holdings. I think it would be relatively easy to implement the recommendations for a TROVE like discovery system (although, we have, as yet, to investigate why or whether we want one, and what we would expect to get out of it).

The DPLA white paper is, also by its nature, more complex, looking at comparative descriptive practices, meditating on the differences between library and archival description, and aggregated (fonds, collection, series, even Australian item level) description. It focuses, however, on individual digital objects, either a product of digitisation or a natively created in the digital environment, such as pages of books or individual photographs in an album. The working group looked at both description at a higher aggregated level (using the term ‘collection’) and for individual objects. Again, a number of examples are given for both, and some recommendations come from that. The working group is to be commended for the way in which they have approached the task at hand. Like the DRI guidelines, the white paper raises some important questions for Australian archivists looking at either a federated system, as proposed by Chris Hurley and others at the recent ASA 2016 conference, or in support of further work with TROVE.


Digital Preservation and sustainability

Over the past few months, there’s been a couple of interesting events in the realm of digital preservation. The first was the publication of the new UNESCO digital preservation guidelines – PERSIST (although UNESCO uses the term sustainability rather than preservation) . The second was the updated Digital Preservation Coalition Handbook.

PERSIST (Platform to enhance the sustainability of the information society transglobally) looks at guidelines for selecting digital materials – it’s necessarily rather broad and full of good intentions and motherhood statements. The guidelines look at national institutions, such as archives, museums and libraries, and suggests that where legislation exists regarding the deposit of materials such legislation should be broadened, if required, to include digital content. Both national and international bodies should be engaged in setting standards for the collection and maintenance of these materials. Copyright and digital rights management are briefly addressed in the next section on the legislative environment.

The next three sections look at libraries, archives and museum collections from the ‘think global: act local’ perpective. The first section, Thinking globally, suggests that libraries, faced with the ubiquity of social media, websites and internet content, will need to manage their legal deposit and selection criteria for ephemera carefully. It also suggests that libraries may need to focus on user requirements for maintaining content, rather than continually acquiring new content with view to preservation. Museums and galleries are flagged as needing to think about metadata for digital and digitised content and also for records about the collection. Archives, like libraries, face problems with shifting formats and systems. Libraries have the luxury of many copies, but archives may lose content that is not ‘born archival’ but which garners significance over time, simply because the formats in which the items are created are in themselves, ephemeral. Although specific issues are identified for each institutional type, the guidelines stress that many of these concerns cross collection boundaries.

The second section, Act locally, provides a range of selection techniques and criteria which are probably already familiar to institutions looking at collection policies and processes: comprehensive collections, focused on a region, time or person/organisation; representative sampling; criteria based selection – format, topic, and so on. It also suggests that there can be delayed appraisal in some circumstances: collect now, select later.

In addition, the guidelines provide a simple decision tree (sadly, not illustrated) which suggests institutions consider the following:

  • Identify
  • Legislative framework
  • Select
    • significance
    • sustainability
    • availability
  • Decide

Possibly of more interest and more utility are the appendices – the first looks at metadata for digital preservation, and manages to do so without using the PREMIS acronym. Three types of metadata are identified as useful for digital preservation; structural, descriptive and administrative. The second appendix provides useful terms and definitions.

The Digital Preservation Coalition Handbook is an online document (which can also be saved and printed as pdf), designed for managers and executives who are either new to the concepts of digital preservation or, through the handbook  and other learning, feel that they have a good grasp of the essentials but are by no means experts. Each section states the level of experience the section is aimed at, and provides some clear, simple discussions before going on to more nuts and bolts information like choosing providers, identifying formats, working through digitisation processes and decisions and more.  This is a far more detail and practical work than the Guidelines, but the two work well together.

Use the Guidelines to promote the importance of digital management and then follow up with the Handbook.




Archives New Zealand – 2057

The National Archives New Zealand have just released their new long term vision (which seems appropriate for an archives) for comment. It starts with a stirring quote from Sir Arthur Doughty about the value of archives for posterity, which can be taken as something of a trumpet call in this fiscally challenging times.

You, and I, have until 4 November to respond (the day after we are blowing up Parliament, after all).

A draft standard, Egad!

ICA 2016 is about begin, and as noted in my post a few months ago, a new draft descriptive standard is now available for review – http://ica-egad.org/ric/conceptual-model/RiC-CM-0.1.pdf.  Looking forward to hearing more about it, and all the presentations via the #ICASeoul16 hashtag. (My very rudimentary french and italian is getting a workout already!)

Stop press: a new email list has been set up for comment on the standard – http://lists.village.virginia.edu/mailman/listinfo/ica-egad-ric


Reflecting on #govHack

A fortnight ago, I gave up a little bit of time to see if I could engage hackers in using cultural heritage data, either to enhance a project or to be the basis for one.

This year’s#govHackWA was held in a new space, and included a link to a regional centre, Geraldton. After four years, it has become far more slick and professional, which was needed with the large number of entrants, but meant that some of the more social components of the weekend had gone by the wayside (the introduction and welcome from the central committee sounded more like phoning a government organisation with a long phone menu, than the somewhat quirky presentation by @pia_waugh of earlier years).  We shared information via Slack, an internet relay chat system with pretensions of grandeur, and the data sets needed to be on the various data portals a week ahead of the competition (rather than on a thumbdrive or harddrive brought in at the last minute).

The Slack channels worked well, enabling information, advice and requests to be shared with a large or small group as required. I have some concerns about these sorts of channels for more formal communication, particularly from a government recordkeeping perspective, but it was an effective tool for a specific project. There was a specific channel for project ideas, so I was able to suggest a few things, one of which, I think, was incorporated into the ihero project, about facial recognition of WWI photographs.

The data portals are clearly identified on the various government websites, with a link to each state from the Commonwealth portal, which shows how data can be connected across jurisdictions. However, I found the quality of the datasets to be variable, and I do wonder how many of them have longevity or usefulness either because of the specificity of the data collected, or the format in which the data is presented (but this is a discussion for another day). Nevertheless, by searching keywords in the data portals I was able to identify a range of useful data sets, and also links to databases, which provide more complex data.  I collated some DATASETS and sources and also printed off my previous post on some #govHack tools.

I was able to help two groups with identifying data and suggesting some ways of working with the data that they had – colourfulpast and ihero.  I had more involvement with the colourfulpast team, because they had worked with cultural data in the past and they included a colleague from the State Library of WA, but it was great to see how both projects evolved over the course of the weekend. I was able to promote both projects via twitter and on relevant facebook groups after the event, so that the target audience could identify and work with the projects and, hopefully, provide feedback and vote!

That said, there are some things that I would do differently next time.  The WA Fisheries Department were there all weekend, with just one dataset – their shark data. Their ability to work with multiple groups and to provide both data and technical expertise meant that three groups elected to work primarily with their data. Had I been more switched on, I could have had a look at the WA Museum and SRO trial discovery layer which Andrew brought to the weekend and identified additional shark data. Similarly, working with Trove to develop some complementary data might also have been useful for them.  The teams are time poor, so helping by providing some easily used and pre-collated data is worth considering. And, I would work to have some specific datasets identified in the portal, which I was really familiar with.

Next year, I hope to return to GovHack with a fully working SROWA catalogue and some datasets derived from the collection. I’ll also have a look at the other data provided by cultural organisations, and work on identifying projects and problems with them.  Having specific datasets and clearly identified projects is of benefit to both the organisations and the hackers.

Survey for Volunteers in Australian Archives

A great initiative! Well done to the team.

