Reflecting on #govHack

A fortnight ago, I gave up a little bit of time to see if I could engage hackers in using cultural heritage data, either to enhance a project or to be the basis for one.

This year’s#govHackWA was held in a new space, and included a link to a regional centre, Geraldton. After four years, it has become far more slick and professional, which was needed with the large number of entrants, but meant that some of the more social components of the weekend had gone by the wayside (the introduction and welcome from the central committee sounded more like phoning a government organisation with a long phone menu, than the somewhat quirky presentation by @pia_waugh of earlier years).  We shared information via Slack, an internet relay chat system with pretensions of grandeur, and the data sets needed to be on the various data portals a week ahead of the competition (rather than on a thumbdrive or harddrive brought in at the last minute).

The Slack channels worked well, enabling information, advice and requests to be shared with a large or small group as required. I have some concerns about these sorts of channels for more formal communication, particularly from a government recordkeeping perspective, but it was an effective tool for a specific project. There was a specific channel for project ideas, so I was able to suggest a few things, one of which, I think, was incorporated into the ihero project, about facial recognition of WWI photographs.

The data portals are clearly identified on the various government websites, with a link to each state from the Commonwealth portal, which shows how data can be connected across jurisdictions. However, I found the quality of the datasets to be variable, and I do wonder how many of them have longevity or usefulness either because of the specificity of the data collected, or the format in which the data is presented (but this is a discussion for another day). Nevertheless, by searching keywords in the data portals I was able to identify a range of useful data sets, and also links to databases, which provide more complex data.  I collated some DATASETS and sources and also printed off my previous post on some #govHack tools.

I was able to help two groups with identifying data and suggesting some ways of working with the data that they had – colourfulpast and ihero.  I had more involvement with the colourfulpast team, because they had worked with cultural data in the past and they included a colleague from the State Library of WA, but it was great to see how both projects evolved over the course of the weekend. I was able to promote both projects via twitter and on relevant facebook groups after the event, so that the target audience could identify and work with the projects and, hopefully, provide feedback and vote!

That said, there are some things that I would do differently next time.  The WA Fisheries Department were there all weekend, with just one dataset – their shark data. Their ability to work with multiple groups and to provide both data and technical expertise meant that three groups elected to work primarily with their data. Had I been more switched on, I could have had a look at the WA Museum and SRO trial discovery layer which Andrew brought to the weekend and identified additional shark data. Similarly, working with Trove to develop some complementary data might also have been useful for them.  The teams are time poor, so helping by providing some easily used and pre-collated data is worth considering. And, I would work to have some specific datasets identified in the portal, which I was really familiar with.

Next year, I hope to return to GovHack with a fully working SROWA catalogue and some datasets derived from the collection. I’ll also have a look at the other data provided by cultural organisations, and work on identifying projects and problems with them.  Having specific datasets and clearly identified projects is of benefit to both the organisations and the hackers.

Survey for Volunteers in Australian Archives

A great initiative! Well done to the team.

Personal Recordkeeping, Identity and Archives

You can all stop holding your breath, as we are now launching the survey which was proposed at the 2015 ASA conference! *collective intake of breath*

We can’t wait for you to pass it on to your volunteers in order to gain a better insight into their collective experiences and motivations, knowing that this knowledge would help to improve your volunteer program.

Volunteers within the Australian archival/records sector are invited to complete the following survey:

take the survey

Please note: Volunteers are not asked to identify which institution they volunteer for, and their involvement and responses will be kept private and anonymous.

More details

This study will address the following research questions:

1. Who are our volunteers?
2. What motivates our volunteers?
3. What type of experiences/support does the Australian archival community offer?

An understanding of the above will assist in improving the experience of volunteers within the archives, resulting in the creation of…

View original post 114 more words

Sticky fingers, or: do we need to revisit the gloves debate?

For quite some time, archivists, conservators and special collections staff have been telling people that they don’t need gloves to handle paper records. The wonderful Rebecca Goldman (@derangedescribe) even did a handy (pun intended) flow charthttps://derangementanddescription.wordpress.com/2014/01/13/do-i-need-to-wear-gloves-in-the-archives-a-helpful-flow-chart/.

Last night, at the Australian Society of Archivists WA Branch AGM, we learnt that the matter needs a lot more research. Professor Simon Lewis, of Curtin University, and his research students are involved in forensic chemistry research, and are looking more closely at paper and fingerprints, to see what they can determine.

Paper porosity is ideal for capturing some drugs; American dollars show up cocaine traces quite well, apparently.  Paper is also evidence itself, or rather a carrier of different sort of information, as archivists well know. In the forensics field, there has been a concentration on the authenticity of documents used to prove identity – a passport may well be authentic, but there may be questions about the documents used to obtain it, for example. Paper is also used as a carrier for some cost effective medical analytical tests.  Because of this, there is an increasing focus on paper as an area of research. Can they date paper, for example, to say when a document was created (turns out the answer, is , umm, not really, or, it’s quite tricky).

Paper responds to particular events in interesting ways. Bleaching and laser ablation to remove stains or colour leads to weakening of paper fibres. Light also changes paper, as we know. But there may be other things going on, within the paper. An Indiana based art museum identified a set of artworks created by Gustave Baumann, which they have in their collection. Baumann is known to have used turquoise inks to sign prints and artwork. Because they took photos of their collection when it was accessioned, they knew they had some turquoise signatures. However, when they went to retrieve the art for a display, despite being stored in the dark for a significant number of years, the ink signatures had disappeared. Something in the paper may have been interacting with the ink.

There’s been a bit of research into rag based papers and even early wood pulp papers, but not a lot, for example, on recycled papers. Simon and his team have recently received paper samples from the Shoalhaven paper mill when it closed, going back 50 years. The paper is well described and its storage conditions are known. This means that they can start looking at some different experiments with paper.

But they also need to find out about the things that interact with paper, like the turquoise inks, and those fingerprints. While they could find quite a lot of research on fingerprints, they discovered a bit of a gap in the literature – the way in which fingerprints interact with and affect or affected by paper. Indeed, when they started to look into it, they found that most of the material on the issue had been written by archivists, librarians and conservators, and about handling issues for cultural heritage materials. Suddenly, their research took on a whole new aspect.

Professor Lewis dates the gloves controversy to a 2005 paper by Baker and Silverman,  Misperceptions about white gloves. In the paper, it was argued that the majority of fingerprint residue was water, so little amino acid or fats remained to contaminate the paper. But it turns out, that is not strictly accurate.

National Archives of Australia, senior conservator, Prue McKay wrote about her experiments with paper and gloves in 2008. She found that bare hands did leave residue, but there was some doubt as to the effect of the marks, particularly on older papers. More recently, Terry Kent, a UK based forensics analyst, reported on the water content of fingerprints, again confirming that there were sufficient amino acids and fats to make a deposit. Apparently, tests conducted at Curtin show that amino acids migrate into the paper substrate and then bind to the paper. They are doing some continuing work to see how long fingerprint amino acids remain and, eventually, will try and find out what the effects are on the paper. They also looked at the fats, which can be both from secretions but also from things like soap, gels and hand creams.

The main result from the experiments so far show that fats and acids return to the skin very quickly, with around 5 minutes, after hand washing. However, the jury is still out on whether or not gloves should or should not be worn. Based on the research to date, I’m sticking with the no gloves policy until the other alternatives are fully investigated, although, if I know someone is a head scratcher or finger licker, I may reconsider.

 

 

Getting ready for #govHack 2 – tools, other data sources and examples

In this post, I’m going to point to some of the tools that I know from digital humanities and the like. They are mostly used in the cultural sphere, but that is not to say that they aren’t useful for exposing and manipulating other sorts of data. I’ll also try and provide some examples of the way data has been used for some simple and not so simple projects. GovHack is all about getting something up and running in 24 hours so, like a thesis, the parameters of time, space and subject need to be clearly defined. However, also like a thesis, the project should show some potential for further work, research and avenues for publication.

I’ve already provided a link to the TROVE API, and to some of the blogs that discuss using it.  The API has been acknowledged as a source of inspiration for the Europeana and Digital Public Library of America (DPLA) APIs, too ( a good way of incorporating some international data) : http://help.nla.gov.au/trove/building-with-trove/api; http://labs.europeana.eu/api; https://dp.la/info/developers/codex/.  Library cataloguing data, including Australian libraries, can be found on WorldCathttps://www.worldcat.org/affiliate/tools?atype=wcapi, while archival and manuscript collections can be found via ArchivesGridhttp://beta.worldcat.org/ArchiveGrid.

Libraries and some archives use a format called MARC (MAchine Readable Catalogue) to describe resources. It’s a standard developed by the Library of Congress, and about half way down their MARC documentation page, you’ll find a list of crosswalks and mappings to other formats including Dublin Core (developed by OCLC, the people who run WorldCat) and geospatial data – http://www.loc.gov/marc/marcdocz.html

Other archives use Encoded Archival Description (EAD) and Encoded Archival Context (EAC) to create and share descriptions. Developed independently, the Library of Congress also maintains documentation to support these standards, and again has some crosswalkshttp://www.loc.gov/ead/ag/agappb.html. EAC is used by the SNAC Project and the eScholarship Research Centre at University of Melbourne (which is a data provider for ANDS) to create connections between organisations and individuals – http://socialarchive.iath.virginia.edu/; http://www.esrc.unimelb.edu.au/about-us/informatics-lab/

Beyond the world of library and archives description (and you just wanted some simple headers to capture data, right?), there is Zotero, an open source citation software developed by the Roy Rosenzweig Centre for History and New Media (CHNM) – https://www.zotero.org/ . Zotero comes with some nice tools, including a simple timeline, and is also something I’d like to play with to open up referencing from archival sources. The CHNM spends a lot of time creating neat tools for historians and cultural curation so they also have Omeka, an online exhibition tool, and Scripto for transcription purposes – http://chnm.gmu.edu/research-and-tools/.

You can also use the open source project  Blacklight (including Spotlight) to play with library described data – http://projectblacklight.org/; http://www.rubydoc.info/gems/blacklight-spotlight/0.19.1. (Turns out Blacklight, Spotlight and other delights are the work of Stanford University Librarieshttps://library.stanford.edu/blogs/topic/blacklight).

There’s some good tutorials on Zotero and other tools on the Programming Historian site – http://programminghistorian.org/lessons/

The ever fabulous and creative Tim Sherratt has a whole host of tools, and examples of how to use them, on his wraggelabs site. The focus is on TROVE and the National Archives of Australia – http://wraggelabs.com/emporium/: e.g. http://troveconsole.herokuapp.com/  and http://faceapi.herokuapp.com/

Finally, I’d like to point to some interesting uses of cultural data, both as part of govHack and more generally.

Not open source, but fun, there’s HistoryPin and NowandThen https://www.historypin.org/en/ and http://nowandthen.net.au/Main_Page. Pixstory, from the 2013 Govhack, explored some of these ideas – https://www.youtube.com/watch?v=pUDxyrOhVQs

As part of the WW1 centenary project, the RSL teamed up with a local TAFE to create a virtual ‘Digger’ app – http://rslwahq.org.au/News/Well-done-Tom.aspx

Last year, at least two projects used cultural data for govhack – http://2015.hackerspace.govhack.org/content/citizen-culture-heritage-lest-we-forget

http://2015.hackerspace.govhack.org/content/exploring-indigenous-culture

And, there are all those geospatial projects, e.g. https://www.gaiaresources.com.au/sro-archive-maps/

 

 

 

 

 

 

 

 

 

 

Getting ready for #govHack – cultural datasets

Next week, the largest hackathon in the world, GovHack, takes place in Australia and New Zealand. There are govhack sites at universities and regional centres, and in all the major centres. Each site has participants, who make the things, and mentors who provide advice and guidance on tools and datasets. There’s even a specific cultural hack node in Canberra, run by Tim Sherratt.

This year, I’ve signed up as a data mentor for WA, which has a central city site and a regional node at Geraldton. This will be my third year as a data mentor, and my first year as a general mentor, talking cultural data generally (mostly archival, of course) rather than representing the SROWA. It’s a lot more organised than I anticipated, and people are already asking for more information to help them prepare. To this end, I’m going to use this post to talk about some datasets.  Participants need to use at least one official data set, but can then also look for other data that they can mash together or reuse. This way, I can print off the page as a guide, and provide a link to it for #govhackwa participants. I’ll do an additional post or two on some tools for analysing them and provide some examples of how data has been used.

Official cultural datasets

These datasets are taken from the various government data portals.

Searching the data.gov.au dataset reveals 144 datasets for the keyword ‘library’, 117 for ‘archive’, 52 for ‘museum’ and 129 for ‘cultural’. The latter includes some gis datasets, including the “coarse cultural topographical data”, showing where major population centres are and the CPI index. In addition to collection links and collection subsets, State and National Libraries have contributed statistical datasets relating to location of libraries, user statistics and so on.

My top picks, outside of TROVE (from the National Library) and ANDS (Australian National Data Service) are –

The National Portrait Galleryhttps://data.gov.au/dataset/portraits-and-people

The Antarctic artefacts bibliographyhttps://data.gov.au/dataset/aad-aa-bibliography and Commonwealth Bay artefacts survey data – https://data.gov.au/dataset/aad-cden-artefacts-gis

Indigenous protected areashttps://data.gov.au/dataset/indigenous-protected-areas-ipa-declared

The National Archives of Australia“Memory of a nation” – digitised content from online exhibitions – https://data.gov.au/dataset/memory-of-a-nation-data  and the Commonwealth Agencies dataset, which provides a comprehensive set of federal government departments, ministeries, offices and so on. Because of the way archives link data, some state and local government agencies are also included. This dataset was last updated in April, 2016 – https://data.gov.au/dataset/commonwealth-agencies.

The State Records Office of New South Wales has a number of indexes available as csv files in the NSW data portal – including convicts, soldier settlement indexes and wills and probate, not to mention their Flickr dataset. SRNSW collection information can also be searched via their online catalogue. Queensland State Archives has 55 datasets in the data.qld.gov.au portalhttps://data.qld.gov.au/dataset?q=archive&tags=Queensland+State+Archives&groups=historical. State Records South Australia has 5 datasetshttp://data.sa.gov.au/data/organization/state-records.

The Powerhouse Museum APIhttp://data.nsw.gov.au/data/dataset/bf9df234-7890-4907-94f6-e7872c8f4258

Other museum datasets include the gorgeous Scott Sisters collection from the Australian Museum, itself the subject of a remix competition in 2013/2014 http://data.nsw.gov.au/data/dataset/4e57d134-79e9-42ad-a0a9-83fc91e1091c

There’s a plethora of WW1 related datasets – searching for ‘World War’ returns 24 datasets, of which only two are not clearly related, and the majority of which are from State Libraries.

It’s worth remembering that data in TROVE is harvested from all public libraries, and includes data from museums and archives. The content can be filtered via the TROVE API. The Public Records Office of Victoria and the Australian National University and Noel Butlin Archives have both contributed data to TROVE. The State Library of Queensland not only has data in TROVE, but also contributed over 50,000 photographs to Wikipedia.

http://help.nla.gov.au/trove/building-with-trove/api

TROVE has some useful examples and help sheets – http://help.nla.gov.au/trove/building-with-trove/examples

The Australian National Data Service is similarly rich and complex. Again, the Public Records Office of Victoria (PROV) has contributed data to ANDS, along with the State Records Office of NSW. The PROV’s semantic wiki is available as an xml formatted download – https://www.data.vic.gov.au/data/dataset/public-record-office-victoria-semantic-wiki.

Postcript – I’ve just been advised that the Curtin Library has made weather observation data from Jon Sanders’ 1986 – 1988 circumnavigation available through ANDS – https://researchdata.ands.org.au/search/#!/rows=15/sort=list_title_sort%20asc/class=collection/q=jon%20sanders/p=1/group=Curtin%20University/. There’s also a nice blog – http://triplesolo.library.curtin.edu.au/ –  and you can follow along on Twitter #triplesolo #noonsummary.

Weather afficionados may also be interested in the digitised daily observations from colonial Perth, now in the NAA collection – http://recordsearch.naa.gov.au/SearchNRetrieve/Interface/ListingReports/ItemsListing.aspx?series=PP430/1

 

Finally, in the WA datasets, you will find a range of historical maps and plans, taken from the State Records Office digitised collection – each map links to the series at the top, but there are some older links to the previous catalogue. For better searching and exporting of data, it’s best to go straight to the new cataloguehttps://archive.sro.wa.gov.au/

WA theme parks – taken from the Landgate “locate the 80s’ site – http://catalogue.beta.data.wa.gov.au/dataset/wa-theme-parks

State Heritage Office datasets – http://catalogue.beta.data.wa.gov.au/organization/state-heritage-office

WA Maritime Archaeology datasets, provided by the WA Museum http://catalogue.beta.data.wa.gov.au/organization/western-australian-museum

 

Updating #inthemailbox

It’s now just over a fortnight since I last blogged. I keep checking my wordpress site, as though words might suddenly start appearing without me adding anything; so far, nothing.

I’ve not been idle though. I’ve given it a new theme, which I think works better with the menus I now have. I’ve uploaded a new image, which is based on one from the Government photographer’s collection, currently held in the State Library of WA. My daughter has all the Adobe software loaded to this machine, which has not been updated in over a year, so updates were an issue. The image shows in Preview on the Mac, which allows me to view a sepia version. However, I couldn’t figure out how to save it in those tones. Eventually, I opened it in LibreOffice as a drawing and changed the red and green colour values, which I could then save and upload. (Getting the updates for Adobe may have been easier!)

The theme comes with a set of colour templates, including bright yellow, a mauve and purple set, red, wishy washy blue and a black set. I chose the yellow template, and have been blinded at each of those nervous wordpress checks I mentioned at the start. Finally, on Friday night, I worked out how to change the base colour to a more dignified bronze/gold.

In anticipation of the coming accreditation at Curtin, and just to play with the idea, I’ve also created an online teaching portfolio. Whether I stick with academia, in whatever capacity, or return fully to my substantive position at the end of the year, I now have something that will let people know what I have done for the past few years.

Let me know what you think of the changes, and what else you would like to see.

#blogJune 101 blogs are done

(with apologies to South Pacific)

 

My blog is now growing with some follows; I’ve written more fully with delight.

When I’m blogging, I’m a blogger with the fellows

When I write, it’s a rite to be right

 

One hundred and one blogs are done

(Bet you thought I’d just begun!)

Cos I’m blogging every night in June

 

I’ve written about speaking out

Conference tweets and copyright

Thank goodness it will all be over soon

 

The CIDOC blog was fav’rite

The others also made it

The hits are pips

From follows and visits

 

A little bot helped a lot

while archive ning did it’s thing

with tweeting and sharing

And I’m having so much fun with bloggging June