Three members of the Names Project team attended the Open Repositories conference, OR2012, in Edinburgh this week. It’s a really packed conference, with fascinating sessions, a Developers’ Challenge and this year a side helping of Repository Fringe with its challenging format of Pecha Kucha presentations.
The conference was very ably live-blogged by Nicola Osborne and Zack O’Leary. I won’t attempt to compete with their thorough work of describing the sessions but instead will mention some of the presentations I attended which touched on the name authority space, of which there were quite a few.
Most of the national name identifier systems presented upon or mentioned during the conference were familiar to me as ones we covered in the report commissioned by JISC last year. One that we somehow missed in that report though, was the Portuguese Cirriculum DeGóis researcher CV service, which is integrated with the national repository service, RCAAP. You can see José Carvalho’s paper on RCAAP here [PDF].
Kei Kurakawa presented on the Japanese Researcher Name Resolver in a Pecha Kucha session, which was also the format for Natasha Simons’ talk on the researcher identifier activity in Australia. Natasha noted that institutions needed both sticks and carrots to engage with national researcher identifier systems.
Simeon Warner set the scene for the need for researcher identifiers in his presentation on progress and plans for ORCID (liveblogged here), which was followed by the talk I gave on Names (that link is to the YouTube video – you can also get the slides from Slideshare), which referenced both ORCID and ISNI and in which I attempted to characterise the different national and international researcher identifier services currently operating or in development. It is a rapidly-evolving area in which we’re all working and linking between the various systems is going to be essential.
Just a quick post to mention that the videos of talks at the Personal Digital Archiving conference are now available online. The event, organised by the Internet Archives, was held in San Francisco in February and attracted a large number of speakers from a broad range of fields. I wouldn’t have thought that the content of the conference would have been directly relevant to the Names Project, but one of the talks was on the Social Networks and Archival Context project, which is working in a similar space: extracting names from archival finding aids and turning them into authority records. Clifford Lynch’s keynote was also of interest: he mentioned the current interest in unique identification of authors within educational institutions more than once.
The Names Project ran a survey of UK institutional repository managers in July, asking about their experiences of name-related issues in relation to repositories. We had a good response, with 65 people completing the questionnaire (we’ll be sharing the overall findings soon). The last question in the survey was ‘Would you be willing to discuss your answers further or to be a case study for the Names project?’. Over half of the respondents said ‘yes’ to this, so we’re now following up on that promise. I’m spending the next two weeks trying to visit as many of those people as possible.
It’s a great opportunity to see how people are dealing with name-related issues at first hand and to explain in more detail what we’re trying to achieve with the Names project. The tour starts today in Aberdeen and I’ll be visiting ten repositories in eight cities over the next two weeks (a test of the UK rail network, if nothing else). I will also be talking about Names at the Internet Librarian International conference in London on Friday 15th October, so if I haven’t managed to visit your repository, perhaps there will be a chance to catch up then. I am also attending JISC’s The Future of Research? event on 19th October in London and will be demonstrating the Names pilot system at the Mimas stand there.
The JISC Innovation Forum (JIF2010) at the Royal Holloway University of London presented the project team with another great opportunity to demonstrate the Names prototype while periodically stuffing ourselves with free food. I personally left JIF2010 with a notepad full of new contacts and a handbag full of mini blueberry muffins (which didn’t survive the journey home).
The event itself focused on the key tenets of innovation – openness, collaboration, and user involvement – which collectively foster creativity and encourage overall sustainability. Delegates were given the chance to learn about other JISC projects through interesting project showcases and jazzy exhibition stands. Innovative ‘Thunderbolts & Lightening’ sessions enabled audience interaction and facilitated idea sharing (which resulted in some impressive illustrations!).
Other hands-on sessions focused on project management, community engagement, and sustainability and access. I attended sessions on ‘writing to get your project noticed’, ‘promoting your project to different audiences’ and ‘engaging with communities – social media and networking’ during which I picked up some useful tips and practical advice on identifying and delivering to target audiences.
Amanda Hill gave a very well received presentation and demonstration of the Names prototype to a roomful of authority control and identity management enthusiasts. A live blog summary of the session is available here. We got some useful feedback and suggestions from the demo, including the possibility of incorporating an ‘Erdős number’ type rating system for authors appearing in Names (the Erdős number being a measurement of connections or associations between a given author and the prolific mathematician and writer Paul Erdős - see Wikipedia definition here). Unfortunately, we don’t yet have any articles authored by Erdős in the Names prototype but maybe in the future…
The event wrapped up with a talk from Professor John Potter who emphasised the significance of effective leadership, negotiation and collaboration within the higher education environment; all crucial endeavours in these uncertain times.
All in all, JIF2010 was a great opportunity to meet new people, learn about interesting projects, and develop practical skills for sustainability and community engagement. The blueberry muffins were really just an added bonus…
Two members of the Names Project team attended the NaCTeM/UKOLN text mining workshop in Manchester on 28-29th October. The event was an opportunity for us to find out how text mining tools have been used within the academic community and to understand the relevance of them to repositories and publishers which are important stakeholders for the Names Project.
The Director of the National Centre for Text Mining (NaCTeM), Sophia Ananiadou, gave a good introduction to the event, explaining that text mining provides annotations to unstructured textual materials which allow semantic enrichment of the text; making implicit knowledge within the materials explicit. A range of perspectives on text mining were then represented, from the academic (linguistics, biology, chemistry and social science) to publishers (Elsevier and the Nature Publishing Group) and service providers (Mimas, EDINA and Microsoft Research).
A theme mentioned by Tony Hey of Microsoft was that if tools like text mining are to be taken up widely by the scientific community (and I presume, by extension, the wider academic world), then they need to be as simple to use as the Web 2.0 tools that are being widely used by general web users. This was echoed in two subsequent talks: Rafael Sidi of Elsevier (who got through an eye-boggling 180 slides in 30 minutes!) emphasised the importance of openness in encouraging innovation and Paul Walk of UKOLN gave us the developers’ point of view, pointing out that access to data without unnecessary obstacles was essential to get the developer community to make use of services.
The closing session allowed a panel of six experts to give their view of the future of text mining, particularly in the context of institutional repositories. Areas that were seen as important were involving end-users in evaluating the effectiveness of text-mining tools (comparing results to those that can be obtained using manual methods); improving repository metadata by using automatic classification of full-text materials such as theses and papers; searching across multiple repositories; developing standards for semantically annotating materials and recording the provenance of those annotations; capturing work-in-progress information generated by researchers that does not get formally published (e.g. laboratory workbooks recording unsuccessful experiments). One issue that (inevitably) generated a lot of discussion was the problem of getting permission to use full-text materials for text-mining purposes given restrictions imposed by copyright laws and by publishers who put limits on annotation of their articles.
Thanks to UKOLN and NaCTeM for organising an interesting event which gave all the attendees plenty to think about and to discuss.
Dan Needham, the developer working on the Names Project, attended the Web Services and Repositories workshop that was organised by the EThOS project and held at the British Library on 2nd June.
He gave a presentation [PowerPoint format, 205KB] on the project and the aims behind the web services for the Names prototype that he’s been working on and recently testing with colleagues from Cranfield University.
UPDATE: the audio from Dan’s presentation and all the other materials from the day are now available on the EThoS site.
In the afternoon I attended the session on e-theses, which was chaired by Owen Stephens and also thoroughly blogged by him (which is quite an impressive feat). Author identities were only touched upon in passing here, but the Entry to EThOS (E2E) project at King’s College is using student record systems to populate name (and other) metadata associated with electronic theses, which sounded interesting. The overlap between the people involved in the creation of theses and those who are producing research outputs is clearly high, meaning that there will be good reasons in the near future for the Names Project to work together with those involved in managing e-theses and digitising the paper versions.
The conference has been interesting. The papers have been a real mixture of the highly theoretical and abstract and those which are more pragmatic and based on solving problems. The session which included the presentation on the Names project was a case in point – the speaker before me was explaining how she had been analysing different thesauri that covered issues relating to water, to see whether they’d be useful for her organisation (the Mexican Institute of Water Technology), while the one who came afterwards was explaining how theories in the social sciences (Marxism, liberalism and feminism, for example) might be classified.
Jenn Riley of Indiana University talked about the disconnect between theoreticians and practitioners in her excellent presentation about the Variations3 Digital Music Library project. She lamented the fact that information professionals often neglect to think about the conceptual models that their content standards (e.g. MARC and MODS) are representing. Names is using the conceptual model of FRAD as the basis of the data structure for the prototype, thanks to the work that the British Library team have undertaken in the Data Analysis Report, so I was pleased to hear Jenn’s views on this.
The questions at the end of my talk included one from a librarian who took exception to the Names project’s aim of not having a preferred form of name for individuals. I had thought that this would be more acceptable now than it had been in the past – it’s a distinction between authority control and access control that has been fairly widely discussed in the library literature, but apparently it is still controversial. Interesting!
The keynote was given by Jonathan Furner of UCLA. Jonathan talked about the philosophy of identity and the process of identifying individuals and objects. He noted that all knowledge organisation systems reflect the world-view of the designers of the system (this is unavoidable), and stated that all such systems should aim to be ‘just’ (i.e. not violate the rights of any particular group). He also felt that such systems need to be responsive and dynamic in order to adapt to the needs of users of the system. This cannot be achieved with rigid hierarchical and centrally-controlled systems alone, although these do have their place. More social approaches such as user-tagging can help to reflect other world-views. This seemed to me to fit in well with the way we are planning that the Names prototype will work. We’ll be creating records for individuals, but also encouraging them to take ownership of the information within them.