Two members of the Names Project team attended the NaCTeM/UKOLN text mining workshop in Manchester on 28-29th October. The event was an opportunity for us to find out how text mining tools have been used within the academic community and to understand the relevance of them to repositories and publishers which are important stakeholders for the Names Project.
The Director of the National Centre for Text Mining (NaCTeM), Sophia Ananiadou, gave a good introduction to the event, explaining that text mining provides annotations to unstructured textual materials which allow semantic enrichment of the text; making implicit knowledge within the materials explicit. A range of perspectives on text mining were then represented, from the academic (linguistics, biology, chemistry and social science) to publishers (Elsevier and the Nature Publishing Group) and service providers (Mimas, EDINA and Microsoft Research).
A theme mentioned by Tony Hey of Microsoft was that if tools like text mining are to be taken up widely by the scientific community (and I presume, by extension, the wider academic world), then they need to be as simple to use as the Web 2.0 tools that are being widely used by general web users. This was echoed in two subsequent talks: Rafael Sidi of Elsevier (who got through an eye-boggling 180 slides in 30 minutes!) emphasised the importance of openness in encouraging innovation and Paul Walk of UKOLN gave us the developers’ point of view, pointing out that access to data without unnecessary obstacles was essential to get the developer community to make use of services.
The closing session allowed a panel of six experts to give their view of the future of text mining, particularly in the context of institutional repositories. Areas that were seen as important were involving end-users in evaluating the effectiveness of text-mining tools (comparing results to those that can be obtained using manual methods); improving repository metadata by using automatic classification of full-text materials such as theses and papers; searching across multiple repositories; developing standards for semantically annotating materials and recording the provenance of those annotations; capturing work-in-progress information generated by researchers that does not get formally published (e.g. laboratory workbooks recording unsuccessful experiments). One issue that (inevitably) generated a lot of discussion was the problem of getting permission to use full-text materials for text-mining purposes given restrictions imposed by copyright laws and by publishers who put limits on annotation of their articles.
Thanks to UKOLN and NaCTeM for organising an interesting event which gave all the attendees plenty to think about and to discuss.
The British Library are recruiting an Analyst for the Names Project. The information below is taken from the job details on their recruitment site.
Location: Boston Spa, Yorkshire
Position Type: Fixed Term
Salary: £22,063 – £23,896
Fixed term appointment for 2 years
Closing date: 18 October 2009
A. Rose by any other name might be, “Alex” to her friends, “Dr. Alexandra Rose”, to her students, “Dr. Alexandra N. Rose”, to her funders and, “A.N. Rose, PhD”, to her publishers; and she is not the only A. Rose. For the higher education and research communities identification of researchers and authors is difficult. The Names 2 Project aims to develop innovative and scalable solutions to problems of identification, attribution and affiliation.
We are recruiting an Analyst to help turn this project from a concept to a service. This is a full time, fixed term post, funded for 2 years by JISC (Joint Information Systems Committee). Names 2 is led by Mimas, based at the University of Manchester.
The successful candidate will have excellent communications skills and work effectively to deadlines. Experience of cataloguing at a professional level, using internationally recognised standards is essential. First hand knowledge or experience of institutional repositories or authority control will be an advantage. The post holder will work as part of a distributed project team.
For an informal discussion about this role please contact Alan Danskin on 01937 546669.