The source code for the Names project’s disambiguation service and user front end are now available from the Bitbucket code-sharing service. The various components are:
The user interface to the Names data has been updated to use the code available through Bitbucket. Here’s an example of a Names record in the new web view of the data:
The Names project came to an end in July 2013 – the final report [PDF, 525KB] is available here. The conclusions of the report were:
- The Names project has demonstrated that automated or semi-automated solutions can be applied to bulk-process complex authority control tasks traditionally undertaken by cataloguers on an item by item basis. This approach offers the potential to extend authority control to types of resource, such as journal articles, which have previously been neglected on grounds of cost.
- The quality of the outcome is directly affected by the range and quality of the metadata available. Publishing conventions, such as use of initials rather than full names, hinder accurate identification and comprehensive disambiguation of individuals. Human intervention is still necessary, but filtering enables the human intervention to be focused on ambiguous and anomalous identities.
- Retrospective author disambiguation is complex and costly, even when partially automated and should be regarded as the solution to a legacy problem rather than the preferred way forward. The Names database and the components of the Names system are resources which can be used by other services to improve their own efficiency.
- Integration between national systems such as Names and international services like ISNI is possible, with the national system offering the opportunity of liaising with institutions to feed data into the international level and with the potential for saving the research community the fees for institutional membership for ORCID and registration agency costs ISNI. Further investment in Names would be required to establish an automatic updating mechanism between the Names system and ISNI and/or ORCID.
- The major achievements of the Names project have been the development of the disambiguation algorithm and the quality assurance process for the resulting data. These have enabled the creation of a useful set of information in the Names database which offers free and flexible access to its contents. By making the database structure, the data, and the disambiguation algorithm available through a code-hosting service, it will be possible for other services to make use of these elements in the future. It should be noted that the quality assurance expertise provided by the Names project team is not something that can be made available externally.
As we wind up the project, I would like to acknowledge the huge amount of work that Dan Needham at Mimas has put into developing this code and into sharing it so that others can benefit from his expertise in this area. Also, many thanks to our colleagues in the British Library: Alan Danskin, Stephen Andrews, Michael Docherty, Alison Wood, Richard Moore, Susan Skaife, Jasper Jackson and Andrew MacEwan whose time and efforts contributed to the success of Names, particularly in the development of a data model and in the quality assurance of data. They also helped to ensure that the results of the project live on in the form of ISNI identifiers for many UK researchers.
Last month the work of the NISO I2 (Institutional Identifier) group culminated in the publication of a NISO Recommended Practice document entitled Institutional Identification: Identifying Organizations in the Information Supply Chain [PDF]. The I2 group was established in 2008 with the task of looking at the issue of uniquely identifying institutions and other organizations.
From the report:
As the digital information landscape grows increasingly crowded and customized, and as institutions achieve economies of scale through increased collaboration, the need to unambiguously identify organizations engaged in any aspect of information acquisition, supply, archiving, and discovery becomes a critical enabler for efficient and trustworthy information practices.
The use of the International Standard Name Indentifier (ISNI) (ISO 27729) for institutional identification is recommended to achieve both of these goals.
Of the 157 UK research institutions currently identified within Names, 93 also already have ISNIs (in other words they were already identified as creators or publishers in library systems which have contributed data to ISNI). We have now added those ISNIs to the Names records of those institutions and will be requesting identifiers for the remaining 64 institutions in the coming weeks.
Our aim by the end of the current phase of the project (July 2013) is to have ISNIs assigned to all of the individuals and organisations identified within Names. ISNI disambiguates and assigns unique identifiers to institutions and individuals internationally. Where ORCID provides a service for individuals to identify themselves, ISNI relies on data from third parties and combines it to create merged records. This means that, in contrast to ORCID, it can include records for organisations and for individuals who may be unable (or unwilling) to manage an online profile.
The latest issue of Information Standards Quarterly (ISQ) is devoted to the topic of identifiers for people and organisations. There are featured articles on ISNI and ORCID, an update from the NISO I2 group and one from the Names Project.