Archive for the ‘Uncategorized’ Category

CLOCK notes – 8 May 2012

Posted on May 8th, 2012 by Paul Stainthorp

This is what the CLOCK project team are currently up to (from meetings over the past couple of weeks and from notes made at the recent Discovery: making sure your resources are discovered, used and reused event in Birmingham):

  • Andrew Beeken has been exploring the Cambridge COMET data via its SPARQL endpoints and has already blogged about the process of using SPARQL to “build kind of a ‘Hello World’ of open data querying”. He’s now looking at the recently-released Harvard open bib data and comparing the speed, the use of matching namespaces, and the use of JSON vs RDF/XML.
  • This work is leading up to unified search and presentation of records from several sources (Cambridge/COMET, Harvard, Lincoln/Jerome, OpenLibrary, etc.). Andrew and Trevor Jones are collaborating on drawing up a high-level architecture for CLOCK, and a strategy for expressing Linked Data, which will be shared with the rest of the project team (and publicly) for discussion.
  • To support this, Alex Bilbie in ICT services at Lincoln is helping to get the original Jerome application up and running on the CLOCK server (jerome.library.lincoln.ac.uk), where it can be used as a stable platform for developing and RDF-ifying Lincoln’s own bib data.
  • Trevor Jones and Ed Chamberlain will work together on developing the work with users (in parallel, at the University of Lincoln and the University of Cambridge) to clarify their requirements for bibliographic data:
    • For cataloguers, based around a rethink of copy cataloguing workflows, we will try to tease out requirements from talking to cataloguers (and associated subject librarians) asking to be ‘positively disrupted’: what do they need to do? What is missing from their data?
    • For researchers, we will build on some initial user walkthrough analysis done by Trevor and Andrew in Lincoln, with performing arts students in LPAC (the Lincoln Performing Arts Centre). What are the research questions that users are trying to answer? How does bib data help them answer those questions? What’s missing? Ed and Trevor will agree on a set of questions and tasks;
    • These requirements will be used to feed the remainingcycles of platform development for CLOCK.
  • Ed Chamberlain will act as the conduit between CLOCK and related projects in the Discovery strand, looking for points of shared interest/technology, and blogging (or asking others to blog) about aspects of one project which can inform the others. The other projects in which Ed is involved are: the Open Education Metadata UK (OEM-UK) project at the Institute of Education (shared interest in new user interfaces for cataloguing – possibly use screencasts to demonstrate alternative workflows?) and the Open Bibliography 2 project (lots of potential technical overlap – BibJSON, JSON-LD, BibSoup.net, expression in RDF container formats).
  • Ed and I (Paul Stainthorp) will work on developing the ‘business case’ / sustainability of CLOCK and data.*.ac.uk, following up on themes discussed in the recent Discovery event, and thinking not only about institutional funding / high-level support for open bib data, but also what it takes to move open bib data publishing from a development environment into an institutionally-supported, ICT-run service.
  • Finally, PS is arranging a couple of internal CLOCK ‘hack days’ (to take place on 17th-18th May, in Cambridge) – more details to follow.

Notes on last week’s KB+ meeting

Posted on May 8th, 2012 by Paul Stainthorp

As I start to understand the aims of JISC Collections’ KB+ (KnowledgeBase+) project a bit better, it’s starting to seem more and more relevant to the real-life problems of e-resources management. At last week’s meeting of the Technical Advisory Group, here were the things I found particularly interesting:

  • The proposed database model for journal package data, which does a neat job of distinguishing between the various ‘layers’ of ERM (in allowing data to be recorded separately for the issue, title, package and platform involved in a particular subscription deal);
  • The proposed links with the GOKb project in the USA, including the possibility (and it’s only a possibility at present) for sharing/co-designing data import processes; and the aims of the GOKb project itself in building and publishing collaboratively-maintained journal package data openly for the Kuali Open Library Environment;
  • The plans for live user testing of the first KB+ data release later in May, which will include e-resources librarians from 10 institutions getting their hands on the data and initial UI. This seems like a really useful and rare opportunity to do some near-real-world testing with groups of experts in the field of ERM. (N.B. this first group of 10 users doesn’t include the University of Lincoln – but I’ve asked Liam Earney if we could have ‘observer status’!);
  • The interesting questions (raised by Owen Stephens) around the complexities involved in representing overlapping journal package deals to e-resources managers – how will the librarians react to having their assumptions (and their mental model of what a journal ‘deal’ is) … challenged? My gut instinct is that we ought to want to know the underlying detail of multiple access rights in a single journal package – to dispel the ‘myths’ that might have grown up about our holdings over several years, even if it makes things look more complicated than we thought they were. (Naturally we need a way of re-presenting / simplifying this complexity to our users.)

I’ll continue to make notes about KB+ on this blog.

Imminent domain

Posted on May 4th, 2012 by Paul Stainthorp

With various new services arising out of the ongoing Library ICT systems review, we’re amassing a nice little collection of library-related 2nd-level subdomains. Here’s a list, which I’ll edit as they become live.

  1. http://library.lincoln.ac.uk/ (i.e. the ‘bare’ library subdomain: this isn’t used at the moment, but we intend that it will become the Library’s ‘root’ web presence)
  2. http://www.library.lincoln.ac.uk/ (currently used for our SirsiDynix Horizon Information Portal OPAC, which we intend to move to catalogue.library… in order to free up www for our web pages hosted on WordPress)
  3. http://catalogue.library.lincoln.ac.uk/ (the future home of the library catalogue)
  4. http://findit.library.lincoln.ac.uk/ (a launch point for our new Discovery system, still to be announced, and with a name yet to be decided!)
  5. http://lists.library.lincoln.ac.uk/ (Talis Aspire reading lists, currently being developed)
  6. http://archives.library.lincoln.ac.uk/ (Axiell Calm archives and special collections software)
  7. http://jerome.library.lincoln.ac.uk/ (Jerome is our innovation platform and a home for experimental search services, being re-developed as part of the CLOCK project)
  8. http://auth.library.lincoln.ac.uk/ (OpenAthens LA v2.1 authentication software)
  9. http://proxy.library.lincoln.ac.uk/ (EZProxy authentication software)

We also have two core systems which aren’t on the library subdomain:

  1. http://eprints.lincoln.ac.uk/ (the Lincoln Repository on EPrints – it’s appropriate that this isn’t on library, as we’ve always managed the Repository as a shared/collaborative project between CERD, ICT services, the Library, and the Research Office)
  2. http://ill.lincoln.ac.uk/ (CLIO inter-library loans software)

RefWorks citation output styles for the University of Lincoln – added IEEE

Posted on May 4th, 2012 by Paul Stainthorp

At the request of the School of Engineering, we have added a new citation output style to the ‘University of Lincoln Specific‘ list of styles preferred and/or supported at the University of Lincoln.

The IEEE (Institute of Electrical and Electronics Engineers) citation reference style is a broadly-recognised format for writing research papers in technical fields, including computer science as well as engineering.

Screenshot of RefWorks create bibliography options

It’s now available to select within RefWorks’ “Create Bibliography” menu, as well as in the Write-N-Cite application. The list of Lincoln-specific output styles now consists of five options:

  1. APA (American Psychological Association) style, used by the subject of psychology.
  2. Harvard (University of Lincoln) – a generic version of Harvard created by the Library which you may have to modify using the Output Style Editor to meet the preferred referencing style for your course;
  3. IEEE, commonly used in engineering;
  4. ISO 690 numeric style, which is permitted as an alternative to Harvard by some subjects;
  5. MLA (Modern Language Association) style, used in some humanities subjects.

For help with referencing style and with using RefWorks, contact your subject librarian or email: RefWorks@lincoln.ac.uk

KB+ TAG meeting

Posted on May 3rd, 2012 by Paul Stainthorp

I’m in London today for a meeting of the Technical Advisory Group (TAG) of the JISC KnowledgeBase+ (KB+) project.

KB+ is an ambitious project to create a “shared service knowledge base for UK academic libraries to support the management of e-resources by the UK academic community“. Project leader Liam Earney blogged recently about what KB+ ought to look like on ‘day one’ (1 September 2012). It’s quite an impressive list of features. The KB+ blog is at: knowledgebaseplus.wordpress.com

I’m particularly interested in the project because of the overlap with our own internal Discovery selection & implementation work, as part of which we’re reviewing our serials acquisitions and ERM procedures, looking for simplification and efficiency/automation wherever possible. Liam’s blog post on the possible future impact of KB+ is worth a read here. Sample quote:

“The benefits of focusing on the data is that the Knowledge Base+ service will ‘add value’ to a whole range of other local databases, ERMs, link resolvers and knowlegebases[…]“

I’ve written in the past about the difficulties we have at Lincoln—difficulties which appear to be shared by most academic libraries—in reconciling data provided by publishers/e-journal platform providers with what exists in commercial knowledgebase software (such as Lincoln’s current EBSCO A-to-Z service), and with what we think we should be entitled based on our subscription agreement! So many journal subscription packages are common to lots of libraries, if not standard across the whole of the UK – it seems obvious to centralise this information.

One of the functions of the TAG is to: ”Provide advice and guidance on the technical architecture, infrastructure, software, standards and tools to be adopted and implemented by the project”

As part of that, I’ve been reading up on the KBART (uksg.org/kbart) – Knowledge Bases And Related Tools guidelines, which provide a useful framework for understanding how ERM data should propagate through library systems. Key quote: with ”small adjustments to the format of their title lists, content providers can greatly increase the accessibility of their products”. This is certainly true. We waste a lot of time formatting and re-formatting publisher data to make it fit our knowledgebase.

The University of Lincoln Library is on Twitter

Posted on May 2nd, 2012 by Paul Stainthorp

Follow @GCWLibrary for updates.

Screenshot of the Library on Twitter

The technical approach: a CLOCK dev stack

Posted on May 2nd, 2012 by Paul Stainthorp

A note on technical development:

We’re beginning to make some progress towards a framework for development in the CLOCK project. Project developers Trevor Jones and Andrew Beeken, with the support of the other developers in LNCD, now have the following at their fingertips:

That list should give you an idea of LNCD’s approach to development. [N.B. some links may not be publicly accessible.]

CLOCK implementation: key themes (the Peterborough meeting)

Posted on May 2nd, 2012 by Paul Stainthorp

Screengrab of our notes from the CLOCK Peterborough meeting

This blog post is a comment upon the formal project implementation plan, and gives some more detail about how the CLOCK project intends to meet its project aims.

In February, 2012, the project team (EC, CL, PS, OS) met at Peterborough Regional College (roughly equidistant between Lincoln and Cambridge!) to discuss the implementation plan and our CLOCK ‘first steps’. We made copious notes using an interactive whiteboard. Here’s what we agreed for CLOCK…

Most of the day’s discussion was spent attempting to define more clearly the users/audience for CLOCK, narrowing down the field of study a bit as we went along, and looking for potential ways to engage those audiences in the research. We agreed that our users consist of:

1. Cataloguers and library managers looking to innovate their resource description workflows as well as contribute to the corpus of Open Bib Data, through improving/correcting/augmenting existing records as well as submitting new records, “adding to the story” by allowing libraries to incorporate data elements outside the boundaries of traditional resource description.

We spent a while discussing how the project might approach the problem of proposing new ”…minimal workflows for cataloguing around individual, disaggregated RDF elements” (taken from the project plan). We’ve also since discussed this back at Lincoln with staff in the Library and LNCD – I’ll shortly be blogging some diagrams which illustrate several different possible approaches to cataloguing workflow, as part of the ‘Users and use cases’ thread. We’ll also ve speaking to cataloguers at Lincoln and at Cambridge to try and get a clearer picture of the ‘pinch points’ in existing cataloguing, where applications using OBD might make a difference to their work.

Key quotes:

“Matching / negotiating of the best available Open bib data through common identifiers; the importance of a social/reputational aspect in identifying authoritative data; [use of] associated social/reputational metadata making explicit the provenance, history, and ‘pagerank’ measurements of each data element. [The phrase 'a narrative verdict on the catalogue record' was used…]“

2. Researchers (qualified as “the ‘serious’ and tech-savvy researcher“), who may be keen to incorporate Open Bib Data in user tools (e.g. citation/reference management software). We agreed to concentrate within the CLOCK project on a specific discipline—that of Drama/Performing Arts—because of the interesting challenges posed by the description of performance resources in existing bibliographic data. (“Almost anything you’d want to know about a play isn’t recorded in the MARC record!”). We identified a number of potentially useful resources and sources of data, including:

  • The play’s the thing
  • TheatreDB
  • Resources in institutional repositories
  • Theatricalia
  • Dutch Culture Link
  • Wikipedia/DBpedia

We agreed that we’ll set up a series of interviews/structured tasks for researchers in performing arts at Cambridge and Lincoln; also for subject librarians in the discipline (as a proxy to the researchers themselves). CLOCK will look at how well existing catalogue data describes performance and related resources (perhaps by sampling MARC records at both instititutions), and how external sources of ‘non-library’ data might complement and enhance those records.

3. Developers attached to academic libraries, who are looking to build applications exploiting available Open Bib Data, and techniques for interrogating and exploiting that data. The engagement with this audience is probably more at a strategic level than the first two – what are the technology choices and the decisions around the design of APIs and data endpoints – can we make a case study on developing using OBD?

We also discussed CLOCK’s overlap with other projects (in particular the Open Biblio 2 and the Open Education Metadata UK project). This work will be picked up by Ed Chamberlain, who is a common factor in all three projects!

“The project team believe that an important aspect of this innovation will be serious consideration given to the development of an awesome, national, open scholarly catalogue knowledgebase for the UK (“data.ac.uk/library” or “library.data.ac.uk”).”

Members of the CLOCK project team have since signed up to the new DATA-AC-UK mailing list and we will use the project as an opportunity to propose first steps in publishing national bibliographic data to data.ac.uk. This will be the topic of a future blog post.

“CLOCK will explore options for updating and maintaining the shared platform on data.lincoln.ac.uk as an eventual service”

University of Lincoln developer Alex Bilbie has blogged about the future of 5★ open data publishing at Lincoln: “As part of the Jerome project, we cracked open the university library’s digital catalogues and stored the data in a sane format (i.e. not MARC). Now through the CLOCK project the data will be semantically marked-up and compatible with other institutions bibliographic data“. This will also be the topic of a future blog post.

Return of the Mash

Posted on April 27th, 2012 by Paul Stainthorp

As I write this, there are just 8 tickets left for #Mashcat, the next Mashed Library event taking place on 5 July 2012 in Cambridge (…and the first mashlib since Pancakes and Mash in Lincoln a year ago? – silly me, I forgot about #ChrisMash). Becuse of the topic and the location, this is a particularly interesting one for the CLOCK project. Mashcat is:

A mashed library event focussing on cataloguing data. For cataloguers, developers and anyone else with an interest in how library catalogue data can be created, manipulated, used and re-used by computers and software. It will be an invaluable opportunity for cataloguers, developers and others to meet and share knowledge, thoughts, and ideas. Possible topics participants could explore on the day include the principles behind the data, tools and code for working with it, and real examples of work on bibliographic data.

Mashcat is a free one day event, which is supported by DevCSI. After refreshments, the first session will start at 10am.

For more information, see http://www.mashcat.info, email us at info@mashcat.info, contact @orangeaurochs and follow the hashtag #mashcat on Twitter.

On the “Z” list

Posted on April 26th, 2012 by Paul Stainthorp

Tshirt "Bad Decision Mr Z"My colleagues in e-Library Services at Lincoln have been spending the last few weeks updating our Library Management System (LMS) – SirsiDynix Horizon. This work included upgrading from v7.34 to v7.51b of the Horizon software itself (and from v3.08 to v3.21 of our library catalogue HiP) as well as moving Horizon off an internal Lincoln server to external SaaS, and re-connecting all the associated systems (access control; Keystone; the 2CQR Lucid self-service touchscreen software, etc.).

We’ve also changed our connection details for remote searching of our library catalogue via the Z39.50 protocol. Our new Z39.50 URL is z3950s://www.library.lincoln.ac.uk:210/lincoln (replacing the old z3950s://194.80.48.4:210/horizon).

A couple of people on Twitter asked why I was bothering. Z39.50 is a national and international (ISO 23950) standard defining a protocol for computer-to-computer information retrieval – and is pretty much the definition of dinosaur library tech:

But secretly I ❤ Z39.50. Also, a few services listed below—most notably RefWorks—still make use of it. The full details of our new Z39.50 setup are:

And here’s a very short list of registries and services that list and/or make use of our Z39.50 profile. I’ll add to this list if any more come to light.

  1. Copac, the UK union catalogue, uses Z39.50 in order to include results from Lincoln in Copac “@yourlibrary” searches.
  2. IESR, the MIMAS-run free and machine-readable catalogue of electronic resources.
  3. RefWorks’ “Search Online Catalog or Database” feature uses Z39.50 to import results from our catalogue. (We also list a small number of e-databases that can be searched via Z39.50 in RefWorks – I wonder if anyone uses these?)
  4. The Library of Congress‘s Z39.50 gateway list of library catalogues accessible via Z39.50.

For testing Z39.50 in the past, I have used the free-to-download Mercury Z39.50 Client from Basedow Information Systems. Other client software is available.