Posts Tagged ‘Jamie Mahoney’

1.8 million library loans from the University of Lincoln under CC0 – Copac Activity Data/SALT2 project

Posted on May 16th, 2012 by Paul Stainthorp

Today we published data on approximately 1.8 million items loaned from the University of Lincoln’s libraries since 2001. The data is available to re-use under a CC0 licence, and can be downloaded from:

We’ve done this as part of our involvement in the Copac Activity Data Project, a.k.a. SALT2. Along with data from the universities of Manchester, Sussex, Cambridge and Huddersfield, our circulation data will be used to power a ‘recommender API‘, which libraries will be able to use to build “People who borrowed X also borrowed Y“-type services. The API will benefit from the power of aggregated data from multiple institutions of different types, containing tens of millions of circulation events.

You’ll notice as well that we’ve chosen to host the data on our brand-new Orbital (v0.1) research data management application. Each dataset has a persistent citable URI. We’ll be keeping the data up-to-date, and generating a new activity data file from our library circulation logs shortly after the end of each academic year.

The data consists of a number of CSV files (one for each academic year since 2000-01, plus a huge file of all the data), containing the following fields:

Field index Field name Description
0 CREATE_DATE The date and time of the loan event, in the format: dd/mm/yyyy hh:mm
1 BORROWER_ID A cryptographic hash of the internal system ID associated with the borrower of the item, as used in the University of Lincoln’s library system.
2 WORK_ID A cryptographic hash of the internal system ID associated with the bibliographic work borrowed, as used in the University of Lincoln’s library system.
3 CONTROL_NUMBER The ISBN of the work borrowed (10 or 13 digits).
4 AUTHOR_DISPLAY The main author of the work borrowed.
5 TITLE_DISPLAY The title of the work.
6 PUB_DATE The publication year of the work in the form: yyyy

I’ll blog in detail another time about exactly how we created the data extracts. In short:

  1. There is a table in the SirsiDynix Horizon library management system called circ_tran which records every instance of item number X borrowed by user number Y at time Z. [#1]
  2. There is another table which provides a lookup between item numbers and the numbers of the bibliographic works of which they are a copy. [#2]
  3. Dave Pattern at the University of Huddersfield wrote a Perl script which scrapes all the bibliographic data (title, author, ISBN) for each work from our OPAC (Horizon Information Portal) and writes it to a text file. [#3]
  4. Developer, Jamie Mahoney of CERD/LNCD then stepped in, using some pretty heavy SQL on the original 3 data extracts, to:
    • Hash the internal Horizon user and work ID numbers to provide anonymity;
    • Convert the internal Horizon date and time stamps in extract [#1] from a version of Unix time into a readable datestamp (formula hint: cko_date*86400 + cko_time*60);
    • Used the item/work lookup table [#2] to pull in the bibliographic details for each loan in [#1] from the bibliographic table [#3] (an epic SQL JOIN query), removing items which are no longer represented in our library system;
    • Removed any items without an ISBN, which are of no use to the SALT recommender API;
    • Tweaked the punctuation and formatting;
    • Split the data into separate files for each year.

Once again, the data is at:

Thanks are due to Chris Leach and Dave Pattern for Horizon-fu, and to Jamie Mahoney for his patient wrangling of several millions of lines of data!

You can find out more about the Copac Activity Data Project/SALT2, at: http://copac.ac.uk/innovations/activity-data/

Smartening up the catalogue for September

Posted on August 4th, 2011 by Paul Stainthorp

We’re making a few changes to the home page of our library catalogue in time for the new academic year. Changes include:

  • Reduced ‘tabset’ browsing to only the most important elements of the catalogue.
  • Use of the newest version of the University’s Minerva logo and colour scheme.
  • Home page used for ‘top 10′ (…ish) links to Library services elsewhere on the web – these are served up using an RSS feed via Feed2JS (so that we can display the same links in other environments such as Blackboard). All placed in one of HiP’s lovely XSL stylesheets.

Very many thanks to the new LNCD intern Jamie Mahoney for help with styling this!

Here’s the current, ‘old’ front page:

Screenshot of the old catalogue home page

And here’s the new, redesigned page – still in development!

Screenshot of the new catalogue home page

You can have a look at it, if you like, at:

This isn’t intended as a long-term solution for the question of the Library’s web presence. There’s a lot more we need to do to consolidate and simplify the information we present to users across different environments (open web, intranet/Portal, Blackboard VLE, etc.). But it’s a good short-to-medium-term fix which makes the most of the tools we have available at the moment, and recognises the value of establishing www.library.lincoln.ac.uk as the home of our ‘primary’ presence on the web. If nowt else, that’s the address we’re printing on our induction materials :-)

We also had to work out a way of testing this on one of our public-access OPAC kiosks. I was particularly proud of this little MARC hack which allowed us to navigate to the test version of the home page without having to use the browser navigation bar (which is disabled on the kiosks).