Posts Tagged ‘activity data’

EMALINK seminar on activity data and the LIDP

Posted on June 15th, 2012 by Paul Stainthorp

The details of a workshop I’m speaking at in July; relates to the University of Huddersfield-led Library Impact Data Project (LIDP), in which Lincoln and DMU participated last year.

EMALINK Seminar

Library Data Impact Project

De Montfort University
Kimberlin Library
Lecture Theatre, 00.11
Tuesday 17th July 2012
2.00 – 4.00 pm
(Light refreshments available from 1.45pm)

The JISC Library Data Impact Project proved a statistically significant correlation between library usage and student attainment. Two universities in the region, De Montfort and Lincoln, participated in the project and will present on their approaches to the collection of library activity data and the analysis and dissemination of the results. There will also be an opportunity for participants to discuss the practicalities and value of gathering and using such data, within our libraries and the wider institution.

Session leaders:
Phil Adams, Senior Assistant Librarian, De Montfort University

Marie Letzgus, Senior Assistant Librarian, De Montfort University

Paul Stainthorp, Electronic Resources Librarian, University of Lincoln

Contact your EMALINK representative to book a place by Wednesday 11th July. There are three places available per institution.

The DMU campus is a 15-20 minute walk from Leicester train station. Limited visitor parking is available on campus – please advise your EMALINK rep on booking if you wish to request a parking space. Campus maps available from: www.dmu.ac.uk/about-dmu/how-to-find-us.aspx

1.8 million library loans from the University of Lincoln under CC0 – Copac Activity Data/SALT2 project

Posted on May 16th, 2012 by Paul Stainthorp

Today we published data on approximately 1.8 million items loaned from the University of Lincoln’s libraries since 2001. The data is available to re-use under a CC0 licence, and can be downloaded from:

We’ve done this as part of our involvement in the Copac Activity Data Project, a.k.a. SALT2. Along with data from the universities of Manchester, Sussex, Cambridge and Huddersfield, our circulation data will be used to power a ‘recommender API‘, which libraries will be able to use to build “People who borrowed X also borrowed Y“-type services. The API will benefit from the power of aggregated data from multiple institutions of different types, containing tens of millions of circulation events.

You’ll notice as well that we’ve chosen to host the data on our brand-new Orbital (v0.1) research data management application. Each dataset has a persistent citable URI. We’ll be keeping the data up-to-date, and generating a new activity data file from our library circulation logs shortly after the end of each academic year.

The data consists of a number of CSV files (one for each academic year since 2000-01, plus a huge file of all the data), containing the following fields:

Field index Field name Description
0 CREATE_DATE The date and time of the loan event, in the format: dd/mm/yyyy hh:mm
1 BORROWER_ID A cryptographic hash of the internal system ID associated with the borrower of the item, as used in the University of Lincoln’s library system.
2 WORK_ID A cryptographic hash of the internal system ID associated with the bibliographic work borrowed, as used in the University of Lincoln’s library system.
3 CONTROL_NUMBER The ISBN of the work borrowed (10 or 13 digits).
4 AUTHOR_DISPLAY The main author of the work borrowed.
5 TITLE_DISPLAY The title of the work.
6 PUB_DATE The publication year of the work in the form: yyyy

I’ll blog in detail another time about exactly how we created the data extracts. In short:

  1. There is a table in the SirsiDynix Horizon library management system called circ_tran which records every instance of item number X borrowed by user number Y at time Z. [#1]
  2. There is another table which provides a lookup between item numbers and the numbers of the bibliographic works of which they are a copy. [#2]
  3. Dave Pattern at the University of Huddersfield wrote a Perl script which scrapes all the bibliographic data (title, author, ISBN) for each work from our OPAC (Horizon Information Portal) and writes it to a text file. [#3]
  4. Developer, Jamie Mahoney of CERD/LNCD then stepped in, using some pretty heavy SQL on the original 3 data extracts, to:
    • Hash the internal Horizon user and work ID numbers to provide anonymity;
    • Convert the internal Horizon date and time stamps in extract [#1] from a version of Unix time into a readable datestamp (formula hint: cko_date*86400 + cko_time*60);
    • Used the item/work lookup table [#2] to pull in the bibliographic details for each loan in [#1] from the bibliographic table [#3] (an epic SQL JOIN query), removing items which are no longer represented in our library system;
    • Removed any items without an ISBN, which are of no use to the SALT recommender API;
    • Tweaked the punctuation and formatting;
    • Split the data into separate files for each year.

Once again, the data is at:

Thanks are due to Chris Leach and Dave Pattern for Horizon-fu, and to Jamie Mahoney for his patient wrangling of several millions of lines of data!

You can find out more about the Copac Activity Data Project/SALT2, at: http://copac.ac.uk/innovations/activity-data/

Ook Nog! Ook Nog! University of Liverpool student team win #DevXS library activity data prize

Posted on November 19th, 2011 by Paul Stainthorp

Four students from the University of Liverpool calling themselves Team Ook Nog took the prize for the best use of library activity data at last weekend’s DevXS student hackathon in Lincoln. Their application used the openly-licensed national OpenURL router data from EDINA and used it to build a search/recommendation tool for scholarly journal articles. You can see the fruits of their labour here

#DevXS - Team Boss Ook Nog

Jude-Thaddeus Ojiaku, Andrew Collins, Arnoud Pastink and Thomas Gorry built the Ook Nog site in a marathon development session over 30 hours in the Engine Shed. A simple Google-like search box (very Google-like!) displays results of articles and books derived solely from the OpenURL router data (example); each result has context-sensitive links out to dx.doi.org, OCLC firstsearch, CORE repository search, and Google Scholar. Clicking on any search result shows a chart of activity for that article, along with “See Also…” suggestions for other articles accessed by the same user in a similar timeframe. Take a look at the results.

From the DevXS wiki:

“Ook Nog is an interface for the data provided by openurl allowing you to search all of the data for any term and find search terms within their archive. By selecting any prior search term, you can then browse all search terms that were also performed by that user(s) within a small time period.

“All publications/searches are nodes. A node shares an edge with another node if a user has searched both nodes. We try to increase the chance of relevance by only showing neighbours of a node that were formed +- 90 days (a semester!).

“Despite no further tests of relevancy, the searches/publications found can be surprisingly similar (or amusing).”

The team from Liverpool pipped their traditional regional rivals to the library prize – Team MCR, made up of student developers from 3 different Manchester universities (University of Manchester, Manchester Metropolitan University, University of Salford). Team MCR built a working DevXS library app based around course reading lists with some interesting social ranking features, designed with great care using the Balsamiq wireframe UI tool, and making use of several open bibliographic datasets including the MOSAIC project data and Cambridge University Library’s search APIs. For their trouble, they picked up the #DevXS ‘social’ prize, awarded by the University of Lincoln Social Research Centre (LiSC).

DevXS was brilliant. Thanks again to Ian Snowley for the idea of donating a University of Lincoln Library prize. £250 in Amazon vouchers are on their way to Liverpool now.

DevXS – library activity data challenge

Posted on November 11th, 2011 by Paul Stainthorp

DevXS starts in less than 5 hours! From the DevXS wiki:

Competitions

Library activity data

The University of Lincoln Library (http://www.library.lincoln.ac.uk/) are sponsoring a £250 Amazon voucher prize (5 x £50 vouchers), which will be awarded to the team making the best use of library activity data as part of the application(s) they develop over the weekend. See the Data page on the wiki for examples of freely-available library activity data.

(DevXS is a developer marathon spread across three days, where students from across the UK and beyond are encouraged to team up and build cool things that contribute to university life. DevXS is about students sharing their ideas, mashing up data and building prototypes that improve, challenge and positively disrupt the research, teaching and learning landscapes of further and higher education.)

The data! The data!

Posted on October 3rd, 2011 by Paul Stainthorp

The Library Impact Data Project (LIDP), which ran from February-July this year, and in which the University of Lincoln took part, has now released a subset of the library activity data used in the analysis (which, you’ll remember, showed a statistically significant correlation across a number of universities between library activity data and student attainment).

Lincoln’s data is included in the release, which is available for re-use under an open licence, from:

http://eprints.hud.ac.uk/11543/

This data set is made available under the Open Data Commons Attribution License
http://opendatacommons.org/licenses/by/1.0/

The data contains final grade and library usage figures for 33,074 students studying undergraduate degrees at UK universities. More information on the data, and how it’s been generalised in order to preserve students’ anonymity, on the LIDP project blog.

  • There’s also a detailed report about the statistical breakdown of Lincoln’s own share of the data (this wasn’t published as part of the project reports, as it was down to each individual institution whether to make it public or not) – I’ve made the report available here [PDF].

The LIDP blog also contains information about the project ‘toolkit‘, developed to assist other institutions who may want to test their own data against the LIDP’s hypothesis, here and here.

Thanks again to Graham, Bryony and Dave at the University of Huddersfield for inviting Lincoln to take part in the project, and for their help along the way!

On to the next one…

The Library to sponsor developer prize at DevXS

Posted on September 16th, 2011 by Paul Stainthorp

At the suggestion of the University Librarian Ian Snowley, the University of Lincoln Library are sponsoring a £250 developer prize at the DevXS student developer hackathon in November. The moolah will go to the winners of a library-flavoured developer competition at DevXS, based around the best use of activity data (details tba).

Screenshot of the DevXS website

DevXS is free! It’s open to all undergraduate and postgraduate students, and it’s taking place in Lincoln on the 11th, 12th and 13th of November. Registration is now open. Find out more at devxs.org or by following @devxsconf on Twitter.

Developers Unite!

DevXS is a developer marathon spread across three days, where students from across the UK and beyond are encouraged to team up and build cool things that contribute to university life.

DevXS is about students sharing their ideas, mashing up data and building prototypes that improve, challenge and positively disrupt the research, teaching and learning landscapes of further and higher education.

We’re going to award prizes to the best ideas, prototypes and collaborations and there are going to be developers from universities around the country hanging around to help you out.

Sound awesome? Register now! It’s free!

Activity data workshop, Leeds, 5th September

Posted on September 13th, 2011 by Paul Stainthorp

Last week Elif and I attended a half-day workshop at the University of Leeds, entitled ‘Improving processes by using activity data‘, which was organised as part of the JISC Activity Data ‘Synthesis’ programme, as a pre-conference event before the 2011 ALT-C conference.

I got the impression that only about half the expected delegates turned up, which seems a bit poor form, but perhaps all too common for a free workshop.

Presentations from:

One thing (of many useful things) that came up in the discussions surrounding these presentations was around the “usefulness” (utility?) of activity data, and how that usefulness is ‘sold’ to the parent institution: shades of business case-type arguments around recruitment, retention, impact, resource management, etc., but what about the user experience? What about the service quality (sez Ben Scoble of Staffordshire University)?

There’s a danger that these aspects could be missed in the drive to produce a convincing ‘traditional’ business case for activity data, when they are the things we ought to be concentrating on the most (and I tend to assume that, as long as I still have a job, the overall case for providing a quality library service has already been accepted by my institution… at least for the time being).

Then on to activities (ha ha), and a group discussion around the way forward in making it easier for libraries to gather and use activity data. Placed on the spot by David Kay (a consequence of “Lincoln having done all of this sort of stuff already“!! – i.e. participated in the MOSAIC and LIDP projects), I reiterated my point that we should really only be concerned with trying to build a better service for the library. We shouldn’t have to constantly refer up to—e.g.—the effect on student satisfaction, retention, or attainment. Take them (for practical purposes, anyway) as a given. The case has already been made, and as long as your library is open for business, your institutions wants you to use activity data. They do. They really do.

All that remains now is for all of us (esp. the Synthesis project) to come up with a sane, usable, ultra-lightweight event-based (WWWWWH) data-exchange format which would allow institutions to easily share and re-use activity data: practival interoperability for libraries and l-users across all library domains. There are some good ideas floating around (they pretty much scream Linked Data), and I’m sure you’ll be hearing about them soon.

The JISC Activity Data Synthesis project blog is at: http://blog.activitydata.org/

LIDP: end of project. Using libraries = good.

Posted on July 28th, 2011 by Paul Stainthorp

I was in Huddersfield last week for the final project meeting of the Library Impact Data Project (LIDP).

LIDP was successful in proving that:

There is statistically significant relationship between both book loans and e-resources use and student attainment. And this is true across all of the universities in the study that provided data in these areas.

“We want to stress here again that we realise THIS IS NOT A CAUSAL RELATIONSHIP!  Other factors make a difference to student achievement, and there are always exceptions to the rule, but we have been able to link use of library resources to academic achievement.”

An initial (outline) report on how the University of Lincoln’s own activity-attainment holds up to this same statistical inspection is available to download from here [PDF]. As much as possible of the library activity data used in the project will be released under an Open Data Commons Attribution License in the near future, and hosted on the project blog.

LIDP [old photo]Thanks are due to Graham Stone, Dave Pattern, Bryony Ramsden, and all the project partners for the opportunity for Lincoln to participate in this project. We had fun getting our together. The end-of-project blog post for LIDP is here – it suggests some very interesting areas for further investigation.

Personally, I’m very interested in looking for cross-institutional comparisons – perhaps trying to explain particular levels of activity-attainment attached to individual subject areas, irrespective of which university the student is at (i.e. does a Lincoln computing student have more in common with a Lincoln business student, or with a Huddersfield computing student?). I’d also be interested in looking particularly at those students whose library activity behaviour changes through the life of their course, and who then go on to get a better degree than they might have been predicted based on their library activity in their first year.

“Finally, we have been astonished by how much interest there has been in our project. To date we have two articles ready for publication imminently and have another 2 in the pipeline. In addition by the end of October we will have delivered 11 conference papers on the project. All articles and conference presentations are accessibly at: http://library.hud.ac.uk/blogs/projects/lidp/articles-and-conference-papers/

I can see this project getting cited, and cited again, simply every time anyone wants to argue that academic libraries are A Good Thing.

What I been up to

Posted on July 7th, 2011 by Paul Stainthorp

Apologies: this is one of those generic catch-all blog posts. I attended four separate events last week: here’s a short report from each one.

~~~

Kimberlin1. CILIP UC&R Members’ Day: Making an Impact

De Montfort University, Leicester. 28 June, 2011

This workshop for CILIP members was looking at various ways in which libraries can have (and can measure) their ‘impact’. I spoke first about Lincoln’s involvement in the University of Huddersfield’s Library Impact Data Project (LIDP), and how that project is trying (successfully, it seems) to measure the relationship between students’ library use and their degree ‘success’.

Then DMU subject librarian Jason Eyre talked about his PITSTOP project, which built a mediated forum for online discussion between Social Work students on placement, their lecturers, and their practice educators (in the NHS and local authorities). Jason explained that while the online discussion forum itself was not very well used, the impact of the project was that is acted as a catalyst for building a better relationship between students, academics, practice educators, and the library.

After a very well-run World Café session, where we moved around between different tables, each themed with a different aspect of ‘impact’ in libraries – and then lunch, information management consultant David Streatfield presented on the difficulties of measuring and evaluating the impact that academic libraries can have. He outlined some of the different approaches that have been taken in the past, and how those approaches can be less than successful in an environment of government pressure to control public service provision.

Lastly, Maria Cotera, former president of the CILIP Career Development Group, told us several anecdotes about the ways she has seen library workers make an impact themselves, through their involvement in staff development, social, and extra-professional activities. In an exercise, all the delegates came up with an example of a shared pressure or circumstance in our home institutions that could be turned into an opportunity for staff development.

Thanks to Marie Nicholson and the UC&R East Midlands committee for inviting me to speak! Twitter hashtag: #UCREMimpact.

~~~

Great Central Icehouse2. EMALINK event on collection development

University of Lincoln. 29 June, 2011

This was another East Midlands event, and the first EMALINK event held in Lincoln since we joined that network. It was organised, jointly, by the University of Lincoln, our neighbours Bishop Grosseteste University College, and Nottingham Trent University (NTU). The theme was the lifecycle of collection management: from selection and acquisition, through analysis and review of collections, and finally disposal.

NTU kicked off with a look at their work to incorporate Talis Aspire into the DNA of their library: they’re building a set of resource selection and allocation processes that are strongly driven by the resource lists built by academics using Aspire. Lincoln responded with two short presentations about collection analysis: our project to compare the strengths and weaknesses (in size, breadth, and age) of the various subject collections in our physical bookstock with the relative sizes of the student body in different subject areas; and our work to determine value for money in ‘Big Deal’ database subscriptions. Finally, Susan Rodda from Bishop Grosseteste talked about the options for disposing of unwanted physical library stock, and how BG have managed, for several years, to weed their collection without sending any paper to landfill.

~~~

Goodenough library (detail)3. JISC Managing Research Data Programme (#jiscmrd) community briefing event

Goodenough College, London. 1 July, 2011

On Friday, I attended this briefing event for the current JISC research data funding call for proposals, on Joss Winn‘s behalf. The JISC programme manager ran through the requirements and expectations for the various strands of this current call. Kevin Ashley of the Digital Curation Centre also presented: about how the DCC can support and work with institutions who are running research data management projects. See hashtag: #jiscmrd for information about the programme.

~~~

OU Library4. JISC Innovations in Activity Data workshop

The Open University, Milton Keynes. 4 July, 2011

After a long, Sunday-afternoon train journey to Milton Keynes, I paid my first ever visit to the OU’s Walton Hall campus for another activity data-related event, this time organised and hosted by the team behind the JISC-funded RISE (“Recommendations Improve the Search Experience”) project.

The day began with three presentations from projects funded under the current JISC activity data strand:

  1. Joy Palmer of MIMAS and the SALT project (“Surfacing the Academic Long Tail”: MIMAS working with the John Rylands University Library of the University of Manchester);
  2. RISE themselves (Richard Nurse of the OU) talking about how they are using EZProxy log data to power a recommendation service (“…users who looked at this, also looked at these…“);
  3. Via video link, live from Huddersfield: Dave Pattern talking about LIDP.

Then, another World Café-type exercise (two in one week!). We moved about the room, scribbling on the tablecloths, making notes about: [a] what activity data universities have at their disposal; [b] what use we might put it to; and [c] what barriers are in our way.

In the afternoon: two more presentations. The OU’s Tony Hirst (a.k.a. @psychemedia), rattling and rambling through various techniques for visualising activity data. This is really valuable stuff… what I’m less clear about is: where’s the first rung of the dataviz ladder? How does a muggle start thinking about data visualisation? Tony says that many of the techniques he writes about are things he “didn’t know how to do a couple of hours before…“, but that doesn’t necessarily mean that the rest of us will find them as easy to pick up! Tony’s coming to Lincoln soon, so I’m going to try and talk to him about data visualisation a bit more then.

Last of all, David Kay (of SERO and the JISC activity data Synthesis Project: kind of an umbrella for all of these separate activity data initiatives) summed things up nicely: including an excellent slide listing the kinds of skills library workers are going to have to develop in order to do justice to activity data: including data visualisation, again! I’ll post that slide here, if and when I can find it.

There was a little bit of activity on Twitter for this workshop: look for the hashtag #iad11.

~~~

Anonymised library activity data for the academic years 2007/08, 2008/09 and 2009/10: collected for the JISC Library Impact Data Project

Posted on June 13th, 2011 by Paul Stainthorp

These data consist of entries for 4,268 anonymised students who graduated from the University of Lincoln with a named award at the end of the academic year 2009/10, along with a selection of their library activity over three years (2007/08, 2008/09, 2009/10): library item circulation, visits to the main GCW University Library, and e-resources usage represented by authentication against AthensDA.

View this item on the University Repository: http://eprints.lincoln.ac.uk/4540/