Posts Tagged ‘RDF’

CLOCK notes – 8 May 2012

Posted on May 8th, 2012 by Paul Stainthorp

This is what the CLOCK project team are currently up to (from meetings over the past couple of weeks and from notes made at the recent Discovery: making sure your resources are discovered, used and reused event in Birmingham):

  • Andrew Beeken has been exploring the Cambridge COMET data via its SPARQL endpoints and has already blogged about the process of using SPARQL to “build kind of a ‘Hello World’ of open data querying”. He’s now looking at the recently-released Harvard open bib data and comparing the speed, the use of matching namespaces, and the use of JSON vs RDF/XML.
  • This work is leading up to unified search and presentation of records from several sources (Cambridge/COMET, Harvard, Lincoln/Jerome, OpenLibrary, etc.). Andrew and Trevor Jones are collaborating on drawing up a high-level architecture for CLOCK, and a strategy for expressing Linked Data, which will be shared with the rest of the project team (and publicly) for discussion.
  • To support this, Alex Bilbie in ICT services at Lincoln is helping to get the original Jerome application up and running on the CLOCK server (jerome.library.lincoln.ac.uk), where it can be used as a stable platform for developing and RDF-ifying Lincoln’s own bib data.
  • Trevor Jones and Ed Chamberlain will work together on developing the work with users (in parallel, at the University of Lincoln and the University of Cambridge) to clarify their requirements for bibliographic data:
    • For cataloguers, based around a rethink of copy cataloguing workflows, we will try to tease out requirements from talking to cataloguers (and associated subject librarians) asking to be ‘positively disrupted’: what do they need to do? What is missing from their data?
    • For researchers, we will build on some initial user walkthrough analysis done by Trevor and Andrew in Lincoln, with performing arts students in LPAC (the Lincoln Performing Arts Centre). What are the research questions that users are trying to answer? How does bib data help them answer those questions? What’s missing? Ed and Trevor will agree on a set of questions and tasks;
    • These requirements will be used to feed the remainingcycles of platform development for CLOCK.
  • Ed Chamberlain will act as the conduit between CLOCK and related projects in the Discovery strand, looking for points of shared interest/technology, and blogging (or asking others to blog) about aspects of one project which can inform the others. The other projects in which Ed is involved are: the Open Education Metadata UK (OEM-UK) project at the Institute of Education (shared interest in new user interfaces for cataloguing – possibly use screencasts to demonstrate alternative workflows?) and the Open Bibliography 2 project (lots of potential technical overlap – BibJSON, JSON-LD, BibSoup.net, expression in RDF container formats).
  • Ed and I (Paul Stainthorp) will work on developing the ‘business case’ / sustainability of CLOCK and data.*.ac.uk, following up on themes discussed in the recent Discovery event, and thinking not only about institutional funding / high-level support for open bib data, but also what it takes to move open bib data publishing from a development environment into an institutionally-supported, ICT-run service.
  • Finally, PS is arranging a couple of internal CLOCK ‘hack days’ (to take place on 17th-18th May, in Cambridge) – more details to follow.

USTLG meeting on research data management

Posted on November 29th, 2011 by Paul Stainthorp

Clare CollegeYesterday I was at Clare College, University of Cambridge for a meeting organised by USTLG, the University Science & Technology Librarians Group. The group—open to any librarians involved with engineering, science or technology in UK universities—has meetings once or twice a year. The theme of yesterday’s meeting (free to attend, thanks to sponsorship from the IEEE) was data management, with an implied focus on research data.

The meeting consisted of a series of presentations (plus a fantastic lunchtime diversion, below) with plenty of time for networking – there were about 40 people there, all with an interest in research data management – though interestingly, a show of hands suggested very few people were actively engaged in looking after their own institution’s researchers’ data.

As usual, this blog post has been partially reconstructed from the Twitter stream (hashtag #ustlg).

First up, Laura Molloy, substituting for Joy Davidson of the Digital Curation Centre (DCC), on a project called the Data Management Skills Support Initiative (DaMSSI), looking at the [shades of information literacy] skills needed by different people involved in the research data curation process. “DaMSSI aims to facilitate the use of tools like Vitae’s Researcher Development Framework (RDF) and the Seven Pillars of Information Literacy model” developed by SCONUL. Key question: how do you assess the effectiveness of research data management training?

Useful links:

Second, Yvonne Nobis of Cambridge’s Central Science Library talked about supporting researchers at Cambridge: data sharing and the role of librarians; including her project—funded through CUL’s Arcadia library staff research scheme—looking at the issues involved in curating not research data per se, but the software code and techniques used to analyse that source data. Key points: [1] there are disincentives (time, and lack of recognition within ones own field) to researchers’ spending time on code/software for research data manipulation. [2] But without that investment in code, the transparency–openness–replicability of computational-data science is at risk. [3] ”Librarians are missing a trick” by not engaging in research data software curation issues. Yvonne also talked about the work of the eScience Centre.

Links and articles…

Before lunch we also got a chance to inspect the USTLG’s brand new website (and smashing new logo), at ustlg.org

Then the highlight of the day… we were invited in groups over to go over to the adjacent University Library, where we were treated to a display and commentary on some of Cambridge University’s rare science manuscripts and early printed books. All laid out in a reading room were Isaac Newton’s notebooks containing his notes on the method of fluxions (i.e. early calculus), Darwin’s field notes from the Beagle, Ernest Rutherford’s lab diaries (still slightly radioactive! – “…not ever so, but Health & Safety made us do a risk-assessment…”), plus Prof. Stephen Hawking’s typed and ring-bound first draft of A brief history of time, along with several early printed herbals and a book containing the first known technical drawings (of machines of warfare). Inspiring stuff, and really quite brilliant of them to lay it out for us to see!

In the afternoon—not directly connected with research data, but certainly of interest to the engineers involved in the Orbital project—we heard from Rachel Berrington of the IEEE, about the work of the organisation and some of the planned developments to the IEEE Xplore platform: new journal titles in 2012, a mobile platform, the inclusion of CrossRef data, and new interactive HTML content.

Handful of interesting links:

Finally, a useful presentation from Anna Collins, Research Data and Digital Curation Officer (good job title) for Cambridge’s DSpace repository. Anna spoke about the Incremental project, a joint exercise between Cambridge and the University of Glasgow, aimed at providing a best practice approach to supporting data management techniques amongst research communities. This is really good practical nuts & bolts stuff (e.g. when’s the right time to broach the subject of data curation with a PhD student? Too early, and they won’t care – too late, and the best you can do is help pick up the pieces!). I’ll be recommending my colleagues at Lincoln take a look at the materials on both institution’s websites. Top quote: ”be the boss of your hard drive”!

Links from Anna’s presentation:

(An aside: after the USTLG meeting had ended, I was lucky enough to get a quick tour of [about 1% of] the Cambridge University Library, along with a cup of tea in the staff room(!), thanks to a “badly-encoded” colleague. I won’t blog about it in any detail now—hopefully I should be back in Cambridge in January for another Orbital-related event—but it’s just a jaw-dropping library.)

The new USTLG website is at ustlg.org, and you can follow them on Twitter at @USTLG.

A LNCD booklist

Posted on June 14th, 2011 by Paul Stainthorp

We have been able to buy a number of useful books on agile software development / rapid innovation of technology for education, aimed particularly at developing student skills and participation in institution-wide projects: they’re all in the GCW University Library now.

  • Allamaraju, S. (2010) RESTful web services cookbook. Sebastopol, CA: O’Reilly.
  • Chacon, S. (2009) Pro Git. New York, NY: Springer-Verlag.
  • Chodorow, K. and Dirolf, M. (2010) MongoDB: the definitive guide. Farnham: O’Reilly.
  • Cohn, M. (2010) Succeeding with agile software development using Scrum. Upper Saddle River, NJ: Addison-Wesley.
  • Flanagan, D. and Matsumoto, Y. (2008) The Ruby programming language. 1st edition. Beijing; Farnham: O’Reilly.
  • Lawson, B. and Sharp, R. (2010) Introducing HTML5. Berkeley, CA; London: New Riders.
  • Lutz, M. and Ascher, D. (2004) Learning Python. 2nd edition. Beijing; Cambridge: O’Reilly.
  • Plugge, E., Membrey, P., and Hawkins, T. (2010) The definitive guide to MongoDB: the NoSQL database for cloud and desktop computing. New York, NY: Apress.
  • Powers, S. (2003) Practical RDF. Beijing; Cambridge: O’Reilly.
  • Richardson, L. and Ruby, S. (2007) RESTful web services. Beijing; Farnham: O’Reilly.
  • Segaran, T., Evans, C., and Taylor, J. (2009) Programming the Semantic Web. Beijing; Farnham: O’Reilly.

There’s a live copy of the same booklist on RefShare, available to download/export:

This little collection of books is designed to support the work of the new cross-University technology-for-education group, the existence of which Joss Winn announced last month. Since then, the group has been given a name: LNCD (it’s a partial pun on “linked”, suggesting “Lincoln”, and also a recursive acronym: see below and at: http://lncd.org/)

LNCD

LNCD’s Not a Central Development group

LNCD is a progressive group that includes educational developers, technologists, teachers, researchers and students and was set up to support the objectives of Student as Producer through the research and development of technology for education. The work of LNCD is informed by the progressive pedagogy of Student as Producer so as to engender critical, digitally literate staff and students. Core principles of the group are that we recognise students and staff have much to learn from each other and that students can be agents of change in the use of technology in education.