Posts Tagged ‘#OrbitalMRD’

Orbital notes, 24 May 2012

Posted on May 24th, 2012 by Paul Stainthorp

The Orbital project team met today (24 May 2012) and agreed the following:

  • Documentation
    • User documentation will focus on the “why”s of Research Data Management, rather than being a point-and-click guide to the Orbital UI (which should not require detailed explanations).
    • JW will create a changelog (human readable text file) for each major release of Orbital, so that documentation for each feature is review if that feature is updated.
    • PS will lead on writing documentation (as HTML pages, stored in the GitHub repository), with documentation for release v0.N completed and available by the launch of v0.N+1
    • PS will email colleagues from the Library and Research/Enterprise for assistance on writing documentation.
  • Training
    • JW will invite Melanie Bullock and David Sheppard on to the Orbital working group. He is meeting Annalisa Jones to discuss RDM training for staff.
  • Releases/development
    • Orbital v0.1.1 (including bug fixes) met all of the initial ‘minimum viable product‘ requirements specified by Dr Tom Duckett, and also includes the basics of project administration.
    • v0.2 will include improvements to the file upload/management, project management, and license management interfaces, as well as clearer distinction between language files and operating code.
    • NJ demoed the current version of Orbital to Siemens staff. He now has access to Siemens machine data for testing within Orbital.
    • The group discussed the LNCD plans for internal servers/private cloud, and about the disk space requirements and costs.
  • Integration
    • The current version of the DMPOnline tool has been installed on a test server. The group discussed our approach to integration between external tools/software (such as DMPOnline, R, Gephi) and Orbital.
    • NJ is going to email Adrian Richardson at the DCC to ask when the DMPOnline APIs will become available.
  • RDM policy
    • JW presented the draft policy to the University RIEC committee. The committee have been asked to send comments to Joss. (One comment at the committee meeting was that our having a policy too geared around the requirements of the Research Councils may not be appropriate for Lincoln, which generates a lot of non-RC income. However it was noted that the good practice specified by the RCs is good practice for management of all research data, whatever the funding source.)
  • Conferences and meetings
  • Data Asset Framework survey
    • The group discussed the recent DAF survey which we conducted at the University of Lincoln.
    • JW will convene a sub-group to consider the responses in detail, and plan follow-up interviews.
  • Business case
    • JW is currently gathering costs for long-term data storage. This will form the first strand of the Orbital business case, which will be presented to University SMT (along with the agreed RDM policy) in September 2012.

Over the moon (again): Orbital v0.1 released

Posted on May 16th, 2012 by Paul Stainthorp

Screenshot of OrbitalJoss Winn has blogged this morning about a signficant milestone in the Orbital project. Today we released Orbital v0.1, and, from today, invited researchers at the University of Lincoln have access to an alpha ‘minimum viable product‘ environment for managing their research data.

For the time being, sign-in access to Orbital (http://orbital.lincoln.ac.uk/) is restricted to invited individuals only.

Orbital v0.1 allows a researcher to:

  • sign in using their University of Lincoln credentials
  • create and describe a project
  • upload their data to the project
  • choose a license for the data
  • add a Google Analytics code to measure project analytics
  • published data at a persistent URI (id.lincoln.ac.uk)
  • leave feedback on the Orbital application

You can read more about this first release on the Orbital blog. We’ve also written a basic development roadmap for Orbital which gives an idea of the kind of features you should see becoming available between now and Christmas 2012.

RDY* mewn orbit! Neu, mae tri yn mynd i Gaerdydd

Posted on December 14th, 2011 by Paul Stainthorp

Baner Caerdydd(*RDY = Rheoli Data Ymchwil.) Dyma fy post blog (…byr) cyntaf yn y Gymraeg. Ydw i yn Ne Cymru yr wythnos hon, gyda fy nghydweithwyr, Nick Jackson a Joss Winn – rydym ni’n mynd i ’sioe deithiol‘ y CCD (Canolfan Curadu Digidol; Digital Curation Centre), yn y Coleg Brenhinol Cerdd a Drama yng Ngaerdydd. Dyma fy nhrip cyntaf i’r ddinas: beth ddylwn i ei weld/wneud yno?

USTLG meeting on research data management

Posted on November 29th, 2011 by Paul Stainthorp

Clare CollegeYesterday I was at Clare College, University of Cambridge for a meeting organised by USTLG, the University Science & Technology Librarians Group. The group—open to any librarians involved with engineering, science or technology in UK universities—has meetings once or twice a year. The theme of yesterday’s meeting (free to attend, thanks to sponsorship from the IEEE) was data management, with an implied focus on research data.

The meeting consisted of a series of presentations (plus a fantastic lunchtime diversion, below) with plenty of time for networking – there were about 40 people there, all with an interest in research data management – though interestingly, a show of hands suggested very few people were actively engaged in looking after their own institution’s researchers’ data.

As usual, this blog post has been partially reconstructed from the Twitter stream (hashtag #ustlg).

First up, Laura Molloy, substituting for Joy Davidson of the Digital Curation Centre (DCC), on a project called the Data Management Skills Support Initiative (DaMSSI), looking at the [shades of information literacy] skills needed by different people involved in the research data curation process. “DaMSSI aims to facilitate the use of tools like Vitae’s Researcher Development Framework (RDF) and the Seven Pillars of Information Literacy model” developed by SCONUL. Key question: how do you assess the effectiveness of research data management training?

Useful links:

Second, Yvonne Nobis of Cambridge’s Central Science Library talked about supporting researchers at Cambridge: data sharing and the role of librarians; including her project—funded through CUL’s Arcadia library staff research scheme—looking at the issues involved in curating not research data per se, but the software code and techniques used to analyse that source data. Key points: [1] there are disincentives (time, and lack of recognition within ones own field) to researchers’ spending time on code/software for research data manipulation. [2] But without that investment in code, the transparency–openness–replicability of computational-data science is at risk. [3] ”Librarians are missing a trick” by not engaging in research data software curation issues. Yvonne also talked about the work of the eScience Centre.

Links and articles…

Before lunch we also got a chance to inspect the USTLG’s brand new website (and smashing new logo), at ustlg.org

Then the highlight of the day… we were invited in groups over to go over to the adjacent University Library, where we were treated to a display and commentary on some of Cambridge University’s rare science manuscripts and early printed books. All laid out in a reading room were Isaac Newton’s notebooks containing his notes on the method of fluxions (i.e. early calculus), Darwin’s field notes from the Beagle, Ernest Rutherford’s lab diaries (still slightly radioactive! – “…not ever so, but Health & Safety made us do a risk-assessment…”), plus Prof. Stephen Hawking’s typed and ring-bound first draft of A brief history of time, along with several early printed herbals and a book containing the first known technical drawings (of machines of warfare). Inspiring stuff, and really quite brilliant of them to lay it out for us to see!

In the afternoon—not directly connected with research data, but certainly of interest to the engineers involved in the Orbital project—we heard from Rachel Berrington of the IEEE, about the work of the organisation and some of the planned developments to the IEEE Xplore platform: new journal titles in 2012, a mobile platform, the inclusion of CrossRef data, and new interactive HTML content.

Handful of interesting links:

Finally, a useful presentation from Anna Collins, Research Data and Digital Curation Officer (good job title) for Cambridge’s DSpace repository. Anna spoke about the Incremental project, a joint exercise between Cambridge and the University of Glasgow, aimed at providing a best practice approach to supporting data management techniques amongst research communities. This is really good practical nuts & bolts stuff (e.g. when’s the right time to broach the subject of data curation with a PhD student? Too early, and they won’t care – too late, and the best you can do is help pick up the pieces!). I’ll be recommending my colleagues at Lincoln take a look at the materials on both institution’s websites. Top quote: ”be the boss of your hard drive”!

Links from Anna’s presentation:

(An aside: after the USTLG meeting had ended, I was lucky enough to get a quick tour of [about 1% of] the Cambridge University Library, along with a cup of tea in the staff room(!), thanks to a “badly-encoded” colleague. I won’t blog about it in any detail now—hopefully I should be back in Cambridge in January for another Orbital-related event—but it’s just a jaw-dropping library.)

The new USTLG website is at ustlg.org, and you can follow them on Twitter at @USTLG.

Notes from RDMF7 workshop

Posted on November 3rd, 2011 by Paul Stainthorp

Long day on the trainI’ve been at the University of Warwick today, for a workshop organised by the Digital Curation Centre (DCC), entitled RDMF7: Incentivising Data Management & Sharing. There appeared to be a wide range of attendees, from data curators & data scientists, ICT/database folk. actual researchers and academics, as well as at least one fellow library/repository rat.

Unfortunately I was only able to attend part of the event (which ran over two days). The following notes have been reconstructed from the Twitter stream (hashtag #RDMF7)!

The first speaker I heard was Ben Ryan of the funding council, the EPSRC. He talked about the “long-established” principles of responsible data management [links below]… this may be my own interpretation of Ben’s presentation, but I don’t think I was imagining undertones of “…so there’s really no excuse!“. He also covered individual and institutional motivations for taking care of data [much more about which later], policy and the enforcement of policy, dataset discoverability/metadata, funding (including the EPSRC’s expectation that institutions will make room in existing budgets to meet the costs of RDM), and embargo periods (inc. researchers’ entitlement to a period of “privileged use of the data they have collected, to enable them to publish” first – important to stress this in order to allay fears/get researchers on board?).

Some links:

Next up was Miggie Pickton, ‘queen bee’ of the University of Northampton‘s repository (and self-described RDM “novice”, indeed!), talking about their participation in the multi-institution, JISC-funded KeepIt project, which aimed to design “not one repository but many that, viewed as a whole, represent all the content types that an institutional repository might present (research papers, science data, arts, teaching materials and theses).” This work lead almost by chance to Northampton’s undertaking of a university-wide audit of its research data management processes using the DCC’s Data Asset Framework (DAF) methodology. This helped them to make the case for an institutional research data management working group and [eventually, and not without resistance] to establish a mandatory, central policy for RDM. (Show of hands at this point: how many other institutions have completed a DAF? I counted perhaps only three, Lincoln certainly not being amongst them. Q. Should the University of Lincoln complete a Data Asset Framework exercise as part of the Orbital project?)

After coffee, we heard a third presentation from Neil Beagrie of (management consultancy partnership) Charles Beagrie Ltd. Neil delivered a very comprehensive explanation of the KRDS (“Keeping Research Data Safe”) project, which has developed both an activity model and a benefits analysis toolkit for the management and preservation-of-access to ‘long-lived data’. I have to come clean here and admit that I was a little bewildered by the detail: much of it went through both ears without sticking to the brain on the way through. I need to go back over the tweets more carefully and have a look at the KRDS toolkit and reports at: beagrie.com/krds.php

The morning’s presentations over, we split into three groups for breakout discussion.

I attached myself to the second of the three groups, led by (JISC programme manager for Orbital) Simon Hodson; our job to consider the question: “What really are the sticks and carrots that will make a long-term difference to the pursuit of structured data management processes?“. After spending some time picking apart the terminology, and what each of the various ‘processes’ might include, we had a wide-ranging (and allocated-time-overrunning) discussion about the things that genuinely motivate scientists, universities, and funding councils(!) to care about RDM; about some of the problems caused by the complexity and inconsistency of metadata for datasets; also about the issue of citations/digital object identifiers for data—how those citations might be treated by publishers and citation data services—and how that relates to any notions of ‘peer review’ in experimental data.

As requested, our group came up with three actions which we believe will help address the question of motivation:

  1. Data citation – publishers should consistently include e.g. DOIs for datasets in final published articles, so that citations of the data can be measured.
  2. Measurement of RDM “maturity” – departments and whole institutions should adopt a standardised quality mark for research data management, to give [potential] researchers, funding bodies, and the public confidence in their ability to handle data appropriately.
  3. Discovery – the research councils (probably) should push for common metadata standards for describing datasets and underlying data-generating research/experimental processes.

Lunch followed, and I had time to hear two more presentations in the afternoon before I had to run for a bus:

Catherine Moyes of the Malaria Atlas Project: in effect, demonstrating what really clear and consistent management of large-scale (geo)data looks like. This seems to consist of an extremely rigorous approach to requesting, tracking, and licensing data from the contributors of the project’s data… and an equally strict (but in a good way) expectation of clarity when dealing with requests from third parties to use the data. If that all comes across as restrictive, I’d point to Catherine’s slide on ‘legalities’ of the data that the Malaria Atlas Project has released openly – it’s about as open as it gets, with no registration needed, no terms & conditions placed on re-use of the published data, and all software/artefacts released under very permissive and free licences (Creative Commons or GNU). N.B. the Orbital project should look at the Malaria Atlas Project’s ”data explorer”, available via map.ox.ac.uk, as an example of a really nifty set of applications built on top of openly accessible and re-usable data.

Finally (and I’m sorry I only got to hear part of his presentation), University of So’ton chemistry professor Jeremy Frey on their IDMB (Institutional Data Management Blueprint) Project—southamptondata.org—and some rather funny anecdotes about the underlying knowledge, expectations, and problems faced by researchers managing their own data, which emerged when they were surveyed as part of the above project.

Lots to take in (lots). But some useful suggestions for Orbital, which I’ll be bringing to the next project meeting: and plenty more reading material which I’ll add to the project reading list asap.

Paul Stainthorp, lead researcher on the Orbital project.