Posts Tagged ‘Microsoft Excel’

New content on the e-journals A-to-Z

Posted on October 10th, 2012 by Paul Stainthorp

A few things that have been added/updated recently on the Electronic Journals A-to-Z. New and updated full-text holdings should shortly be reflected in Find it at Lincoln.

Brand-new e-journal packages and titles:

Holdings updated:

Authentication changes:

Notes:

[1] I’ve not been able to find (by searching through Cambridge’s “Account Administrator” pages) a holdings file for our Cambridge University Press subscriptions—at least, not in a format that we are able to use in the A-to-Z—so the 40-odd titles in this package have been checked individually against the Cambridge Journals website. For that reason, I can’t guarantee that they are 100% accurate.

[2] The ScienceDirect Freedom Collection package in the A-to-Z knowledgebase does not have any holdings defined – libraries have to add their own custom holdings dates. I added ours this by ordering an “Electronic Holdings Report” from Elsevier’s admin tool, then downloading the A-to-Z holdings and using an Excel =LOOKUP() formula to match against ISSNs common to both spreadsheets. This is very fiddly and unfortunately will have to be re-done at intervals.
Screenshot from Elsevier

[3] Created using SwetsWise’s “Download Publication List” feature, re-formatted for the A-to-Z. Again, this has to be re-done at intervals as our Swets subscriptions change.
Screenshot from SwetsWise

[4] Links to HeinOnline journals/articles will now automatically log the user in via OpenAthens (federated access). However there are a couple of residual problems with these links: some of the OpenURL data for an individual article is not being passed through correctly (leading to the occasional error), and also the authentication does not work properly in non-Microsoft browsers – e.g. Chrome, Firefox. For the time being (while HeinOnline technical support address the issue) there is a note on the A-to-Z advising people to use Internet Explorer if they can. This is obviously not ideal.
Screenshot from the A-to-Z

Rationalising multiple lists of e-resources (ERM)

Posted on June 25th, 2012 by Paul Stainthorp

As an offshoot of our discovery (Find it at Lincoln), authentication, and library website projects, we’re trying to impose a little bit more order on the various lists of electronic resources we present to users – aiming at a single version of the truth.

For historical reasons, users can browse several different lists of e-resources at Lincoln:

  1. The ‘e-Library‘ page on the University Portal
  2. A list of packages on the e-journals A-to-Z
  3. Resources available through the MyAthens portal
  4. Other (minor) authentication systems, each listing its own subset of resources

Frustratingly for our users (and for slightly obsessive-compulsive librarians like myself), no one of these lists exactly corresponds with any other. Each one includes a slightly different set of resources. For example, looking at a Venn diagram of resources listed on two platforms only – the e-Library and electronic journals A-to-Z:

Screenshot comparing two lists of e-resources

  • The e-Library contains 163 distinct resources (usually referred to as “databases”). 106* of them also appear on the e-journals A-to-Z: but there are 57, mostly non-bibliographic, resources on the e-Library which aren’t on the A-to-Z. I kind of expected this.
  • Conversely, the A-to-Z contains 162 packages (including a number of titles which don’t form part of a package). 112* of these are reflected on the e-Library, but there are 50 A-to-Z packages which aren’t on the Portal. This was less expected, and is more worrying!
  • The name given to a resource on one platform doesn’t necessarily correspond to the name given to the same resource on the other platform.
  • We use a Google Spreadsheet to [try and] keep tabs on this mess.
  • *The reason why only 106 resources on the e-Library correspond to 112 packages on the A-to-Z is that one “database” can be represented by a number of packages. For example: the Portal lists “JSTOR” as a single resource, whereas the A-to-Z lists three separate packages: JSTOR Arts & Sciences I, …Arts & Sciences II, and …III.

Drop in the other two platforms which list e-resources, and the Venn diagram will look something more like this:

Screenshot of a Venn diagram of e-resource platforms

Rationalising these various lists has to be a way toward better e-resources management, and we need to get to a stage where we present only one version of the truth at our users. As part of the ‘Find it at Lincoln‘ project, we’ll be re-populating the A-to-Z knowledgebase from scratch, reviewing our acquisitions/ERM procedures along the way. And for our new website, we’re looking for better ways of presenting lists of resources than the current e-Library page on the Portal.

Side note: it’s possible to use the MS Excel =Match function to compare two lists of resource names that nearly, but don’t exactly, correspond. Formula is:

  • =MATCH(“*”&LEFT(<value in native list>,12)&”*”,<foreign list array>,0)

Java, John and JournalTOCs

Posted on April 17th, 2012 by Paul Stainthorp

I heard recently that the ticTOCs journal tables-of-contents service will close down in the next month or so. ticTOCs was a JISC-funded project which hasn’t been developed for several years now.

Screenshot of ticTOCs

It’s effectively been superseded by the JournalTOCs service, “the largest, free collection of scholarly Tables of Contents (TOCs)”. The outgoing service has published some advice for users on transferring saved lists of TOCs between ticTOCs and JournalTOCs.

ticTOCs did have one particularly useful feature: a text file of all the TOCs it contained (at http://www.tictocs.ac.uk/text.php), which I’ve been filtering and using since 2009 to create a custom package of RSS feeds for upload to the e-journals A-to-Z at Lincoln.

While JournalTOCs doesn’t provide the same simple text list feature, it does have a fully-documented API. This is much more powerful and flexible for developers, but it’s not quite so straightforward as /text.php to create my list (a subset of all the feeds in JournalTOCs, matching only those journals to which the University has full-text access) using desktop tools and no programming.

A chance comment from a colleague at another university about Lincoln having “developers coming out of its ears“(!) inspired me to ask on the LNCD development group for help.

Dr John Murray of the Lincoln School of Computer Science responded, and very kindly supplied a Java program which I can use to identify which journals in our A-to-Z are represented in JournalTOCs, and so build a list of links to valid RSS feeds. Starting with a comma-separated list of ISSNs (which I downloaded from the A-to-Z), the program takes each ISSN in turn and makes a call to the JournalTOCs journals API. Depending on the data returned by JournalTOCs, the program records each ISSN as ‘VALID’ or ‘INVALID’ (i.e. no RSS feed available) in a new .csv file.

Thank you very much, John!

[Aside: to use John's code I had to learn how to compile and run Java programs on my laptop (running Ubuntu 11.10). For the record—and because I imagine it'll be useful again in the future—I first had to install OpenJDK 6 by going to the terminal and running the command:

sudo apt-get install openjdk-6-jdk

…then, once OpenJDK had installed, using the following command to select the correct version of Java:

sudo update-alternatives --config java

…before compiling and running the program itself.]

Once all the ISSNs had been checked against the API and the validated list constructed (this took ~5hrs to run!), I used Microsoft Excel to filter out only the ‘VALID’ ISSNs matched in JournalTOCs, and used Excel’s =LOOKUP() function to pull in enough information about each journal from our managed title list (previously downloaded), to create a custom upload text file.

Screenshot of the A-to-Z

The updated package of journal article RSS feeds is now available to view on the A-to-Z. We’ll review and re-generate this every few months, as we do with all custom and publisher-generated e-journal packages. At the time of writing, it contains just over 10,000 journal article RSS feeds, each one corresponding to one of our full-text journals. I’ve also added an orange RSS icon and link to JournalTOCs for each one, using the A-to-Z’s public notes feature.

So: which other library APIs will accept an ISSN as an input, and what other custom packages could I create using John Murray’s code in the same way?

Library Impact Data Project: good news, everybody!

Posted on June 18th, 2011 by Paul Stainthorp

I think this is worth re-posting from the LIDP blog:

LIDP graphicWe are very pleased to report that we have now received all of the data from our partner organisations and have processed all but two already!

Early results are looking positive and our next step is to report back with a brief analysis to each institution. We are planning to give them our data and a general set of data so that they can compare and contrast. There have been some issues with the data, some of which has been described in previous blogs, however, we are confident we have enough to prove the hypothesis one way or another!

In our final project meeting in July we hope to make a decision on what form the data will take when released under an Open Data Commons Licence. If all the partners agree, we will release the data individually; otherwise we will release the general set for other to analyse further.

I submitted Lincoln’s data on 13 June. It consists of fully anonymised entries for 4,268 students who graduated from the University of Lincoln with a named award, at all levels of study, at the end of the academic year 2009/10 – along with a selection of their library activity over three* years (2007/08, 2008/09, 2009/10).

The library activity data represents:

  1. The number of library items (book loans etc.) issued to each student in each of the three years; taken from the circ_tran (“circulation transactions”, presumably) table within our SirsiDynix Horizon Library Management System (LMS). We also needed a copy of Horizon’s borrower table to associate each transaction with an identifiable student.
  2. The number of times each student visited our main GCW University Library, using their student ID card to pass through the Library’s access control gates in each of the three* years; taken directly from our ‘Sentry’ access control/turnstile system. These data apply only to the main GCW University Library: there is no access control at the University of Lincoln’s other four campus libraries, so many students have ’0′ for these data. Thanks are due to my colleague Dave Masterson from the Hull Campus Library, who came in early one day, well before any students arrived, in order to break in to the Sentry system and extract this data!
  3. The number of times each student was authenticated against an electronic resource via AthensDA; taken from our Portal server access logs. Although by no means all of our e-resources go via Athens, we’re relying on it as a sort of proxy for e-resource usage more generally. Thanks to Tim Simmonds of the Online Services Team (ICT) for recovering these logs from the UL data archive.

I had also hoped to provide numbers of PC/network logins for the same students for the same three years (as Huddersfield themselves have done), but this proved impossible. We do have network login data from 2007-, but while we can associate logins with PCs in the Library for our current PCs, we can’t say with any confidence whether a login to the network in 2007-2010 occurred within the Library or elsewhere: PCs have just been moved around too much in the last four years.

Student data itself—including the ‘primary key’ of the student account ID—was kindly supplied by our Registry department from the University’s QLS student records management system.

Once we’d gathered all these various datasets together, I prevailed upon Alex Bilbie to collate them into one huge .csv file: this he did by knocking up a quick SQL database on his laptop (he’s that kind of developer), rather than the laborious Excel-heavy approach using nested COUNTIF statements which would have been my solution. (I did have a go at this method—it clearly worked well for at least one of the other LIDP partners—but it my PC nearly melted under the strain.)

The final .csv data has gone to Huddersfield for analysis and a copy is lodged in our Repository for safe keeping. Once the agreement has been made to release the LIDP data under an open licence, I’ll make the Repository copy publicly accessible.

*N.B. In the end, there was no visitor data for the year 2007/08: the access control / visitor data for that year was missing for almost all students. This may correspond to a re-issuing of library access cards for all users around that time, or the data may be missing for some other reason.

We got print journals in your electronic journals! We got electronic journals in your print journals!!

Posted on September 14th, 2010 by Paul Stainthorp

While my colleagues are busy adding 856 $u tags to MARC records, I’ve been using our Library’s list of current subscriptions to update information about print journal holdings on our e-journals website.

I first did this in May 2009: I wasn’t even sure if it was a job worth doing until I discovered that links to current print holdings wers accounting for one in every fifteen links followed out from the University’s [supposedly "electronic"] journals A-to-Z site.

Photo - me, "reading" a "journal", "yesterday"

(Photo - me, “reading” a “journal”, “yesterday”.)

The Library has only recently transferred many of its print-only (or print+”e”) journal subscriptions to electronic-only, so a freshly-dumped file of our print holdings data was needed. (The shift in subscription format aside, dozens of minor title changes had left confusing, semi-orphaned records in the A-to-Z.)

Screenshot of print holdings on the University of Lincoln's e-journals A-to-Z websiteIt turns out that, from 600 print+”e” titles in May 2009, the University now has 319: the rest, presumably, having gone “e”-only or having been cancelled outright.

The custom ‘package’ of data for our e-journals A-to-Z site was created by taking a list of the ISSNs, plus holdings, for all of our current subscriptions; filtering out those ISSNs which don’t appear in our A-to-Z (i.e. those which are print-only); then using a =LOOKUP() function in Microsoft Excel to pull across a matching title from our e-journals knowledgebase.

Next: to see if usage of this package, as a percentage of total e-journal usage at the University of Lincoln, changes [from 7% last year] over the coming academic year, now that—presumably—the relative demands for electronic vs. print journals have themselves changed at the University …and now that there are only half the print+”e” titles to choose from.

Google magazines (slight return)

Posted on August 20th, 2010 by Paul Stainthorp

I’ve just recreated my list of magazines from Google Books for the University’s e-journals site.

Google now hosts 199 digitised magazine titles, and for the sake of 10 minutes’ work every few months it would be a shame to miss out on the extra full-text coverage, which often complements the “library” sources for a title.

E.g. for the frankly un-put-downable Estonian Journal of Archaeology (available as an Open Access (OA) journal from 2006-, and indexed in Art Full Text), Google provides the missing articles from 1997 (vol.1) up to 2006.

Record for the 'Estonian Journal of Archaeology' on the University of Lincoln's list of e-journals.

I’d like to be able to harvest the Google Books content to build my list using the standard mashlib toolkit (Google spreadsheets; Yahoo! Pipes; some coffee)… but while use of Google’s =ImportHtml() function is limited to 50 per spreadsheet, and because Google search pages block robots.txt files, I can’t figure out a way of doing so.

Instead, I’ve been copying-and-pasting the search results pages into an ordinary Microsoft Excel spreadsheet (thanks, again, Google, for making this possible through your magazine browse page), then using a custom Excel function to ‘unmask’ the URL hidden behind each hyperlinked magazine title.

Google Books magazine browse page, pasted into an Excel spreadsheet.

Finally, I use a bit of text-to-column splitting, search/replace, and filling-in of package-wide fields, to give me a compatible, tab-delimited text file which I then upload to our e-journals knowledge base (which happens to be EBSCO A-to-Z) – I used EBSCO’s custom notes feature to link to Google’s cover image to each entry in the file.