A group of about 50 persons gathered for a half day at the ALA meeting in Washington, DC, to discuss linked library data. These are the notes from the discussions that took place at that meeting.
Contents
Although this discussion began with APs, it soon became clear that the topic was broader, including (or primarily) dealing with the queston of how do you determine what vocabularies you can use, and when it is necessary to create properties of your own.
We talked about the importance of DEFINITION and CONTEXT in the re-use of defined elements.
You may also want to determine the possible SUSTAINABILITY of the schema before making use of it. Is the schema being currently maintained? Is there an active user group?
Then we talked about what you do if you do not find a property that fits your need. In particular, we talked about two possibilities:
In the case of 1), it may be necessary for you to define a property under your domain that represents the un-registered property. In this case, it is best to create a separate name space under your domain that makes it clear that this is a "borrowed" property. If the property is later defined by its owner, you can create a link between the two definitions that declres them as "equivalent" (e.g. owl:sameAs)
In the case of 2), if you can find a broader property that has been defined, you can define your property as a sub-property of a defined property, thereby connecting your vocabulary to an existing one. As an example, if you need to express a journal title, you may be able to define your property as a sub-property of dc:title
dc:title
journalTitle: subproperty of dc:title
Why Create APs?
APs have (at least) two major functions:
Notes from Unconference session on literacy and training
Where are we aiming our efforts?
Curricula considerations
Structures
Clarify future vision
Address the tyranny of the record
Essentials
Tools
The first issue we discussed was that in our current model we tend to create metadata that is static. Data, however, is constantly changing and we are very poor about reflecting this. Linked data gives us a chance to make use of dynamic data that more accurately reflects the real world. The web has made our users much more accepting of constant change.
This led to a discussion of the use of linked data in our ILS. Most (all?) are not currently able to make use of it. If we want to make use of linked data in a library environment it will have to be outside the ILS. Stanford is interested in making use of linked data for a digital map project. Because we will be working outside of the ILS, we cannot put the controlled headings under authority control. Linked data will allow us to do this as the headings will be dynamically updated. It was pointed out that a simple link to data would not be enough. At some point, you would need to capture the data for indexing, faster display, preservation, etc.
Is there an incentive for publishers to share their ONIX data? It's a by-product of their business but they make it for internal use. OCLC has developed a project in which they receive ONIX data from publishers to enhance catalog records in Worldcat and in return publishers receive quality "work" information that they can use to enhance their metadata. Both sides win.
Do we really want all the ONIX data a publisher creates? They start creating it for the book at a concept level and it grows until production, and then post production with reviews etc. Often, the metadata is very poor by our standards because it's not meant to do the same things. If publishers are aware, however, that people can make use of this data to sell books, they will be more motivated to curate it. When do we tap into the stream? Do we want it all and all it to grow in our discovery environments?
Will linked data change the way catalogers do their work? Can we really on linked data to create a basic descriptive record? In a world of links, we should no longer need to create unique text strings to identify entities. This will allow, for instance, name headings to be registered (in VIAF?) for all those millions of names we cannot put through the NACO process. Our focus will shift from the hand crafting of individual bibliographic records for individual items to an evolving cluster of links that dynamically draws in data as an item evolves with time.
We ended our discussion talking about the importance of a testbed for developing ideas, some place that we can get in and play. The complete record examples in RDA would be interesting to use, also OCLC data on works and identities. OCLC’s place as an aggregator of information could be very important.
Also, we need to develop the apps that will sell the concept of linked data. It's the classic chicken and the egg. Perhaps by just exposing the data the apps will be developed? Linked data is like the web before MOSAIC. There is some practical ONIX data that would be of great use to use currently: TOCs, author bios, reviews etc. but if we could load them into our ILS how would we display them? Would they make the bibliographic display unusable?
The last issue we discussed was the persistence of data particularly when dealing with publishers - they don't commit, they go out of business or are absorbed - We need to think of preservation of all this data as well. Should we become the archivers? Could we afford this?
The group introduced themselves and mentioned their interest in joining the discussion, responses included:
Further questions, thoughts, and ideas were floated:
* -- retrospective linkage between the last idea and some related concerns mentioned earlier in the discussion
This combined two proposed break-out topics:
- Linked Data Use Cases for Scholars (non-library uses)
- Usage of existing linked library data sources (id.loc.gov, viaf, etc)
Use Cases generally fell into two categories, which might be labeled
"inward" and "outward" facing.
Conversation started around discussion of existing usage statistics:
- LC just started collecting stats and aren't seeing much use [1]
- OCLC's viaf seeing a doubling of resolution of URIs each month:
- 30,000 "303" redirects in June
- This begged the question "Why?"
Discussion of Value Proposition of Linked Data
- Europeana Whitepaper mentioned [2]
- Discussion of "Rationalizing Serendipity" as being the primary use
case for scholarship
- Scholars finding each others works, related works, looking at
interdisciplinary collaboration
- Cornell/Florida "Vivo" project to expose university faculty on the
web [3]
- Driven largely by Grant-Tracking needs?
- Was also subject of a session on Sunday morning (along with
id.loc.gov)
Discussion of general use case to expose our authorities for disambiguation
- NY Times has approached both LC and OCLC to discuss using linked
names and subjects in their own linked data apps.
Use case for matching and controlling names in dissertations
In general, largest needs seem to be for working with Names of People
and Places (Geo-coordinates?)
- IETF has a URI Scheme for Places [4]
Generic Use Case for pulling cross-references from existing vocabularies
into non-ILS/non-MARC indexes
Generic Use Case for connecting Names (via, eg, Vivo) to Topics (areas
of interest) using id.loc
- Note that this is in part what WC Identities is doing, though not
via Linked Data
Brief discussion of Intellectual Property Applications
Use case re: Exposing Library Vocabs to the web for purposes of Search
Engine Optimization
- Enhance participation in programs like Google's "Rich Snippits" and
Facebook's "Open Graph" [5],[6]
Quick Win: Putting RDFa in OPACs - Google has indicated they would be
interested in seeing this
In Summary:
* Inward facing use cases are typically about supply chain issues,
improving re-use of data and aggregatying search
* Outward facing use cases are harder to discuss but allow other users
to leverage linked data sources from the library world.
[1]http://bit.ly/id-loc-gov-chart
[2]http://version1.europeana.eu/web/europeana-project/whitepapers
[3]http://vivoweb.org/
[4]http://tools.ietf.org/html/rfc5870
[5]http://code.google.com/apis/customsearch/docs/snippets.html#structured_data
[6]http://developers.facebook.com/docs/opengraph
Names:
Subjects:
General: