Notes from Linked Library Data Unconference

American Library Association Conference, June, 2010

A group of about 50 persons gathered for a half day at the ALA meeting in Washington, DC, to discuss linked library data. These are the notes from the discussions that took place at that meeting.


Application Profiles

Literacy and Training

ONIX Data as Linked Data

FRBR + Linked Data

Use Cases

Authorities and Linked Data

Discussion of Application Profiles

Although this discussion began with APs, it soon became clear that the topic was broader, including (or primarily) dealing with the queston of how do you determine what vocabularies you can use, and when it is necessary to create properties of your own.

We talked about the importance of DEFINITION and CONTEXT in the re-use of defined elements.

  1. you should only re-use an element if your use is compatible with the definition of the element
  2. in determining if your use is compatible, you need to look at the context in which the element is defined and used. This is important because formal definitions may not fully express the usage of the property, so looking at the larger context can fill in information relating to the property's meaning
  3. 3) is the data model of the vocabulary compatible with your usage?

You may also want to determine the possible SUSTAINABILITY of the schema before making use of it. Is the schema being currently maintained? Is there an active user group?

Then we talked about what you do if you do not find a property that fits your need. In particular, we talked about two possibilities:

  1. there is a vocabulary that fits your needs (at least to use some of its data elements) but it has not been registered or defined in a linked-data compatible way
  2. there is no vocabulary that has a property that meets your need

In the case of 1), it may be necessary for you to define a property under your domain that represents the un-registered property. In this case, it is best to create a separate name space under your domain that makes it clear that this is a "borrowed" property. If the property is later defined by its owner, you can create a link between the two definitions that declres them as "equivalent" (e.g. owl:sameAs)

In the case of 2), if you can find a broader property that has been defined, you can define your property as a sub-property of a defined property, thereby connecting your vocabulary to an existing one. As an example, if you need to express a journal title, you may be able to define your property as a sub-property of dc:title

     journalTitle: subproperty of dc:title
Why Create APs?

APs have (at least) two major functions:

  1. communication: an AP expresses your metadata realm, and makes it possible for you to communicate this to others
  2. validation: an AP allows you to validate that data meets your requirements

Literacy and Training

Notes from Unconference session on literacy and training

Where are we aiming our efforts?

  • Working librarians
  • Library schools
  • Vendors
  • Curricula considerations


    Clarify future vision

    Address the tyranny of the record



    Onix Data as Linked Data

    The first issue we discussed was that in our current model we tend to create metadata that is static. Data, however, is constantly changing and we are very poor about reflecting this. Linked data gives us a chance to make use of dynamic data that more accurately reflects the real world. The web has made our users much more accepting of constant change.

    This led to a discussion of the use of linked data in our ILS. Most (all?) are not currently able to make use of it. If we want to make use of linked data in a library environment it will have to be outside the ILS. Stanford is interested in making use of linked data for a digital map project. Because we will be working outside of the ILS, we cannot put the controlled headings under authority control. Linked data will allow us to do this as the headings will be dynamically updated. It was pointed out that a simple link to data would not be enough. At some point, you would need to capture the data for indexing, faster display, preservation, etc.

    Is there an incentive for publishers to share their ONIX data? It's a by-product of their business but they make it for internal use. OCLC has developed a project in which they receive ONIX data from publishers to enhance catalog records in Worldcat and in return publishers receive quality "work" information that they can use to enhance their metadata. Both sides win.

    Do we really want all the ONIX data a publisher creates? They start creating it for the book at a concept level and it grows until production, and then post production with reviews etc. Often, the metadata is very poor by our standards because it's not meant to do the same things. If publishers are aware, however, that people can make use of this data to sell books, they will be more motivated to curate it. When do we tap into the stream? Do we want it all and all it to grow in our discovery environments?

    Will linked data change the way catalogers do their work? Can we really on linked data to create a basic descriptive record? In a world of links, we should no longer need to create unique text strings to identify entities. This will allow, for instance, name headings to be registered (in VIAF?) for all those millions of names we cannot put through the NACO process. Our focus will shift from the hand crafting of individual bibliographic records for individual items to an evolving cluster of links that dynamically draws in data as an item evolves with time.

    We ended our discussion talking about the importance of a testbed for developing ideas, some place that we can get in and play. The complete record examples in RDA would be interesting to use, also OCLC data on works and identities. OCLC’s place as an aggregator of information could be very important.

    Also, we need to develop the apps that will sell the concept of linked data. It's the classic chicken and the egg. Perhaps by just exposing the data the apps will be developed? Linked data is like the web before MOSAIC. There is some practical ONIX data that would be of great use to use currently: TOCs, author bios, reviews etc. but if we could load them into our ILS how would we display them? Would they make the bibliographic display unusable?

    The last issue we discussed was the persistence of data particularly when dealing with publishers - they don't commit, they go out of business or are absorbed - We need to think of preservation of all this data as well. Should we become the archivers? Could we afford this?

    FRBR + Linked Data

    The group introduced themselves and mentioned their interest in joining the discussion, responses included:

  • How do we transition to another format?, that:
  • Standards? Specifically with respect to the hierarchical nature of library data
  • The transition - deconstructing current data structures to have "hooks" to which to link.
  • How to bring this vision/future practices to the "simple, country cataloger"?

    Further questions, thoughts, and ideas were floated:

    * -- retrospective linkage between the last idea and some related concerns mentioned earlier in the discussion

    Use Cases Discussion Group

    This combined two proposed break-out topics:
    - Linked Data Use Cases for Scholars (non-library uses)
    - Usage of existing linked library data sources (, viaf, etc)
    Use Cases generally fell into two categories, which might be labeled
    "inward" and "outward" facing.
    Conversation started around discussion of existing usage statistics:
     - LC just started collecting stats and aren't seeing much use [1]
     - OCLC's viaf seeing a doubling of resolution of URIs each month:
        - 30,000 "303" redirects in June
        - This begged the question "Why?"
    Discussion of Value Proposition of Linked Data
     - Europeana Whitepaper mentioned [2]
     - Discussion of "Rationalizing Serendipity" as being the primary use
    case for scholarship
     - Scholars finding each others works, related works, looking at
    interdisciplinary collaboration
     - Cornell/Florida "Vivo" project to expose university faculty on the
    web [3]
       - Driven largely by Grant-Tracking needs?
       - Was also subject of a session on Sunday morning (along with
    Discussion of general use case to expose our authorities for disambiguation
     - NY Times has approached both LC and OCLC to discuss using linked
    names and subjects in their own linked data apps.
    Use case for matching and controlling names in dissertations
    In general, largest needs seem to be for working with Names of People
    and Places (Geo-coordinates?)
     - IETF has a URI Scheme for Places [4]
    Generic Use Case for pulling cross-references from existing vocabularies
    into non-ILS/non-MARC indexes
    Generic Use Case for connecting Names (via, eg, Vivo) to Topics (areas
    of interest) using id.loc
      - Note that this is in part what WC Identities is doing, though not
    via Linked Data
    Brief discussion of Intellectual Property Applications
    Use case re: Exposing Library Vocabs to the web for purposes of Search
    Engine Optimization
      - Enhance participation in programs like Google's "Rich Snippits" and
    Facebook's "Open Graph" [5],[6]
    Quick Win: Putting RDFa in OPACs - Google has indicated they would be
    interested in seeing this
    In Summary:
    * Inward facing use cases are typically about supply chain issues,
    improving re-use of data and aggregatying search
    * Outward facing use cases are harder to discuss but allow other users
    to leverage linked data sources from the library world.

    Authorities and Linked Data