PNLA Quarterly home
PNLA Quarterly home
Current issue
Other issues
PNLA home
 

The Catalogers' Revenge: Unleashing the Semantic Web

Virginia Schilling

 

Virginia (Ginger) Schilling works at the Center for Bibliographic Studies & Research at the University of California at Riverside. She can be reached at: virginia.schilling@ucr.edu

Introduction

One theme that seems to be a consistent undertone in the library-related literature, and one that is reinforced by popular culture, is that cataloging, that meticulous process of identifying bits of data, is no longer necessary. Depending on the perspective, the tone varies between dread and glee. Cataloging is dead, they say. Catalogers are an anachronism and it is just a matter of time before that job function disappears entirely. We don't need catalogers; we have simple keyword searches and social tagging. We can just throw all that undifferentiated text out to the world and let the users sort it out. Look, these proponents say, how well it works on the World Wide Web. So I find it to be the supreme irony that the next revolution in web technology is slated to be exactly the opposite of this supposed trend away from the careful description and precise identification of "things" and the relationships between them.

The World Wide Web first revolutionized the presentation of text. People thousands of miles away from each other could suddenly see the same exact text or data at the same time in the same format. (Well, sort of. It was identical when there was only one browser. These days, of course, presentation of the text can vary between web browsers with different capabilities. Ask anyone who has tried to code a web page for both Microsoft's Internet Explorer and Mozilla's Firefox.) It is hard, now, to remember when it was pretty amazing just to be able to see web pages. The second wave of data handling has come with the collaboration technologies: social tagging, networking websites like Facebook and the interactivity of Wikis and blogs. Both of these technology sea changes have been aimed at making data accessible to people. Both have improved a person's ability to read the text or data presented and interpret meaning from it, for good or for ill.

The next wave of technology will make data accessible to computers as well as people. Instead of undifferentiated text presented on a web page, each data point will be coded in a way that computer programs will be able to understand and interpret. This next wave of technology change will lead us into the semantic web. The World Wide Web Consortium (W3C) defines the semantic web as providing "a common framework that allows data to be shared and reused across application, enterprise, and community boundaries" (2009b, para 1). Campbell and Fast note "information will be machine understandable, as well as machine readable, enabling intelligent agents to draw sophisticated inferences from the metadata attached to Web-based information" (2004, p. 383).

Basic Structure of the Semantic Web

Web pages are generally coded using either HTML or the stricter XHTML markup languages (collectively known as X/HTML). However, these languages only tag data on the web page for presentation purposes (i.e. they say things like "make this word bold"), not for the actual meaning delivered by the content (they don't say "this word is the name of a city"). Using markup languages that code for meaning in addition to presentation will allow software to find and use specific bits of information on the web page, such as a date or a person's name, rather than just understanding everything on the page as one gigantic mass of text. Each bit becomes a separate piece of information with its own individual meaning. In some ways, the concept is like taking everything on the Internet and putting it into a gigantic distributed database.

The real power of semantic markup, however, is that implicit relationships between bits of data can be established by the computer. People can read the text of two different web pages, for example, and be able to interpret implicit relationships between the data in each one. A computer cannot do this. If on one web page, a city is stated to be in a particular country and on another separate web page, a person is stated to be in that same city, then the implicit statement that the person is located in that same country can be understood easily by a person. Semantic markup will allow that implicit relationship to be also understood by a computer.

Semantic web markup will consist of a vocabulary used within a defined syntax similar to the way that HTML is implemented. A software package designed to "understand" what the markup means will be able to extract and use the tagged information. For example, a search engine designed to "read" the tags that indicate a person's name versus those that indicate a corporate entity will be able to distinguish between a web page containing biographical information about the person Abraham Lincoln and a web page for an elementary school named Abraham Lincoln.

As shown in the example above, at the most general level, semantic tagging will improve the precision of search results for Internet search engines designed to use it. But beyond basic search results, it will also allow for things such as users tweaking the search engine, or agent as it is often called, to understand the user's context for searching. Harper and Tillet (2007) state:

Another large part of the Semantic Web vision is about enabling "agents" or systems to insert a searcher's/user's individual context or perspective into a search for information. This necessarily involves interacting with the elements that make up that context, such as schedules, contacts, group membership, profession, role, interests, hobbies, location, etc. Systems can then be developed that "understand" the searcher's needs, based on who the searcher is and the searcher's "context" or demographics. (p. 65)

An overview of Current Research: Standards Development

Research in this field is led by the W3C. The actual technology to carry out the goals of the semantic web is still in its infancy, where it exists at all. Current research is being directed primarily towards establishing standards and developing basic specifications to ensure interoperability in the future and to allow the construction of the tools and components that will form the invisible backbone of the semantic web.

W3C has developed standards/specifications for an abstract model to describe relationships between "things", expressed as Resource Description Framework (RDF) (2004a), a semantic schema to allow the description of other vocabularies in RDF (RDFS) (2004b) and a syntax for RDF in XML (RDF/XML) (2004c). Gleaning Resource Descriptions from Dialects of Languages (GRDDL) is a specification for extracting RDF content from marked up XML or XHTML pages (2007). Simple Knowledge Organization System (SKOS) is a specification for converting existing controlled vocabularies into an RDF-compliant form (2009b). SPARQL Query Language for RDF is designed to do exactly what it says: query RDF-compliant data (2008c). The Web Ontology Language (OWL) is yet another extension of semantics for RDF, allowing for much more sophisticated use than that supported by the basic model and RDFS (2009c). RDFa is a specification for representing RDF in XML and XHTML documents (2008a). The specification for Protocol for Web Description Resources (POWDER) builds on these other specifications to allow the description of groups of web resources for purposes such as customized retrieval of resources or the identification of resource authenticity (2009a).

However, W3C is not the only organization developing tools and standards that will underpin the semantic web. There are organizations with projects contributing to development all over the world. Library of Congress has made its subject headings data available in RDF/XML (n.d.). The DCMI/RDA Task Group has started a project to convert Resource Description and Access (RDA) into RDF (2008). The International Federation of Library Associations and Institutions (IFLA) is busy translating its Functional Requirements for Bibliographic Records (FRBR) into RDF (2008).

Completely separate from W3C but with the same idea in mind, the open source community has developed a set of formats called "Microformats" (About microformats, n.d.). Just like RDFa, these allow the use of existing XHTML tags to add meaning to the data they mark up. hCalendar allows events to be tagged in such a way that the information can be extracted and, for example, added to a calendar somewhere else. hCard allows contact information to be marked up in the same way. Formats exist to describe resumes, reviews and Atom feeds. Other formats are under construction to describe audio, recipes and citations (Microformat, 2009). Talis provides another way of adding RDF-compliant tags to a web page with eRDF: Embeddable (or Embedded) RDF (Talis, 2006).

The National Archives in the United Kingdom has developed PRONOM, an authoritative registry of digital file formats for use in the RDF/XML environment (n.d.). The Global Digital Format Registry (GDFR), developed by Harvard (n.d.), is merging with PRONOM to become the UDFR or Unified Digital Formats Registry (2009). The Dublin Core Metadata Initiative (DCMI) has a registry of metadata schemes, The Dublin Core Metadata Registry (2008), as does the National Science Digital Library (NSDL). The NSDL Metadata Registry "provides services to developers and consumers of controlled vocabularies and is one of the first production deployments of the RDF-based Semantic Web Community's Simple Knowledge Organization System (SKOS)" (2009, Welcome to The Registry! section). The JISC IE Metadata Schema Registry (IEMSR) "will act as the primary source for authoritative information about metadata schemas recommended by the JISC IE Standards framework" (2009, About IEMSR section).

There are numerous descriptive vocabularies that can be used for semantic markup of resources. DCMI designed Dublin Core (2005) specifically with web resources in mind. The Gateway to Educational Materials (GEM) describes web-based educational resources (2009). The Public Health Information Network (PHIN) vocabulary developed and maintained by the Centers for Disease Control and Prevention (CDC) "enables data from different programs to be consistently documented" (2005, p. 4). Hundreds of other vocabularies exist both within the focus of library work and completely outside of and unrelated to it: TEI, EAD, FOAF, DOAP and so on.

Finally, there is a myriad of various projects documented in the literature testing the possibilities of semantic web technology. Tonkin & Strelnikov (2009) discuss the JISC metadata registry mentioned above and Heery & Wagner (2002), the DCMI metadata registry. Hildebrand et al. (2009), Angjeli et al. (2009) and Guzmán Luna, Torres Pardo & López García (2006) each discuss projects to develop or use specific thesauri. Talantikite, Aissani & Boudjlida (2008), Arch-int & Sophatsathit (2003) and Uddin & Janecek (2007) all discuss the general use of ontologies. Chavarriaga & Macias (2009) look at modeling a semantic web-based interface. Damiani & Fugazza (2007) discuss the management of intellectual property rights using semantic web technologies.

Implementation and Use of the Technology

RDF is not used in web pages directly. It is simply a vocabulary that describes the relationship between two things. Just as data can be coded into XML and displayed by a program that reads XML, it can be coded into RDF and read by RDF-compatible software. RDF, however, cannot be read by standard web browsers at this time. Instead the web pages are marked up by some other method and used by a separate software program that can extract that information and translate it into RDF. Similar to the method used by OAIster, the metadata in the web page is harvested and stored by a separate tool. Currently, there are two basic methods for marking up data semantically in a web page.

The first, most basic, method is to use the existing meta tags available in the X/HTML header to directly code values for one or more descriptive metadata vocabularies, like Dublin Core. The second method is to extend existing X/HTML coding in the web page body using semantic tag attributes defined by one or more profiles, such as eRDF, RDFa or Microformats, to hold the metadata vocabularies. The choice of which method to use depends on the resource being described and for what purpose. Does a metadata harvester, for example, expect to find the information in meta tags or does it look for a link to a profile that defines the elements used in the body of the page?

This first method of using existing meta tags seems somewhat like a stop-gap measure to be used until better standards can supplant it. Meta tags don't identify data within the text in the body of the page. The data has to be pulled out and separately tagged in the head, meaning that if changed, it must be changed in two places, one of which may not be visible in WYSIWYG X/HTML editors. The functionality is also more limited than embedding the tags in the body because the metadata is often understood by RDF extraction software to refer to the web page as a whole and not to individual pieces within the web page. In his brief discussion about embedding metadata in X/HTML, O'Donnell gives an example of embedded metadata. Adapted from his example, the use of Dublin Core in the meta tags might look something like:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<head>

<link rel="schema.dc" href="http://purl.org/dc/elements/1.1/" />

<meta name="DC.title" content=" Naked Metadata" />

<meta name="DC.creator" content=" Jonathan O'Donnell " />

<meta name="DC.rights" content=" http://purl.nla.gov.au/net/jod/tutorial/naked-metadata.html © Jonathan O'Donnell 23 October 2005" />

<meta name="DC.date" content="23 October 2005" />

</head>

<body>

<h1>Naked Metadata</h1>

<h2>Jonathan O'Donnell</h2>

<p>http://purl.nla.gov.au/net/jod/tutorial/naked-metadata.html © Jonathan O'Donnell 23 October 2005</p>

</body>

</html>

The second method is the technique of using the existing X/HTML structure to include semantic tags within the body of a web page. In the head, a reference is made to a pre-existing profile that defines how the elements are used. Bits of data are enclosed with tags such as div, span or class. Formatted as attributes of the X/HTML tags, the semantic tags can then contain values that link specific elements of particular metadata vocabularies to the tagged text. This method has the advantage of tagging the information right in the text where it occurs. The example below from O'Donnell (2006, "Example" section) shows how Dublin Core might be used within the body of an XHTML page.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<head profile="http://purl.org/NET/erdf/profile" >

<link rel="schema.dc" href="http://purl.org/dc/elements/1.1/" />

</head>

<body>

<h1 class="dc-title">Naked Metadata</h1>

<h2 class="dc-creator">Jonathan O'Donnell</h2>

<p class="dc-rights">http://purl.nla.gov.au/net/jod/tutorial/naked-metadata.html © Jonathan O'Donnell <span class="dc-date">23 October 2005<span></p>

</body>

</html>

A variety of tools exist to either generate or use semantically tagged data. DC-dot generates some semantic markup in X/HTML, RDF or XML for an existing web page without any (n.d.). Other projects, like GEM and PHIN, have implemented search interfaces for their own vocabularies. Extensions for Firefox, like Operator (n.d.) and Piggy Bank (2008), can extract data from web pages tagged with Microformats markup. Tools run the gamut of sophistication from simple scripts like eRDF detector (Alexander, 2007) to full-fledged data processors like ARC (2007).

However, despite the seeming multitude of tools available, this is where the infancy of the semantic web technology is most obvious. No one standard dominates the industry. There are still a variety of ways to have data semantically encoded without any correspondence, necessarily, between them. Many of the tools available are aimed at programmers and not end-users who just want to code a web page to provide access for other users, not write an entire customized suite of scripts to create and process the metadata generated. Development has begun, but has a long way to go before the semantic web will be ready for the mainstream.

How Does this Relate Specifically to Libraries?

The obvious question, for me, is how does the semantic web relate to libraries? It turns out very closely. First, the issue nearest to my heart: the semantic web has the potential to revolutionize cataloging. Cataloging could take place in software built using the same type of structure that would be used to build metadata into and extract it from web pages. Then cataloging metadata would be interoperable with all other RDF-compliant metadata on the web and, furthermore, library resources would be findable and usable in the same way as all other RDF-compliant web resources. The headaches brought on by trying to aggregate data from different, proprietary systems could become a distant memory. Different software packages to decode the individual markup schemes might not be necessary. (And there will be flowers and unicorns. People make the same claims about open source software and that has yet to make major inroads into the ILS market. However, hope springs eternal.)

It could, in fact, come to mean that instead of using text strings to assign meaning to things like the author's name, cataloging would be a process of identifying the URI where an author's name is defined and including that URI in the cataloging record in place of a text string. (An example from the Library of Congress Subject Headings would be the URI "<http://id.loc.gov/authorities/sh98007973#concept>" in place of the text "Smith-Purcell effect.") The actual text of the author's name would not be retrieved until a user calls up that record for viewing. The advantage of this scenario is that name authority information only has to be updated once, in one place.

In this bewitching vision, we would share in the creation of Uniform Resource Identifiers (URIs) for works, expressions, manifestations, persons, corporate bodies, places, subjects, and so on. At the URI would be found all of the data about that entity, including the preferred name and the variant names, . If any of that data needed to be changed, it would be changed only once, and the change would be immediately accessible to all users, libraries, and library staff by means of links. (Yee, 2009, p. 55)

Tillet (2003) envisions an international virtual name authority file. "One proposal is to link the personal name authority file of the Library of Congress and that of the Deutsche Bibliothek (DDB) in Germany" (p. 115). The authority data linkages would be harvested and stored at a central location but the "day-to-day record maintenance activities continue to be managed as they are now by the national bibliographic agency (or regional authority)" (p. 116).

Another way that the semantic web fits in with the mission of libraries is that the core of the semantic web is built on controlled vocabularies, something librarians have been working with forever. Library work is full of pre-existing thesauri and controlled vocabularies that need only to be fitted into the appropriate structure for use as part of the semantic web.

The Semantic Web communities and library communities have both been working toward the same set of goals: naming concepts, naming entities, and bringing different forms of those names together.. The tools and vocabularies developed in libraries, particularly those developed by the Library of Congress, are sophisticated and advanced. When translated into Semantic Web technologies they will help to realize Berners-Lee's vision. (Harper & Tillet, p. 48)

The more connections that are made between things described in various places, the more value they have collectively. Tillet supports that idea. "We already have controlled vocabularies in our various authority files. Those could be linked with other controlled vocabularies of abstracting and indexing services, of biographical dictionaries, of telephone directories, and many other reference tools and resources to help users navigate" (p. 116-17).

Finally, Harper & Tillett quote Miller in stating that libraries have a role in developing the "layer of trust" for the semantic web. They are not clear about what that role might be, simply stating that "libraries have long standing trusted position that is applicable on the Web [ sic ]" (p. 50). Hillman (2008), on the other hand, provides a hint to how libraries might help build the trust layer:

A description consisting of aggregated sourced statements is susceptible to a variety of processes designed to provide downstream users with configurable descriptions based on their needs and capabilities. Over time, multiple statements can be rated using various criteria, and only the "best" used for exposure to downstream users who would rather not do the rating themselves. (p. 75)

By rating the descriptive statements that apply to any given resource, the library sets itself up as an authority on what information is "good" or "bad."

Issues and Problems

Unfortunately, it is never as simple as just issuing some standards and letting the developers take over. First of all, metadata has traditionally been centrally created by trained staff and, additionally, created after the fact. So two immediate challenges present themselves: getting the people who create the resources to also create standardized, coherent metadata to go with those resources and to get those creators to create that metadata at the same time as they create the resources. Greenberg, Sutton & Campbell (2003) state that "the glories of the Semantic Web will ultimately depend on tools that will enable authors to create with very little effort RDF annotations and other useful semantic metadata on their Web pages" (p. 18). At the current time, the tools allowing the web page developer to easily mark up applicable metadata are still being developed. As Coyle (2008) notes, the semantic web is still stuck in "engineer mode."

The documents on the [W3C] Semantic Web site develop concepts, set rules, and illustrate code. But even the most basic explanatory document, the RDF Primer, lacks examples of what services could be provided and how it might look to a user of the Semantic Web. (p. 264)

A second issue with the model of the semantic web is the question of bandwidth and accessibility. If there is no such thing, for example, as an "in-house" catalog record, what level of bandwidth is required for a system to retrieve and/or process each piece of data that must be retrieved from some other source on the Internet? A library, again for example, does not have one person calling up one record at a time; it would be operating on the scale of dozens of people calling up dozens of records each. How would this impact a place that can barely even afford its Internet connection with the existing bandwidth load? D'Arcus as quoted by Yee seems to imply that the way to get around this is to not retrieve the information in real-time. According to D'Arcus it can be retrieved in off-hours and stored locally for retrieval by users (Yee, p. 65). However, this solution creates its own problems, primarily that the data is now stored in two separate places and if changed in one, also needs to be changed also in the other.

Related to this is the very ephemeral existence of many sites on the Internet. Presumably, anyone who sets themselves up as an authority, i.e. creates a vocabulary, profile, registry etc., can somehow guarantee their continuing presence for some time at least, but unlike with traditional print authorities, once the website goes down, the authority information will be completely gone as well. Abrams (2005) also worries about fragmentation of authority information. In the context of a discussion about registries for digital file formats, he states:

It appears likely that many similar format registries may be developed or at least deployed at institutions around the world. This could result in an undesirable fragmentation of important format representation information that would unnecessarily complicate the process of discovery of relevant data. (p. 132-3)

Another issue is that there is no one technology or metadata standard that fits everything. Unlike with MARC, there is no one ubiquitous standard that everyone will use. Some standards are not even developed yet, while others are still evolving. Hillman notes that even within communities, agreement on just a representative vocabulary can be difficult.

For the most part communities using metadata are still floundering in their attempts to figure out where the best balance between "rich and comprehensive" and "efficient and functional" can be defined. Part of the challenge is that few communities of practice have been able to define their needs as a community and take the next steps to implement services that support their goals. (p. 68)

Finally, the model for metadata exchange outside of MARC, as exemplified by Dublin Core, currently, is to develop detailed and careful descriptions for in-house use and make only simplified data available for exchange. Dublin Core is designed with core elements and qualifiers. Institutions can use the qualifiers for their own cataloging, but anything designed to be transferable has to make sense using only the core elements. The DCMI page Using Dublin Core states that the "element value (minus the qualifier) must continue to be generally correct and useful for discovery" (section 1.2). Campbell & Fast note that:

Such an approach is highly useful to the development of interoperability standards.. However, while this approach goes a long way towards ensuring smooth and effective delivery of information across the Web, it does not necessarily allow libraries to exploit and contribute to the emerging Semantic Web in a full and exciting way. (p. 383)

Campbell and Fast continue:

If we look at the Semantic Web merely as a cheap mechanism for exchanging metadata between approved metadata providers, we are shutting ourselves out from its potential richness. A Web that boasts a wealth of information, semantically coded and with a global addressing system, could be a source of cataloguing data in and of itself. (p. 386)

Conclusion: Challenges for the Future

The semantic web is coming. There is a multitude of developers all over the world working busily to ensure that. The first challenge is to get the tools into place for users of all levels. No one but the geekily inclined is going to be willing to invest the time and effort necessary to code semantically tagged web pages from scratch. The second challenge is getting semantic data into the everyday workflow of librarians. Semantic linkages with web content will enrich current content in ways yet unknowable. It is my intent to explore a small corner of the semantic web and contribute to the creation of linkages with a project to add metadata to pages on the site of cbsr.ucr.edu.

While decentralizing cataloging by pushing facets of it out to the content-creators is not going to actually put catalogers out of business, as so many fear, I do believe it will change how the job is done. Catalogers will have to become metadata experts, not just MARC experts. Knowledge of RDF, XML and a variety of other metadata vocabularies (and the tools using them) will be necessary. Professional cataloging might be more a job of aggregating and improving harvested or contributed metadata, rather than developing new metadata, like MARC records, for resources. Hillman proposes that metadata evaluation will also be a large part of the future cataloger's job description.

Increasingly . they may find themselves managing data from multiple sources, aggregating that metadata to serve a particular purpose, often not the one for which the metadata was originally created. Because most metadata available for aggregation, whether within an institution or via harvest, was created in a context most likely rife with assumptions that it would be used narrowly and only in a specific context, any aggregation project automatically involves some confrontation with metadata quality issues. (p. 66)

Cataloging has already entered a transitional phase, with more and more positions being advertised for metadata librarians rather than traditional cataloging librarians. It will be a challenge going forward for librarians and the content-creators to navigate this new re-imagined world. Cataloging is dead. Long live cataloging.

References

About microformats. (n.d.). Retrieved October 4, 2009 from http://microformats.org/about

Abrams, S. L. (2005). Establishing a Global Digital Format Registry. Library Trends, 54 (1), 125-143.

Alexander, K. (2007). eRDF detector. Userscripts.org. Retrieved October 17, 2009 from http://userscripts.org/scripts/show/8260

Angjeli, A., Isaac, A., Cloarec, T., Martin, F., Meji, L. van der, Matthezing, H., et. al. (2009). Semantic web and vocabulary interoperability: an experiment with illumination collections. ICBC, 38 (2), 25-29.

ARC: RDF classes for PHP. (2007). Retrieved October 17, 2009 from http://arc.semsol.org/home

Arch-int, N. & Sophatsathit, P. (2003). A semantic information gathering approach for heterogeneous information sources on WWW. Journal of Information Science, 29 (5), 357-374.

Campbell, D. G. & Fast, K. V. (2004). Academic libraries and the semantic web: What the future may hold for research-supporting library catalogues. The Journal of Academic Librarianship 30 (5), 382-390.

Centers for Disease Control and Prevention. (2005). Public Health Information Network vocabulary metadata standards. Version 1.2. 08/08/2005. Retrieved October 11, 2009 from http://www.cdc.gov/phin/library/documents/pdf/PHIN%20Vocabulary%20 Metadata%20V1.2.pdf

Chavarriaga, E. & Macias, J. A. (2009). A model-driven approach to building modern semantic web-based user interfaces. Advances in Engineering Software, 40, 1329-1334.

Coyle, K. (2008). Meaning, Technology, and the Semantic Web. The Journal of Academic Librarianship, 34 (3), 263-4.

Damiani E. & Fugazza, C. (2007). Toward semantics-aware management of intellectual property rights. Online Information Review, 31 (1), 59-72.

DC-dot: Dublin Core metadata editor. (n.d.). Retrieved October 17, 2009 from http://www.ukoln.ac.uk/cgi-bin/dcdot.pl?n=0&guesspublisher=yes

DCMI/RDA Task Group. (2008). DCMI/RDA Task Group Wiki. Retrieved October 10, 2009 from http://dublincore.org/dcmirdataskgroup/

Dublin Core Metadata Initiative. (2005) Using Dublin Core. Retrieved October 17, 2009 from http://dublincore.org/documents/usageguide/

Dublin Core Metadata Initiative. (2008). The Dublin Core metadata registry: Promoting the discovery and reuse of metadata. Retrieved October 10, 2009 from http://dcmi.kc.tsukuba.ac.jp/dcregistry/

Gateway to Educational Materials information. (2009). Retrieved October 17, 2009 from http://www.thegateway.org/about

Global Digital Format Registry. (n.d.). Retrieved October 17, 2009 from http://www.gdfr.info/

Greenberg, J., Sutton, S. & Campbell, D. G. (2003). Metadata: A fundamental component of the Semantic Web. Bulletin of the American Society for Information Science and Technology, 29 (4), 16-18.

Guzmán Luna, J., Torres Pardo, D. & López García, A. N. (2006). Desarrollo de una ontología en el contexto de la web semántica a partir de un tesauro documental tradicional. Revista Interamericana de Bibliotecología 29 (2), 79-95

Harper, C. & Tillet, B. (2007). Library of Congress controlled vocabularies and their application to the semantic web. Cataloging & Classification Quarterly, 43 (3), 47-68.

Heery, R. & Wagner, H. (2002). A metadata registry for the semantic web. D-Lib Magazine, 8 (5). Retrieved September 20, 2009 from http://www.dlib.org/dlib/may02/wagner/05wagner.html

Hildebrand, M., Ossenbruggen, J. van, Hardman, L. & Jacobs, G. (2009). Supporting subject matter annotation using heterogeneous thesauri: A user study in Web data reuse. International Journal of Human-Computer Studies, 67, 887-902.

Hillman, D. (2008). Metadata Quality: From Evaluation to Augmentation. Cataloging & Classification Quarterly, 46 (1), 65-80.

International Federation of Library Associations and Institutions (2008). Declaring FRBR entities and relationships in RDF. Retrieved October 10, 2009 from http://www.ifla.org/files/cataloguing/frbrrg/namespace-report.pdf

JISC IE Metadata Schema Registry. (2009). Retrieved October 17, 2009 from http://www.ukoln.ac.uk/projects/iemsr/

Library of Congress. (n.d.). About [authorities & vocabularies]. Retrieved October 17, 2009 from http://id.loc.gov/authorities/about.html

Microformat. (2009). Wikipedia. Retrieved October 4, 2009 from http://en.wikipedia.org/wiki/Microformats

The National Archives. (n.d.). The technical registry: PRONOM. Retrieved October 10, 2009 from http://www.nationalarchives.gov.uk/PRONOM/Default.aspx

National Science Digital Library. (2009). NSDL registry: Supporting metadata interoperability. Retrieved October 10, 2009 from http://metadataregistry.org/

O'Donnell, J. (2006). Naked Metadata. Retrieved October 17, 2009 from http://jod.id.au/tutorial/naked-metadata.html

Operator. (n.d.). Mike's Musings. Retrieved October 17, 2009 from http://www.kaply.com/weblog/operator/

Piggy Bank. (2008). Retrieved October 17, 2009 from http://simile.mit.edu/wiki/Piggy_Bank

Talantikite, H. N., Aissani, D. & Boudjlida, N. (2008). Semantic annotations for web services discovery and comparison. Computer Standards & Interfaces, 31, 1108-1117.

Talis. (2006). Rdf in html. Retrieved October 12, 2009 from http://research.talis.com/2005/erdf/wiki/Main/RdfInHtml

Tillet, B. (2003). AACR2 and Metadata: Library opportunities in the global semantic web. Cataloging & Classification Quarterly, 36 (3), 101-119.

Tonkin, E. & Strelnikov, A. (2009). Spinning a semantic web for metadata: Developments in the IEMSR. Ariadne, 59. Retrieved October 17, 2009 from http://www.ariadne.ac.uk/issue59/tonkin-strelnikov/

Uddin, M. N. & Janecek, P. (2006). Faceted classification in web information architecture: A framework for using semantic web tools. The Electronic Library 25 (2), 219-233.

Unified Digital Format Registry (UDFR). (2009). Retrieved October 17, 2009 from http://www.udfr.org/

World Wide Web Consortium (2004a). RDF primer. Retrieved October 10, 2009 from http://www.w3.org/TR/rdf-primer/

World Wide Web Consortium (2004b). RDF vocabulary description language 1.0: RDF Schema. Retrieved October 4, 2009 from http://www.w3.org/TR/rdf-schema/

World Wide Web Consortium (2004c). RDF/XML syntax specification (revised. Retrieved October 17, 2009 from http://www.w3.org/TR/rdf-syntax-grammar/

World Wide Web Consortium (2007). Gleaning resource descriptions from dialects of languages (GRDDL). Retrieved October 4, 2009 from http://www.w3.org/TR/grddl/

World Wide Web Consortium (2008a). RDFa primer: Bridging the human and data webs. Retrieved October 4, 2009 from http://www.w3.org/TR/xhtml-rdfa-primer/

World Wide Web Consortium (2009b). SKOS simple knowledge organization system primer. Retrieved October 10, 2009 from http://www.w3.org/TR/skos-primer/

World Wide Web Consortium (2008c). SPARQL query language for RDF. Retrieved October 10, 2009 from http://www.w3.org/TR/rdf-sparql-query/

World Wide Web Consortium (2009a). Protocol for web description resources (POWDER): W3C working group note 1 September 2009: Primer. Retrieved October 4, 2009 from http://www.w3.org/TR/powder-primer/

World Wide Web Consortium (2009b). Semantic web: W3C semantic web frequently asked questions. Retrieved October 4, 2009 from http://www.w3.org/RDF/FAQ

World Wide Web Consortium (2009c). Semantic web: Web ontology language (OWL). Retrieved October 4, 2009 from http://www.w3.org/2004/OWL/

Yee, M. (2009). Can Bibliographic Data be Put Directly onto the Semantic Web? Information Technology and Libraries, 28 (2), 55-80.

Additional Resources

Adams, K. (2002). The semantic web: Differentiating between taxonomies and ontologies. Online, 26 (4), 20-23.

Berners-Lee, T. (2009 May) Linked Data [video file]. Presentation at TED2009: "The Great Unveiling" Conference, Long Beach, California, USA. Retrieved October 4, 2009 from http://www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html

Bradley, F. (2009). Discovering linked data. Library Journal, 134 (7), 48-50.

Chudnov, D. (2009). The geography of linked data and ready reference. Computers in Libraries, 29 (4), 26-28.

Coombs, K. (2009). Microformats: Context inline. Library Journal, 134 (7), 64.

Coyle, K. (2009). Making connections. Library Journal, 134 (7), 44-47.

DOAP: Description of a Project (n.d.). Retrieved October 17, 2009 from http://trac.usefulinc.com/doap

Dublin Core Metadate Initiative. (2009). DCMI metadata terms. Retrieved October 10, 2009 from http://www.dublincore.org/documents/dcmi-terms/

Dumbill, E. (n.d.). DOAP. Retrieved October 10, 2009 from http://trac.usefulinc.com/doap

EAD: Encoded Archival Description. (2009). Retrieved October 10, 2009 from http://www.loc.gov/ead/

Embedded RDF (2009). Wikipedia. Retrieved October 12, 2009 from http://en.wikipedia.org/wiki/Embedded_RDF

Feigenbaum, L. (2009 May). The 2009 semantic web landscape [slideshow]. Presentation at the PRISM Forum SIG Meeting, Luzern, Switzerland. Retrieved October 4, 2009 from http://www.slideshare.net/LeeFeigenbaum/semantic-web-landscape-2009

Fichter, D. & Wisniewski, J. (2008). Microformats and the search for meaning. Online, 32 (4), 55-57.

The Friend of a Friend (FOAF) project. (n.d.). Retrieved October 10, 2009 from http://www.foaf-project.org/

Gradmann, S. (2005). Rdfs:frbr- Towards an implementation model for library catalogs using semantic web technology. Cataloging and Classification Quarterly, 39 (3), 63-75.

Herman, I. (2009a June). Introduction to the Semantic Web (tutorial) [slideshow]. Presentation at the 2009 Semantic Technology Conference, San Jose, CA, USA. Retrieved October 4, 2009 from http://www.w3.org/2009/Talks/0615-SanJose-tutorial-IH/Slides.pdf

Herman, I. (2009b June). What is new in W3C land? [slideshow]. Presentation at the 2009 Semantic Technology Conference, San Jose, CA, USA. Retrieved October 4, 2009 from http://www.w3.org/2009/Talks/0615-SanJose-talk-IH/Slides.pdf

Hodge, G. (2000). Systems of knowledge organization for digital libraries: Beyond traditional authority files. Council on Library and Information Resources. Retrieved October 4, 2009 from http://www.clir.org/pubs/reports/pub91/contents.html

Kolbitsch, J. & Krottmaier, H. (2006). The Use of HTML-Encoded Dublin Core in Academic and Educational Settings. Retrieved October 17, 2009 from http://www.kolbitsch.org/research/papers/2006-Dublin_Core_Analysis.pdf

Krishnamurthy, M. (2006). Library portals: Ontological representation of knowledge on the web. Information Studies, 12 (2), 75-84.

Miller, R. (2008). Content delivery rides the semantic web. EContent, 31 (8), 26-30.

Msporny. (2008). RDFa basics [Video file]. Retrieved October 4, 2009 from http://www.youtube.com/watch?v=ldl0m-5zLz4&NR=1

OAIster .find the pearls. (2009). Retrieved October 17, 2009 from http://www.oaister.org/

RDFa. (2009). Wikipedia. Retrieved October 10, 2009 from http://en.wikipedia.org/wiki/Rdfa

Rogers, G. P. (2007). Roles for semantic technologies and tools in libraries. Cataloging & Classification Quarterly, 43 (3), 105-125.

Smith, G. (2009). Web 3.0: 'Vague but exciting'. AdweekMedia, 50 (24), 19.

Social network service. (2009). Wikipedia. Retrieved October 17, 2009 from http://en.wikipedia.org/wiki/Social_network_service

Sutton, S. (2008). Metadata quality, utility and the semantic web: The case of learning

resources and achievement standards. Cataloging & Classification Quarterly, 46 (1), 81-107.

TEI: Text Encoding Initiative. (2009). Retrieved October 10, 2009 from http://www.tei-c.org/index.xml

Wiki. (2009). Wikipedia. Retrieved October 17, 2009 from http://en.wikipedia.org/wiki/Wiki

World Wide Web Consortium (2005). Quick guide to publishing a thesaurus on the semantic web. Retrieved October 4, 2009 from http://www.w3.org/TR/2005/WD-swbp-thesaurus-pubguide-20050517/

World Wide Web Consortium (2009a). Protocol for web description resources (POWDER) working group. Retrieved October 4, 2009 from http://www.w3.org/2007/powder/

World Wide Web Consortium (2009b). Semantic web: W3C semantic web activity. Retrieved October 4, 2009 from http://www.w3.org/2001/sw/

World Wide Web Consortium (2009c). Semantic web: W3C semantic web activity publications. Retrieved October 4, 2009 from http://www.w3.org/2001/sw/Specs.html

PNLA homepage