PNLA Quarterly home
Current issue
Other issues
PNLA home
 

Metadata in the Music File World

MARIBEL ALVAREZ

 

Maribel Alvarez works for the Los Angeles County Libraries and is a Master's student in the School of Library and Information Science at San Jose State University. She can be reached at: nerdylibrary@yahoo.com

Introduction

Music has been recorded in a number of different formats. A lot has changed since the first moment in history when the first musical sound was recorded and certainly during that last fifty years. We have come a long way from the vinyl record to the modern iPod (see figure 1 below).

music evolution.jpg(°?!?oji°, 2008)

The creation of new musical formats has made it more difficult to represent music for searching and retrieval. This article surveys recent research, activity, and issues in the field of music metadata.

Research, Activity, and Issues

The Music of Social Change (MOSC) project uses Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to link the many metadata standards used across museums, libraries, and archives. The MOSC project allows access to subject-based virtual collections. One can browse by subjects such as the "civil rights struggle"(Roel, 2005). MOSC is an attempt to bridge the differences among organizations such as museums, archives, and libraries.

The greatest challenge of the MOSC is to find a way to uniformly represent all the records by means of OAI-PMH. One problem is the minimal detail included in library cataloging records, compared with the greater detail that is the practice in museum cataloging. Too much information can be a problem, but lack of detail is far worse when attempting to aggregate records using OAI-PMH and create a common interface. The OAI-PMH can harvest metadata that uses any form of Extensible Markup Language (XML). This includes the use of schemes such as: "Dublin Core, Encoded Archival Description (EAD), the eprints schema, RSLP collection description schema, UDDI/WSDL, MARC21, and the branding schema" (Roel, 2005). Dublin Core is the minimum standard required. MOSC is basic enough for any collaborator with minimal cataloging experience to understand, add, and use OAI-PMH records (see figure 2 below).

(Roel, 2005)

MOSC was funded by the Institute of Museum and Library Services (IMLS). The MetaScholar Initiative of Emory University Libraries, the Center for the Study of Southern Culture, the Atlanta History Center, and the Georgia Music Hall of Fame all took part in the project, whose goal was to allow the user to have access to resources on music and musicians linked to social change.

A well-known and user friendly database in use today is Playlist.com, which was formerly known as Projectplaylist.com. Playlist.com is used to share and listen to music using social networks. Playlist.com "is an information location tool similar to Google® and Yahoo!® but devoted entirely to the world of music" (Playlist.com, 2009). It brings together music that is available on the Web, by providing links using "web crawlers" rather than hosting the music files (Lynch, 1997). Playlist.com looks for websites that contain music files, and provides a URL. The hyperlinks provided by Playlist.com can then be added to a user "playlist" for publishing on a social networking webpage.

Playlist.com is improving its interface for easy use while adhering to copyright regulations. In today's environment the latter has become a challenge. Playlist.com states that, "our mission . is to organize this rapidly growing abundance of legal music on the web for the benefit of the worldwide music community - artists, songwriters, music distributors, and listeners alike" (Playlist.com, 2009). Playlist.com works closely with record companies to expand the amount of music available in their database. Sony BMG has partnered with Playlist.com, making all its music available on Playlist.com.

Representation of music files carries with it concerns about the legality of sharing them. In the wake of the Napster case, databases like Playlist.com are careful in their treatment of the links they provide. Playlist.com was nevertheless recently accused of copyright violation. As a result, the social networking site Myspace.com removed all playlists created in Playlist.com from user profiles and sent an email to users explaining the issue. Playlist.com states that they only provide links and are not responsible for illegal hosting of music files (Playlist.com, 2009).

This begs the question of who should is responsible for hosting, the host that provides the free link or the user who uploads the files to the service? Free file hosts such as FileFactory.com allow unlimited file hosting. Anyone can upload a file, including music files in mp3 and wma format, regardless of legal ownership of the file (FileFactory, 2009).

This battle over digital music sharing and storing is not new, and its challenges change as new technology emerges. Ogbuji (2002) states that digital music has been controversial since it began in the mid-1980s, adding that, as a result of the growing number of digital music, the Compact Disc Database (CDDB) emerged in the early 90s. The purpose of this database was to match CD attributes to the database. Later, Gracenote restricted CDDB and as a result MusicBrainz.org, a free open access database, emerged (Ogbuji, 2002).

MusicBrainz is a metadatabase containing track information, and also an "open music encyclopedia (Ogbuji, 2002)." Many music players obtain track information when a compact disc is recognized by the computer by searching MusicBrainz.org or a similar metadatabase. MusicBrainz uses Resource Description Framework (RDF) (Herman, Swick, & Brickley, 2004), which allows for a single Uniform Resource Identifier ( URI ) per track. MusicBrainz.org also applies Dublin Core (DC) metadata to some tracks. Figure 3 shows an example from Ogbuji (2002)

(Ogbuji, 2002)

The following record (Figure 4) was retrieved in a recent search in MusicBrainz.org of "Artist: The Beatles" and of "track: let it be" (MusicBrainz, 2008). Ogbuji states that that an RDF schema was used, and figure 3 shows that.

 

MusicBrainz.org's current description of the metadata schema that is being used appears to be more user friendly than the RDF schema (figure 3). This current version, shown in figure 4, is an XML web service. MusicBrainz.org uses traditional schema classes such as artist, release, track, and label. Sets of attributes pertaining to each of the classes are also assigned. In addition to this, MuscBrainz.org uses "Advanced Relationships Documentation" which includes miscellaneous relationships between artists, releases, and tracks (Murdos, 2009).

Like in the RDF-based web service, MusicBrainz.org continues to have a unique ID for artist, release, track, and label class. Type, status, and language information of music file can be found under release class and the playtime attribute can be found in track class. Label name, sort name, code, country, and founding and dissolving dates can all be found in "label class (OTWAON23-1279656530, 2008)."

There are a number of other projects working on unification of music metadata. Some of the markup languages known and used specifically for music metadata projects are; HyTime (Hypermedia/Time-based Structuring Language), SMDL (Standard Music Description Language), NIFF (Notational Interchange File Format), MNML (Music Notation Markup Language), and SMF Lyric Meta Event Definition (Childress, 2000). These projects attempt to identify key elements in music such as; frequency, pitch, timing, and duration of the music file in its corresponding record (Steyn, 2000). This relationship can be seen in the following figure 5.

Miller (2007) discusses the problems of encoding music using systems that are based on the attributes of texts. He argues that multimedia is not text and proposes that search engines "look" at videos and "listen" to music in order to retrieve them (Miller, 2007). Current routines for retrieving multimedia files consist of reading tags and any text, metadata, and files surrounding the target. Some companies use speech recognition technologies while others use waveforms in audio files to identify the media file (Miller, 2007). Companies such as TVEyes and Nexida are taking a phonetic approach.

Another project is ezSEO from EveryZing, Inc. This service converts multimedia into text, saying that "EveryZing's patented technology wraps every piece of audio and video from your site in a rich layer of metadata, including a full text output of the spoken word track" (EveryZing, Inc., 2009). Although making computers "see" and "listen" to videos and music has now become possible, these capabilities are not part of major search engines. Implementation of these innovations is limited to commercial use at this point.

Conclusion

Digital music presents challenges in information retrieval and metadata. The continuous format changes of the last few decades will probably continue. The future will probably hold the full development of digitization, fully-developed music metadata schemes, new and improved markup languages, fully functional speech recognition search engines, and many other inconceivable metadata advances. Discoverability, searchability, and the relationship of multimedia and text files are areas for future research.

References

°?!?oji°. (2008, May 16). Music Evolution. Retrieved March 1, 2009, from Flickr: http://www.flickr.com/photos/25239886@N03/2497230844/

Childress, E. (2000, February 25). Metadata 101. Retrieved March 2, 2009, from Music Library Association: www.musiclibraryassoc.org/BCC/MLA-MetadataEC.ppt

EveryZing, Inc. (2009). ezSEO. Retrieved March 2, 2009, from EveryZing: http://www.everyzing.com/solutions/video-seo

FileFactory. (2009, March 1). Home. Retrieved March 2, 2009, from FileFactory: http://www.filefactory.com/

Herman, I., Swick, R., & Brickley, D. (2004). Resource Description Framework (RDF). Retrieved March 2, 2009, from World Wide Web Consortium: http://www.w3.org/RDF/

Lynch, C. (1997, March). Searching the Internet. Scientific American, pp. 52-56.

Miller, R. (2007, June). Multimedia search matures...but not without growing pains. EContent, pp. 32-37.

Murdos (2009, January 28). Advance Relationships. Retrieved March 2, 2009, from MusicBrainz: http://wiki.musicbrainz.org/AdvancedRelationships

MusicBrainz. (2008, November 25). MusicBrainz.org. Retrieved March 2, 2009, from Let It be: http://musicbrainz.org/track/3aa2166f-99ca-403a-a995-0b17181ba65e.html

Ogbuji, U. (2002, December 1). Thinking XML: Manage metadata with MusicBrainz. Retrieved March 1, 2009, from IBM.com: http://www.ibm.com/developerworks/xml/library/x-think14.html

OTWAON23-1279656530. (2008, December 29). XMLWebService. Retrieved March 2, 2009, from MusicBrainz Wiki: http://wiki.musicbrainz.org/XMLWebService

Playlist.com. (2009, February). About Playlist.com . Retrieved March 1, 2009, from Playlist.com: http://www.playlist.com/about

Roel, E. (2005). MOSC Project: Using the OAI-PMH to bridge metadata cultural differences across museums, archives, and libraries. Information Technology and Libraries, 22-24.

Steyn, J. (2000). Music Markup Language. Retrieved March 2, 2009, from Music Markup Language: http://www.musicmarkup.info/

PNLA homepage