Using XQuery for Metadata Transformations

The Vanderbilt Television New Archives, Jean and Alexander Heard Libraries, includes over 80,000 news programs from 1968 to present, described in 1.1 million news story records. The relational database is bulky, messy, and unsustainable for long-term preservation. The goal of this project was to migrate the records first written in multiple, home-grown metadata schemas into a single metadata schema – PBCore, an audiovisual metadata standard originating with public broadcasters. Our team utilized the XQuery tool BaseX and developed a script that parsed the records into groupings of episodic records. We chose the XQuery programing language, despite our unfamiliarity with the technology to support our student and faculty needs for XQuery. While the XQuery language was more difficult for us than Python or JavaScript or an XSLT transform, we feel this project improved our team’s ability to provide instructional assistance with a language used in digital humanities and other disciplines. By migrating to a standard metadata schema in XML we are already reaping the rewards. We are sharing sample XML files with vendors to demo and test new content management systems.

Presenter(s): Sarah Swanz, Librarian for Digital Media & Publishing, Vanderbilt University; Jim Duran, Director of the Vanderbilt Television News Archive, Vanderbilt University; Nathan Jones, Manager Digital Imaging Laboratory, Archivist, Vanderbilt University


3:20 PM