Finding aid aggregation: Difference between revisions

From LSTA Wiki
Jump to navigation Jump to search
m (updating procedure)
m (updating procedure)
Line 11: Line 11:
EAD files are to be uploaded to a CONTENTdm digital assets management system server that hosts your collections as part of the Mountain West Digital Library network.  This involves two steps: (1) extracting the values in certain EAD elements and mapping them to CONTENTdm fields; and (2) using CONTENTdm's Acquisition Station to upload the EAD file and metadata to the Mountain West Digital Library hub server.  Both steps can be done on multiple EAD files at a time for batch processing.   
EAD files are to be uploaded to a CONTENTdm digital assets management system server that hosts your collections as part of the Mountain West Digital Library network.  This involves two steps: (1) extracting the values in certain EAD elements and mapping them to CONTENTdm fields; and (2) using CONTENTdm's Acquisition Station to upload the EAD file and metadata to the Mountain West Digital Library hub server.  Both steps can be done on multiple EAD files at a time for batch processing.   


=== 1.  Extracting the EAD Elements ===


=== 1Extracting the EAD Elements ===
An extraction script has been created to automate the first stepThe 35 EAD elements chosen for extraction, the local CONTENTdm fields they correspond to, and the Dublin Core fields they correspond to are given in the [[Media:UMA_LSTA_elements_mapEADtoDC_v8.doc|EAD-CONTENTdm-Dublin Core Elements Assignments (Mapping Table)]].
 
The extraction script was written in VBScript by Nathan Pugh at the University of Utah, based on a VBScript created by Terry Reese at Oregon State University.  When the script is double-clicked, it acts on all files in the same folder as itself that have the extension ".xml".  It automatically goes through each file and extracts the values in certain EAD elements and saves them as CONTENTdm fields within a tab-delimited text file.  This tab-delimited file can then be used to upload the metadata to the CONTENTdm collection of your EAD files.


An extraction script has been created to automate the first step.  The EAD elements chosen for extraction, the local CONTENTdm fields they map to, and the Dublin Core fields they map to are given in the [[Media:UMA_LSTA_elements_mapEADtoDC_v8.doc|EAD-CONTENTdm-Dublin Core Elements Assignments (Mapping Table)]].


The extraction script was written in VBScript by Nathan Pugh at the University of Utah, based on a VBScript created by Terry Reese at Oregon State University.  When the script is double-clicked, it acts on all files in the same folder as itself that have the extension ".xml".  It automatically goes through each file and extracts the values in certain EAD elements and saves them as CONTENTdm fields within a tab-delimited text file.


=== 2.  Uploading EAD Files to CONTENTdm Hub ===
=== 2.  Uploading EAD Files to CONTENTdm Hub ===

Revision as of 16:20, 29 August 2008

Overview of Aggregation Process

EAD Central Index Ingestion

EAD files for all partners will be hosted as part of the Mountain West Digital Library (MWDL) system. All EAD files will be discoverable in the MWDL central index. The workflow for this process is illustrated in the diagram to the left and discussed below. All LSTA partners will follow the workflow on the right of the following diagram, while NEH-grant-funded partners in the Western Waters Digital Library will follow the workflow on the left.

For more information about the process described below, please contact Sandra McIntyre (mailto:sandra.mcintyre@utah.edu), Nathan Pugh (mailto:nathan.pugh@utah.edu), or Debbie Rakhsha (mailto:debbie.rakhsha@utah.edu) at the University of Utah Marriott Library. Information is available for the development team on a password-protected workspace within the Marriott Library's Sharepoint site.

Uploading Your EAD Files to Your Institution's Repository

EAD files are to be uploaded to a CONTENTdm digital assets management system server that hosts your collections as part of the Mountain West Digital Library network. This involves two steps: (1) extracting the values in certain EAD elements and mapping them to CONTENTdm fields; and (2) using CONTENTdm's Acquisition Station to upload the EAD file and metadata to the Mountain West Digital Library hub server. Both steps can be done on multiple EAD files at a time for batch processing.

1. Extracting the EAD Elements

An extraction script has been created to automate the first step. The 35 EAD elements chosen for extraction, the local CONTENTdm fields they correspond to, and the Dublin Core fields they correspond to are given in the EAD-CONTENTdm-Dublin Core Elements Assignments (Mapping Table).

The extraction script was written in VBScript by Nathan Pugh at the University of Utah, based on a VBScript created by Terry Reese at Oregon State University. When the script is double-clicked, it acts on all files in the same folder as itself that have the extension ".xml". It automatically goes through each file and extracts the values in certain EAD elements and saves them as CONTENTdm fields within a tab-delimited text file. This tab-delimited file can then be used to upload the metadata to the CONTENTdm collection of your EAD files.


2. Uploading EAD Files to CONTENTdm Hub

using CONTENTdm's Acquisition Station to upload the EAD file and metadata to the Mountain West Digital Library hub server.

Display of Individual EAD Files in CONTENTdm

The display of EAD files will be within CONTENTdm's item viewer. As with all CONTENTdm collections, the CONTENTdm item viewer displays a header and footer of the partner's choice, typically with the partner's logo and other branding related to the EAD collection.

Nathan Pugh has modified the CONTENTdm item viewer to bypass the usual display of metadata and instead to go directly to the display of the EAD file itself. The display is done using an XSL transform (XSLT), which uses an XSL stylesheet (template) to transform the XML in the EAD file into XHTML for viewing in a browser. For the purposes of the initial demonstrations, we used one of the stylesheet combinations given by the EAD 2002 Cookbook site. For production, Nate has created a new stylesheet for the specific needs of the partners in this project. The stylesheet transforms the elements recommended by the Stylesheet Subcommittee convened by Dan Davis. A separate default stylesheet will be released to transform the container lists. Partners may wish to modify this default styling of the container list to reflect their own organization of the collection.

Searching and Browsing

Institutional Search and Browse: Each CONTENTdm-based partner will be able to use standard CONTENTdm features to search and browse its own EAD files. In addition, partners may create special search and browse pages using CONTENTdm's Custom Query functions.

Central Search and Browse: The metadata from all uploaded EAD files will be harvested periodically and aggregated into the Mountain West Digital Library at http://mwdl.org. Search and browse pages within MWDL's interface will allow users to discover finding aids from all partners, or from any selected subset of the partners. Sandra McIntyre and Nathan Pugh will be creating interface mockups for both searching and browsing for consideration by the LSTA partners.