TYPE OF PROPOSAL: Session TITLE: Progress of the Supporting Digital Scholarship Project KEYWORDS: digital library, collections, electronic publishing TITLE: Overview of Supporting Digital Scholarship Project AUTHOR: Thornton Staples AFFILIATION: University of Virginia E-MAIL: tls@virginia.edu In 1992 the University of Virginia established the Institute for Advanced Technology in the Humanities to provide support and computing resources for long-term humanities research proposed by faculty at the University of Virginia and elsewhere. In 1998 a Digital Library Research and Development Group was created within the university library to develop long-range planning of digital library architectures, systems, and procedures. These two groups received funding in 2000 from the Andrew W. Mellon Foundation for the Supporting Digital Scholarship Project (SDS) to address issues of digital resources in scholarly research particulary focusing on analysis, reprocessing, and the creation of digital primary resources. During this session the panelists will give an overview of the Supporting Digital Scholarship Project and will discuss progress after one year of work in the areas of collecting existing digital resources created with proprietary software, dynamic architectures for digital scholarship, and collecting existing HTML web sites of scholarly research for long-term use and preservation. TITLE: The Salisbury Project: Collecting an Existing Digital Resource KEYWORDS: XML, collection, preservation AUTHOR: Kirk Hastings AFFILIATION: University of Virginia - Institute for Advanced Technology in the Humanities E-MAIL: kvh2n@virginia.edu The Salisbury Project is a large online archive of images designed to supplement existing scholarship published on the cathedral and town of Salisbury. It consists of a highly structured, navigable hierarchy of over 500 images, associated descriptions, teaching materials, bibliographies, and maps. Marked up in SGML, using the Encoded Archival Description DTD, the archive has made every effort to employ the standards available and has certainly reached a certain level of platform and display independence. However, in order to present the materials to the user in the dynamic and interactive fashion it requires, certain concessions had to be made. The lack of appropriate standards and supporting software at the time of its creation, made it necessary to rely on proprietary programs and formats for its look and feel. This obviously presents problems for the collection and preservation of such a complex object in the long term. The challenge for SDS was developing an approach that made it possible to recreate the Salisbury Archive using methods and tools that could be reused for subsequent projects. Therefore in this, the first of our collecting efforts, we decided to take a two-step approach: re-encoding the document in the GDMS (General Descriptive Modeling Scheme) XML DTD and then using the newly developed XSLT standard to emulate the dynamic presentation software used by the original project. The combination of XSLT stylesheets with transformation engine allows you to model complex dynamic behavior but unfortunately it will not provide the contextual searching essential to markup based projects. Fortunately another member of the IATH staff was working on just this problem. TITLE: Planning Obsolescence: Dynamic Architectures for Digital Scholarship KEYWORDS: search engines, design methodology, digital scholarship AUTHOR: Stephen J. Ramsay AFFILIATION: University of Virginia, Institute for Advanced Technology in the Humanities EMAIL: sjr3a@virginia.edu Digital scholars have discovered the virtues of pastiche; scholarly projects of the sort developed at the Institute for Advanced Technology in the Humanities consist not merely of individually authored textual data and associated media, but large supporting archives, fragments of other projects, and links to outside materials. Given this shift in the paradigm of authorship and scholarly production, one might assume that consumers of scholarly materials will likewise come to view the task of research itself as bricolage--the act of assembling and recombining digital media into ever more specific forms. This paper seeks to theorize information retrieval in light of this fundamental shift, and to present some preliminary research undertaken as part of the Supporting Digital Scholarship project at the University of Virginia. The Granby Suite, an xml-based search and retrieval system for full-text documents developed by the Virginia Center for Digital History and Institute for Advanced Technology in the Humanities, employs an extremely modular architecture with strict separation between tiers devoted to search, document assembly, and rendering. In this paper, I argue that such design principles are not merely a matter of good software engineering, but are in fact a consequence of the new paradigms of digital scholarship. The digital library of the future should allow users to re-present materials from diverse media spread across a number of independent projects. Such re-presentations will require that the underlying mechanics of the search and delivery system allow for constant remodification. As such, the Granby Suite demonstrates that the technical exigencies of implementation and the functional requirements of humanities scholars must give rise to design methodologies flexible enough to allow for constant and sometimes sudden changes in the very architecture upon which the system depends. TITLE: Collecting Existing HTML Web Sites KEYWORDS: HTML, Collections, Archive AUTHOR: Robert Cordaro AFFILIATION: University of Virginia, Digital Library Research and Development Group E-MAIL: cordaro@virginia.edu An area of future concern for universities and academic libraries will be the long term support of existing research stored and presented as HTML web sites. There has been a recent proliferation of HTML as the final format of scholarly research projects and theses. The long term viability of such resources is in question if they remain as free-standing islands of information, particulary if the originating researcher is no longer actively maintaining the site. Changes in HTML server and browser software, problems with server hardware, and policy changes in institutions may cause the material to become inaccessable. One alternative is to "collect" the HTML webpages and move them into a library environment, possibly transforming the storage or access format. Moving them into a centrally supported digital repository or archive and possibly transforming the HTML into some other format should extend the useful life of scholarly work that may be otherwise be lost. We will discuss the pro's and con's of such a process and what problems arise in the effort. We will look at issues involved in deciding whether a site is suitable for collection, methods of limiting the scope of the site, problems involved in moving the contents and developing software to transfer and possibly transform the format, issues in trying to preserve the presentation look and feel, and possible format options for storage. By way of example we will discuss the successes and failures encountered while developing a software tool as part of the Supporting Digital Scholarship project.