TYPE OF PROPOSAL: Session
TITLE: Progress of the Supporting Digital Scholarship Project
KEYWORDS: digital library, collections, electronic publishing

TITLE: Overview of Supporting Digital Scholarship Project
AUTHOR: Thornton Staples
AFFILIATION: University of Virginia
E-MAIL: tls@virginia.edu

In 1992 the University of Virginia established the Institute for Advanced 
Technology in the Humanities to provide support and computing resources for 
long-term humanities research proposed by faculty at the University of 
Virginia and elsewhere.  In 1998 a Digital Library Research and Development 
Group was created within the university library to develop long-range 
planning of digital library architectures, systems, and procedures.  These 
two groups received funding in 2000 from the Andrew W. Mellon Foundation 
for the Supporting Digital Scholarship Project (SDS) to address issues of 
digital resources in scholarly research particulary focusing on analysis, 
reprocessing, and the creation of digital primary resources.

During this session the panelists will give an overview of the Supporting 
Digital Scholarship Project and will discuss progress after one year of 
work in the areas of  collecting existing digital resources created with 
proprietary software, dynamic architectures for digital scholarship, and 
collecting existing HTML web sites of scholarly research for long-term use 
and preservation.


TITLE: The Salisbury Project: Collecting an Existing Digital Resource
KEYWORDS: XML, collection, preservation
AUTHOR: Kirk Hastings
AFFILIATION: University of Virginia - Institute for Advanced Technology in 
the Humanities
E-MAIL: kvh2n@virginia.edu

The Salisbury Project is a large online archive of images designed to 
supplement existing scholarship published on the cathedral and town of 
Salisbury. It consists of a highly structured, navigable hierarchy of over 
500 images,  associated descriptions, teaching materials, bibliographies, 
and maps. Marked up in SGML, using the Encoded Archival Description DTD, 
the archive has made every effort to employ the standards available and has 
certainly reached a certain level of platform and display independence. 
However, in order to present the materials to the user in the dynamic and 
interactive fashion it requires, certain concessions had to be made. The 
lack of appropriate standards and supporting software at the time of its 
creation, made it necessary to rely on proprietary programs and formats for 
its look and feel. This obviously presents problems for the collection and 
preservation of such a complex object in the long term. The challenge for 
SDS was developing an approach that made it possible to recreate the 
Salisbury Archive using methods and tools that could be reused for 
subsequent projects. Therefore in this, the first of our collecting 
efforts, we decided  to take a two-step approach: re-encoding the document 
in the GDMS (General Descriptive Modeling Scheme) XML DTD and then using 
the newly developed XSLT standard to emulate the dynamic presentation 
software used by the original project. The combination of XSLT stylesheets 
with transformation engine  allows you to model complex dynamic behavior 
but unfortunately it will not provide the contextual searching essential to 
markup based projects. Fortunately another member of the IATH staff was 
working on just this problem.


TITLE: Planning Obsolescence: Dynamic Architectures for Digital Scholarship
KEYWORDS: search engines, design methodology, digital scholarship
AUTHOR: Stephen J. Ramsay
AFFILIATION: University of Virginia, Institute for Advanced Technology in 
the Humanities
EMAIL: sjr3a@virginia.edu

Digital scholars have discovered the virtues of pastiche; scholarly 
projects of the sort developed at the Institute for Advanced Technology in 
the Humanities consist not merely of individually authored textual data and 
associated media, but large supporting archives, fragments of other 
projects, and links to outside materials.  Given this shift in the paradigm 
of authorship and scholarly production, one might assume that consumers of 
scholarly materials will likewise come to view the task of research itself 
as bricolage--the act of assembling and recombining digital media into ever 
more specific forms.  This paper seeks to theorize information retrieval in 
light of this fundamental shift, and to present some preliminary research 
undertaken as part of the Supporting Digital Scholarship project at the 
University of Virginia.

The Granby Suite, an xml-based search and retrieval system for full-text 
documents developed by the Virginia Center for Digital History and 
Institute for Advanced Technology in the Humanities, employs an extremely 
modular architecture with strict separation between tiers devoted to 
search, document assembly, and rendering.  In this paper, I argue that such 
design principles are not merely a matter of good software engineering, but 
are in fact a consequence of the new paradigms of digital scholarship.  The 
digital library of the future should allow users to re-present materials 
from diverse media spread across a number of independent projects.  Such 
re-presentations will require that the underlying mechanics of the search 
and delivery system allow for constant remodification.  As such, the Granby 
Suite demonstrates that the technical exigencies of implementation and the 
functional requirements of humanities scholars must give rise to design 
methodologies flexible enough to allow for constant and sometimes sudden 
changes in the very architecture upon which the system depends.


TITLE: Collecting Existing HTML Web Sites
KEYWORDS: HTML, Collections, Archive
AUTHOR: Robert Cordaro
AFFILIATION: University of Virginia, Digital Library Research and 
Development Group
E-MAIL: cordaro@virginia.edu

An area of future concern for universities and academic libraries will be 
the long term support of existing research stored and presented as HTML web 
sites.  There has been a recent proliferation of HTML as the final format 
of scholarly research projects and theses.  The long term viability of such 
resources is in question if they remain as free-standing islands of 
information, particulary if the originating researcher is no longer 
actively maintaining the site. Changes in HTML server and browser software, 
problems with server hardware, and policy changes in institutions may cause 
the material to become inaccessable.

One alternative is to "collect" the HTML webpages and move them into a 
library environment, possibly transforming the storage or access 
format.  Moving them into a centrally supported digital repository or 
archive and possibly transforming the HTML into some other format should 
extend the useful life of scholarly work that may be otherwise be lost.  We 
will discuss the pro's and con's of such a process and what problems arise 
in the effort. We will look at issues involved in deciding whether a site 
is suitable for collection, methods of limiting the scope of the site, 
problems involved in moving the contents and developing software to 
transfer and possibly transform the format, issues in trying to preserve 
the presentation look and feel, and possible format options for 
storage.  By way of example we will discuss the successes and failures 
encountered while developing a software tool as part of the Supporting 
Digital Scholarship project.