NINCH guide home        interview table of contents        previous interview        next interview

 

34   University of Virginia, Special Collections (Alderman Memorial Library)

 

Edward Gaynor, Associate Director of the Special Collections department at the University of Virginia, was interviewed by HATII on September 22 2000. With a particular interest in American history and literature, the department at the Alderman Memorial Library had the task of digitizing material including more than 12 million manuscripts. The material represents primary and secondary historical sources, which can subsequently be used to support teaching and research. The material to be digitized is subject to evaluation of its fragility, prior to inclusion in the project.

 

34.1    Organizational Digitization Program and Policy

Priorities for digitization were based on the strengths of special collections; for example, the ability to secure grants is based on the appeal of certain collections, such as African American photographs. These priorities were not formalized in a strategic policy statement. The first objective of digitization is to serve user needs and, secondly, to increase access to collections; it is not a preservation strategy. The project has been successful in achieving these objectives.

The main obstacle to developing and building the digital deliverables was the absence of an overall plan — independent centers developed by format. Efforts to integrate all formats into one digital delivery system are underway.

The primary selection criteria are the provision of user services and enhanced access. Other considerations are research significance and teaching and learning potential. However, Special Collections does not evaluate the importance of the user’s need for the material. As far as increased access is concerned, the material is either heavily used or else it is an important collection that is little used. These criteria are fairly constant although Special Collections is now somewhat less constrained about what it delivers.

The project has co-operated significantly within the library but not much outside the university, with the exception of libraries in North Carolina. One recommendation from their experience is that high expectations of other departments, which are time-dependent, may not materialize.

The project began in 1994 and is ongoing. It has no anticipated end date.

The first purpose of creating the digital deliverables is as a teaching and learning resource and, secondly, to provide wider access. Experimentation was also an early purpose but this was never explicit. Revenue generation is a minor factor, mainly to cut out frivolous requests (there is no charge for UVA users external users have to pay).

The project produced an explicit statement of intent covering its significance.

The nature and source of materials digitized are:

There is a huge range of materials right up to large maps; the only material that is relatively standard is nineteenth century manuscripts. The digital deliverables are neither a sample nor the entire collection. Each collection is evaluated but usually it comprises the material that the user has requested. The project would like to re-purpose material but since it lacks the staff to do so itself, the material is provided for others to re-purpose.

The standards, guidelines or tools used for describing content are:

The standards, guidelines or tools used for controlling data values are:

The project did not consult guidelines for digitizing particular document types.

The standards, guidelines or tools used for representing structure are:

In relation to the use of standards in general and how to use them in practice, the project believes in interoperability as far as possible, using standards where feasible, but not worrying about detailed implementation. Implementation of library-wide digital standards will change this approach.

The intended audience for the digital deliverables and their priority are:

The project has not undertaken any evaluation of the target audience, and the materials could be used by those outwith the target audience. The project originally looked at the needs of those with disabilities and the W3C’s guidelines on web accessibility.

There are no restrictions on materials placed on the web. Other material is restricted because of limitations on the originals, such as copyright and restrictions to UVA users only. The profile of users is as anticipated but online exhibits have attracted a higher proportion of K-12 users.

 

34.2    Project Management and Planning

Informal internal and external advice was taken on project management. The project started as quite separate from the rest of the library and UVA but over the last year or two they have tried to communicate developments widely and to expand internal projects. There are no formal project management procedures in place but some are developing library-wide. One project management procedure that has failed, has been the lack of a single area in charge of an entire project.

Quality assurance procedure is random spot-checking. This developed from the first project where quality assurance was omitted and two-thirds proved incorrect.

Neither feasibility nor pilot studies have been carried out. Several time studies have been conducted. This started with the EAF (Early American Fiction) project and is now a standard part of projects. The parameters used are to lay out the work, break it down to the smallest part and extrapolate from these. It is the quantifiable part of this process that has proved the hardest to measure, for example, network transfer rate.

Planning and scheduling tools are becoming more elaborate — calendar schedules tied to a tracking database to generate quality assurance and backups. It was not difficult to delegate, since people were hired specifically for projects. Non-project staff include Michelle Kraft who is a digitizer and manages the student digitizers. Both job descriptions and performance indicators are state requirements.

Digitization is carried out in-house for cost reasons. Equipment was bought in for digitization, some for each project, and Special Collections was the first to receive equipment. The choice of digitization process is sometimes equipment-based; they have some expertise but lack the facilities for large plan digitization.

The following technologies are used:

Guidelines are established for data capture procedures and benchmarks in use are gray scales, color and reproduction charts.

 

34.3    Human Resources and Training

Personnel employed on the project include the following.

Special Collections Staff:

Grant Project Staff:

Most staff have a humanities background, one has a library background. Staff were hired for their skills or subject background, and areas where training was needed were:

The project director, specialist technical staff and library or cataloging staff all engaged in training. This was organized in-house (with their own consultants), through external courses, independent study and learning on the job, and has met the requirements of the project.

 

34.4    Project Life-Cycle Processes and Procedures

The project is aware of its copyright position regarding the digital deliverables, but does not own the copyright in the originals. This is retained by donors and in most cases resides with the IPR holder (e.g. literature) and digitization has been carried out under the legal provision for libraries. The copyright status of the deliverables is stated on the website but not on each item. This practice has been effective.

Users can download material for individual use. They can view and download TEI, DTD and XML marked-up text; and download thumbnails, low quality and highest quality images, and associated documents. There are no electronic management systems such as watermarking in use to control copyright.

There is always a full conservation survey carried out on each item prior to digitization and the project will not digitize materials which are fragile or may be damaged by the process. No material is modified, degraded or compromised to carry out digitization. There was no official risk assessment during preparation or digitization but new cool fluorescent and UV filtered lights have been installed to replace those which were halogen and non-UV filtered. Other steps to minimize risk are the use of bookrests and cradles. Curatorial staff demonstrate how to handle material during digitization and monitor the process. There are no official restrictions on the originals after digitization, but users are first directed towards the surrogate. In some cases, use of the originals has increased.

Card or online catalogs, printed guides, encoded guides are in place before digitization and basic descriptive or Dublin Core elements are used in digitization. The project does not have to locate any core reference material. Material is rejected for digitization because of skewing and fragility. Sometimes the project uses intermediaries, but the material does not just exist in this form. Forms of intermediaries used are slides, 35mm or 4x5 transparencies, photographic prints and microfilm.

The project does not catalog every surrogate, but when it does it uses the same method as the original. Standards used for this are:

Filemaker pro is used for tracking; cataloging at the item level is not merged. Tools for controlling data values are:

Metadata details recorded are:

Some metadata records for the digital object are included in the main catalog and some are not. The catalog and the objects are linked by the 856 field.

 

34.5    Format, Resolution and Compression of Digitized Materials

The format for retroconverted text is:

None of the text contained non-Latin script.

OCR is used on rare occasions (Omnipage Pro). The accuracy level depends on the text and no special treatment is carried out beforehand. Their advice is not to OCR any material pre-dating 1850.

All manuscript material is keyed-in and their advice here is to use operators who are familiar with the subject area and can read copperplate script.

Images are preserved and delivered in GIF, captured, preserved and delivered in TIFF and JPEG, and delivered as PDF.

Capture and preservation resolution is 600dpi, delivery 72-300dpi, bit-depth for all is 24-bit color. JPEG compression is used for delivery, firstly to improve access, and secondly to decrease storage requirements. The project retains the uncompressed originals. Color balance, gamma, re-sizing and compressing are carried out using PhotoShop. The dynamic range of the equipment is checked.

Quality assurance procedures are random spot checks and on rare occasions total checks. Such quality assurance procedures have an impact on workflow and planning for it in advance is recommended.

Users have open access to the catalog and about 75% of the materials. Searching and browsing facilities are open text, images through Filemaker (metadata in Filemaker). The level of usage on the last collection was 200 hits a day or 10,000 a month, measured by automatic data capture.

Users do not pay to access the deliverables.

Potential users are informed by:

The most effective of these are email and direct mail.

 

34.6    Evaluation, Funding and Long-term Sustainability

No front-end evaluation was undertaken. Formative evaluation used paper questionnaires, email and user observation, and no significant changes were made as a result of this. Summative evaluation uses paper and online questionnaires, email and user observation. The purpose of this was to find out whether surrogates online were an adequate substitute for originals or whether they increased original use. These results were not disseminated.

The project has cost $1,750,000 to $1,800,000, 90% of which was from grants (Federal Agencies, NEH, IMLS, Mellon). The project feels it should have cost $2 million. If it had had the extra money, the project would have made more rapid progress. So far the quality has been good, but it has taken a long time. In the long-term standards have saved money, but have possibly cost money in the short-term.

Funding organizations monitor the project through reports (one quarterly, one annual, Mellon semi-annual plus site visits).

Documents required by funders are:

New material and metadata are constantly added. Metadata are enhanced yearly, as is the user interface. It is not clear when file formats are changed.

The project is committed to keeping the materials viable and available indefinitely, but has not determined how. Loss of the deliverables would be a matter of concern.




valid xhtml 1.0 strict
abp~04/02