NINCH guide home interview table of contents previous interview next interview
On September 22 2000, HATII interviewed Judith Thomas, Assistant Director of the Robertson Media Center at the University of Virginia. Since its inception in 1991, the Center has sought to provide individual and group access to a broad range of materials at the University. The collection includes more than 17 000 videos and audio recordings, images and artifacts in various formats. While the primary target audience is the University community, for teaching and learning purposes, the digitization process inevitably enables wider access to the materials to those outside the university. Throughout the past year, the Center has initiated several developments to its practices, most significantly the increased centralization of digital activities across all units of the library. The recent creation of the Digital Library Production Service to act as a gatekeeper of the digital collections, while also being the locus of all production activity, is also intended to promote co-operation and adherence to standards.
The Robertson Media Center’s digitization strategy was initiated and is driven by curricular requirements, such as the early demands of the Art and Archaeology departments in the early 1990s to develop a response to the use of proprietary tools for images. A further impetus was given by the move to the web via non-searchable study guides. Audio material was developed for out-of-class study, foreign languages and special collections’ presentations. Video material was developed for media- centric curricula.
A collection survey was undertaken including faculty and libraries, but this was a very project-orientated exercise and cannot be scaled for the whole of the population. The Center’s activities began in an entrepreneurial way however, it has recently changed this approach. The Center has been making advances in the areas of collection development, project management and digital practices, with a view to increasing the centralization of digital activities, with collaboration across all units. With these incentives in mind, the Center developed a system for managing the central repository of digital resources. This work is overseen by the Digital Library Research Group, led by Thornton Staples.
Priorities for digitization come through separate processes in a project. Batch is used wherever possible, followed by the post-digitization processing, cursory description and metadata creation.
The objective of the Center’s digitization policy is to fulfil faculty and students’ curricular needs. The Center’s collection efforts were initially solely in response to curricular needs, but the newly instigated collection development strategy for digital images allows for a systematic building of the collection.
Overall obstacles for planning the development of digital deliverables were the lack of institutional support for tools that the Center has identified as critical. Obstacles for building digital deliverables were that collection building for non-text material was difficult as there were no institutional solutions, or the solutions were elusive.
The primary criteria for selecting and prioritizing material for digitizing are: teaching and learning potential (including historical and cultural value), and enhanced access. In the case of material digitized for special collections, conservation is a priority. Secondary criteria are research significance and preservation. The quality of the original and IPR are further considerations.
The selection criteria have not changed over time, but prioritization criteria have done so. There has been an increased interest in preservation, especially of motion media because of technological advances.
Co-operation has involved archives, libraries, museums and government agencies. This co-operation has been mostly local, at the regional level on the question of how material could be shared, at the national level with MESL and at the international level with UNESCO.
One lesson was the lack of compatible systems at the regional level in order to share resources.
The Center began in 1991/92 and the current status of the program is ongoing with no anticipated end date.
The first purposes for which the digital deliverables are created, are as a teaching and learning resource and to provide wider access. Secondary purposes are research and preservation.
The project produced an explicit statement of intent that covered its rationale, scope, significance and primary audience for fundraising and has had much iteration on the web.
The type of source material digitized includes:
The nature and format of the material digitized includes:
Whether the digital deliverables represent a sample or an entire body of material depends on the collection; for example, if it fits other selection or prioritization criteria for the whole of the collection. It is surprisingly seldom that the digital deliverables are intended to be re-purposed, for example as an educational CD-ROM.
Standards, guidelines and tools used for representing content are:
Standards, guidelines and tools used for describing content are:
Standards, guidelines and tools used for controlling data values are:
The guidelines which the Center consulted for digitizing particular document types were emanating from Getty, MESL, Harvard, Michigan and Oxford (Stuart Lee). The Center looked at these to corroborate its own experience and they were modified for local requirements.
The MARC standard was rejected because it did not work for digital images, but the Center maps its records to MARC.
Standards, guidelines and tools used for representing structure are:
In terms of navigation between ideal standards and realistic use, the Center’s recommendation is to interpret locally then make sure it can map for interoperability.
The first priority target audience is four-year college and graduate school students.
An evaluation of the target audience was made through faculty meetings, library content surveys and identified needs.
The 30% of the digital deliverables that are openly accessible can be used by groups other than the target audience, such as other sections of higher education, as well as there being great interest from K-12. The profile of users has been as anticipated. The project has not taken account of the needs of those with disabilities or the W3Cs web accessibility guidelines.
There are restrictions on the use of the digital deliverables because of the original rights and copyright holders, and these restrictions are clearly stated.
The Center follows homegrown project management guidelines.
The Robertson Media Center is related to the structure of UVA as a library department, but collaborates with computing. The digitization program has led to the broadening of political boundaries and tighter collaboration across centers. The work of the Digital Library Research Group and the creation of the Digital Library Production Service have encouraged these developments.
Quality assurance procedures are benchmarking for different types of procedures, such as references and questions for traditional library matters, but few for digital projects, since so many of these are open and fluid. Concept testing is in place and smaller projects are storyboarded.
Pilot studies for scheduling, training needs, technical feasibility, user needs, workflow analysis and piloting and technology are used to assist project management and planning. Significant changes were made to the program as a result of these pilot studies in the areas of technology, training and workflow.
The electronic tools used for scheduling are flow charts and diagrams. Clearly defined areas of technical expertise determine who does what work. Public service and project support is shared. Job descriptions and performance indicators are used.
Most digitization is outsourced for reasons of quantity, cost effectiveness and terms of grants, which make it economical to outsource. Digitization equipment was bought in and the decision as to which digitization process to adopt is based on media type and the requirement for making an archival and/or endurable copy.
The following technologies are used:
Guidelines for data capture procedures are flexible, and are increasingly adhered to, since the development of a more controlled environment. Benchmarks are a minimal color calibration.
The Center employs one full-time director (also as metadata specialist) and five full-time technical specialists (who also carry out digitization work). Each member of staff also fulfils a technical support and development role (approximately 20% of their time). In addition between one and five graduate students are employed for 5-10 hours a week. The project also works closely with staff in the school of education. Most staff have a humanities background and were employed from other areas.
Advice on the technical aspects of digitization was available externally and in-house. Areas where training was or is needed were identified as:
The Center also looks at the training needs of the end users; for example, media experts teach graduate students. All staff were or are engaged in training in-house, with internal consultants and through learning on the job. The training has met the needs of the project.
The Center is aware of the copyright position of the digital deliverables in some cases. It does not own the copyright in the original materials (which are retained by the original donors or holders). The copyright status is declared next to the acquisition device simply as “refer to copyright law”. Material in copyright is digitized under the legal provision for libraries and the Center offers copyright advice to anyone who seeks it.
Depending on the copyright position, users are able to download thumbnails, lower and higher quality images and associated documents. Most audio material is available for listening only in less than 30-second samples, full length compressed, high fidelity sound and associated documents. For digital moving image most material is available for viewing only as samples, lower quality clips, highest quality clips and associated documents.
IP access restrictions are in place.
The Center does not have a conservation procedure for the original material.
A rough assessment of the originals is made by the Center, before being referred to experts. The Center seldom undertakes conservation activities on original materials except for audio. No material is modified, destroyed, degraded or compromised to carry out digitization. Only small subsets of fragile originals are identified as being at risk during preparation for digitization.
As playback is destructive for audio and video, risk is assessed through feeding the materials. To minimize risk, fragile material moves back to special collections, and vulnerable materials are then made stable for digitization. Conservation staffs sometimes prepare material before digitization but do not monitor it during the process. Some access restrictions are placed on materials after digitization.
Cataloging and referencing systems in place before digitization include BPAC and hand lists, and as much information is taken from these as possible. The Center usually has access to all the relevant catalog materials but has to locate some core reference material. Some material is altered from the original to complete the digitization process and some material is rejected because of blurred pages, or for reasons of general quality or fragility. In many instances the Center uses reproductions or intermediaries and in these cases the material only existed in intermediary form.
The form of reproductions/intermediaries used includes the following.
For images:
For audio:
For moving images:
The digital surrogates are often cataloged differently from the originals.
Standards or guidelines used for cataloging the digital deliverables are:
Tools used for controlling data values are:
Metadata details recorded are:
Metadata records are created by the digitizer, archivist or information professional and by the digitizer who is also archivist or information professional. The metadata records are included in the main catalog and a separate catalog which is in electronic form on an intranet and available on the internet. The records for the digital deliverable and the original digitized materials share some information. The catalog and object are linked by the filename reference.
For textual material the formats used are:
Some texts contain non-Latin scripts.
The OCR software used is Omnipage and 80% accuracy was achieved with no special treatment carried out beforehand. The aim of using OCR was to capture machine-readable text for user based needs. Keying-in is not used as a method for textual materials.
For images the capture formats used are GIF, TIFF, Photo-CD and PDF. Preservation format is TIFF and delivery format is JPEG, PDF and MrSID. Capture and preservation resolution is 2000-3000 pixels one side and screen resolution for delivery. The bit-depth for illustrations, rare books and manuscripts is 24-bit color. In the case of microfilm or delicate material, the preferred quality setting is 1-bit 600 DPI tiff files, with 8-bit 400 DPI tiff files used on occasion, for materials such as photographs or handwritten documents on microfilm. LZW compression is used at capture and JPEG for delivery. The aim of compression is to improve access, enhance usability and decrease storage requirements. The program retains the uncompressed scans. The program processes JPEG images using PhotoShop and Debalelizer. MrSID images have their own processing engine. The average file size at capture and preservation is 25MB, under 1 MB for delivery. The dynamic range of the equipment is not checked.
Recommendations from this area would be to acquire familiarity with issues of color fidelity and capture at high resolutions.
Methods used for the digitization of sound are analog play, mixing, digital capture, and manipulation and compressing.
Capture and preservation formats are AIF, MIDI, Au and QuickTime.
Delivery formats are AIF, Real Audio, MIDI, Ra, Au, MPEG and QuickTime.
Compression is used and the aim is to improve access, enhance usability and decrease storage requirements. Post-editing is carried out.
For moving images the capture and preservation formats are QT and MOV. Delivery formats are MPEG-1, QT, Real Video and MOV. Moving image resolution for capture, preservation and delivery is NTSC 30 fps. Capture and preservation compression is DV and for delivery MPEG and QT. The aim of compression is to improve access, enhance usability and decrease storage requirements. Post-capture processing includes adding data, fades, and altering frame rate and size using Media Cleaner, FinalcutPro and Imovie.
Quality control for the digital deliverables includes spot checks, a random set of checks and checks on a stratified random sample. Metadata quality control procedures involve checking by the project leader. Quality control procedures have influenced the Center’s assessment of time and manpower.
Users do not have to pay to use the digital deliverables. Access is a combination of: open access to the catalog, the catalog and materials; access restricted to in-house users; and restricted access. For web delivery, users are able to browse at the collection level and view individual items. Metadata information is delivered with the digital object (retrieved from database). The program has no special hardware or software needs. Users are not able to manipulate any of the digital deliverables except for MrSID images.
The level of use of the digital deliverables is primarily course support at the moment. Usage is monitored by automatic data capture using Netwatcher and Webstats.
Potential users of the digital deliverables are informed by website announcements, press releases, articles in print media, conferences, meetings, email shots, conventional mail shots and registering with web search engines. It is not known which of these has proved the most effective.
The Center has not yet undertaken any evaluations.
It is not known how much the program has cost so far. The main sources of funding have been grants and library budget allocations. The Center believes that the use of standards has been costly in the short-term but has produced savings in the long-term. Funding organizations monitor the Center through periodic reports and assessment of deliverables.
New materials and metadata are added with high frequency.
A preservation strategy is under development. Quality control procedures are in place for life-cycle management according to media type. The Center intends to keep the digital deliverables available indefinitely.
The Center does not have an exit strategy. Loss of the digital deliverable would be a matter of concern.