NINCH guide home        interview table of contents        previous interview        next interview

 

2       The Bancroft Library

 

HATII interviewed Merrilee Proffitt, Director of Digital Archives Development at The Bancroft Library, University of California, on January 25 2001. A result of the California Heritage Digital Image Access Project, images are selected from the Manuscript and Pictorial Collections of The Bancroft Library, to then be digitized and contained in the Online Archive of California. The OAC began as a pilot project, with the aim of developing a UC-wide database of archival finding aid data. The Bancroft began the process of digitizing its thirty-nine volumes of historical records, for the purposes of enhancing research, teaching and learning opportunities, while also opening up the resources for greater public access.

 

2.1       Organizational Digitization Program and Policy

The Bancroft Library, like other collections in the Online Archive of California, has access to its rich digital material through the OAC central server via the finding aids produced in Bancroft.

The collections at Bancroft contain material in special collections such as manuscripts and rare books. Among its components are the Bancroft Collection of Western Americana and Latin Americana, the Rare Book Collection, the History of Science and Technology Collection, the University Archives, the Free Speech Movement Project, the Bancroft Library Pictorial Collection, the Mark Twain Papers and Project, and the Regional Oral History Office. There is no formal collection survey for digitization, which tends to take place in an opportunistic way, i.e. in response to funding opportunities, or faculty alliances on campus, which the Bancroft tries to cultivate to produce digital resources that are directly related to UC Berkeley research. A curators’ management group meets quarterly and discusses various issues, including which possible digital projects could be pursued based on the opportunities available. The catalog is examined for projects, which may appear to be feasible on paper, but could be difficult in practice, i.e. those involving fragile objects or copyright issues. The research is often focused on regional and local thematic bases.

They will often incorporate funding in any grant for collection surveys specifically to ensure that all relevant material is identified. The finding aids associated with the collection, whether paper or electronic, are essential to any project at all stages.

There is a thematic approach to the selection of material for digitization, based on the research projects identified. The material will be related to the subject area. However, the nature of special collection material is such that it can often pose physical problems in digitizing, so the selection process can be very subjective. While Bancroft does not have formal selection criteria for digitization, it recognizes that such a document would be most useful – although not a restrictive document, rather one that allowed flexibility with guidelines. The management group functions as a “living” document on selection criteria.

The main influences are research significance and teaching and learning potential. They are not driven by labor or infrastructure cost reduction, or by space rationalization. With selection a response to research, the criteria change over time and from project to project.

As with sub-projects in the OAC, the Bancroft has co-operated with various projects across the cultural heritage field, as well as with corporations such as Biotech companies. Collaboration is considered to be worthwhile if partners are chosen carefully and roles and commitments are set out early, while having reasonable expectations of members. Face to face meetings are vital and should happen soon and often. It is essential to remember that the goal is to produce and to collaborate.

The Bancroft started digitization in 1993 when it converted the card catalog to MARC on EAD. The principal purposes behind the whole digitization project are research, teaching and learning, and public access, as would be expected from such an institution. Materials digitized include handwritten documents, typescript documents, original artworks and photographs, however, the nature of these materials varies, reflecting the diverse nature of the collection as a whole. They rarely digitize all of one sub-collection, i.e. one leaf from each manuscript in a collection. They did fully digitize their papyrus collection, but this was not typical.

The finding aids must be able to be used by OAC for access on the OAC central server, so the Bancroft uses standards where possible, including CIMI, TEI, JPEG, SGML and XML. Other standards for describing content are MARC, EAD, TEI Header and CDWA. They have an “adopt and adapt” approach, using standards where possible and adapting where practically necessary; for example, the papyrus project had to create a thesaurus of ancient Egypt place names. Interestingly, they are not enthusiastic about Dublin Core, but this could be a librarian’s perspective, where Dublin Core is not as flexible as MARC and or EAD.

 

2.2       Project Management and Planning

Project management varies from project to project, but tends to be in-house and responds to the needs of individual projects. The management group is key to the digitization project management, meeting twice a week to discuss all Library activities and quarterly reports on the digital issues. Within projects, email is used to track progress and goals. The Bancroft is keen to establish more formal project management procedures for digital projects and recognizes the need for training on project management. At the moment, such training is geared towards business rather than cultural and heritage activities.

They do not carry out pilot studies or feasibility studies prior to starting projects but often base benchmarks on previous projects. Text conversion for EAD and TEI is out-sourced, as is microfilm, while imaging of the collections material is carried out in-house. There is an excellent imaging unit in Bancroft, which was unfortunately unavailable to this visit, due to illness. In this unit, run by Dan Johnston, fragile and delicate materials can be imaged with great care. Flatbed scanners handle oversized materials and there is a Phase One - Power Phase camera. This unit uses Mac workstations and has its own intranet for managing digital objects.

 

2.3       Human Resources and Training

Staffing on each project varies, with archivists and curators involved at different levels. For example, more curators were involved on the papyrus project due to the nature of the material, and students are often employed where there is a large amount of material to process. In the Etext unit there are three programmers and three text encoding specialists. Training on digitization is provided in-house by Dan Johnston and his team. Staff members are also offered training on using the databases for cataloging.

Bancroft feels that there is a huge need for the community to offer more specific training on all issues associated with digitization, in short intensive courses within members’ price range. While there are a few excellent courses, they would like to see more discrete training courses.

 

2.4       Project Life-cycle Processes and Procedures

The Bancroft owns the copyright of most of the materials in the collections and employs fair use criteria when disseminating the resources. There is a bold copyright statement at the point of access:

“Copyright has not been assigned to The Bancroft Library. All requests for permission to publish or quote from manuscripts must be submitted in writing to the Head of Public Services. Permission for publication is given on behalf of The Bancroft Library as the owner of the physical items and is not intended to include or imply permission of the copyright holder, which must also be obtained by the reader.”

While much of the material is old and can be digitized with no rights problems, there are some sensitive areas that Bancroft responds to as carefully as possible. So far this strategy has been successful and Bancroft continues to monitor this area.

The materials are conserved by the curators in the Library and are routed through the curators during the digitization process. During the papyrus project, the fragile papyri were held in plastic sheeting to ensure no damage occurred. Any necessary conservation is carried out on a project, if material comes to light because of the digitization process. This tends to be opportunistic, where material may be digitized because of its fragile nature, or conservation may be an off-shoot of the project. The image lab takes due care and concern when dealing with fragile materials and curators are present when necessary. Bancroft is discussing access to the original materials and may possibly limit access, but at the moment directs users to the digital version.

Occasionally material is rejected before digitizing; it may sound promising in the catalog but in practice is not sufficiently rich in content or presents physical problems.

The original materials are cataloged by the Library. The digital objects are referenced through this catalog by MARC 856. The digital objects are delivered through the catalog and Bancroft is looking at search tools for the TEI and EAD to be attached to the catalog for more precise searching on the digital metadata. The TEI header holds data about the original and the surrogate and refers to the digital object. At the moment the catalog only allows viewing of the digital object. The finding aids on the OAC central server web page allow searching of the metadata attached to each object. There is a local database for all projects that uses the same format – Access. This database references the catalog. The SGML finding aids are produced from this database for access in the OAC central server.

Where possible, the Bancroft uses controlled vocabulary and thesaurus, e.g. name authority files. The metadata records a little information about the original object, the digital object, the process, technical details, staffing details and administrative information. The metadata is recorded by the digitizers in the image lab. The database outputs the finding aids for web access.

 

2.5       Access to Digitized Materials

The Bancroft Library mounts some of its projects online through the Californian Digital Archive – the OAC and CDL. These materials are accessed through the SGML finding aids which are produced from the database. The minimum specifications are available and Bancroft adheres to these, thereby enabling its materials to be accessible online.

They are currently researching MOA2, an XML DTD for structural and administrative data. This helps navigate through complex objects – looking at the master and derivative images. The rights information can also be attached by links to the catalog and the finding aids. Output would be as MOA2 objects with the finding aid passing onto MOA2 DTD. This structural metadata allows generic handling of complex hierarchies. The DTD would be in common exchange formats and the Library at UC Berkeley is producing a variety of tools to support the capture of administrative and structural metadata during the creation of digitized archival materials.

“The cornerstone of the MoA II effort is an XML DTD that defines the digital object’s elements and encoding. The project has also developed a relational database that allows a Library to capture the metadata, a program that reads the database and automatically creates the XML encoded digital objects, a repository manager that provides distributed network access to the objects (via RMI) and a viewer that displays MoA II objects from the repository.”[2]

MOA2 is based on administrative data for images; the aim is to record more of the derivatives and the capture process. There is a current concern over the databases as they stand at the moment and problems with migration. MOA2 will therefore address access, preservation and exchange issues.

 

2.6       Evaluation, Funding and Long-term Sustainability

Bancroft does not carry out any formal evaluation, but have consulted curators and faculty staff at the onset of projects about user needs. There is a perceived lack of standard tools for this process and access to typical instruments would make the process more viable.

Grants from funding bodies often require evaluation and dissemination strategies which motivate these procedures. However, it is also felt that the evaluation process may not always be fruitful and ad hoc discussions with colleagues can be as effective in the long term.

As part of the OAC, Bancroft recognizes that updating and adding to the archives is a necessary and unavoidable process. On older projects, the metadata will have to be updated and new materials added. The OAC is looking at changing the user interface, examining off-shelf solutions and may produce a new interface for access. The intellectual effort in MOA2 will help this process. It is through MOA2 that Bancroft is redirecting its long term preservation and archiving strategy so that it no longer relies on the databases. The preservation strategy is based on migration of data; emulation and hardware/software retention are not valid elements. The aim of this strategy is to ensure that the digital objects remain usable for as long as possible as their loss matters greatly.

 

2.7       Comments and Conclusions

The Bancroft is a university Library that participates in many digitization projects across the campus of UC and across the state. It has an in-house image lab with excellent staff who manage the digitization of a variety of materials. The Bancroft has its own guidelines for internal projects, but is also a major part of larger programs, such as the OAC, and outputs finding aids in EAD for use in the OAC online server.

The Bancroft, in common with many projects and programs in Berkeley, is aware of the need for good, easy to use, cross community tools. It is with this in mind that they are researching the XML DTD for MOA2. The re-addressing of these larger issues is a timely and strategically thoughtful method to ensure the longevity and usability of the digital objects that Bancroft produces.

They are motivated by research from faculty to create their own digitization projects, but their material is also available through cross disciplinary, cross community projects such as JARDA. In being an individual project as well as an integral part of the larger university and state program, Bancroft has ensured that it can serve both the immediate community and the growing numbers of users of digital material, while also ensuring that its collections are accessible and will remain so in the long term.


[2] http://sunsite.berkeley.edu/moa2/




valid xhtml 1.1
abp~04/02