NINCH guide home        interview table of contents        previous interview        next interview

 

4       The Online Archive of California

 

HATII interviewed Robin Chandler, Director of the Online Archive of California, on January 26 2001. A fundamental facet of the California Digital Library, the OAC aims to make accessible to a wide audience, the vast array of primary sources and artifacts stored in institutions throughout California. Robin Chandler has championed the advancement of the OAC as an OAC working group member and as the Chair of the UC Archivists Council.

 

4.1       Organizational Digitization Program and Policy

“ A core component of the California Digital Library (CDL), the Online Archive of California (OAC) is a digital information resource that facilitates and provides access to materials such as manuscripts, photographs, and works of art held in libraries, museums, archives, and other institutions across California. The OAC is a California-wide digital archives that integrates in a single searchable database of finding aids to primary sources and digital facsimiles of selected content from archival collections dispersed throughout the state. Critical issues range from ensuring that policies, procedures, and mechanisms are in place for contributing digital finding aids to the OAC database to coordinating the deployment of select digital content and the development of documentation, tools, and training needed to advance the growth and management of the OAC. The California Digital Library (CDL) is a “co-library” with a focus on digital materials and services. A collaborative effort of the ten University of California (UC) campuses, organizationally housed at the UC Office of the President, it is responsible for the design, creation, and implementation of systems that support the shared collections of the University of California. Several CDL projects, including the OAC, focus on collaboration with other California Universities and organizations to create and extend access to digital material to UC partners and to the public at large.”[3]

As part of the OAC, there are a variety of CDL sponsored virtual archives sub-projects that in turn have project partners, such as the Museum Online Archive of California (MOAC). The MOAC project, comprising eight museums in California, is testing the application of Encoded Archival Description (EAD) to describe and provide access to museum collections, test specifications for item level description of museum artifacts and to integrate digital facsimiles of museum collections with digital archival materials as part of a union database. While hosting the collections of other institutions, the OAC creates virtual archives, such as MOAC and JARDA – the Japanese American Relocation Digital Archives. The material from JARDA comes from archives, libraries and museums from across the state. Currently, the OAC does not conduct a collection survey as such, as it does not have an inherent collection, but provides the platform for digital resources across the state. For example, individual OAC partners are creating virtual archives for the OAC, such as the Free Speech Movement, the Cased Photograph collection and the San Francisco Earthquake and Fire. However, one of the charges of the OAC Steering Committee – the database’s administrative group – is to advise on the development of an overall collection policy for the OAC.

The priorities for selecting material vary from project to project. Research and public demand are major drivers, e.g. the JARDA project is in response to the OAC partner institutions’ recognition of the high public demand for reference access on Japanese internment and this need drove the establishment of the project. Political awareness plays a part in the selection of material to ensure the collection is representative of the need for the specific resources and that it can be used by all possible users, with filters down to education, at levels K–12. The curators from the eight participating institutions formulated the collection development criteria used to select materials for the JARDA project. They determined that it was important to have balanced perspectives, that is, to provide the government’s viewpoint (the camp administrators and the context of evacuation from homes) and the views of the internees, as well as to document life in all of the camps.

The OAC database has provided the first opportunity for geographically dispersed finding aids, encoded in EAD, to be available at a single online source in California. The next steps for OAC are to improve the database interface by working with user communities (faculty, students, the K-12 audience) and to improve access by identifying what functionality is required, what content is required, and how the interface can promote ease of use.

The OAC is a collaborative venture; co-operation is an integral part of all projects, with materials from dispersed collections and diverse institutions. This collaboration is worthwhile and effective but can often take more time than may at first be thought. Partners contributing finding aids and digital images to the OAC must adhere to CDL standards, such as the Digital Imaging Standards and the OAC Best Practices for New Finding Aids. Effective tools and services can be built using standardized data.

While the nature of the source material varies considerably from project to project, most forms are digitized. The OAC is currently reviewing available standards for incorporating the presentation of streaming audio and video datafiles into the database.

Standards are an important issue to OAC to ensure the access and longevity of the diverse materials. To describe content, they use MARC, EAD, TEI Header and CDWA. During the next year, it is anticipated that a planning group will be established to develop guidelines for incorporating non-EAD descriptive metadata standards into the OAC. Standards have been used where possible, and where standards are not used, it is because they do not exist as yet. Standards are also used for for data control values – AAT, and Library of Congress Subject Headings. They do not reject standards, nor adapt them. The union catalog allows information sharing. There are different data sets but generic tags. This bears the most basic level of tagging and allows diversity from the equally diverse collections.

 

4.2       Project Management and Planning

There are formal project management structures in place within the OAC and projects are attached to these. There is a Steering Committee that is the principal advisory group of OAC, responsible for evaluating issues of administration, intellectual property concerns, funding resources, and content development. The OAC manager has direct management of the various projects, such as JARDA, MOAC and California Cultures. The OAC manager is in contact with forty institutions and communicates with the OAC Working Groups advisory panel which directly oversees several subcommittees: Architecture & Structured Text, Metadata Standards and Digital Objects. This management structure allows flexibility and communication from project to project and from the strategic top down. The OAC manager fulfils a pivotal role as the point of contact for all parties.

The individual projects have various project management structures, since they are each led by an individual Project Manager. Each will have an advisory board that helps and assists where necessary. The OAC manager liaises with editorial boards and the project managers on each site and campus, so the levels of communication are established and transparent to all. Targets are set and a timeline established. There are feed back loops that enable the communication to flow freely and where necessary staff will be redeployed to reach a goal or target. The OAC Manager is considering using project management software for all tasks; they currently use Gantt charts and the present project management structure has experience in this practice.

The OAC is a core component of the CDL and is firmly established in the management structure. As a part of the larger body of the CDL, the OAC has its objectives clearly established. The ability to interact with the other components of the CDL is an essential part to establishing success. The CDL web page describes the organization structures:

“The CDL effort is directed by the University Librarian and Executive Director. There are separate staff responsibilities for Shared Content, Business Development, Education & Strategic Innovation, Scholarly Communication Initiatives, Digital Library Services, and Digital Library Technologies. There are several cross-campus advisory bodies to the CDL. The CDL maintains liaisons to all the UC campuses as well as to its other partners such as the State Library, the California State University system, and other organizations throughout California”[4]

The OAC is placed within the Shared Content section of the CDL. There are broad areas within this structure that allow for collaboration and sharing of skills and expertise.

While pilot projects and feasibility studies are not generally carried out within OAC, projects may do this independently, however, it is neither a guideline nor is it advised.

Currently, the OAC carries out its digitization in house using a variety of technologies depending on the projects. However they are now considering outsourcing for this process due to concerns about quality control that have emerged from the in-house digitization process. The OAC is in the process of interviewing vendors with attention to quality of product, considerations for handling rare and fragile materials, and security for materials. Longevity and access are the main drivers of the CDL and OAC and therefore the digital objects must be of the best possible quality. The funding is mainly public money and thus there is a large level of accountability and need for the digital materials to be available to the public in excellent quality for as long as possible. The use of outside vendors implies that this accountability can be shifted from in house staff to these vendors. Not all in-house staff have the same levels of skills and there are limitations to the equipment used, thereby raising some issues of quality control. The contract with vendors enables the quality control levels to be established at a set cost that will not shift if there is a skill differentiation.

This strategy is to contain funding by establishing a contract that is creative and collaborative. All aspects of the process must be covered by the contract with quality control checks and benchmarks established clearly with minimum targets. Post processing must be approached in the same manner.

Outsourcing is a viable option and shifts the quality control and workflow to the vendor while ensuring that the cost will not rise as a result of re-imaging where poor quality imaging has been discovered.

 

4.3       Project Life-Cycle Processes and Procedures

The copyright is owned by the institutions contributing the digital images, which is clearly established where the original archival materials are publicly available. OAC is developing a model for indicating rights with the digital objects, similar to the Bancroft Library statement [see 2501AG09]. Different project and archives have statements that apply to each but the strategy is to create a model for stating IPR on all archives/projects covering as much ground as is possible. The diverse nature of the materials, in terms of both content and geography, will mean that the copyright will vary.

The California Digital Library Technology Architecture and Standards Workgroup (TASW) is responsible for recommending to the University Librarians and other appropriate University bodies for review, architectural guidelines and standards for the University of California shared digital collections. These guidelines should be consistent with industry and other University standards, provide a framework that facilitates the creation of integrated systems and provide the flexibility to foster innovations in scholarly communication. As such TASW has developed standards for architecture, digital imaging, best practices for image capture and digital image repository models. The digital image repository model calls for storage of masters and derivative images at both CDL and partner institutions. The partners maintain control of their master image and the CDL master will serve as a records management copy. The OAC Operations Group, the technical group comprised of computer programmers and the OAC Manager, is charged with developing and implementing the storage system for managing digital images housed at CDL. CDL oversees the accessibility, management and storage of EAD and MOA2 metadata. Derivative images are primarily stored by OAC, and users navigating EAD metadata primarily access images stored at local and not CDL servers.

The OAC has EAD finding aids for forty separate institutions, the majority of which are not part of the University of California (UC). The OAC Working Group Subcommittee on Metadata Standards is currently conducting a survey to determine how many of the 5,800 finding aids in the OAC also have a MARC record. This survey is the first step towards developing a plan to create MARC records for each of the finding aids in the OAC. Most of the UC campus library finding aids do have MARC records that are available through the Melvyl union catalog (the online library catalog for the UC system). The MARC records contain an URL that links to the finding aid in the OAC database

The OAC co-ordinates EAD creation for the finding aids for the digital objects. The OAC Operations Group is continually working towards the further development of procedures for delivering new and maintaining existing finding aids and is currently discussing the formulation of a quality control tool for OAC EAD based on the “best practices” and metadata standards for processing new finding aids in EAD. The OAC also identifies or develops further tools to facilitate EAD mark-up of new finding aids for the digital objects.

As each project creates the metadata for each digital object, e.g. EAD, TEI Header and MOA2 (see Bancroft report, 2501AG09), the OAC extracts information for the finding aids for online access. The in-house method chosen by the OAC to catalog its own digital archives is an Access database and File Maker Pro. All the forms of metadata are described, the original object, the digital object, the digitization process, technical details, staffing details and administrative information. The digitizer will record the details about administration and structure while an archivist will record the descriptive metadata. The metadata catalog and the main catalog are linked by the MARC 856 field.

 

4.4       Quality Control

OAC has reached a point in its development where it is reviewing the digital imaging process and recognizing the need for formalization of the quality control process. While the CDL TASW has developed Best Practices for Digital Image Capture, there are no formal quality control procedures in place. This is also a leader to the outsourcing ventures, as the vendors have logarithms for skew and understand issues such as color space, dynamic range and gamma correction, which until now were dealt with uniformly by the institutions. This may have had the effect of creating images that are not as high quality as is possible. This reflects the need for photographers to be involved in the digitization process, as they have the skills and knowledge required to understand and create the best images, whereas cultural heritage professionals with some knowledge of IT are not always the most experienced or knowledgeable to carry out the physical digitization. The level of quality control required will depend on the project, the volume of material and the practitioners involved.

 

4.5       Access to Digitized Materials and Evaluation

Currently, access to the digital objects is provided by drilling down through the hierarchy of EAD finding aids available in the OAC. The OAC is now exploring the creation of a database comprised solely of digital objects so that the user could either access images through browsing and searching finding aid data, or directly by accessing an image database. This is a dynamic process; scholars rely on navigating the finding aids to learn about collections and review digital objects. But the complex finding aids may not be suited to the multiple users across the community who are unfamiliar with primary resources. Collaborating with CDL’s eScholarship program, the OAC is exploring means to enhance access to finding aids and digital objects available through JARDA. A group of faculty, K-12 educators, digital librarians and archivists have been identified and are meeting to discuss interface development, interpretive tools required, collection development strategies and metadata standards that will improve access to the OAC and its virtual archives resources.

At the same time, as XML becomes more integrated and internet browsers are supporting it, the CDL is developing a plan to convert the SGML EAD finding aids to XML and concurrently identifying new software to succeed Dynaweb. The OAC would like to improve upon the efficiency of the Dynaweb software, and take advantage of XML tool development targeted for browser deployment.


[3] http://www.cdlib.org/about/faq/units.html#collections

[4] http://www.cdlib.org/about/faq/#organization




valid xhtml 1.1
abp~04/02