NINCH guide home interview table of contents previous interview next interview
HATII interviewed Lee Ellen Friedland, Senior Digital Conservation Specialist at the Preservation Reformatting Division of the Library of Congress, on September 21 2000. The particular aim of the Division’s Preservation Digital Reformatting Program, is to preserve fragile or unstable materials, while also enabling them to be accessed, without further endangering their longevity. The Program therefore began the process of creating facsimiles, whether in the form of microfilm, paper copies or digital reproductions.
The Preservation Reformatting Division’s digitization strategy developed from a need to incorporate digital technology into the preservation reformatting program. The program tapped into the annual collection surveys undertaken by divisions throughout the Library, which nominates material for preservation. As collection surveys are carried out across the Library of Congress, the whole institution is aware of this process and staff are involved. The curatorial and collection management staff in each Division carry out each survey. A lesson learnt in this process was to issue guidelines for selection criteria.
The collection surveys are used to establish priorities for digitizing holdings. The selection criterion establishes which material is most appropriate for analog v. digital preservation. The division makes these decisions and a workplan is then devised, based on these criteria. These priorities are formalized in a strategic policy statement, which was based on that for preservation microfilm and photocopying. The Senior Digital Conversion Specialist and ultimately the Division Chief, who has to manage the budget, approve these decisions.
The objectives that this policy attempts to achieve are, firstly, the most appropriate preservation strategy for the collections and secondly, the use of digital material in the most appropriate way. In these respects the project has been successful. The recommendation from this experience is to have sound selection criteria (for the Library of Congress environment) and to ensure they are explicit but flexible. See policy documents on web: http://lcweb.loc.gov/preserv/prd/presdig/presselection.html
Overall obstacles to planning and building the digital deliverables are finite resources, the under-supply of certain technical support and a slow bureaucracy for acquiring tools.
The guiding criteria for selecting materials for digitization were primarily conservation, historical and cultural value, preservation and security, followed by improved functionality and enhanced access. The project did not feel that within this any prioritization criteria applied, nor did the criteria change over time.
Developing the digitization program took place in co-operation with other parts of the Library of Congress. Based on this experience the program recommends that work has to be collaborative, and that it is essential to have good communication and clearly articulated expectations of what other parties will do.
The current status of the project is ongoing with no anticipated end date.
The first and main purpose in creating the digital deliverables is preservation; the second, to be able to serve the digital surrogate regardless of the level of distribution (e.g. reading room only, campus-wide, web, etc.); and the third as experimentation.
The program did produce an informative statement of intent that was explicit about its rationale, long-term sustainability and level of faithfulness to the original.
The type of source material digitized includes:
The nature and format of the materials digitized embraces any paper or photographic based material. Everything recommended by the preservation strategy is digitized (i.e. not only a “sample”, or “entire body of material”). The program does not intend to re-purpose the digital deliverables.
The following standards, guidelines or tools are used for representing content:
The following standards, guidelines or tools are used for describing content:
The program did look to existing guidelines for digitizing particular document types when planning its digitization strategy and built on the American Memory and Making of America (MOA) models.
The following standards, guidelines or tools are used for representing structure:
In relation to standards in general the program would recommend using standards to serve goals (e.g. TEI and MARC are used officially in LC).
The program did not undertake an evaluation of the target audience, but other users could use the digital deliverables. The main limitation to the use of these deliverables is intellectual property rights (IPR) and this is clearly stated.
Advice on managing the program was available in-house (via experience of senior program staff and division). The management of the program is fully integrated into the preservation services, and support is provided to all library collections. The digitization program led to the addition of another layer to the divisional level. The formal management procedures in place are those of the whole divisional level program. The managerial quality assurance procedures in place are the tracking of work by the Project Manager and Division Chief.
Both feasibility and pilot studies were undertaken on technical feasibility, workflow analysis, workflow piloting and technology forecasting to provide information to assist project management and planning. These studies led to a refining of processes in the project. In addition periodic, limited benchmarking studies were undertaken to determine the viability of scheduling tasks. The parameters taken into account in these benchmarking exercises were the technical results of convergence and quality procedures. The tools used to aid scheduling and planning include spreadsheets and databases. Who undertakes what work is determined by the outsourcing of conversion, which leaves internal staff to manage the program. Both job descriptions and performance indicators are used, the latter for contract deliverables.
Digitization is outsourced as it is not feasible to provide these services in-house. The vendor uses flatbed scanners, film scanners, high-end professional digital cameras and book cradles. Guidelines for data capture procedures are formally established in the contracting instruments.
The people working on the project have a variety of backgrounds — library, graphic arts, photography, humanities research and project management. Staff members were not redeployed from other areas, but some microfilm technicians assist in the image quality review process. Advice on the technical aspects of digitization was predominantly available in-house.
The training needs of the project team were assessed in the applied environment of the project and were identified as project management, application of technical standards and the preparation and handling of digitizing equipment. Training was organized in-house, using the organization’s own consultants, by attendance at external courses, by independent study and by learning on the job. This organization of training has met the needs of the project.
The project is aware of the copyright position of the digital deliverables but the LC does not own the copyright to all the original materials. Every variant of copyright ownership is encountered, the predominant form being held by the original donor. Where material that is in copyright is digitized this is carried out under the legal provisions for libraries. The project declares the copyright status of the digital deliverables in a collection level statement, in accompanying documentation at the item level, and/or in bibliographic and administrative metadata or text headers. The copying rights for users vary for every set of materials. No definitive statement can be made on the format of the digital deliverables that users are able to view and download as the delivery end is the least established and is still in development. At this time no electronic management systems, such as watermarking, are in use.
The digital preservation program is fully integrated into the LC’s preservation and conservation work and preservation analysis is undertaken on material as part of each division’s collection survey. Any conservation work on the original materials, such as re-housing, is carried out as part of this. Whether any original material is modified or degraded to carry out digitization depends on the collection materials themselves. A risk assessment in preparing materials for digitization is carried out by conservators and an appropriate plan is created with their advice. This may involve specifying special equipment such as book cradles, special ventilation or workspace requirements. If any preparation of materials is required before digitization, preservation staff carry this out. Once the material has been digitized, access to the originals is sometimes restricted and this is determined by the curatorial and custodial authorities for that collection.
The cataloging system in place prior to digitization is the LC online catalog that incorporates a wide variety of access aids and to which the project has full access. All of the available information from these records is used in the digitization process. This is supplemented by locating further core reference material for the digital deliverables.
The materials sometimes have to be altered from their original format for the digitization process to take place, but material is only rejected if there is irreparable damage to the original. Where only reproductions or intermediaries are available the project digitizes from these. The reproductions or intermediaries used take the form of photocopies, photographic prints and microfilm.
The original material is cataloged in the LC catalog, which the collection division facilitates and co-ordinates. The digital deliverables are cataloged using MARC and USMARC standards.
The format for retroconverted text-based deliverables is SGML markup (TEI in Libraries Guidelines Level 1 for archival versions) and PDF for output only. None of these texts contained non-Latin scripts. The OCR software used to convert digital images is Prime Recognition.
TIFF file format is used for capturing, preserving and delivering images. GIF and PDF are also used for delivering images.
The project retains original scans in uncompressed form although not all master files are uncompressed. The project’s vendor checks the dynamic range of the equipment and undertakes processing on images for color balance, gamma corrections, etc.
Total check and statistical sampling quality control procedures are in place, while metadata recording and delivery are reviewed (tools are also being developed). The effect of these quality control procedures on workflow and policy decisions varies project by project.
Users do not have to pay for the use of the digital deliverables but access them through open access catalog, open access to catalog plus the materials, and restricted on-site access.
The usage of the digital materials is monitored by automatic data capture.
Potential users of the digital deliverables are informed about their availability through website announcements, press releases, articles in print media, conferences, meetings and electronic mail shots. The project has not yet decided which of these media is the most effective.
No front-end, formative or summative evaluations have been undertaken.
The project receives its funding annually from the LC as a percentage of an appropriated budget. In the long-term the project feels that standards have saved money.
The digital deliverables will be available indefinitely and the project does not need to rely on self-generating funds to sustain the resource. The project does not have an exit strategy.