table of contents        previous chapter        next chapter

 

 

II. Project Planning

 

Introduction

Planning is the first and arguably the most important step in any digitization project. Lest this sound like a platitude, it is worth noting that far too many projects are undertaken without adequate thought to the activities involved, the staff required, or the technical exigencies of the work. The need for good planning may be self-evident, but in practice it is often difficult to anticipate all the areas in which forethought is essential. Good planning for any project—even for managers who have successfully completed previous projects—requires a large number of decisions on questions such as the following:

This kind of planning is one of the most intellectually challenging of the project tasks, and may well be time-consuming. There may also be pressure to hurry this step, from a desire to show visible progress or in response to institutional pressure. But an investment in this kind of planning will be amply repaid over the life of the project: in the quality of the products, in smooth workflow, in staff morale, and not least in the total project cost. The goal of this section is to sketch out the parts of the planning process and indicate the important decisions—assessing the resources needed to complete the project, the staffing and equipment required, the choice and role of metadata, and the overall project management—and how to go about making them effectively. The checklist below gives a brief inventory of the resources required to undertake a digitization project. Not all projects will require all the resources listed, but this list will show the range of needs you should anticipate.

Technology develops and changes so quickly that decisions like those listed above may seem almost impossible to make with any confidence. Information on the array of standards, specifications, equipment, skills, and techniques not only presents a daunting learning curve, but also a welter of detail that can be very difficult to track. For the project planner, however, it is not these details that really inform good decision-making. It is much less important to know what sampling rate a particular piece of equipment offers than to understand how sampling works and how it can affect the quality of digital conversion. These underlying principles apply more broadly and change more slowly. Most importantly, though, they represent the level at which good planning takes place; with this knowledge, the planner has the tools to bring together an expert group of staff and consultants and create an effective framework within which they can work. This Guide contains detailed, up-to-date information on best practices in a number of technical areas, but the Guide's greatest and most enduring value for the project planner is its presentation of the more fundamental issues and how they interrelate.

The Guide's introductory section has already addressed the first question on the list above: What work needs to be done? By emphasizing the identification of audience and of your own institutional location and goals, the introduction contextualizes this decision and reminds us to ask "Who needs this work? Who will benefit?" The further ramifications of this question are explored in Section III on selecting materials, which discusses how to assess your collections and set priorities for digitization, and in Section XII on user evaluation, which provides guidance on how to assess the needs of your audience and how this information can shape your digitization strategy. This is also the stage at which you should get the facts and make your decisions concerning rights management, without which you cannot proceed with digitization: you need to establish the intellectual property status of the materials you wish to digitize, and you also need to decide on your own strategy for managing the intellectual property you are about to create. Both of these issues are explored in depth in Section IV. And although the project's final product may seem impossibly remote at this stage, you need to consider how the results will be distributed: not only what technologies you will use, but also how you will control access and ensure that you reach your intended audience. Section X covers these issues in detail.

The question of how the work will be done—the specifications, standards, and procedures you need to establish—has many facets which are addressed at various points in the Guide. Foremost among these is the question of standards: by using standards-based approaches wherever possible, you increase the longevity, portability, and interoperability of your data. You need to be aware of the standards that apply to the kinds of digitization you are undertaking, and these are described in detail in the sections on digitizing text, images, and audio-visual materials. Given the complexity and breadth of most standards, though, you also need to be aware of the best practices that apply to your community. For instance, both documentary historians and linguistic researchers use the XML-based Text Encoding Initiative Guidelines to encode textual data, but each group uses the standard in different ways that serve their particular needs. While you are considering the specifications for your data, you should also think carefully about how to capture and represent the metadata you will need to administer your digital materials and enable them to be used effectively. The Guide includes an appendix on metadata which describes the various types and their uses. The relevant sections of the Guide also provide pointers to specific information on best practices for particular digitization communities.

The question of "how" also involves decisions about equipment. For the project planner, these questions are most usefully addressed not at the level of specific brands and models, but by thinking about the functionality you require and the tradeoffs you are willing to make (for instance, whether keeping costs low is more important to the project's overall success than achieving the highest possible capture standard). The sections on images and audio-visual materials discuss how to approach these decisions; more specific information on particular kinds of equipment can be found in the appendix on equipment. Finally, you need to establish an effective workflow for your project. At the highest level, this includes project management strategies, which are discussed later in this section, and quality assurance methods (discussed in Section VIII). But in addition you need to consider how you will store, manage, and track your digital objects, which is addressed in detail in Section XIII on digital asset management.

Staffing issues—who should do the work—are closely related to the points just mentioned, since your decisions about methods and procedures may be difficult to separate from the staff resources you actually have available. Few projects have the luxury of hiring all new staff to an arbitrary standard of skill and experience. Further on in this section we discuss human resources: how to construct job descriptions and identify skilled staff, and how to set up a management and advisory framework that allows your staff the autonomy to do their jobs effectively. In Section IX, Working With Others, we consider a range of collaborative and cooperative relationships that may expand your staffing options, including project consultants, vendor outsourcing, collaboration with other institutions, and internal cooperation.

Once you have worked through the issues sketched above, you will be in a position to assess the practical scope of the project: how long the work will take, and how much it will cost. Of all the questions addressed here, these may be the most vulnerable to change over time, as techniques and equipment improve and grow cheaper, and as quality expectations rise. Some guidance on cost estimation is offered later in this section, and also in the sections on specific digitization areas (Sections V, VI, and VII). You should make sure in researching costs to take into account all of the startup and infrastructural costs the project will incur-costs for initial planning, choosing data specifications, building or choosing tracking and documentation systems, training staff, and so forth-as well as the incremental cost of digitizing the materials themselves. This is also an opportunity to consider the scope of your investment and whether this infrastructure can be reused or adapted for further digitization projects once this project is completed.

Finding the funds to undertake the project is the final step, at least logically; a successful funding request will almost always require a thorough consideration of the issues just described. Even if you are fortunate enough to have funding already committed, going through this process will ensure that you spend your resources prudently and receive value for your investment. Funding sources and strategies are discussed later in this section, and also in Section XI on sustainability.

The checklist box below gives a condensed list of the resources you may need to undertake a digitization project. Although not all projects will need all of the resources listed, it gives a sense of the range and options.

 

Checklist Box:

Resources that you will need for a digitization project:

Personnel: advisors
project management staff
rights specialists
researchers
editors
authors
digitizers
catalogers
technical support/development
legal advisors
Software: operating systems
applications:
--> image manipulation
--> metadata authoring
--> database
--> indexing and search engine
--> web server
utilities
server systems
network clients
specialist applications/developments
Storage devices: local hard drives
network storage servers
optical devices (e.g. CD writers)
magnetic devices (e.g. tape drives)
controlled storage environment
Network infrastructure: cables
routers
switches
network cards
ports
Consumables: stationery
utilities
printer cartridges
lamps (for capture devices/special lighting)
storage and backup media
Project management: preparing bids
recruitment
publicity and dissemination
creation of deliverable product specifications
design of workflow
supervision of staff
quality assurance

 

Resources within your institution

If you are working within an institution that has other digitization projects under way, an examination of the resources already available within your institution is a good starting point. Staff will know if their department or unit has capture devices available or workers with experience of digitization or cataloging. This is an easy first step towards building a resource inventory, although knowing that you have one flatbed scanner, a digital camera and suitable equipment for digitizing audio, as well as people who know how to use that equipment, is not on its own sufficient. A thorough identification of internal resources involves checking that:

Clearly assessing the adequacy of these resources is predicated on other decisions, such as your workflow requirements; indeed, many of the planning areas discussed in this section are closely interdependent. It should also be apparent why the Guide's introductory section stressed early on that you need to define what you want to do and the audience or audiences you intend to reach (see Section I). A clear statement of objectives (preferably in a formal document that can be shared with staff), combined with the resource inventory, will enable you to assess the suitability of your local resources.

You will make this document an even more effective planning tool by adding information about equipment specification (e.g. computer processor speed, RAM, hard disk capacity) and the results of tests for suitability. Before you can conclude that you have suitable resources you must test them to make certain that they will meet the requirements of the project. The Example Box below, "Resource Inventory and Test", shows what a resource inventory and test for scanners might look like.

 

Example Box:

Resource Inventory and Test:

PCs and Scanners Functional Requirements Suitability Test Result
1 Pentium 3, 600 Mhz, 128 MB Ram Must handle processing and manipulation of image files up to 50 MB Needs more RAM
1 Pentium 4, 1 Ghz, 384 MB Ram Okay
1 Agfa Arcus Okay
1 Agfa DuoScan 1200 Transparency tray inadequate

Overall Conclusion: Upgrade one PC and replace one scanner

 

Most large institutions in the cultural heritage sector will have resources that may be useful to the project but would not necessarily need to be borrowed for the entire life of the project. There may be physical equipment, such as devices for digital capture, analog capture equipment (e.g. record, tape, CD and video players that can be used when converting from analog to digital), network storage devices, or handling equipment and controlled storage for analog material.

Human resources may be even more useful—expertise in digitization, text encoding, networks or web delivery can often be found in-house. Even those institutions yet to carry out any significant digitization will have cognate areas of expertise. These skilled individuals can be difficult to find, so tell your colleagues that you are planning a digitization project and have them consider which skills might be of value to you. For example, the skills, techniques and processes required by digital photography are identical in many areas to analog photography, and the same applies to image processing. Similarly, the standards and methods for creating metadata have their roots in the creation of bibliographic records, library catalogs or finding aids and museum collection management systems. In addition to this, it is important to consider the project team and project management process here. Projects should establish a set of procedures for project management from the very start of any project, identifying goals and time scales as well as tasks and outcomes tied to the availability of specific staff and equipment.

It is much easier to identify potential facilities and expertise within the framework of an institutional digitization policy or corporate technology plan—follow the more detailed questions for your own resources as described above. If such a policy has not already been adopted, it will probably be beyond the scope of an individual project to initiate one. Nevertheless, informal inquiries can still be made relatively easily. Remember that apparently unrelated departments or projects may be useful. For example, a great deal of high-end digital imaging takes place in dental, medical, biological and life science departments. The Internal Resource Identification Question Box illustrates some of the common areas of expertise to be found within an institution.

 

Question Box:

Internal Resource Identification:


Institution Type
Resource Academic Library Museum/Gallery
Imaging Medical Imaging / Media Services / Photographic Services / Library Special Collections / Photographic Dept Imaging / Publications Dept
Metadata Library Cataloging
Finding Aids
Collection Management
Finding Aids
Text Encoding Literature / Language / Computing Science Depts. / Information Management / Library Cataloging / Information Management
Finding Aids
Electronic Texts
Finding Aids / Information Management

 

External Resources

Identifying resources outside your immediate department, unit or institution can be a more difficult process. Success depends upon what type of institution you are, your strengths and limitations, the accessibility of the resources you are seeking, and whether there is scope for collaboration. Guidance from and access to the experience of others are likely to be readily available. The Link Box points you to national organizations that provide information to support digitization projects. Outsourcing can be another way to fill gaps in the resources available locally, by contracting with a vendor, hiring a consultant, or establishing a cooperative relationship with another institution. These options are discussed in greater detail in Section IX, Working with Others.

 

Link Box:

Links to National Organizations Offering Guidance

CLIR: Council on Library and Information Resources: "The projects and activities of CLIR are aimed at ensuring that information resources needed by scholars, students, and the general public are available for future generations." http://www.clir.org/

DLIB Forum: "The D-Lib Forum supports the community of researchers and developers working to create and apply the technologies leading to the global digital library." http://www.dlib.org/

LOC: Library of Congress: "The Library's mission is to make its resources available and useful to the Congress and the American people and to sustain and preserve a universal collection of knowledge and creativity for future generations." http://www.loc.gov/

NINCH: National Initiative for a Network Cultural Heritage: "A coalition of arts, humanities and social science organizations created to assure leadership from the cultural community in the evolution of the digital environment." http://www.ninch.org/

RLG: Research Libraries Group: "The Research Libraries Group, Inc., is a not-for-profit membership corporation of universities, archives, historical societies, museums, and other institutions devoted to improving access to information that supports research and learning." http://www.rlg.org/rlg.html

PADI: "The National Library of Australia's Preserving Access to Digital Information initiative aims to provide mechanisms that will help to ensure that information in digital form is managed with appropriate consideration for preservation and future access." http://www.nla.gov.au/padi/

AHDS: Arts and Humanities Data Service: "Create and preserve digital collections in all areas of the arts and humanities." http://ahds.ac.uk/

HEDS: Higher Education Digitization Service: "The Service provides advice, consultancy and a complete production service for digitization and digital library development." http://heds.herts.ac.uk/

TASI: Technical Advisory Service for Images: "Advise and support the academic community on the digital creation, storage and delivery of image-related information." http://www.tasi.ac.uk/

 

Resource challenges

There are a number of challenges both in assessing and securing the resources required for the project. Projects that take place in large institutions frequently benefit from a significant amount of non-project-related investment. Such hidden benefits include local area networks, high bandwidth Internet connections, large capacity network-based storage devices, web servers, and technical expertise associated with maintaining and developing these facilities. This infrastructure provides the framework for the specific resources and skills a project needs, and without it many projects simply would never get off the ground. Although institutions are now trying to quantify this input, its actual value is difficult to establish, with the result that projects in well-resourced institutions are able to scale up more quickly but often under-represent the real costs that lie behind the their activities.

Equally, less well-resourced institutions and initiatives face an increasing challenge in matching the developments in presentation and delivery of digital resources that larger projects can provide. Frequently, the solution is for small and medium size institutions to develop collaborative projects. The Colorado Digitization Project (http://coloradodigital.coalliance.org/) provides a flagship example of how equipment, staff and expertise can be shared between large and small projects alike, enabling the digitization and delivery of resources that would not otherwise be possible.

Another challenge for digitization projects, large and small, lies in the area of human resources. Content creation is a burgeoning field and although many Internet businesses have failed, those companies such as Getty Images, Corbis, The Wall Street Journal and Reed Elsevier, which have adopted prudent content creation and marketing strategies, are showing steady growth. The finance, commerce, media and entertainment industries all recognize the value and benefits of digital assets, and this places a premium on skilled personnel. Furthermore, the development of staff with digitization skills related specifically to the humanities and cultural field has not kept pace with the growth in the number of digitization projects. Many projects report difficulties in recruiting and retaining staff. Few public sector projects can match the remuneration levels offered by the private sector, but there are strategies you can adopt that enhance your chances of meeting the human resources challenge. These are outlined in the Human Resources Question Box.

 

Question Box:

Human Resources:

 

Funding

Some project staff will be preoccupied with securing adequate financial resources to start, develop and sustain a project throughout its lifecycle. An accurate picture of the financial costs will help you to identify the financial pressure points and to estimate more accurately the overall costs of running the project. The sections below on skills, equipment, and project management will provide points to help you develop accurate project budgets. An accurate profile of project costs helps to minimize the financial unpredictability of the project and improves the probability that it will attract funding. Funding agencies remain attracted by the opportunities for funding initiatives in the heritage sector. The Link Box provides pointers to some major US funders.

 

Link Box:

Potential Funders of Digitization Projects:

 

From the projects surveyed it is evident that most potential funders, particularly in the public sector, require applicants to provide a robust and auditable cost model. How this should be presented may vary from one funder to another, but it can be extremely useful to break down equipment and salary costs on a per unit or work package basis. Not only does it help the potential funders to make comparisons of unit costs between projects within and across heritage sectors, but it also forces you to look at the process and scheduling of work in detail. The accuracy of these figures will be greatly improved by conducting a pilot study or by adopting a cost model from a previous project, even if it needs to be revised in light of the experience of the earlier project.

All the projects surveyed obtained their financial backing from a combination of institutional budgets, public grants, private donation or corporate sponsorship. None of the projects reported serious under-funding, although some found that the distribution of funds created an uneven cash flow, resulting in medium term planning problems. Similarly, none of the projects reported serious concerns about sustainability, even where the source of future funds was unclear. The general absence of plans for self-generating funds or of exit strategies supports this confident view that income would continue to materialize in the future. A number of projects have recognized that failing to adopt long-term financial planning is less than prudent. We recommend that time and support for securing further external funds are crucial as well as exploring the potential for self-generating income. Projects should develop an exit strategy that will secure the maintenance and accessibility of the digital material. These issues are discussed in more detail in Section XI on Sustainability.

 

Cost models

Determining the cost of digital content creation on a per unit basis is extremely problematic. Not only are there no comprehensive cost models available that cover all resource types but trying to apply such a model to the variety of institution types, financial arrangements, prevailing market conditions, nature and volume of material and the resolutions required would be problematic. Furthermore, the cost basis for creating, storing and delivering digital resources can be quite different and trying to establish a single cost per unit can disguise these differences or ignore them altogether. In spite of these problems it is possible to establish some bases for per unit cost.

At the simplest level a project can take the total funding required and divide it by the total number of units that they intend to digitize. For example total project funding of $300,000 divided by 40,000 units equals $7.5 per unit. However, such a figure can be extremely misleading. Firstly, there will be variation in per unit cost according to the type of material digitized. The creation of OCR text pages will differ from reflective color still images, which will be different again from 16mm moving images or 78 rpm records. Even within material of the same broad type there will be variation. Black-and-white negatives are likely to be more expensive to scan than black-and-white prints, since tone reproduction needs to be set image-by-image in the former case, while the same settings can be applied to a group of photographic prints. Even if a project is dealing with material of a uniform medium and size, variations can occur that impact on unit costs. A collection of bound, legal-size books may have volumes that cannot be opened beyond a certain degree for conservation reasons. This may require a different capture technique, for example capturing pages from above rather than inverted. Some volumes may have details that demand a higher capture resolution than the rest of the collection, while others may require curatorial intervention to prepare them for digitization. The extent to which projects need to take account of such details will vary but at the very least different material types should be distinguished as well as same-type materials that require different capture techniques.

The cost items that go to make up a per unit calculation also require consideration. Should pre-digitization conservation work, handling time, programmers and management staff be included in addition to capture equipment and staff? In practice, projects need to do both. This is best achieved by calculating the costs directly related to capture on a per unit basis, which facilitates comparison and cost effectiveness for different techniques. Non-capture-related items could then be added to provide a total project cost and a second per unit calculation could be carried out if required. The list box below provides an indication of how these different factors can be differentiated. It is common practice to calculate costs for audio-visual material on a per minute basis.

 

List Box:

Capture Cost Factors:

(per unit for a single media type with uniform capture techniques and settings). It is important to note that the digitization capture costs are actually the least costly of the whole process.

Non-Capture Cost Factors:

 

Some sites with detailed information on costing are listed below.

 

Key Sites with resources on costings:

 

Human Resources

A project’s long-term success depends on the accurate assessment of the required human resources, and producing a map of available and unavailable skills is a valuable starting point. Institutions vary in their areas of expertise and different types of project require different skills. Nevertheless, from the projects that we surveyed it has proved possible to develop a basic template of the people and skills required in realizing a digitization project. The requirements can be scaled according to the size of the project envisaged.

 

Job descriptions, performance indicators, training

Comprehensive job descriptions are indispensable, regardless of the project or institution. While job descriptions are not always required by the host institution, employment law often demands them. Funders are increasingly expressing an interest in viewing job descriptions as part of the application process as this provides them with a richer overview of the project. It is worthwhile developing an outline of job descriptions before the project reaches the recruitment stage. This is useful to determine the delegation of work, how jobs interrelate, which posts can be tailored to existing skills and which can be identified for external recruitment or outsourcing. A useful process for developing accurate job descriptions is to set out a list of all the tasks required for a post and then rank them from highest to lowest priority or into essential, desirable and non-essential categories. Next, compile a corresponding list linking these tasks to the skills required, including any particular knowledge or qualification. Alongside this, compose a description of the experience or background required for these skills. Finally, review the original tasks and their priority to ensure that a realistic and coherent job description is produced. A resource which has been developed by the Association for Computers and the Humanities is a database of jobs in this field—it may be consulted by projects for guidance in drafting job descriptions, and can also be used to publicize new jobs to a focused audience of candidates. See http://www.ach.org/jobs/ for more information.

 

Example Box:

Sample Job Description

Job title: Digital Library Research Assistant

The Digital Library Research Assistant will play an integral role in the university's digital library projects, the goal of which is to bring a wide range of source materials to as large an audience as possible. The DLRA has responsibility for overseeing initial scanning and data capture, creating and reviewing metadata, and performing quality assurance checks. With other project members, collaborates on project publications and research.

Job requirements: Bachelor's degree and one to three years' experience; basic computational skills, and expertise in at least one area of the humanities. Advanced degree and three to five years experience preferred. Familiarity with relevant encoding and metadata standards, including SGML/XML, METS and Dublin Core, is highly desirable. Must be a self-directed team worker with strong motivation and the ability to take initiative. Needs good communications skills (oral and written) and willingness to work collaboratively.

 

The use of performance indicators appears to be on the increase. They can have a positive impact, not least by providing a way of formally identifying training requirements. While most projects assess training needs on the job as an informal exercise, formal methods encourage appropriate training solutions to be planned and resourced in advance.

There is a close interplay between performance indicators, job descriptions and training assessments. The job description is very useful in developing meaningful performance indicators. Indeed, a useful starting point for performance review is to evaluate current tasks against those set out in the job description, highlighting whether the original job description was unrealistic, whether workloads need to be re-evaluated in the light of practical experience, or whether a skills shortfall needs to be addressed. The aim of addressing training requirements is to ensure that future tasks can be achieved and that the project will not encounter a skill shortage.

 

Managing the skills base

It is vital to ensure that a project be able draw on the right balance of skills. The challenge is to determine the skills of individuals and how they can most effectively contribute to the project. The key to successful delivery of projects is management. The diagram below incorporates elements from all of the projects surveyed, from the smallest to the largest, and illustrates the general structure that may be used to manage the project's skills base.

skills_diagram

The steering group functions as an executive board and includes all constituents who are directly involved in the project, even if not employed by it, such as curators, archivists, subject specialists and education officers. In practice it is common for the steering group to be an existing committee within an institution.

The advisory committee is a broader-based group, providing general advice on the project's focus and direction. Members usually include the steering group with additional appointments from external organizations bringing particular areas of expertise, such as evaluation, to the initiative. There may be more than one advisory committee, or the advisory committee may be broken down into sub-committees each of which supplies more focused technical, academic or editorial decision-making support. This is the case with the Perseus Project at Tufts University, which has separate Technical and Academic Advisory Boards as well as a Steering Group to provide general project management. (Read Interview 28.2 for details on this arrangement)

It is essential to have a single project manager who is employed by the project, with responsibility for its daily management. In most cases the project manager provides the necessary project management experience, supplemented by internal or external advice. An institution needs to assign both accountability and authority to the project manager position, so that the process is not bogged down by myriad interactions with the advisory group or groups to deal with daily operations. In content creation projects it is unusual to employ external consultants to handle project management.

 

What skills are required?

There are four main areas, which will require staff with identifiable skills. These skill areas may be provided within a single project, dispersed across a collaborative project, or outsourced.

In smaller projects staff may carry out tasks in more than one area: for example, the digitizer may also undertake technical development, or the project manager may take on metadata creation. In larger projects, such as SHOAH or the Genealogical Society of Utah, the duties of staff are so extensive that this is not feasible.

Project managers will have to decide whether to hire new staff with the required skills or to re-deploy existing staff from other areas of the institution. We found that many projects prefer the former, with two notable exceptions. First, there is a discernable trend for photographers to be employed for high-end digitization work. Projects have found that better-quality images are produced through training a photographer in digitization rather than trying to equip a digitizer with photographic skills. The second exception is the tendency to re-deploy or train existing cataloging staff in metadata creation. This is a logical progression for staff who will already have considerable experience in creating bibliographic records, collection management records, finding aids or catalogs, frequently in an electronic form such as MARC.

Another decision concerns background skills. With the exception of some technical posts, we noted a clear preference for staff with arts, humanities, library, museum or gallery backgrounds, or at least some experience or interest in the subject area of the collection. There may sometimes be advantages in not having such a specialization. For keyed-in text transcription, staff without subject knowledge are more likely to enter exactly what is on the page rather than interpret the contents and enter what they think is in text. On the other hand, subject knowledge can be exceptionally useful in gauging what areas of the content should be focused upon, deciphering difficult materials, or recognizing how areas of the content should be marked up.

When you are trying to find staff with appropriate skills, remember that some projects have benefited from using student labor and volunteers. The ability to draw on student labor represents a significant benefit for university-based projects. Projects such as those based at the University of Virginia Library have been able to build large and diverse digital collections because they are able to draw upon a pool of skilled, motivated and affordable labor. Projects that recruit student labor have invested considerably in training, adopted flexible working practices and tailored the work around the students' educational commitments. This approach has the added benefit of equipping students with the skill set required for future work, adding to the pool of available staff.

Volunteers often provide a similar pool of skills and projects such as the Genealogical Society of Utah have made effective use of this resource. They have found it both necessary and beneficial to invest in appropriate training for the volunteers. Such training should be factored into the project resource plans. In large-scale initiatives, volunteer management and training may become a significant part of the project itself.

The Link Box below provides links to sites that support skills development in digital representation.

 

Link Box:

An increasing number of organizations are offering training in digitization, which generally proves cheaper and far more useful than commercial training courses:

 

Equipment

Because our digitization capabilities are so strongly tied to—and limited by—the developing equipment technology, it is tempting to feel that the available technology should motivate our digitization strategies. However, on the contrary, it is vital to base equipment requirements on the characteristics of the collection/s to be digitized and on project needs, and not the other way around.

Although there are significant cost savings associated with outsourcing work to "offshore" production bureaus in Asia, the Far East, Mexico, etc, in cases where unique materials or special collections materials are to be digitized it is important that digitization should take place as close to the original as possible. Hence many projects will need to confront the complex questions of equipment specification and selection. A detailed discussion of matching material properties to hardware and capture settings can be found in Section VII on audio-visual materials. There is also further information on equipment choices in the appendix on equipment. At the moment we will focus on the basic differences in equipment and the technologies employed in order that the correct type of equipment resource can be procured for a project. Selecting the most appropriate equipment can be time consuming, but projects should not be deterred by the plethora of manufacturers and their competing claims. For example, the SCAN project (Scottish Archive Network) was initially unable to find a commercially available digital camera that exactly matched their requirements. Instead, they sourced a camera custom-made to their exact specification. This level of exactitude may be out of reach—and unnecessary—for most projects, but it is worth remembering that one need not be entirely constrained by what is commercially available.

 

Principles of digital data capture

Although there is a variety of capture devices for different applications, whether you are digitizing images, text, audio, video or 3D objects, the operating principles are the same. All digital capture devices take a sample of the analog source material to create a digital surrogate. This sample is made up of two elements: the sample rate and the sample depth. The sample rate describes how frequently readings are taken of the analog material. For example, in a digital image this would be the resolution, or the frequency per unit of area: the number of pixels per inch, expressed as pixels per inch (ppi) or dots per inch (dpi). An image captured at 600 ppi would have had 360,000 samples recorded per square inch. Similarly, for audio-visual materials the sample rate is the frequency per unit of time at which the source material is sampled. The sample depth is the amount of information recorded at each sampling point. For example, a sample depth of 24-bits would capture 8 bits for each of the three color channels (red, green and blue) at every sample point. For a more detailed explanation of sampling, see the appendix on digital data capture and Section VII on Audio-Visual Materials.

 

Selecting equipment

The medium, format, size, and fragility of the original material are among the primary factors affecting equipment choice. For text documents, flatbed scanners are suitable for single leaf, regular sized documents, provided the material does not go beyond the scanner's maximum imaging area (usually up to approximately US Letter size), or is put at risk by "sandwiching" it in the scanner. Large format flatbed scanners and sheet-feed scanners can handle single leaf, oversized documents. However, sheet-feed scanners put material at greater risk than flatbed scanners as the originals are pulled through a set of rollers. Drum scanners, whose imaging area is usually from 8" x 10" to 20" x 25", and digital cameras can also be used for oversize material, but they are an expensive option compared to flatbed scanners.

Bound pages that cannot be disbound, and pages in bindings that cannot open fully to 180 degrees require flatbed scanners with a right angle, prism, or overhead capture array. Digital cameras, with appropriate easels, book rests and weights are a versatile option for bound material. Camera beds or mounts, lighting, lenses, and filters all add to the cost and complication but make digital cameras more versatile tools for capturing manuscripts, bound volumes, original works of art, prints, out-size material and artifacts.

To achieve the highest quality scans of transparent media (e.g. 35mm slides and negatives, 6x4 and large format transparencies and microfilm) specialist equipment such as slide and film scanners, microfilm scanners or drum scanners should be used. Some flatbed scanners, with a dual light source, can handle transparent media though they often lack the dynamic range comparable to that supported by transparency scanners. However, you will not achieve as high a quality image as you would with a dedicated film or slide scanner. These have an inherently higher resolution, appropriate for the small size of the original, hold the transparencies more closely and securely, and frequently have negative color compensation to correct color casts for different types of film.

Audio and moving image materials present their own problems for digital capture. Not only is there a variety of source formats, including wax cylinders, 33, 45 and 78 rpm records, 8-track and cassette tapes, two-inch and VHS video in PAL and NTSC formats, but it is often very difficult to obtain access to analog devices for playback and linkage is difficult.

 

Definition Box:

Audio-Visual Facilities:

 

Link Box:

There are a number of audio and video digitization projects that are just getting started:

RAI: http://www.rai.it/portale

BRAVA: Broadcast Restoration of Archives through Video Analysis http://www.ina.fr/recherche/projets/encours/brava/

COLLATE: Collaboratory for Annotation, Indexing and Retrieval of Digitized Historical Archive Material http://www.collate.de/index.htm

PRESTO: Preservation Technology for European Broadcast Archives http://presto.joanneum.ac.at/index.asp

AMICITIA: Asset Management Integration of Cultural Heritage In The Interchange between Archives http://www.amicitia-project.de/ami_home.html

 

The 3D representation of objects, from coins to buildings, is at the forefront of current digitization developments. At present the technology can be divided into two broad categories. The first, and simplest, is to create a moving image of an object. This is achieved by moving a digital camera around the object, or rotating the object in front of a fixed camera, while taking a series of still images. These images are then compiled to create a moving image of the object. The most common format for this is QuickTime VR. This is a reliable technology that requires a digital camera and mount or turntable. However, it does not provide a true 3D representation of the object because while only two planes are captured and displayed, it still represents 3D objects using two spatial planes. The viewer cannot manipulate the object, and the views provided are fixed and pre-determined.

Creating a true 3D representation of an object requires that the dimensions and features of the object be modeled. That is, the three dimensions of the object are represented in the computer as a set of coordinates. Attached to this "frame" are the textures of the object to provide the surface details. At present most 3D imaging technology remains in the sphere of industry. The technologies used to capture coordinates, render the model, and interact with the 3D representation (such as haptic feedback systems that allow one to "touch" the object, or 3D printing to create facsimiles) are often quite costly and require a relatively enormous amount of computing processor power compared to the average desktop computer (in 2002). As such, 3D modeling devices remain application-specific, for example body imaging, prototyping or CAD/CAM applications. However, it was not long ago that digital imaging was the sole preserve of medical applications. During the next ten years we should see increasingly cost-effective and user-friendly devices that will bring 3D modeling into the mainstream.

 

Definition Box:

Virtual Reality:

Virtual reality can be described as an interactive, self directed, multi-sensory, computer generated experience which gives the user an illusion of participating in a three dimensional environment, even if a synthetic one. For cultural and heritage institutions, this may mean using virtual reality to create virtual representations of three dimensional objects in their collections or to create representations of environments, such as an Egyptian tomb, an ancient Persian palace, a historic Greek theatre or an ancient landscape. These three-dimensional objects could range from coins, vases, and sculptures to representations of whole rooms of collections.

 

Metadata

Metadata is an indispensable part of any responsible digitization program, and considerable attention has been paid to the definition of high-quality metadata standards for various purposes. (The appendix on metadata provides more detail on different types of metadata, and on specific metadata schemes and their uses.)The availability of accurate metadata is as important as the digital surrogates themselves for accessibility, usability and effective asset management. In many instances institutions will already have substantial metadata about the analog object (for instance, catalog records) much of which can be applied to the digital object. The project will be able to reduce its metadata creation costs by building on existing metadata. When selecting material for digitization you may wish to give priority to material for which partial metadata already exists.

It is crucial to remember to determine the status of the existing metadata, when you are assessing resource requirements. In an ideal world the existing catalog or finding aid would be complete and up to date. However, many libraries, archives and museums have a backlog of cataloging work, and part of a collection selected for digitization could fall into this category. Therefore, it may be necessary to devote time to locating missing information for your metadata records. You must then decide whether to seek information just for those fields required for the metadata, or to update the original catalog record in its entirety. Digitization provides an economical opportunity for institutions to expand their metadata, so consider the possibility of seeking extra funds or devoting more resources to this activity. Some of the new elements required for the metadata record of the digital object can be generated automatically: for instance, automatic metadata creation is a feature of much high-end digital camera software and of some OCR systems. Alternatively, a project may need to develop its own system, and can greatly improve the efficiency and accuracy of technical metadata. There is a general dearth of metadata tools, which poses a problem for the efficient creation and management of metadata for many projects. There is therefore likely to be a significant element of manual work, whether this lies in adding digital objects to existing electronic catalogs, creating records for web-based delivery such as Dublin Core, or implementing encoded metadata schemes such as EAD. Creating a metadata record will usually take as long as creating the digital surrogate and if detailed encoding schemes such as Encoded Archival Description or Text Encoding Initiative are used, this process can be considerably longer.

 

METADATA RESOURCES:

GENERAL METADATA RESOURCES

  1. Canadian Heritage Information Network Standards Page: http://www.chin.gc.ca/English/Standards/metadata_intro.html
  2. J. Paul Getty Trust, Introduction to Metadata: http://www.getty.edu/research/institute/standards/intrometadata/
  3. Extensible Markup Language: http://www.w3.org/XML/
  4. International Federation of Library Associations and institutions. Digital Libraries: Metadata Resources: http://www.ifla.org/II/metadata.htm
  5. Text Encoding Initiative: http://www.tei-c.org
  6. Metadata Encoding and Transmission Standard (METS): http://www.loc.gov/standards/mets/

METADATA MENTIONED ELSEWHERE IN THE GUIDE

  1. Section III: Selecting Materials: Metadata & Interoperability. The Dublin Core metadata initiative http://dublincore.org/
  2. Section IV: Rights Management: Technologies for Copyright Management and Protection.
  3. Section V: Digitization and Encoding of Text - Text markup schema. Text Encoding Initiative (TEI): http://www.tei-c.org
  4. Section VI: Images
  5. Section VII: Audio and Video Capture and Management
  6. Section VIII: Quality Control and Assurance: Importance of Quality Control and Assurance of Metadata
  7. Section X: Distribution:
    Metadata Harvesting
  8. Section XIII: Digital Asset Management: "Metadata definition and management"
  9. Section XIV: Preservation:
    Institutional Approaches

 

Project Management

Many different approaches to managing projects are possible. While we found little evidence of the conscious adoption of a project management model, such as PRINCE 2 (http://www.kay-uk.com/prince/princepm.htm), most projects implemented many of the key features of successful project management. As understanding of digitization becomes more commonplace it may not be necessary to "hot house" prototype projects in the manner that many early projects experienced. However, it should also be recognized that integrating existing projects into host institutions often adds a layer of bureaucracy.

The Genealogical Society of Utah provides a good example of a comprehensive project management model. Each imaging project undertaken follows six stages:

  1. Negotiation and project administration
  2. Capture Convert Acquire
  3. Image and metadata processing
  4. Storage and preservation
  5. Indexing and cataloging
  6. Access and distribution

All projects will need to consider these six areas in setting up their own project management systems.

You do not necessarily need to adopt all the activities of a project management methodology; rather you need to scale the method to the needs of your project. The whole process should be determined by the project's objectives and rationale for creating the digital deliverable. Each process should be defined, together with the specific objectives to be achieved and activities to be carried out. The various roles and responsibilities should be detailed (defining job descriptions and breaking finances down aid in this — see above) and adapted to the size and complexity of the project. This should enable the efficient control of resources and facilitate regular progress monitoring. Regular reviews should be used to ensure that the project's objectives, which may change during the project lifecycle, are being met. Whatever project management method is adopted, it should provide a common framework and delineate milestones for all elements of the project.

In summary, your project management methodology should make possible:

Other key features are the need for one project manager to have ultimate responsibility and for the project advisory group to provide management quality control and assurance. In distributed projects, site managers are recommended in addition to an overall project manager. Most projects have relied on internal project management expertise, supplemented by external advice. Although many projects started as relatively autonomous there is a clear trend for project management structures and the project organization to be integrated into the host institution's structure. This may be a natural progression for projects as they mature, but new projects may consider whether they should adopt it immediately.

Work flow and costings

While few of the projects interviewed carried out benchmarking tests most had conducted pilot studies. These were undertaken for a variety of reasons:

When considering technical forecasting or prototyping, particularly in relation to costs, remember that there may be no corresponding benefit, and if there is a benefit it will vary for different types of content. Few projects in the humanities and cultural sector charge users for the digital deliverables. As such the cost/benefit may simply be realized by the ability of the project to amortize the depreciation on the equipment. A new high-resolution camera may pay dividends for fine textual or line art material, but not so for color images. Similarly, a device that enables the digitization of material that previously could not be captured, such as a 3D modeler, may not make financial sense if a project has to build in a profit or depreciation margin. However, if the device makes an important collection more widely available, the public access benefit may outweigh the financial costs.

Where any form of pilot study is undertaken it is important to build this into the project design and development cycle. For example, the University of Virginia Library's Special Collections department delineates its project work as intricately as possible before extrapolating its workflow and costings. This has given the project reliable data to forecast costs, but there are some areas where measurement has proved inaccurate, such as network transfer rates. The UVA Special Collections department also has a scheduling calendar tied to a tracking database to generate quality control and assurance checks and back-ups. In this respect it is typical of the projects surveyed which all use flowcharts, spreadsheets or Gantt charts to plan and monitor their workflow and costs.

If you are considering using a cost model (see above), it is important to include all the relevant costs, not just the obvious items such as equipment and staff time. You will also need to decide on what basis to evaluate — for example, costs per unit to be digitized or costs per hour. The table below provides a checklist of the factors that should be built into a cost model.

Finally, one further area to be aware of as you develop your cost estimates is digital asset management. In digitizing an image collection, for instance, you may well be generating a number of different kinds of digital objects-archival masters, delivery masters, thumbnails and other deliverables-which in turn will require storage, tracking, documentation, and upkeep. This process may require a significant commitment of resources and will need to be planned carefully. Section XIII covers digital asset management in detail.

 

Cost Model Factors
Equipment Purchase
Maintenance
Repair
Software Purchase
Upgrades
Staff Salary (including benefits and insurance,)
Training
Recruitment
Travel & subsistence
Utilities Heat
Light
Water
Phone
Postage
Building Rates
Maintenance
Upgrading/expansion

 

Analysis Box:

Costs of Digitization Programs

There is little information available about costs and this is an area where future work is necessary. In a rare exception, Steven Puglia analyzed the costs of digitization programs, in particular from the Library of Congress Ameritech competition and the National Archives and Records Administration's Electronic Access Report. The costs discussed are mostly projected and estimated costs, a problem discussed in the conclusion, which suggests that further studies are necessary. After an initial discussion on general costs of projects—it appears that on average, a third of the costs incurred by projects is the digital conversion, slightly less than a third is metadata creation, and slightly more than a third is made up of administrative and quality assurance tasks—the emphasis turns towards long term maintenance costs. The author suggests that these are not often taken account of with the project costs.

Three types of maintenance of digital objects are considered, each with mounting costs in relation to the initial costs per image:

In conclusion, it is suggested that digital imaging may not be the best approach for long-term retention of information. Institutions can only justify retention if the images are used. Analog retention is the best way of holding materials in the long-term. In addition, it would be instructive to use figures from final project costs and also examine costs per person and production per person.

For the full report see: Steven Puglia, 'The Costs of Digital Imaging Projects', RLG DigiNews, October 15 1999, Vol. 3 No. 5. http://www.rlg.org/preserv/diginews/diginews3-5.html

 

Conclusion

At the start of any project, project planning feels like a way to exert control, eliminate risk, and guarantee a successful outcome. Certainly without good planning, the likelihood of failure and inefficiency is much greater. But you can be a better project planner by recognizing that the goal is not to eliminate risk but to prepare for it—not to control every variable but to create a project framework within which your team's response to the unforeseen will be resourceful and effective. In the technology domain, change and unpredictability are facts of life, and often represent opportunities rather than disasters for a well-planned project. Your planning goal should be to create a flexible, adaptable system whose staff and procedures can accommodate change. Your aim as a project leader should be to distinguish between what is essential—the central project objectives, the strategic components that will ensure long-term viability—and what is merely instrumental detail.

 


 

  table of contents        previous chapter        next chapter




valid xhtml 1.1
abp~03/03