TYPE OF PROPOSAL: Paper
TITLE: OpenText.org: An Experiment in Internet-based Collaborative 
Humanities Scholarship
KEYWORDS: Greek New Testament, Open Source Development, Internet Groupware

AUTHOR: Matthew Brook O'Donnell
AFFILIATION: University of Surrey Roehampton, London and OpenText.org
E-MAIL: m.odonnell@roehampton.ac.uk

AUTHOR: Stanley E. Porter
AFFILIATION: University of Surrey Roehampton, London and OpenText.org
E-MAIL: s.porter@roehampton.ac.uk

AUTHOR: Jeffrey T. Reed
AFFILIATION: OpenText.org
E-MAIL: reed@wtp.net

CONTACT ADDRESS: Centre for Advanced Theological Research
                  University of Surrey Roehampton
		     80 Roehampton Lane
      		     London, SW15 5SL
		     U.K.
FAX NUMBER:      (+44) 208 392 3491
PHONE NUMBER:    (+44) 208 392 3000 ext. 4162



The OpenText.org initiative seeks to harness the collaborative effort 
and ideas of scholars of Hellenistic Greek, particularly the Greek of 
the New Testament, through the medium of the Internet. Modelled upon 
the Open Source Software (OSS) movement, it seeks to actively nurture 
the involvement of humanities scholars in the process of corpus 
building and annotation, as well as the analysis of texts and the 
development of tools for this analysis.

New Testament scholars have begun to realize the importance and 
potential of computer resources both for traditional exegetical 
analysis and for newer interpretative models such as discourse 
analysis. Currently there are a number of grammatically and lexically 
annotated texts available and accompanying software for 
concordance-based search and retrieval. These texts and tools have 
been developed by a small and devoted number of practitioners over 
the past 25 years. These developers have tended to follow a closed 
method of development, following what Eric Raymond (1999) has 
described as the 'cathedral style' of project building. This has 
limited the degree of participation for interested New Testament 
scholars, who have tended to fill the role of product consumers. The 
OSS movement adopts a different viewpoint on the role of users, 
seeking to facilitate their role as co-developers. This goal is 
achieved through open access to regular updates of source code and 
the use of Internet collaboration tools--through a mailing list, 
newsgroup or bulletin board (Udell 1999; Preece 2000). The contention 
of OSS enthusiasts, such as Raymond, is that software of quality 
equal or superior to that of commercial closed source products can 
result from such a process.

The OpenText.org project contends that the OSS model of development 
can be adapted to the realm of textual annotation and analysis. These 
are tasks that require detailed and time consuming analysis, yet hold 
long-term benefits for the whole scholarly community. Biblical 
scholars have tended to work independently in their study of texts, 
carrying out a great deal of linguistic and literary analysis 
summarized in their publications but not accesible to other scholars 
for future work. The Text Encoding Initiative has demonstrated the 
position of textual encoding as a valuable academic discipline in and 
of itself, rather than just a preparatory exercise (Sperberg-McQueen 
1991; DeRose et al. 1990; Renear, Myloans and Durand 1996). One of 
the goals of OpenText.org is to develop a series of specification 
documents to act as guidelines for the linguistic and literary 
annotation of Hellenistic Greek texts. These specifications are 
developed following the editorial process of the World Wide Web 
Consortium. These documents define XML schemas that can be used by 
scholars to mark-up a particular text or section of text. They are 
then encouraged to contribute the resulting document(s) back into the 
data repository, making them available for use and adaptation by 
other scholars. We are also exploring the possibilities of on-line 
annotation, allowing the logging, co-ordination and editorial review 
of the work carried out by users. The eventual goal of this 
arrangement is the full annotation (with linguistic, literary, text 
critical and contextual information) of a large corpus of Hellenistic 
Greek texts (O'Donnell 1999 and 2000). In addition, OpenText.org 
draws upon the insights of corpus linguistics and functional 
discourse analysis (Reed 1997; Porter and Reed 1999; Porter and 
O'Donnell 2000) to provide a theoretical basis and systematic model 
for the annotation and analysis of texts in a corpus.

This paper will provide an overview of OpenText.org and an outline of 
the principles behind the project. It will also describe and 
demonstrate the progress of the project in harnessing the 
collaborative potential of the virtual scholarly community during the 
first nine months. In addition, an analysis of some of the key issues 
faced in the project, such as, the difficulties in overcoming the 
individualistic practices of many humanities scholars, the fears 
concerning the loss of intellectual property, the use of an XML 
encoding scheme and the adoption of these schemes by non-technical 
scholars, and issues of copyright of ancient texts and editions. 
Aside from the sociological problems of building an on-line 
collaborative community, two key problems have become clear. The 
first concerns the legal and copyright issues surrounding both 
printed and electronic editions. The OSS movement makes use of a 
number of software licences (GPL, Apache, BSD) to protect the free 
distribution and reuse of the software it produces. OpenText.org is 
involved in producing new editions of Hellenistic texts, particularly 
the New Testament according to Codex Sinaiticus. There is some 
disagreement as to how 'open source' licences can be applied to 
machine-readable texts. The second difficulty relates to the high 
entry level set for participation in the project. It requires at 
least three elements: (1) a reasonable facility in the Hellenistic 
Greek language, (2) an acceptance and understanding of linguistics 
(for the linguistic analysis and annotation of texts) and (3) comfort 
with XML encoding. The first of these cannot easily be removed. The 
development of encoding standards and specification documents 
addresses the second. The use of XML editors and web-applications 
with a clear user interface can partially address the third issue.

This paper supports the view that the OSS process, when properly 
understand (including the different types of participants and the 
roles they fulfil) and adapted, is highly applicable to 
computer-based humanities projects.



References
----------


DeRose, S.J., D.G. Durand, E. Mylonas, and A.H. Renear, 'What is 
Text, Really?', Journal of Computing in Higher Education 1.2 (1990): 
3-26

O'Donnell, M.B., 'The Use of Annotated Corpora for New Testament 
Discourse Analysis: A Survey of Current Practice and Future 
Prospects', in Porter and Reed (eds.) 1999: 71-116.

O'Donnell, M.B., 'Designing and Compiling a Register-Balanced Corpus 
of Hellenistic Greek for the Purpose of Linguistic Description and 
Investigation', in S.E. Porter (ed.), Diglossia and Other Topics in 
New Testament Linguistics (JSNTSup, 193; Sheffield: Sheffield 
Academic Press), pp. 255-97.

Porter, S.E. and M.B. O'Donnell, 'Semantics and Patterns of 
Argumentation in the Book of Romans: Definitions, Proposals, Data and 
Experiments', in S.E. Porter (ed.), Diglossia and Other Topics in New 
Testament Linguistics (JSNTSup, 193; Sheffield: Sheffield Academic 
Press), pp. 154-204.

Porter, S.E. and J.T. Reed (eds.), Discourse Analysis and Other 
Topics in Biblical Greek (JSNTSup, 113; Sheffield: Sheffield Academic 
Press, 1999).

Preece, J., Online Communities: Designing Usability, Supporting 
Sociability (New York: John Wiley, 2000)

Raymond, E.S., The Cathedral and the Bazaar: Musing on Linux and Open 
Source by an Accidental Revolutionary (Cambridge, MA: O'Reilly & 
Associates, 1999).

Reed, J.T., A Discourse Analysis of Philippians: Method and Rhetoric 
in the Debate over Literary Integrity (JSNTSup, 136; Sheffield: 
Sheffield Academic Press, 1997).

Renear, A., E. Mylonas, and D. Durand, 'Refining our Notion of What 
Text Really Is: The Problem of Overlapping Hierarchies', in S. Hockey 
and N. Ide (eds.), Research in Humanities Computing 4: Selected 
Papers from the ALLC/ACH Conference, Christ Church, Oxford, April 
1992 (Oxford: Clarendon Press, 1996): 263-80.

Sperberg-McQueen, C.M., 'Text in the Electronic Age: Textual Study 
and Text Encoding, with Examples from Medieval Texts', Literary and 
Linguistic Computing, 6 (1991), pp. 34-46.

Udell, J., Practical Internet Groupware (Cambridge, MA: O'Reilly & 
Associates, 1999).