Computational Principles of Sentence Construction
This course gives a student extensive hands-on practice in using the New York University ACF computer system (IBM PC, Mac, and ULTRIX/UNIX Vax) to implement Noam Chomsky's and Zellig Harris' ideas about morphology, syntax, and semantics into symbolic processing languages like Prolog, Lisp, Snobol, and so on. We focus on Prolog since it is so easy to learn and Lisp because there exists an extensive on-line library of available programs in natural language processing. Students will run existing parsers available on the WWW and write their own grammars and parsers in Prolog. The equipment in the ACF Innovation Lab will be demonstrated.
By the end of the class you will be able to:
This course description assumes:
If you are an advanced student, you should contact Prof. Dougherty. The Course Project page is aimed at beginners to comutational linguistics and natural language processing. I have an abundance of more advanced projects that could be used to satisfy the course requirements. This term (Spring, 1997) I am particularly interested in developing CGI scripts to enable net browsers to use the NYU parsers. I would like the users to input a standard orthographic sentence, such as: Sean, Tess, and Tracy left, and have the parser return the possible phrase markers for this sentence in a graphic format, perhaps jpg or gif. The parsers might return information about ambiguity, constituent structure, logical form, and so on.
We will implement the ideas of Noam Chomsky and Zellig Harris in Prolog, and to a lesser extent, Lisp. Experience has shown that Prolog is easier to learn than Lisp for just about everyone. Once you a have learned Prolog, all the information carries over into Lisp. The teaching strategy is to move the students from the familiar to the unfamiliar.
Our ideas about lexical structure derive from Noam Chomsky, Maurice Gross, and Zellig Harris. We are not concerned to implement any ideas about morphology, syntax, semantics, or logical form except those underlying Chomsky's 'Minimalist Program' and 'Logical Structure of a Linguistic Theory'.
We only discuss Prolog as a tool to represent recent linguistic theory. This is not a general course in Prolog or Lisp. Students will learn to program in these language in order to encode lexicons and principles of sentence construction for English, French, and German grammars. We have no interest in discussing classical problems like the Towers of Hanoi, the Queens problem, and so on.
The main readings from the course will come from one book and two manuscripts. The basic organizational idea underlying these books is:
If you already know a lot about Chomsky's theories, Prolog, Lisp, and parsing, you can download the materials and use them in your teaching, research, and so on. If you are somewhat in the dark concerning recent linguistic theory, parsing strategies, and so on, then these books are the key to awakening your intelligence to the dawn of computational linguistics. cognitive psychology, and symbolic processing languages. Some of our pages use complicated graphics and displays, see the HTML Gesellschaft.
In short, these books are the manuals and workbooks to use the extensive array of free software on the world wide web concerning computational linguistics. These free programs are first class items comparable to interpreters and compilers that cost hundrends of dollars. We are giving you a Jaguar for free and selling you driving lessons. If you already know how to drive, simply take the car and recommend us to your friends. The Jaguar is free, the books are for those who need driving lessons.
The single best source of outside reading on the computational material is this:
Gazdar, Gerald and Chris Mellish. 1989. Natural Language Processing in Prolog: An Introduction to Computational Linguistics. New York: Addison Wesley Inc.
Gazdar and Mellish (1989) differs considerably from Dougherty (1994) in many respects. In brief,