Workbook Images

Beginner's Workbook in
Computational Linguistics

Prof. Ray C. Dougherty
New York University | Linguistics Department


If you have any comments about these workbook pages, please let us know. Should we place more of these pages on line? Do you benefit from these pages? If you are a student at NYU, you may help in developing these pages, see the HTML Geselschaft. Your ideas and comments will lead to modifications and improvements.

Beginner's Workbook in
Computational Linguistics

Preface: A Tutorial for Beginner's.

and SOFTWARE are FREE.Free Software?

1. Introduction to Natural Language Computing.

1.1. Chomsky's Universal Generative Grammar.
1.2. Levels of Linguistic Structure
1.3. Linguistic Structure: Phrase Markers.
1.4. Methods in Generative Grammar.
1.5. Generative Grammar, Parsing, and Derivations.
1.6. Derivational Processes versus Representational Structures.

2. Basic Linguistic and Computational Mechanisms.

2.1. A Simple Prolog Tree Parser.
2.2. A Sentence Parser that Sends Trees to the Disk.
2.3. Writing Parser Results to the Disk.
2.4. A Parser that Produces .bmp Graphic Trees.
2.5. Storing, Sorting, and Manipulating Trees.

3. Ambiguity of Meaning and Structure.

3.1. Noun Phrases: Prepositional Phrase Attachment Ambiguities.
3.2. Verb Phrases: Complement Structures.
3.3. The Lexicon and Lexical Entries.
3.4. Cooccurrence and Selection Restrictions.

4. Embedded Sentences and Coordinations.

4.1. Verbs with Simple Sentence Complements.
4.2. Verbs with Complex Sentence Complements.
4.3. Selection Restrictions in Verb Complements.
4.5. The Subject of Infinitive Complements.
4.6 Empty Categories as Infinitive Subjects.

5. Agreement Phenomena

5.1. Verbs that Require Plural Subjects or Objects.
5.2. Subject-Verb Agreement in English.
5.3. Person, Number, and Gender in German Noun Phrases.
5.4. Subject-Verb Agreement in French.

6. Symbolic Processing Languages: Lisp and Prolog.

6.1. Parsing at Two Levels: Phone Codes.
6.2. Relating the Plural to the Singular.

Introductory Textbook in Computational Linguistics

The book Natural Language Computing: An English Generative Grammar In Prolog, by Ray C. Dougherty, is an introduction to the basic ideas of linguistics and an introduction to the symbolic processing language Prolog. It provides detailed information on how to use Prolog to encode the data structures one finds in the morphology, syntax, and semantics of human languages.

The book contains an IBM compatible 3.5 inch disk containing a Prolog interpreters for IBM PC and Macintosh, plus all of the programs in the book. If you insert the disk into drive A: or B: and type A:INSTALL (or B:INSTALL), it automatically loads all of the Prolog materials into your computer so that you can run the programs discussed on these pages. The book is an introduction for people unfamiliar with linguistics, grammar, and symbolic processing languages

If you feel confident in using decompression utilities, setting up your own directory structures, and installing the programs, then all of the materials on the disk are also available on this site for download via a modem.

A Comment on Software: Free and Otherwise.

Zero Bucks

All of the software on these pages may be downloaded and used by anyone. Some of these software projects represent a lot of work by one or more people, perhaps working over years. Why would anyone give away their work?

Is the free software, particularly Lisp and Prolog compilers and interpreters, as good as the software you pay hard earned cash for? Usually it is excellent, but has zero frills. Most commercial packages have editors, graphics, debuggers, and so on that greatly decrease the frustrations of programmers. The edit-Prolog-edit loop is much simpler in professional products. Commerical products usually make more efficient use of memory, crash less often, and offer a variety of printing options.

Medium Bucks

I have tried and used an evaluation copy of LPA Prolog (by Artificial Intelligence International) on the IBM PC and was very impressed with its power, speed, convenience, editing capacities, and graphics abilities. I have not tried any other commerical products for the IBM or Apple. If you are looking for an excellent Prolog, you would be happy with LPA Prolog.

Big Bucks

At New York University, each student has an account on the Academic Computing Facilities UNIX or ULTRIX machines. Normally students do their work in Quintus. Many, however, like to work using laptops or at home. These use the IBM and Macintosh software. Advanced undergraduates and graduate students working with Prof. Dougherty can use the exotic equipment at the Innovation Center.

Which is Best?

As Mae West said: "I've been rich, and I've been poor. Believe me, rich is better." Undoubtedly the best is to work on a university maintained UNIX machine running Quintus. There are lots of consultants around to help. But you can do a lot with less.

A Request

If you find any of the materials here useful, please include a pointer to the site on your pages or in your work.

Width bar

Some of the figures are scaled to fill a 1200 x 650 screen.

If you enlarge the browser window, you will be able to read even the small print.

These figures are used in courses at NYU with a computer connected to the WWW and an overhead projector using an overhead/computer attachment. One need not carry overheads, they are on the WWW for use anytime, anywhere.