Here is the First Dot

Prof. Ray C. Dougherty
New York University | Linguistics Department

If you have any comments about these pages, please let us know. Should we place more of these pages on line? Do you benefit from these pages? If you are a student at NYU, you may help in developing these pages, see the HTML Gesellschaft. Your ideas and comments will lead to modifications and improvements.

The New York University Linguistics Department offers both undergraduate and graduate courses.

The book, Natural Language Computing, offers a 'follow-the-dots' approach to communicating the basic ideas involved in expressing Chomsky's linguistic concepts in computer languages like Prolog. But some people cannot seem to see the dots. Others have trouble finding the first dot. This page is the first dot. For those with graphics oriented browsers, we have pages containing animations showing parsers analyzing sentences.

1. Where am I? You are here.

You have stumbled onto a site maintained by Prof. Dougherty, Linguistics Department, of New York University. The bulk of the material is aimed to show students in NYU degree programs how to use a computer to process human language data. The students range from Ph.D. candidates writing dissertations on obscure ideas in the theory of Noam Chomsky to freshman in an introductory Linguistics course who are studying to fulfill a distributive requirement.

There are lots of people working in computational linguistics around the world. The following figure provides a sketchy overview of some of the various other computational and linguistic sites related to ours.A map of where you are.

2. Where is here?

This site differs from others in that almost all research and courses described on the site follows directly or indirectly from the ideas of Noam Chomsky, a Professor of Linguistics at MIT. Basically, why can't we talk to a computer in English and get an answer in English, Spanish, or some other language?

3. What is here?

Most of the materials here are used in courses or in research or are placed here by students at NYU. Professor Dougherty oversees the pages, but in general they are constructed and maintained by students in the linguistics and computer science departments. Some pages are produced by students in courses or working on dissertations.

The Beginner's Workbook in Computational Linguistics is aimed at anyone, not specifically people at NYU. A large number of people who have no connection with NYU have downloaded our free software only to ask: "What do I do now?" "What can I do with this stuff?" We receive about 200 e-mail letters a month from high school students, college students, business people, the curious, people developing "talking games", and members of the leisure class.

To aid these people in their plight, we have added the workbook pages. These contain lots of pictures, diagrams, simple examples, how-to-information, and so on. High level research it ain't. But if you are sitting home alone scratching your head and wondering what to do next, this will get you moving.

4. What does the software do?

In a nutshell, all of the Prolog programs at this site have one goal: To produce a labeled bracketing or tree structure (phrase marker) for a given sentence. Basically, using the software is a five step process:

Here is a simple example using the sentence: The woman sees in the house.

Motivated by the thrill of success and the anguish of defeat, linguists change the dictionary in the program (the lexicon) and the principles of combination (the definitions of noun phrase, verb phrase, and so on) in order to find the correct grammar that will properly analyze any input sentence. Most of our programs work pretty well, but there is lots of room for students to modify and improve the programs.