Computational Principles of Sentence Construction |
|---|
If you are an advanced student, you should contact Prof. Dougherty. The Course Project page is aimed at beginners to comutational linguistics and natural language processing. I have an abundance of more advanced projects that could be used to satisfy the course requirements. This term (Spring, 1997) I am particularly interested in developing CGI scripts to enable net browsers to use the NYU parsers. I would like the users to input a standard orthographic sentence, such as: Sean, Tess, and Tracy left, and have the parser return the possible phrase markers for this sentence in a graphic format, perhaps jpg or gif. The parsers might return information about ambiguity, constituent structure, logical form, and so on.
For this project, you have been given a set of sentences in a file on the NYU web sever. You are required to run these through the parsers discussed in class, output the results to a file, and print the file. You are expected to turn in the printout resulting from parsing the sentences.
You are expected to write a short Prolog program in which the quantifiers (each, all, etc.), the coordinating conjunctions (and, or, nor), and some distributional adverbs (simultaneously, together, etc.) are presented as facts (simple or complex) in a Prolog database. You should also define some relations that indicate selection and cooccurrence restrictions between the quantifiers, coordinations, and adverbs.
You will write a short Prolog grammar (database/lexicon) and principles of combination that defines strings of quantifiers, coordinations, and adverbs that cooccur. It will recognize: (each, and, independently), as in John and Mary each will leave independently, but not *(each,and,simultaneously), as in *John and Mary each will leave simultaneously. It will recognize (all, and), but not *(all, or). I am mainly concerned that you realize how to use the unfication properties of Prolog to insure cooccurrence and selection. If you feel ambitious, you may write a program to parse sentences and phrases incorporating these quantifiers, coordinations, and adverbs.
You will produce a Prolog program that parses coordinations of noun phrases and plural noun phrase constructions that may incorporate quantifiers, coordinations, plurals, and adverbs. Your parser will recognize: (john,and,mary,each,independently) but not *(john,or,mary,each,simultaneously). If you wish, you may include a parser that provides pretty printer to indicate the scope of the quantifiers and adverbs. Such pretty printers have been discussed in class and are readily available on the web, but you may write your own if you wish.
You will write a Prolog program (lexicon and principles of combination) that parses embedded noun phrase coordinations such as: (mary,and,sue,or,jane), which might have a logical form like ((mary and sue) or jane) or (mary and (sue or jane)). The input to your parser will be strings of nouns, the coordinations (and, or, nor), and quantifiers (each, all, etc.), such as: (either,tess,or,tracy,and,sean), (neither,tess,and,tracy,nor,sean), and so on. Your parser will assign each of these the correct logical form(s), indicating ambiguities where relevant. It will fail to link strings like this to any logical form: *(and,or,tess,tracy,neither,sean). Use the pretty printers to output a readable logical form for the input sentences.