Introduction to Programming for Linguists
G61.1830-01 31508

SYLLABUS
LECTURE SCHEDULE


Prof. Ray C. Dougherty

This course description is available in postscript and in zipped postscript.

HELP || LING || COURSES || Basics | Overview | Syllabus | Project | Students


Reading List:
     Chomsky, Noam. 1966. Language and Mind. New York:
Harcourt Brace
     Dougherty, Ray C. 1994. Natural Language Computing: An
English Generative Grammar in  
          Prolog. Hillsdale, N.J.: Lawrence Erlbaum Inc.
     Gazdar, Gerald and Mellish, Chris. 1989. Natural Language
Processing in Prolog: An Introduction   
           to  Computational Linguistics New York: Addison
Wesley Inc.

All supplemental readings, and all Internet materials, are on reserve in Bobst Library and in the Linguistics Department Library, 719 Broadway, Fifth Floor. Weekly readings are assigned in Dougherty and Gazdar & Mellish.

Check the Academic Computing Facility for lab hours

Each student will complete the course project, which will be given out at the first class meeting and also is described in some detail on the page:

The class will be held in a computerized classroom. Many of the lectures will include computer screens that will be projected onto an overhead screen so students can see programs being executed.


Introduction to Minimalist Grammar and Logical Programming
  1. Introduction to Programming Syntax and Semantics in Prolog. Overview of the course, materials, the world wide web, HTML, Prolog, linguistics, and levels in a minimalist framework. We will discuss the project and why it is relevant to explicate concepts in the mininimalist framework.
         
              (a)  This
    course 
              (b)  The NYU Academic
    Computing Facility 
              (c ) Dougherty, Preface, Introduction: What is
    Computational Linguistics? How to Use this  
                    Book, pp. I-xlvi.
              (d)  Dougherty, Ch. 1, Natural Intelligence
    Linguistics and Prolog, pp. 1-23
    
  2. How to Read and Write in Prolog: Representing Linguistic Notations and Processes. Introduction to Prolog as a notation for representing linguistic data, processes, and constraints. Description and illustration of various Prolog tools. We discuss (and illustrate with the overhead projection system) how to run Prolog on the Unix, IBM PC, and Macintosh systems.
          
               (a)  http://www.nyu.ed
    u/pages/linguistics/ling.html
              (b)  Dougherty, Ch2., How to Read and Write in
    Prolog, pp. 24-51
              (c)    Dougherty, Ch3., How to Load, Run, and Edit a
    Prolog Program, pp. 52-82.
              (d)  Gazdar and Mellish, Introduction, pp. 1-20
    
  3. Morphology: A Lexicon and Constraints on Combination. We will discuss Prolog programs that relate singulars and plurals, present and past tenses of verbs, form compound nouns like American history teacher, and complex words like undecomposability.
          
              (a) 
    http://www.nyu.edu/pages/linguistics/anlcbk.html
              (b)  
    http://www.nyu.edu/pages/linguistics/parsers.html
              (c)    Dougherty, 7.1. The Levels of Human Language
    Structure, pp. 161-179
              (d)  Dougherty, 7.2. Morphological Parsers, pp.
    180-194.
    

    Part 1 of Project: (PF,LF) Pairs as Primary Data
    Turn in your paper at the fourth class meeting

  4. Merge: X-Bar, Phrase Structure, and Prolog Constraints. We discuss simple context free phrase structure grammars, context sensitive phrase structure grammars, X-bar grammars, lexical structure, and the constraints on representations and processes. We present X-bar type grammars in Prolog that define structures as projections of lexical items.
              (a)  htt
    p://www.nyu.edu/pages/linguistics/marc/parsshk.html
              (b)  Dougherty, 7.3. Recursion: Affixes on the
    Affixes, pp. 195-215
              (c)    Dougherty, 7.4. Regular and Irregular
    Morphology, pp. 216-228
              (d)  Dougherty, 7.5. The Minimalist Framework,
    pp. 229-238
    
  5. Merge: Generating The Structure of Simple Sentence and Phrases. Chomsky offers a simple mechanism, called merge, that defines a structural relation between two nodes and defines a third node that is a label for the two related nodes. We show how to interpret Chomsky's ideas about constituency and node relations in terms of Prolog operations.
              (a) 
    http://www.nyu.edu/pages/linguistics/courses/g611830/german.html
    (b) http://w ww.nyu.edu/pages/linguistics/parsers.html (c) Dougherty, 8.1. Syntax: Representations and Parsers, pp. 239-253 (d) Dougherty, 8.2. Rule Governed Creativity: Derivations, pp. 254-265 (e) Gazdar and Mellish, 5.1. A Simple Parsing Problem, pp. 143-144 (f) Gazdar and Mellish, 5.2. Bottom-up Parsing, pp. 144-151 (g) Gazdar and Mellish, 5.3. Top-down Parsing, pp. 152-156
    Make an Appointment with the Professor to Discuss the Project
    Office Hours or By Appointment: Mon & Wed 1:00-3:00

  6. Features, Feature Structures, and Verb Selection Constraints. Up until now we have discussed only constituent and relational structure among nodes. Now we start to express inter-constituent selections and cooccurrence restrictions. We discuss verbs that are intransitive, that are transitive and require an NP complement, a sentence complement, oranother type of complement. Examples of coordinate restrictions will be discussed: both Mary and Sue, *both Mary or Sue.
     
             (a) 
    http://www.nyu.edu/pages/linguistics/ling.html
              (b)  Dougherty, 8.3. Parsers Assign Structure to an
    Ordered String, pp. 266-275
              (c)    Dougherty, 8.4. Top Down and Bottom Up
    Parsing, pp. 276-302
              (d)  Gazdar and Mellish, 5.5. Comparing
    Strategies, pp. 165-166
              (e)  Gazdar and Mellish, 5.6. Breadth-first and
    Depth-first Search, pp. 166-168
              (f)  Gazdar and Mellish, 5.7. Storing Intermediate
    Results, pp. 168-169
              (g)  Gazdar and Mellish, 5.8. Ambiguity, pp.
    169-173
              (h)  Gazdar and Mellish, 5.9. Determinism and
    Lookahead, pp. 174-179
    
  7. Prepositional Phrases: In a house on a hill near a river on public land in Jersey. The time adverbial phrases that form the substance of your term project will include prepositional phrases such as: at six o'clock in the evening on the third of July in 1966. How can a Prolog program recognize or generate embedded or coordinated or appositive prepositional phrases?
    (a) Lecture 7 
              (b)  Look up 'Recursion' in Dougherty and read
    the references.
              (c)    Look up 'Recursion' in Gazdar and Mellish
    and read the references.
    

    Part 2 of Project: Possible Notations at the Level of PF
    Turn in your paper at the eighth class meeting

  8. The Calendar and Time Expressions: Semantics, Pragmatics, Syntax. We will place a calendar on the computer as a database: calendar([1,1,january,monday,1944]). calendar([32,1,february,wednesday,1944). And so on. We will see how to relate the information in such a database to phrases like: on the thirty second day of nineteen forty four, Wednesday, February the first.
              (a) 
    http://www.nyu.edu/pages/linguistics/courses/g611830/lecture8.htm
    l
              (b)  Dougherty, Ch 4. Tables of Data as Prolog Facts
    and Relations, pp. 83-110
              (c)    Peirce, C.S. How to Make Our Ideas Clear,
    on reserve in the library.
    

    At this point, you have learned (or at least been exposed to) enough Prolog tools to complete the course project. Dougherty, Natural Language Computing, does not discuss how one might optimally represent features and complex node types in Prolog. Essentially, all feature structures (and node labels) are treated as list structures. For those who want to see alternatives, we will start to examine some of the techniques and procedures presented in Gazdar and Mellish, Natural Language Processing in Prolog. You should regard G&M as a cookbook of techniques, sort of like an encyclopedia of Prolog processes and data structures, that can more or less match the structures one finds in human languages.

  9. Compositional Semantics: Nodes as Features of Nodes Below Them. The data and the month information, which are combined into a single item at one level may be spread over two prepositional phrases in the orthographic/phonetic form. On the thirteenth day of August in the year of nineteen sixty six contains the 13 in the phrase on the thirteenth day, 8 in the phrase of August, and 1966 in the phrase in the year of nineteen sixty six. These three numbers can be spread across the PF/OF in different orders: in nineteen sixty six on August the thirteenth, in sixty six on August thirteenth, and so on.
              (a) 
    http://www.nyu.edu/pages/linguistics/courses/g611830/lecture9.htm
    l
              (b)  Dougherty, 8.5. Horizontal Appends: Complement
    Structures, pp. 303-318
              (c)    Dougherty, 8.6, Vertical Appends: Selection
    Restrictions, pp. 319-327
              (d)  Gazdar and Mellish, 4.1. Grammar as Knowledge
    Representation, pp. 100-103
              (e)  Gazdar and Mellish, 4.2. Words, Rules and
    structures, pp. 104-109
              (f)  Gazdar and Mellish, 4.3. Representing Simple
    Grammars in Prolog, pp. 110-114  
    
  10. Feature Percolation: Passing Features from Node to Node. In order to obtain a Numeric Form such as [_,_,8,13,1966] for a variety of orthographic/phonetic forms, the parser must be able to accept information in a variety of different linear and hierarchical orders. The parser must be able to input information and move it from node to node as it "composes" the Numerical Form for the time expression.
              (a) 
    http://www.nyu.edu/pages/linguistics/courses/g611830/lecture10.ht
    ml
              (b)  Dougherty, Ch. 5. How Prolog Backtracks in
    Searches, pp. 111-140
              (c)    Gazdar and Mellish, 4.4. Subcategorization
    and the Use of Features, pp. 115-126
              (d)  Gazdar and Mellish, 4.5. Definite Clause
    Grammars, pp.127-131
              (e)  Gazdar and Mellish, 4.6. Classes of Grammars
    and Languages, pp. 132-142
     

    Part 3 of Project: Definig the Pairing for the Calendar and PF Notations
    Turn in your paper at the fourth class meeting

  11. The Clock and Time Expressions: Notations Containing Operators. The clock time expressions are more complicated than those relating to the calendar because of the inherent ambiguity in a twelve hour clock and the idiosyncratic ways in which the minutes are related to the hours in time expressions. *three quarters past six, *forty minutes till seven, and so on. Ten minutes till seven will be represented as [6,50,_,_,_], a representation that contains neither a 7 nor 10.
              (a) 
    http://www.nyu.edu/pages/linguistics/courses/g611830/lecture11.ht
    ml 
              (b)  Gazdar and Mellish, 8.1. Semantics:
    Compositionality, pp. 280-281
              (c)    Gazdar and Mellish, 8.2. Meaning as
    Reference, pp. 283-287
              (d)  Gazdar and Mellish, 8.3. Translation to a
    Meaning Representation Language, pp. 288-292
              (e)  Gazdar and Mellish, 8.4. A Database Query
    Language, pp. 290-293
    
  12. Calculations on Features Percolated Among Nodes. Some of the feature percolation mechanisms must perform calculations on the lower features and pass the computed features up the tree, or maybe sometimes down the tree, depending on your assumptions about constituent structure.
              (a) 
    http://www.nyu.edu/pages/linguistics/courses/g611830/lecture12.ht
    ml
              (b)  Gazdar and Mellish, 8.5. Computational
    Semantics as Feature Instantiation, pp. 293-294
              (c)    Gazdar and Mellish, 8.6. Transitive Verbs and
    Quantification, pp. 295-301
              (d)  Gazdar and Mellish, 8.7. Ambiguity,
    Preferences, and Timing, pp. 301-302
              (e)  Gazdar and Mellish, 8.8. Building Semantic
    Checking in the Grammar, pp. 303-308
    

    Part 4 of Project: Defining the Pairing for the Clock and PF Notations
    Turn in your paper at the thirteenth class meeting

  13. Presentation of a Rude Time Parser for Clock and Calendar. We will present and discuss a rather primitive attempt to solve the project. The main function of this class is to show the format in which student projects should be submitted in their final form. All projects must include documentation, a discussion of the problem, a bibliography, and so on in addition to the computer program and scriptfile samples of executions. The most elegant formulation of our project would be in terms of a feature structure grammar.
              (a) 
    http://www.nyu.edu/pages/linguistics/courses/g611830/lecture13.ht
    ml
              (b)  Gazdar and Mellish, 7.1. Feature-Theoretic
    Syntax , pp. 218-220
              (c)    Gazdar and Mellish, 7.2. Feature Structures
    as Graphs, pp. 221-227
              (d)  Gazdar and Mellish, 7.3. Feature Structures in
    Prolog, pp. 228-229
              (e)  Gazdar and Mellish, 7.4. Subsumption and
    Unification, pp. 230-237
              (f)  Gazdar and Mellish, 7.5. The Status of
    Rules, pp. 238-239
    
  14. Encoding Your Project as a Set of HTML Pages on the World Wide Web. It is possible to present your project as a set of HTML pages and place it on the NYU Linguistics Department site. You will see the problems involved and learn how to do this. We will discuss some of the grammars (HPSG, LFG, and so on) implemented on the WWW in HTML. Some of them exhibit the type of programming discussed by Gazdar and Mellish.
              (a) 
    http://www.nyu.edu/pages/linguistics/courses/g611830/lecture14.ht
    ml
              (b)  Gazdar and Mellish, 7.6.  Implementing PATR in
    Prolog, pp. 240-247
              (c)    Gazdar and Mellish, 7.7. Chart Parsing with
    Feature-Based Grammars, pp. 248-255
              (d)  Gazdar and Mellish, 7.8. Representation of
    Lexical Knowledge, pp.256-269
              (e)  Gazdar and Mellish, 7.9. Implementing a Lexicon
    in Prolog, pp. 270-272
              (f)  Gazdar and Mellish, 7.10. DAG's Versus
    Terms, pp. 273-275
    
    


Part 5 of Project: Turn in the Final Polog Program



Intro | Part 1| Part 2 | Part 3 | Part 4 |Part 5