Ray C. Dougherty

Picture of Ray C. Dougherty


The New York University Linguistics Department offers both undergraduate and graduate courses.

What Am I Doing Spring Summer Fall 1997?

Office Hours: Mon and Wed, 1:00-3:00

If you call 212-998-7958 during these hours, you are almost certain to reach me. Outside these hours you will get me or my voicemail account. The fax machine 212-###.#### is always on.

Courses I offer Fall 1996-Spring 1997

Director of Undergraduate Study

If you are interested in a Linguistics Major, Joint Major, or Minor, please contact me.

I will be collecting the abstracts for undergraduates in the Social Sciences who wish to submit their research to be considered for the Spring 1997 Student Research Conference, sponsored by the FAS Dean's Office.

WWW, HTML, CGI, and Computer Graphics

If you are interested in the possibilities for using the computational resources of NYU for linguistic research or for developing pedagogical materials, please let me know your interests. Visit the HTML Gesellschaft mit Stammtisch.

My Background

Educationally speaking, I have a B.S. in Engineering Science (Dartmouth College), a M.S. in Electrical Engineering and Instrumentation (Dartmouth College and M.I.T. Instrumentation Lab), and a Ph.D. in Linguistics (M.I.T.). Too many years ago, I was a student at the Universität München. Several years ago, I was a Fullbright Research Professor at the Universität Salzburg. For several years I have collaborated with Prof. Maurice Gross, Université Paris VII.

My favorite authors on the subject of linguistics, symbol processing, scientific methodology, and innate ideas are, in alphabetical order: St. Augustine, Noam Chomsky, Benjamin Franklin, Otto Jespersen, Richard Kayne, Helen Keller, Konrad Lorenz, C.S. Peirce, Plato, Claude Shannon, and Norbert Wiener.

My interest in computational linguistics is mainly on the linguistics. My interest in computation is mainly on the hardware, circuit design, and interfacing between levels. Just as early automobiles were designed so that the driver sat outside of the body of the car perched where the rain and sleet could hit, so too today's (certainly yesterday's) computers are designed to make their use inconvenient. Why should anyone have to learn a computer language, punch keys, and develop carpal tunnel syndrome in order to interact with a device that is by its very nature a symbolic processing linguistic device?

Perhaps owing to my early days in electronics and instrumentation, I am 90 percent interested in computer hardware and 10% interested in software. This might also follow from my belief that computer languages are simply a necessary evil following from our inability to design a computer that can learn and understand English, French, and so on.

If we believe that time is the substance of which lives are made, then we must budget our time prudently. If we allocate a hundred hours of this precious resource to the study of language we should decide how much we will spend studying French, German, Fortran, Prolog, and so on. Would you rather study French and German and perhaps travel to Paris or Berlin to utilize the resource, or would you rather study Fortran, C, Prolog, and Lisp and trundle into the windowless basement of the computer center to punch keys? Given today's primitive computers and our primitive understanding of human symbol processing, we must learn Prolog, Lisp, and so on. But one day, perhaps not far off, computers will utilize human languages.

My Leading Ideas, as Peirce Would Say

While many people seek a linguistic theory or model that will be highly efficient, non redundant, and describe all the facts in a single coherent framework, I doubt this is the correct path. My idea is that the human brain has accumulated cognitive abilities over generations of evolution, somewhat like the hermit crab has accumulated an assortment of items on its shell.

What we call human language is a compiled program running on a computer whose operation we do not understand. We have lost the compiler in those distant ancestors we have evolved beyond, and we have lost the uncompiled code which must have been stammered among our ancestors during their show and tell sessions around their campfires.

It appears that a human language is a highly coded symbolic system that has the property of reproduction. Somewhat like a computer virus, an I-language in one human mind/brain can replicate a copy of itself in another 'uninfected' mind/brain. Following ideas of Claude Shannon and particularly Norbert Wiener an I-language is a self-replicating cybernetic automaton. An I-language might best be thought of as an intellectual commensal (maybe a parasite) whose main function is to format the mind/brain and whose secondary function is to provide a communication system for humans to natter and for itself to reproduce. Just as parasites and commensals borrow energy from their hosts, an I-language borrows information, or perhaps more accurately, channel capacity. Its invariance content, that is the properties that define 'same language,' are the patterns of redundancy (paraphrase and cooccurrence) among the utterances. Insofar as this is true, the invariance content of an I-language, i.e., those properties that must be replicated in the new grammar formed in a new host, might best be studied in a computational framework. In Chomsky's terminology, the invariance content of the commensal I-language is universal grammar.

Once we solve the problem posed by universal grammar, and we figure the correct configuration of memories, problem solving skills, and pattern matching abilities so we can photocopy the correct configuration onto a set of silicon chips, we can simply set our universal grammar computer down in front of a souvlaki imbis at Times Square and let it learn all of the languages in the world by listening to the three card monte dealers, the taxi drivers, the push cart fast food salespeople, and those eating souvlaki sandwiches .

Current Research and Teaching

My research is focused on the language acquisition problem in a Chomskian linguistic framework, as described below. This research is reflected in the following areas covered in my courses. My work focuses on questions like these:

C.S. Peirce on Pragmatism and Explanation

What are the similarities and differences between the ideas of Noam Chomsky and Charles Sanders Peirce concerning innate ideas, theories of knowledge and belief, and the possibilities for explanation and understanding? What is the role of abduction and pragmatics in linguistic theory and methods? What is the role of mathematical, logical, and algebraic formalisms in developing linguistics as a science? What is the difference, if any, between formal analytical theories (constraint based formalisms) and analogic models (I-language as a switch or code)? What is meant by a 'toy grammar'?

Generative Grammar
(Derivational Processes and Notational Representations)

How can the basic concepts in generative grammar, in particular principles and parameters and minimalism, be represented in constraint based logical frameworks (perhaps feature structure and unification systems)? How can Prolog and Lisp be exploited to design parsers to yield theoretically interesting conceptual models of the processes and structures of human language?

Methods and the Strength of Arguments

What are the sources of explanation in Linguistics? What is the difference between internal justification and external justification? Do reductionist concepts play a role in linguistics? What sort of a logical or cybernetic model could unify the data from child language acquisition (for instance, repair sentences) with facts about adult grammars (for instance, redundancies, exceptions, analogical failures, and so on)? Are there any residues in adult language from the stages of child language acquisition?

Robots, Automata, and Intelligence

What is the cultural history of computers, robots, and artificial intelligence? How do robots relate to the mind/body problem? Is there a mind/body problem?

Why were so many robots and automata built between 1500 and 1800? What is the relation between human intelligence, animal intelligence, and machine intelligence? To what extent is our understanding of innate ideas related to the studies of the Wisdom philosophers, the neoplatonists, and the apocrypha?

Computers in Linguistic Research and Teaching

I would like the NYU Linguistics Department WWW site to provide extensive tutorial information (all free and downloadable) on aspects of computational linguistics, in particular, formalizations of Chomsky's theories in logical constraint based computational languages like Prolog and list processing language like Lisp. I encourage student projects to be presented in HTML. I encourage anyone interested in CGI, HTML, animation, graphics, or my pages to visit the HTML Gesellschaft mit Stammtische.

The Main Problem I Try to Solve

Observational Fact

A child, whose immature cognitive and neurological systems are growing, can acquire a language effortlessly on the basis of exposure to primary data containing performance errors. On the other hand, an adult, whose cognitive and neurological system are complete, can acquire a language only with difficulty and often requires special instruction.


A person learning a language is in fact solving a problem. The data to which the person is exposed define the problem. The person's cognitive apparatus (modeled by memory storage capacities, problem solving skills, and pattern recognition abilities) define the problem solving device. The I-language (grammar) acquired constitutes the solution to the problem.

The Leading Question in My Research

What kind of a problem in symbol processing can be solved by a computational device that is under construction while the problem is being solved, but cannot be solved by the computational system when it is fully constructed?

My Answer

In terms of digital signal processing, a multiplex problem, is one that must either be solved in stages or must be solved by two or more computational systems working simultaneously. The book, Natural Language Computing, contains examples of parallel processors that parse morphological and syntactic structures.

One obvious multiplex problem is binocular vision. Animals with a pair of forward facing eyes, like humans, have 'real time' binocular vision because they have two computational systems (eye structures) working simultaneously. Animals without forward facing eyes, for example some birds - like the pigeon, have 'delayed time' binocular vision because they solve the problem in stages: First they scan the world with one eye, then they jerk their head to another position and scan it again. The information from the first scan is combined with the information from the following scan to produce a binocular 'depth' image.

For their solution, some problems require two different, distinct, views of the 'same thing.' Binocular vision requires two different, almost identical, views of the 'same thing.' A child, at any stage, might be considered as 'selectively structure blind.' And the structure blindness at one stage yields a view of language slightly different than the view at another stage. The adult grammar contains 'residues' of the stages of child language in that the adult stage defines structures that can meet the structural demands of the child's stages.

I assume that the structures and processes necessary and sufficient to define 'possible human language' stem from the fact that any human language must be learnable by a child. If we assume that a child goes through definable stages of acquisition, where each stage is definable by a specific range of computational capacities that defined possible structures (non-embedded coordination, embedded coordination, and subordination), then my analysis assumes that the adult grammar must define a clause/sentence architecture that permits every adult sentence to be parsed by a child at any stage. The child and the adult see the 'same sentence' in the sense that a black and white TV sees the 'same picture' as a color set, or a monophonic FM receiver hears the 'same sounds' as a stereo receiver. The adult grammar (actually a combination of the grammars from each stage of acquisition) defines a 'multiplex signal system'. The developing child is in fact a 'demultiplexer,' and demultiplexes the adult signal structure by passing through stages.

How I Try to Solve the Problem

The answer to the problem posed above (why children under construction learn languages easier than fully built adults) lies in the examination of problems that can be solved by parallel computational processes operating on a shared memory space. We specifically would like a machine optimized to play dominoes. assemble jigsaw puzzles, or work crossword puzzles. It is not clear that a parallel processor optimized to play chess would be the correct design for solving natural language problems. Since such computers are scarce and expensive, and usually designed to solve partial differential equations and not solve linguistic problems or play games like dominoes, research using such devices is restricted.

Lacking hardware, the next best thing is to try to formulate universal grammar as a logical constraint based system in which there is no order among the constraints. Researchwise this is an excellent idea, somewhat like charging straight up the steepest side of the mountain in order to get to the top the fastest. Pedagogically, it horrifies some students with its formalisms and blunts the enthusiasm of others via the dormative powers of its notational tedium. Constraint based formalisms of universal grammar are not for beginners, the faint of heart, or the easily bored. This is best left for dedicated students in somewhat advanced classes.

In most teaching, I use linguistics examples cast into Prolog. Prolog, short for PROgramming in LOGic, is a computer language that is easy to learn and use. A big plus is the fact that it costs nothing. There are excellent versions of Prolog for the IBM PC and the Macintosh that are freeware or shareware, quite error free, easy to install, and as functional as expensive versions.

Who and Me Solve the Problem and How and Where

Advanced students and me discuss the type of hardware implementations required to build a universal grammar type of computer that could learn coordinate, subordinate, wh-, passive, etc. on the basis of the data available to a child.

Less advanced students and me discuss the problems involved in formulating concepts of universal grammar, minimalism, principles and parameters, and so on in terms of logical constraint based formalisms.

Beginners and everyone else and me work on developing Prolog implementations of universal grammar. Prolog, despite its procedural pollutions - and the cut (!) notwithstanding, is an excellent introduction to logical constraint based systems.

Tuesday noon finds me lunching on tortellini at the NYU Violet Cafe discussing the convolutions of HTML and the WWW with the HTML Gesellschaft mit Stammtisch.

How Can I Find Out More?

You can examine my ideas about multiplex problems and cross-point switches in this book mainly aimed at people interested in formal symbolic processing systems:

You can see recent linguistic ideas (mainly ideas basic to Noam Chomsky's view of universal grammar, minimalism, principles and parameters, and so on) applied to linguistic problems in these books:

Courses Offered by Prof. Dougherty

Some courses are on-line in html format.
Some courses have a postscript description (pos) that you can download and print.
Some courses have a zipped postscript file (zip) that you can expand and print.

Undergraduate Courses

Graduate Courses

Selected List of Publications and Works in Progress

Who are You?

If you are interested in any of the above topics, please let me know. There are many opportunities for students at NYU at the undergraduate, graduate,and post-graduate levels in computational linguistics. For information about the Linguistics Department and application information, contact linguistics@nyu.edu.

For further information about any of the topics mentioned on my pages, or information about the research opportunities at NYU, please drop me a note. doughert@acf4.nyu.edu