NYU Linguistics
About | Graduate | Undergraduate | Contact | People | Events | Working Papers | Site Map | Home

Agent-based Modeling and Microparametric Variation: Modeling the Evolution of Vowel Harmony
K. David Harrison, Mark Dras, Berk Kapicioglu, and Eric Aaron

Affiliation: Institute for Research in Cognitive Science
University of Pennsylvania
Mail address (for all authors):
3401 Walnut St, Suite 400A
Institute for Research in Cognitive Science
University of Pennsylvania
Philadelphia PA 19104-6228
Email addresses:
kdh2@linc.cis.upenn.edu
madras@linc.cis.upenn.edu
bkapacio@linc.cis.upenn.edu
eaaron@linc.cis.upenn.edu

A central tenet of the study of microparametric variation (MPV) is that closely related languages will reveal which observable parameters of languages are correlated; in particular, this correlation can lead to the discovery of more abstract parameters underlying the behavior of some of the more surface parameters (Kayne, 2000). The analysis of MPV has also been a motivation for work in phonology (e.g. Repetti, 2000). Kayne notes that this MPV is a way of approximating diachronic experimentation on a language, where the linguist (if it were possible) would alter a parameter of a language and see what occurred. Here we propose software agent-based simulation of such an experiment as a complementary methodology for examining the feasibility of abstracting parameters through analysis of related languages. Our work suggests that changes in quite disparate features of a language may be related, such that it may be odd to consider a single abstract parameter as being responsible.

Modeling diachrony in a mathematical framework is useful for investigating the process of and the consequences of hypotheses about parametric variation, particularly for those changes that might occur over multiple generations in a community. However, there is only a small body of work on building mathematically-based models to evaluate language variation. One type (Clark and Roberts, 1993; Briscoe, 1999) models the coevolution of a language acquisition device and the syntax of a language: the "language agent" has parameters which change according to agent interactions and the resulting "fitness" of the agents. The other type does not presuppose teleological agents that have a particular fitness goal; changes arise from community interaction. Early work in modeling language change in this way (Kroch, 1989) imposed particular structure on the data, specifically the S-shaped trajectory of historical change; later work is interested in how the S-shape observed in data can emerge from simple parameter interaction. Of these, macro models that model the behavior of the whole speech community through mathematical equations (Niyogi and Berwick, 1997) have fundamental problems because of the complexity of interactions (Briscoe, 2000). Software agent-based simulations are one way to feasibly model the sort of complex system that arises in a speech community. However, models that have been built so far (Satterfield, 2000; Zuraw, 2000) Are of domains where there is a sparsity of empirical data.

All of these (bar Zuraw's) are models of syntax. However, the actual S-shape of historical data with respect to syntactic change is not uncontroversial (Aitchison, 2001). In addition, data are problematic in terms of quantity and ambiguity. Instead, we look at phonology: our domain is the evolution of vowel harmony in Turkic languages, and the aim is to find a model that fits the S-shaped historical data, of which there is an extensive written record over a long time period. The various languages that are descended from Old Turkish differ with respect to their vowel harmony, either retaining or losing their root or affix backness harmony. We look for a model which posits one or more factors that can plausibly explain the differentiation, such that the change predicted by the model matches the historical data. Restricting our model to language-internal factors, we show that factors such as asymmetry of production or recognition errors could lead to the decline in harmony seen from Old Turkish to Uzbek. More interestingly, an apparently unrelated grammaticalization could also lead to a decline in harmony. As a specific instance, both Turkish and Uzbek have a third person possessive suffix -i, which was grammaticalized from an independent lexical third person pronoun (Poppe, 1965). Early in its status as an affix it did not undergo harmony (Menges, 1968); later in Turkish it developed dual forms, while Uzbek kept only the original palatal. Later Uzbek has also generally lost harmony in roots and affixes (e.g. the reduction of plurals to the single form -lar). Our software agents, using probabilistic (Bayesian) decision processes, model a reanalysis of harmony based on the Uzbek variant of the grammaticalized construction. Over time in a community of agents this initial grammaticalization becomes self-reinforcing, so that other grammaticalizations are similarly reanalyzed, and leads to a decay of the harmony system. This suggests that not all correlated parameter variations will want to be modelled by abstract parameters.


References:

Aitchison, J. (2001) Language Change: Progress or Decay. Cambridge University Press. Cambridge, UK.

Briscoe, E. (1999) Grammatical Acquisition and Linguistic Selection. In Linguistic evolution through language acquisition: formal and computational models, (ed.) Briscoe, E. Cambridge University Press. Cambridge, UK.

Briscoe, E. (2000) Macro and micro models of linguistic evolution. Proceedings of the 3rd International Conference on Language and Evolution. Paris, France.

Clark, R. and I. Roberts (1993) A Computational Model of Language Learnability and Language Change. Linguistic Inquiry, 24(2): 299-345.

Kayne, R. (2000) Parameters and Universals. Oxford University Press. Oxford, UK.

Kroch, A. (1989) Reflexes of grammar in patterns of language change. Language Variation and Change, 1: 199-244.

Menges, K. (1968) The Turkic Languages and Peoples. Otto Harrassowitz, Weisbaden, Germany.

Niyogi, P. & R. Berwick (1997) Evolutionary Consequences of Language Learning. Linguistics and Philosophy, 20: 697-719. Poppe, N. (1965) Introduction to Altaic Linguistics. Otto Harrassowitz, Weisbaden, Germany.

Repetti, L. (2000) Phonological Theory and the Dialects of Italy. John Benjamins. Amsterdam, The Netherlands.

Satterfield, T. (2000) The Socio-Genetic Solution: A New Look at Language Genesis through SWARM Modelling. Ms., Univ. of Michigan, Ann Arbor.

Zuraw, K. (2000) The Extension of Semiproductive Morphophonemic Processes to Novel Forms: An Optimality-Theoretic Model and Simulation. Paper presented at WCCFL XX, Univ. of Southern California.

NYU | Graduate School | Department of Linguistics Home | Resources
Last updated September 18, 2001