Coarticulation &
Models

What is coarticulation?

source: http://oldlowlight.co.uk/events/easter-egg-trail-at-the-old-low-light/

Speech production is usually considered a combination of a series of discrete articualtory gestures. Articulation is the movement of one articualtor approaching the other. Each articualtory movement requires an inact target while performing speech segments. However, in a speech flow, speech segments do not stand alone; as Hockett (1955) suggested, speech production is to carry a row of Easter eggs along a conveyer belt and smash them into pieces. The Easter eggs stand for the seemingly discrete speech segment, while the smashed pieces represent the final output of speech production, which is further realized as the concept of "coarticulation."

Based on the vivid concept mentioned above, coarticualtion basically refers to the contextual influence of surrounding segments which contribute to the overlapping movements in the production of a phonetic unit. For instance, when producing an English word "play," the "voiced" alveolar lateral /l/ would become voiceless due to the influence of the preceding voiceless labial stop /p/. According to the direction of coarticulatory effect, there are two types of coarticualtion: right-to-left (regressive or anticipatory coarticulation) and left-to-right (progressive or retentive coarticualtion). The former one refers to the situtaion where the following phonetic segment imposes its phonological features on the preceding segment, such as the nasalization of /i/ in "bean" due to the influence of the following nasal sound.Another example is the influence of vowel backness on its preceding consonant; when producing /ki/, since /i/ is a front vowel, it will extract the velar stop to an anterior position and therefore the constriction locates at a place close to the hard palate rather than the velum. On the other hand, the latter one refers to the situation where the preceding phonetic segment exerts its phonological features on the following segment, such as the devoicing of /l/ in "play."

Coarticualtory Models

With the basic ideas of coarticulation, let's procede to an advanced issue: how did past researchers model coarticualtion in modern speech produciton theories? In the following paragraphs, two coarticulatory models will be briefly introduced: (1) a feature-based model proposed by Henke (1966), and (2) a window model proposed by Keating (1990). This seciton will largely focus on the latter model.

Feature-Based Model

This model considers phonological features of speech segments as inputs to see how the features change in the motor implementational output. The primary contribution of this model lies in the fact that it provides us with a concept that there are more than one level in terms of speech outputs and segmental inputs. Meanwhile, features come to play as the intermediate level between them. Furthermore, this model allows researchers to track the long-run process of coarticulation; that is, it shows how coarticulation influences the speech input across several segments, not only the neighboring single segmental unit. Also, this model is very parsimonious in terms of its simple assumptions associated with the input and output (for example, you can simplified a certain sound to several features without mentioning other redundant features). However, there are some pitfalls as well; one of the biggest downsides is the concept of compatibility criterion. Compatibility criterion allows a feature to be projected earlier than its parent segment, as long as the articulatory gesture associated with the feature will not fall in conflict with the currently produced segment.This is problematic because some data shows that anticipatory coarticulation may occur even if the feature values (defined by physiological dimensions) are contradicted. This is even more problematic when a segment is produced with overlapping articualtory gestures.

Window Model

Keating (1990) propsoed a new model to account for the phenomenon of coarticulation in both phonological and phonetic perspectives. She believed coarticulation, just like other phonetic descriptions or linguistic items, follows the phonological and phonetic rules. Keating divided coarticulation into the two layers and hoped to come up with a model which could sufficiently explain the two levels in terms of coarticulatory gestures. The phonological representation of coarticulation provided by the current model was feature-based as well as categorical, such as the concept of feature spreading (and according to Keating, this is not adequate to capture the temporal dimension of particular segmental movements); while the phonetic representation accounted for the spatial and temporal aspects of speech production as well as the gradient levels of speech phenomena. However, coarticulation refers to the overlapping gestures, transitions of articulators or gestural assimilations across segmental boundaries which cannot be easily captured by features. For example, the intermediate level between vowels and nasal consonants in English would be complex to present of we only have a binary feature model to show [+nasal] or [-nasal]. Thus, Keating proposed the window model to account for the spatially continuous representations of coarticulation.

Unlike the traditional target-based model proposed by MacNeilage in 1970s which converted the abstract features to invariant targets (or physical points), this proposal basically transferred the fixed targets of a given phonemic articulation to a range of values with possible maximum and minimum spatial values constrained by the segment’s occurring contexts. The range of acceptable values form a “window” which describes the spatial variability of a given speech unit and varies from context to context. Plus, this window also varies across different speech segments. If the space for both of the given active and passive articulators is relatively limited, the windows for the segments are narrow; if not, the windows are wide. In this regard, the windows are designed dependent on allophones rather than invariant phonemes.

To show how to implement this window model, Keating provided several concrete examples which incorporated an array of narrow and wide windows and projected how to map out the path from one window to another. Evidences from English velum and jaw positions seem clearly illustrate how this model works. Taken velum positions as one dimension to measure, English vowels’ velum position window is relatively much wider than consonants, with nasal consonants lower and oral consonants higher. The value of nasality in the sequence of CVN (such as the word “ten”) is drawn out through smooth curves from one window value to another. Presumably, English vowels don’t contrast in nasality and thus was used to be labeled as [-nasal] in terms of its feature. However, the interpolation of curves from one window to another shows the lowering process of velum and thus proves that the vowel is still nasalized during the whole process with different degrees of velum lowering. Another example deals with the jaw position in vowels and consonants in English, which is not a phonetic gesture that could be simply categorized in a single feature. But this gesture is able to be captured within the window model. Given the segment “stræ,” the jaw position window shows the narrowest in /s/ and widest in /æ/. The sequential changing of jaw positions from /s/ to /æ/ is successfully described in the window model while this change is not fully explained in the traditional models (which only use the combination of feature spreading and shifts of targets to account for it; detailed movement is not addressed). These are the evidence that Keating proposed to demonstrate how her window model can explain the successive movement within segments as well as beyond segments.

One thing that Keating didn’t addressed much is the pattern of the trajectory between windows. The variability of possible paths are still not fully understood, which is a very big downside of this model. However, she tries really hard to map this model with the underlying phonemic features of speech. Widths of the window will tell people the information about features and contexts of a given segment. For example, she states that those features that are phonologically unspecified would result in wide range of window value and allow many possibilities of coarticulatory gestures to happen. She also shed light on the concept of coarticulatory resistance, which is about a given segment’s sensitivity to its environments. If the value is low, that means the segment is highly variable; and vice versa. Keating claims that this property is also preserved in the window model and presented though the width of windows as well. Overall, this model provides future researchers with an innovative concept to think about coarticulation with respect to its relation with phonological representations.

Top