Size: 4700
Comment:
|
Size: 6865
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 26: | Line 26: |
= Clayards et al. (2008) = '''Introduction''' ---- Speech perception is difficult as the auditory signal is not only noisy, but extremely variant based on context and speaker differences, even for minimal pairs from the same speaker. To correct ambiguities, listeners make use of probability distributions in making decisions about ambiguous words. This paper examines the "ideal observer" model, or the theory that listeners make use of all probabilistic cues for a word in the auditory environment in order to make up for the perceptual noise. Here, probability distribution information is defined as the number of times a listener has heard a phoneme as a member of a certain category. '''''Question: How do we go from the category likelihood graph in fig. 1 to the probability response graph?''''' Listeners under the ideal observer model make a prediction based on probability information about category boundary overlap as well as the likely amount of overlap given the categories. Evidence has shown that to a certain extent, listeners are aware of within-category differences, which supports the ideal observer model. ---- '''Experiment''' ---- In order to determine if listeners are sensitive to an entire probability distribution for a phonetic cue, the researchers spread phonetic tokens out along a VOT scale and tested subjects on category judgment. For this experiment, subjects were given a 4-picture forced choice task with two possible pictures based on the phoneme continuum (e.g. a peach and a beach) and two irrelevant pictures while wearing an eye tracker in order to record gaze time and decision time for the pictures. As predicted, the results showed that the probability curves for the subjects matched the hypothesized curves, and the participants' uncertainty (as measured by gaze time at incorrect objects) also matched what would be expected under the ideal observer model. However, the researchers note that in actual speech perception as opposed to this experiment's confines, the auditory signal would carry more information, which the listener may also rely on. '''''Question: Clarify the idea of posterior probability.''''' |
Norris et al. (2002)
Introduction
Infants ~6 months of age can still discriminate phonetic contrasts in non-native languages; lose this skill by 10 months of age in favor of better discrimination of native language phonetic contrasts; this is important for minimal pair discrimination; phonetic categorical modulation seems to occur based on lexical information, perhaps accounting for regional dialect/pronunciation changes
Does lexical information feed back to pre-lexical selection of phonemes? This study tests this theory. Feedback systems for learning distinct from "on-line" feedback systems.
Experiment 1
Speakers presented with recordings of fluent Dutch speaker for training, 3 groups of subjects. Group 1 heard 20 words ending with ambiguous /f/-/s/ spectrum sounds followed by 20 words with unambiguous /s/ sounds. Group 2 heard 20 words ending with ambiguous /f/-/s/ spectrum sounds followed by 20 words with unambiguous /f/ sounds. 3rd group was control, heard ambiguous /f/-/s/ continuum endings for all words. Groups were given a lexical decision and phoneme categorization task on 3 lists of words similar to the training words, with some filler. Results: Subjects were faster to label unambiguous-ending words as legal words than words with ambiguities, unsurprisingly. Subjects who heard ambiguous /f/ sounds and unambiguous /s/ sounds were more likely to categorize an ambiguous sound as /f/, and the opposite was true of the second group. This supports the hypothesis of feedback from lexical context cues. However, the authors note that this result could have been the result of selective adaptation to a fricative along the /f/-/s/ continuum, or due to "a contrast between the ambiguous phoneme and the unambiguous endpoint." (p.15)
Experiment 2
Kraljic et al. (2006)
Introduction
Listeners attempt to make auditory perception more consistent through a process called normalizing, which makes up for fluctuations in the acoustic signal. They do this by maintaining a few, broadly defined phonemic categories in order to account for variations in speech. Recently, research has shown that people may adjust to speakers by narrowing these phonetic categories based on context and speaker variation, which the text refers to as perceptual learning or perceptual recalibration. The article makes references to Norris et al. (2003) in which listeners adjusted their perceptual categories to adjust to the acoustic signal in an ambiguous context, but goes on to suggest that perceptual learning may happen on a speaker, word or even phonemic level. The following experiment is conducted to see the extent of perceptual learning that listeners have when presented with a speaker which produces an odd phoneme, in this case, /d/.
Experiment 1
The experiment run was similar to Norris et al. (2003) in that listeners heard a series of words with ambiguous /d/-/t/ sounds randomly inserted in a list of unambiguous words (in place of an unambiguous /d/ sound, such as croco?ile and to?al [total]) with the hypothesis that if perceptual learning were to occur, listeners would be more likely to classify these sounds as /d/ during a lexical decision task, due to lexical knowledge and adaptation to the speaker's accent. Listeners were also required to place /b/-/p/ sounds on a continuum, with the hypothesis that if listeners shift their perceptual categories in a more general way, we would observe the shift on other continua, but if it was done on a more specific basis, we would not see this effect. Results: Listeners performed very well on the lexical decision task, and like in the Norris et al. (2003) study, listeners who were tested with more ambiguous /d/-/t/ phonemes in a word which should have a /d/ categorized ambiguous phonemes as a /d/ or a /b/ more than the subject group exposed to more /d/-/t/ ambiguous phonemes in place of a /t/. Control groups did not show this effect. EDIT: This study was run with multiple speakers, which is significant because listeners shifted their learning between voices.
Discussion
This study gives evidence to the theory that listeners adjust perceptual categories on a speaker- and phonemic- basis, even for stop consonants, in a domain-general way, as evidenced by the effect on the /b/-/p/ continuum as well. In addition, listeners are able to apply perceptual learning to other speakers, which is useful for understanding inconsistencies in speech like accents.
Clayards et al. (2008)
Introduction
Speech perception is difficult as the auditory signal is not only noisy, but extremely variant based on context and speaker differences, even for minimal pairs from the same speaker. To correct ambiguities, listeners make use of probability distributions in making decisions about ambiguous words. This paper examines the "ideal observer" model, or the theory that listeners make use of all probabilistic cues for a word in the auditory environment in order to make up for the perceptual noise. Here, probability distribution information is defined as the number of times a listener has heard a phoneme as a member of a certain category. Question: How do we go from the category likelihood graph in fig. 1 to the probability response graph? Listeners under the ideal observer model make a prediction based on probability information about category boundary overlap as well as the likely amount of overlap given the categories. Evidence has shown that to a certain extent, listeners are aware of within-category differences, which supports the ideal observer model.
Experiment
In order to determine if listeners are sensitive to an entire probability distribution for a phonetic cue, the researchers spread phonetic tokens out along a VOT scale and tested subjects on category judgment. For this experiment, subjects were given a 4-picture forced choice task with two possible pictures based on the phoneme continuum (e.g. a peach and a beach) and two irrelevant pictures while wearing an eye tracker in order to record gaze time and decision time for the pictures. As predicted, the results showed that the probability curves for the subjects matched the hypothesized curves, and the participants' uncertainty (as measured by gaze time at incorrect objects) also matched what would be expected under the ideal observer model. However, the researchers note that in actual speech perception as opposed to this experiment's confines, the auditory signal would carry more information, which the listener may also rely on. Question: Clarify the idea of posterior probability.