Norris et al. (2002)

Introduction


Infants ~6 months of age can still discriminate phonetic contrasts in non-native languages; lose this skill by 10 months of age in favor of better discrimination of native language phonetic contrasts; this is important for minimal pair discrimination; phonetic categorical modulation seems to occur based on lexical information, perhaps accounting for regional dialect/pronunciation changes


Does lexical information feed back to pre-lexical selection of phonemes? This study tests this theory. Feedback systems for learning distinct from "on-line" feedback systems.


Experiment 1


Speakers presented with recordings of fluent Dutch speaker for training, 3 groups of subjects. Group 1 heard 20 words ending with ambiguous /f/-/s/ spectrum sounds followed by 20 words with unambiguous /s/ sounds. Group 2 heard 20 words ending with ambiguous /f/-/s/ spectrum sounds followed by 20 words with unambiguous /f/ sounds. 3rd group was control, heard ambiguous /f/-/s/ continuum endings for all words. Groups were given a lexical decision and phoneme categorization task on 3 lists of words similar to the training words, with some filler. Results: Subjects were faster to label unambiguous-ending words as legal words than words with ambiguities, unsurprisingly. Subjects who heard ambiguous /f/ sounds and unambiguous /s/ sounds were more likely to categorize an ambiguous sound as /f/, and the opposite was true of the second group. This supports the hypothesis of feedback from lexical context cues. However, the authors note that this result could have been the result of selective adaptation to a fricative along the /f/-/s/ continuum, or due to "a contrast between the ambiguous phoneme and the unambiguous endpoint." (p.15)


Experiment 2


UPDATE THIS POSTING

Kraljic et al. (2006)

Introduction


Listeners attempt to make auditory perception more consistent through a process called normalizing, which makes up for fluctuations in the acoustic signal. They do this by maintaining a few, broadly defined phonemic categories in order to account for variations in speech. Recently, research has shown that people may adjust to speakers by narrowing these phonetic categories based on context and speaker variation, which the text refers to as perceptual learning or perceptual recalibration. The article makes references to Norris et al. (2003) in which listeners adjusted their perceptual categories to adjust to the acoustic signal in an ambiguous context, but goes on to suggest that perceptual learning may happen on a speaker, word or even phonemic level. The following experiment is conducted to see the extent of perceptual learning that listeners have when presented with a speaker which produces an odd phoneme, in this case, /d/.


Experiment 1


The experiment run was similar to Norris et al. (2003) in that listeners heard a series of words with ambiguous /d/-/t/ sounds randomly inserted in a list of unambiguous words (in place of an unambiguous /d/ sound, such as croco?ile and to?al [total]) with the hypothesis that if perceptual learning were to occur, listeners would be more likely to classify these sounds as /d/ during a lexical decision task, due to lexical knowledge and adaptation to the speaker's accent. Listeners were also required to place /b/-/p/ sounds on a continuum, with the hypothesis that if listeners shift their perceptual categories in a more general way, we would observe the shift on other continua, but if it was done on a more specific basis, we would not see this effect. Results: Listeners performed very well on the lexical decision task, and like in the Norris et al. (2003) study, listeners who were tested with more ambiguous /d/-/t/ phonemes in a word which should have a /d/ categorized ambiguous phonemes as a /d/ or a /b/ more than the subject group exposed to more /d/-/t/ ambiguous phonemes in place of a /t/. Control groups did not show this effect. EDIT: This study was run with multiple speakers, which is significant because listeners shifted their learning between voices.


Discussion


This study gives evidence to the theory that listeners adjust perceptual categories on a speaker- and phonemic- basis, even for stop consonants, in a domain-general way, as evidenced by the effect on the /b/-/p/ continuum as well. In addition, listeners are able to apply perceptual learning to other speakers, which is useful for understanding inconsistencies in speech like accents.

Clayards et al. (2008)

Introduction


Speech perception is difficult as the auditory signal is not only noisy, but extremely variant based on context and speaker differences, even for minimal pairs from the same speaker. To correct ambiguities, listeners make use of probability distributions in making decisions about ambiguous words. This paper examines the "ideal observer" model, or the theory that listeners make use of all probabilistic cues for a word in the auditory environment in order to make up for the perceptual noise. Here, probability distribution information is defined as the number of times a listener has heard a phoneme as a member of a certain category. Question: How do we go from the category likelihood graph in fig. 1 to the probability response graph? Listeners under the ideal observer model make a prediction based on probability information about category boundary overlap as well as the likely amount of overlap given the categories. Evidence has shown that to a certain extent, listeners are aware of within-category differences, which supports the ideal observer model.


Experiment


In order to determine if listeners are sensitive to an entire probability distribution for a phonetic cue, the researchers spread phonetic tokens out along a VOT scale and tested subjects on category judgment. For this experiment, subjects were given a 4-picture forced choice task with two possible pictures based on the phoneme continuum (e.g. a peach and a beach) and two irrelevant pictures while wearing an eye tracker in order to record gaze time and decision time for the pictures. As predicted, the results showed that the probability curves for the subjects matched the hypothesized curves, and the participants' uncertainty (as measured by gaze time at incorrect objects) also matched what would be expected under the ideal observer model. However, the researchers note that in actual speech perception as opposed to this experiment's confines, the auditory signal would carry more information, which the listener may also rely on. Question: Clarify the idea of posterior probability. P(b|x)=[P(x|b)P(b)]/[P(x|b)P(b)+P(x|p)P(p) where P=probability, b/p are category boundaries, x is VOT

Battaglia et al. (2004)

Introduction


The article begins by explaining the phenomenon and adaptation to visual blurring; although there is some amount of blur to most retinal images due to differences in eye focus and object depth, those objects typically seem in focus to our perceptual systems due to accommodation by the lens and the brain. Although our retinas "see" blur, instead of assuming that objects are inherently blurry, experimenters hypothesize that the brain "knows" that those objects could be brought into focus by changing the lens shape, and corrects for that based on its distance from the eyes. The experimenters test this hypothesis by attempting to establish a relationship between adaptation to blur and visual depth (113).


Experiment


Subjects placed their heads on a chinrest to keep their heads still while fixating on a diamond shaped point in between two images depicting a grassy texture, which were each filtered to different amounts of blur. The experiment was conducted in 4 stages; Stage 1, the baseline training, consisted of subjects fixating on the center point while the surfaces moved closer and then farther away from the subject in tandem. Stage 2 consisted of a pretest of the subjects' blur matching performance in a near and a far condition (115). They were presented with two images, one near and one far, and tested on whether one looked more blurry than the other, with the blur being adjusted in a "staircase" method for what the subject responded ("less blurry" results in an increase in blur, and vice versa). This continued until there were 8 reversals in the "staircase" setup, with conditions similar to stage 1 in between trials. Stage 3 consisted of adaptation training, which was the same as in Stage 1, except that as the images moved back and forth, their blur factor was adjusted based on simulated distance from the subject, with subjects falling into two groups. The first group's images got blurry as the images moved away from them ("Far" group) and the second's got blurry as they moved towards the subject ("Near" group). Stage 4 was similar to stage 2, except that the scene changed as in Stage 3 in between trials. When comparing the results of Stage 2 and Stage 4, we see that participants experienced a great adaptation effect, as seen in the graphs on p.116, where their perception of blur was affected by surface depth rather than actual blur factor. The discussion attributes this phenomenon to the fact that the visual scene always includes novel blur on objects as our distance from those objects is constantly changing, thus the need for adaptation. The discussion goes on to highlight two possible explanations for the underlying neural networks, 2D and 3D, discussed in more detail on page 117.

Kleinschmidt & Jaeger (2011)

Introduction


Phonemic category recalibration (discussed in previous articles) is contrasted with selective phonemic adaptation, in which repeated exposure to a prototypical phoneme such as /b/ narrows the category boundaries for that phoneme. Some basic differences include the level of processing (low-level processing for phonemic adaptation vs. higher level learning for recalibration), the opposite effects on categories, and the differences in cumulative exposure (recalibration grows weaker with exposure, adaptation grows stronger). Question: What perceptual processing does the adaptation reflect? Is it related to the pruning that naturally occurs in early language development? The article suggests that instead of a distinct process, these effects can be thought of as an ongoing part of perception referred to as incremental belief updating. A full description of the model exists on page 2, but in brief, the category boundaries are either widened or narrowed based on proximity of the stimulus to the middle of the bell curve.


UPDATE THIS POSTING

Jacobs (2002)

Overview


The visual system incorporates a great number of cues to properties such as depth. However, some of these cues can be more or less reliable in different contexts. The question of what criteria we go by when analyzing which cues are most reliable and how we use that information once we have it is an important question in the visual perception and cognition field. One of the ways this is theorized to happen is that each cue derives a depth estimate and then is weighted and averaged with other cues. See page 2&3 of the article for mathematical models of this process. In addition, we hypothesize that ambiguity of the cue is related to the estimated reliability of that cue; a model which is called a Kalman filter. However, do humans act as ideal observers in this regard? It's possible that we are constantly updating our weighting of cues based on factors like how variant a cue's estimates of depth have been in the past as well as how different the cue's estimates are from other depth cues. In addition, observers take into account other sensory information like haptics in their evaluation of which cues are most reliable. The two overarching hypotheses here are the Kalman filter hypothesis and cue correlation hypothesis, in which cues are correlated with each other, and the more similar their output, the more reliable they are judged to be.

Fine & Jaeger (2011)

Introduction


In order to resolve some of the ambiguities inherent in the speech and syntactic signals, we rely on cues in the syntactic structure of sentences such as placement and use of verb and noun phrases. Cues vary by context in their validity, which is a function of their availability in the environment as well as their reliability as an indicator of the speech signal. The tested hypothesis has two parts: that our weighting and reliance on cues continues throughout life, and that our adaptation to cues changes as a function of validity.


Experiment


Participants were sorted into two groups, "high reliability" in which subjects read sentences in which all verbs occurred with sentence compliments, and "low reliability" in which verbs occurred 50% of the time with direct objects and 50% of the time with sentence compliments. This reading occurred over 3 non consecutive days.

Bharucha (1987)

Introduction


Music cognition involves "a myriad" of context-dependent phenomena related to/caused by musical structure. A main question to music cognition theorists is what psychological processes underlies these phenomena. A structured musical environment combined with our brain's natural tendency to encode stimuli as schematic representations lead us to perceive relationships between tones (major, minor, pitch chroma, relative pitch, etc). The activation of these representations may lead to certain patterns in context; such as an absent event that is cohesive with a context being erroneously judged to be present, or events inconsistent with a context being correctly judged as absent - such to an extent that mental representations can be activated in anticipation of an event. This effect can be brought on, for instance, by a dominant chord, which implies the tonic. These expectations may or may not be conscious. Note: These expectancies should not be confused with veridical expectancies, which are created based on past experience or knowledge of what comes next. Connectionist networks that propose simultaneous top-down and bottom-up processing do well to explain such phenomena.


'Experiment'


Experimenters ran a priming task for chords with the intent of measuring processing time of played chords based on their preceding chord. The hypothesis was that chords closely "related" to each other in terms of tonal composition would exhibit more of a priming effect for processing time. They did this task for major/minor chords and Indian rags, and the results confirmed the hypothesis, which provides evidence for the type of connectionist models described in the article.

ThomasDavison (last edited 2012-02-29 05:44:43 by 10)

MoinMoin Appliance - Powered by TurnKey Linux