[Sursound] The Sound of Vision (Mirage-sonics?)

Eric Carmichel Sat, 02 Jun 2012 14:04:27 -0700

Greetings All,
I continue to learn a lot from this site (and someday hope to have something to 
give back). For now, however, I will briefly comment on posts by Umashankar 
Mantravadi, Augustine Leudar, and Entienne.

Entienne wrote the following [abridged]: **The argument essentially says that
for something to appear real it has to fit people's *pre-conception* of what is
real, rather than fit what actually is real. In other words, throw out
veridicality (coincidence with reality); instead, try to satisfy people's
belief of reality. This is another argument for questioning the extent to which
physical modeling has the capacity to create illusions of reality in sound...**

Entienne made me consider further something of great importance re my CI
research. Briefly, we really don’t know what auditory perception is like for
hearing-impaired listeners (remembering that there’s a lot more to
sensorineural hearing loss than threshold elevation). For example, does the
Haas effect work for them? Why is noise-source segregation so difficult? Does
breaking apart an auditory scene create greater dysfunction, or can they put
the pieces back together to give the illusion of a unified sound source (as
with the cello example)? How does multi-band compression sound for them, etc?
We would most certainly like to know how altering a physical stimulus improves
their belief of reality (thus improving their ability to communicate or enjoy
music)? But how do we measure the perception of cochlear implant and hearing
aid users other than providing *physically accurate, real-world* stimuli? Side
note: Thanks for the reference to H. Wallach
(1940).

Re Augustine’s post: Thanks for suggesting Gary Kendall’s paper. While it
doesn’t provide a *complete* explanation (who can?), it is a good read. I
proposed a somewhat similar study while a grad student, but the stimuli would
have included speech, dynamical sounds (such as breaking glass or a bouncing
ball), and unfamiliar sounds. The constituent components of the unfamiliar
sounds would be spatially separated but have identical start times. We could
then ask whether it’s familiarity (as with a cello), arrival times, or other
variables that unify the separate sounds into a common source.

Umashankar Mantravadi wrote the following: *As a location sound mixer, I
exploited the visual reinforcement of sound in many situations. If you are
recording half a dozen people speaking, and the camera focuses on one -
provided the sound is in synch - the person in picture will sound louder,
nearer the mic, than the others. It is a surprisingly strong effect, and one
side benefit is you can check for synch very quickly using it.*

Many thanks for sharing this experience. I am currently creating AV stimuli
(using a PowerPoint presentation as the metronome/teleprompter). While there is
nothing new or novel about incorporating video, I am unaware of any
investigations using cochlear implant patients’ in a surround of uncorrelated
background noise combined with a video of the talker(s). One could also study
the effects of simulated cochlear implant hearing (using normal-hearing
subjects) with visual cues in a *natural* environment.

It has been known for some time that lipreading is useful for comprehending
speech presented in a background of noise. For example, Sumby & Pollack (1954)
showed that visual cues could aid speech comprehension to the same degree as a
15 dB improvement in SNR. Sounds with acoustic features that are easily masked
by white noise (for example, the voiceless consonants /k/ and /p/) are easy to
discriminate visually.

There is a plethora of literature surrounding the benefits of lipreading. It is
entirely possible that visual cues can affect more than just speech
comprehension. A study showing the reduction of stress when a listener is aided
by lipreading could be interesting: It is possible that visual cues, regardless
of speech comprehension advantages, could reduce listener stress in a difficult
listening environment. Capturing subtleties, such as talker voice level as a
function of background noise level, could make video and audio stimuli more
realistic. Although we might not be sensitive to these subtleties when
sufficient information is available to us (us = normal-hearing listeners),
hearing-impaired individuals might make use of visual cues in ways that have
not been explored. Systematic reduction or elimination of available information
can be accomplished when the stimulus contains all of the *essential*
information in the environment.

I am exploring Ambisonics (along with video) as method of capturing the
essential information. At worst, I have a great-sounding system for listening
to others’ musical recordings. Thanks to everyone for the recordings of crowd
sounds, music, software, and for sharing your wisdom.
Best regards,
Eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20120602/bb7c646d/attachment.html>
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

[Sursound] The Sound of Vision (Mirage-sonics?)

Reply via email to