Greetings to All, I've been working on listening samples to help explain my "ideas" regarding hearing aid and cochlear implant research to others. For starters, I'm using IRs obtained with a Soundfield mic to auralize dry speech. Unfortunately, more questions than sounds surround me weary head.
I discovered an artifact that will need to be addressed, and the answer may be obvious to the experts out there. I have uploaded files so that everyone can here the artifacts. The files can be downloaded from www.elcaudio.com/demos/ The dry recording I had initially planned as a demo is titled janice_sample_condensed.wav. The recording was made in a semi-anechoic room with a Rode NT1-A mic. No big deal here. I took a 6-word sample of the longer word list and cut out time between words. Zero-crossing detection was used to eliminate pops as I deleted sections of silence between words. The resulting file is labeled janice_sample_condensed.wav. Next, the monaural wav file (speech) was "auralized" using the four B-formatted, 96 kHz, 24-bit IR files obtained via a Soundfield Ambisonic mic. The four IRs (w, x, y, and z) were applied to the monaural dry recording (janice_sample_condensed). Finally, I used a popular VST to convert the resulting four B-formatted files to a stereo/binaural file (KEMAR or similar HRTF). The stereo file is titled janice_60x00y.wav. The 60x00y comes from the position of the mic relative to the loudspeaker. Now for the weird stuff: When you listen to the janice_60x00y.wav file under headphones (it's a binaural recording), it's fairly clear that the talker is to the right of the listener. This would be expected based on the mic/speaker orientation. The first word is the easiest to localize, and one could argue the precedence/Haas effect helps localize the first sound in the reverberant room. As the sentence progresses, the localization is more blurred (at least to me). So, to investigate whether other words could be well localized by starting at each word's onset, I moved the wav file editor's cursor to begin at around 4 seconds. What I noticed was a distinct impulsive/gunshot sound--it isn't remotely subtle. This "burst" has nothing to do with non-zero crossing point pops or the abrupt start/stop of a waveform without fading in/out of it. This occurs at any number of locations, but is particularly noticeable around 3.8 seconds. But when you listen to the wav file from start to finish, no such sound exists. I also trimmed off the wav file's first four seconds and provided 50 ms fade-in. The impulse is still clearly audible. But yet, it goes completely unnoticed when listening to the full-length file from its beginning. Because the four IR files are 2 s duration, I thought there might be a "ripple" that occurs every two seconds. So, to test this, I created a 600 ms noise burst from ANSI speech-weighted noise (600 ms is approximately the time taken to say Tom). I added a 10 s tail of silence to the noise burst. Next I proceeded to apply the IR files using the same settings (e.g. 100 percent wet) as I did with the dry-speech recording. There are no "ripples" of impulse noise in the silent region. I then cropped off a small initial portion of the noise burst and applied a fade in. The impulsive sound is very evident, but doesn't occur when listening to the file from its beginning (i.e., the original, full-length file). The speech noise files are speech_noise_600ms.wav ("dry" noise); speech_noise_hrtf_1.wav (same processing as dry speech stereo); and speech_noise_hrtf_cropped.wav (fade-in added to the trimmed file). Artifacts such as this make me question a lot of what's going on research-wise. I don't know how hearing-impaired persons hear or deal with echo suppression and artifacts, so these "ghosts" could present a very real problem. Although we might not hear the artifact in one condition (i.e., playing from beginning), there's still something going on behind the scenes. This kind-of reminded me of "Ghosties and Ghoulies" found on the Harvard Tapes psychoacoustic demos (briefly, this demo shows how the brain suppresses echos: When the hammer blow is played backwards, the decay is quite audible, when played forward, it's a brief sound). Please listen to the files yourself. Your insight is most welcome. Back to work (and a lot of coffee). Best always, Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://mail.music.vt.edu/mailman/private/sursound/attachments/20121214/078fb22a/attachment.html> _______________________________________________ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound