[Sursound] Ambisonics recording of LOUD night club or venue?

2015-06-05 Thread Eric Carmichel
Greetings Everyone,I haven't posted in ages--a move to Silicon Valley over a 
year ago has occupied my time.Does anybody have a Soundfield recording of a 
loud nightclub or live music venue? I mean really LOUD electronic dance or rock 
music. I understand this isn't something you'd normally take a high-end mic to, 
but I need an accurate representation of the atmosphere. I have live recordings 
taken from feeds, but these aren't representative of what the "sound" is really 
like. A binaural or monaural recording (with quality mics) would help, too, but 
marginal quality recordings made with a Smartphone won't work (otherwise I'd go 
to YouTube and find tons of $%#@).I checked uploaded recordings linked to the 
ambisonic net site: Very cool stuff, but not what I need for a particular 
study.Best regards,Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit 
account or options, view archives and so on.


Re: [Sursound] Sursound Digest, Vol 83, Issue 2

2015-06-06 Thread Eric Carmichel
Thanks, Sampo and Andy,I didn't think to check Core Sound's site. (Side: 
coincidentally, I just ordered a Jecklin disk from them.) I wonder if my head 
is shaped like Len M's -- the sampled recordings could be a real binaural 
treat! Both the deafeningly loud club and motorcycles ought to work. Thanks 
again for suggesting.Sampo, I agree. I often work with MEMS mics. SNR getting 
better, but the overall response is so affected by the net acoustic path 
(air-mass loading) that there's always a peak in response. Low frequency 
generally rolls off below 100 Hz without electric filtering, though this can be 
set by manufacturer (f3 = 5 Hz entirely possible, but low freq energy saturates 
system). Regardless of uniform response or SNR, sound quality of microacoustic 
mics is inferior to good studio mics.Best regards,Eric
  From: "sursound-requ...@music.vt.edu" 
 To: sursound@music.vt.edu 
 Sent: Saturday, June 6, 2015 9:00 AM
 Subject: Sursound Digest, Vol 83, Issue 2
   
Send Sursound mailing list submissions to
    sursound@music.vt.edu

To subscribe or unsubscribe via the World Wide Web, visit
    https://mail.music.vt.edu/mailman/listinfo/sursound
or, via email, send a message with subject or body 'help' to
    sursound-requ...@music.vt.edu

You can reach the person managing the list at
    sursound-ow...@music.vt.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Sursound digest..."


You are receiving the digest so when replying, please remember to edit your 
Subject line to that of the original message you are replying to, so it is more 
specific than "Re: Contents of Sirsound-list digest� the subject should match 
the post you are replying to.

Also, please EDIT the quoted post so that it is not the entire digest, but just 
the post you are replying to - this will keep the archive useful and not 
polluted with extraneous posts.

Today's Topics:

  1. Ambisonics recording of LOUD night club or venue? (Eric Carmichel)
  2. Re: Ambisonics recording of LOUD night club or venue?
      (Sampo Syreeni)
  3. Re: Ambisonics recording of LOUD night club or venue?
      (Andy Furniss)


--

Message: 1
Date: Sat, 6 Jun 2015 00:05:19 + (UTC)
From: Eric Carmichel 
To: "sursound@music.vt.edu" 
Subject: [Sursound] Ambisonics recording of LOUD night club or venue?
Message-ID:
    <1261780876.6102999.1433549119381.javamail.ya...@mail.yahoo.com>
Content-Type: text/plain; charset="utf-8"

Greetings Everyone,I haven't posted in ages--a move to Silicon Valley over a 
year ago has occupied my time.Does anybody have a Soundfield recording of a 
loud nightclub or live music venue? I mean really LOUD electronic dance or rock 
music. I understand this isn't something you'd normally take a high-end mic to, 
but I need an accurate representation of the atmosphere. I have live recordings 
taken from feeds, but these aren't representative of what the "sound" is really 
like. A binaural or monaural recording (with quality mics) would help, too, but 
marginal quality recordings made with a Smartphone won't work (otherwise I'd go 
to YouTube and find tons of $%#@).I checked uploaded recordings linked to the 
ambisonic net site: Very cool stuff, but not what I need for a particular 
study.Best regards,Eric
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20150606/1d3221e5/attachment.html>

--

Message: 2
Date: Sat, 6 Jun 2015 04:01:26 +0300 (EEST)
From: Sampo Syreeni 
To: Eric Carmichel ,  Surround Sound discussion
    group 
Subject: Re: [Sursound] Ambisonics recording of LOUD night club or
    venue?
Message-ID: 
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

On 2015-06-06, Eric Carmichel wrote:

> Does anybody have a Soundfield recording of a loud nightclub or live 
> music venue? I mean really LOUD electronic dance or rock music. I 
> understand this isn't something you'd normally take a high-end mic to, 
> but I need an accurate representation of the atmosphere.

I don't have one. But from what I understand, SoundFields, especially of 
the classical kind, can take one hell of a beating without even 
distortin too much. I seem to remember you can take a Mark V into an 
environment close to 130dB(A), while still being so-and-so on the safe 
side.

With the newer ones, your mileage may vary. But not too much even there. 
All of the miniaturized mics are of course shit, as they always were.
-- 
Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front
+358-40-3255353, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2


--

Message: 3
Date: Sat, 06 Jun 2015 09:51:49 +0100
From: A

[Sursound] Greetings (and a newcomer's first submission)

2011-11-27 Thread Eric Carmichel
Greetings to all:
I am an Audio Production Technology student in the USA. I am also a hearing 
scientist with a few modest credentials under my belt. I am, however, new to 
Ambisonics.
I have been an audio enthusiast for most of my life. My current research 
interests include cochlear implants and the possibility of creating 
“real-world” virtual listening environments for studying hearing aid and 
cochlear implant efficacy in a variety of noisy and moderately quiet 
environments.
I see the potential of Ambisonics as a research tool for my research and 
hearing scientists in general; consequently, I created a new website = 
cochlearconcepts.com. Ambisonics isn’t nearly as well known in the US as it is 
in Europe--at least this is my perception. I have communicated with Bengt-Inge 
Dalenbäck (CATT-Acoustic) and Dr. Pauli Minnaar (Oticon)--both of them kindly 
directed me to useful resources.
My first project is to record restaurants and coffee houses using a Core Sound 
TetraMic and then play these back through an 8-speaker, horizontal-only 
circular configuration (r = 1.4 m). I could just as easily stagger the speakers 
so that they provide both vertical and horizontal information (would this be an 
efficient use of eight speakers?). I plan to decode the raw (A-Format files) 
via VVMic software (along with the calibration files provided with my 
TetraMic), and then process the B-format files offline using MATLAB. My 
preferred DAW for the Windows OS is Nuendo.
My new website (www.cochlearconcepts.com) shows a portion of my research 
arsenal. My loudspeaker inventory consists of ten carefully matched KRK passive 
monitors (they are the older KRK speakers using Focal woofers). I have an 
Edirol/Roland R-4 Pro four-track digital recorder for live recording, and a 
MOTU 896HD interface for playback. Because I am new to Ambisonics, any 
critique, leads to resources, and “moral support” are welcome.
Many thanks for your time: I appreciate your taking the time to read this.
Kind regards,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Great responses to my post--thanks!

2011-11-30 Thread Eric Carmichel
Hello Everyone,
Since my first post (Greetings from a newcomer...), I have received many kind 
and informative emails. Because there may be confusion regarding my cochlear 
implant (CI) research, here’s some additional background: Part of the impetus 
for my work stems from an attempt to show (objectively) that two cochlear 
implants provide more benefit than one. From a normal-listener perspective, 
this seems almost obvious: Two ears help us localize a sound source, and this, 
in turn, helps us segregate a signal from noise. CI users have a lot of 
difficulty listening in noise: Even a +5 dB SNR makes speech comprehension 
difficult for them. Research to date hasn’t shown significant improvement in 
word or sentence comprehension ability in noise with binaural implantation. 
Individuals with two CIs say that there’s a marked improvement in their sense 
of “space” (and sense of well-being) over a single implant, but it has been 
difficult to quantify this improvement.
 Consequently, insurance companies (at least in the US) won’t pay for two 
implants. The old-school method of testing in noise largely ignores surround 
sound or “real-world” scenarios, so I am attempting to improve the way we test 
hearing-impaired listeners in noise. Methods of measuring speech comprehension 
typically include the use of multi-talker babble or speech-weighted noise in 
one speaker and the speech (or target) signal in another speaker: This 
arrangement hardly replicates real-world scenarios.
In case readers are unfamiliar with CI listening, it’s probably a lot like 
listening through a 6-channel noise-band vocoder. Examples of implant 
simulations can be found on www.hei.org/research/shannon/simulations.html. 
Although not included in simulations, the background noise would also be akin 
to listening through a vocoder, so there's probably a lot of energetic masking 
(versus informational masking) going on when using a limited number of channels 
(channels equating to the number of electrodes along the implanted electrode 
array). If the noise and signal are spatially separated, and if there's still a 
sense of "direction" at opportune moments, then two CIs should help in noise. 
Incidentally, typical CIs have 22 electrodes, but only so many electrodes are 
"active" at any given time; othewise, there would be a lot of current smearing 
among the electrodes. I think a 6-six channel vocoder is the most reasonable 
approximation when simulating CI
 listening (research supports this).
The post that said that Ambisonics resorts to some “psychoacoustic trickery” 
was very well taken, and addresses one of my preliminary concerns regarding 
first-order Ambisonics. But what I hope to do at the onset is to use a variety 
of representative background noises from recordings (ranging from quiet coffee 
cafes to loud restaurants) to investigate speech comprehension in surround 
noise using single, binaural and hybrid CI patients. I could also vocode the A- 
or B-formatted signals as well as the speech signals for simulated CI 
listening with normal-hearing listeners. To be clear: Initial tests will use 
Ambisonic recordings only to provide “real-world” background noise, not to 
provide the target or speech signal. The speech signal will be recorded on an 
independent (monoaural) track and reproduced through its own loudspeaker. 
Auralizing the speech signal may or may not add much in the way of realism 
because the intensity of reflected or
 reverberant sound from a nearby talker (typically well within 1 m) would be 
quite small. But the background noise should be realistic in level, and its 
wave field created by an Ambisonic arrangement (even first-order) should 
hopefully be more realistic than the old-school method using a single 
loudspeaker. Sadly, a lot of hearing aid and CI studies are done with only two 
loudspeakers, one for speech and one for noise: I just don't think this reveals 
more than the effects of energetic or informational masking (depending on the 
noise) using two, albeit spatially separated, monaural signals.
I have read a number of articles on first- and higher-order ambisonics, and I 
realize that I have a lot to learn. Certainly, the "best" setup for my research 
would be a way of creating a sound field at the listening position that's 
equivalent to a real-world situation, but this isn’t easy to achieve in many 
research environments. For example, binaural recordings and headphone playback 
might give "accurate" pressures at the ears, but headphones are certainly out 
of the question when it comes to CIs and most HA devices. Actually, I've never 
experienced a sense of “open space” when listening to binaural recordings or 
simulations from HRTF IRs (including the often-cited IRs made by Gardner et al 
at MIT during the 1990s). I own ER-3A insert phones, Sennheiser HDA 200 
audiometric headphones, and my work-horse AKG K240 studio 'phones--but I've yet 
to hear a binaural recording that replicates live sound--pr

[Sursound] HRTFs, recordings, headphones, and more

2011-12-05 Thread Eric Carmichel
Greetings again to all,
My second post (“Great responses to my post--thanks!”) elicited some noteworthy 
responses, particularly regarding my comment that most binaural recordings that 
I’ve listened to don’t give a sense of “open space.” Naturally, we all have a 
unique HRTF, and recordings or IRs made with an acoustical test fixture (e.g. 
KEMAR) probably won’t match our own HRTF.
Recordings made with KEMAR (Knowles Electronic Manikin for Acoustic Research) 
have the microphones deeply seated in this fixture. Such recordings will have a 
“naturally occurring” resonant peak around 3 kHz because of the KEMAR’s pseudo 
ear canal (which, for KEMAR, is just a straight tube, with or without Zwislocki 
couplers). A naturally occurring resonant peak exists in open-ear listening 
situations, and this adds to the sense of openness. The style of headphones we 
use may destroy the ear canal’s natural resonant peak, particularly if the 
headphones are of the insert type. If the recording includes a peak, then 
insert phones may not be a problem. Otherwise, we may have to use a peaking 
filter to re-create an open-ear type of response. Of course, not all headphones 
seal off the canal. So how do these headphones affect listening? My 
off-the-cuff answer follows:
I’d estimate that the earcup volume of circumaural headphones is around 6 cm2. 
But because headphones include active drivers, computing the combined resonance 
of the ear canal with the earcup’s volume may not be so simple: There’s an 
issue of “equivalent volume” when dealing with active elements (for example, 
consider the equivalent volume of a B&K acoustic calibrator). The point to all 
of this is that HRTF, pinna transfer functions, open-ear frequency response, 
etc. are dependent not only on the individual, but on the headphones used for 
playback.
I made one recording using in-the-ear microphones and it was eerily realistic 
in one way: I was slowly moving on a squeaky floor while making the recording, 
and when I played the recording I found myself looking at my feet because it 
made me feel as though something was moving at my feet. This is the result of a 
full-body transfer function, and was the most out-of-the-head sensation I've 
experienced with headphones. The rest of the recording wasn’t this impressive.
I have listened to Hector’s recording using AKG K240 studio phones (semi-open). 
(Thanks to Hector for making his recording available.) The sounds and child 
coming from the extreme left gave the sense of a distant source--this is good. 
But I believe I experience what others discovered: None of the sounds appeared 
to come from behind or in front of me; it was though the child was running 
through my head. This may not be the case with all headphones. I have a pair of 
ER-3A insert phones that will probably yield a different effect. I’m currently 
using my ER-3A’s for an otoacoustic emission (OAE) study, but will report back 
once I have a chance to listen to the recording via insert-type phones and my 
Sennheiser HDA-200 headphones.
Again, many thanks to all for sharing thoughts, recordings, references, and 
wisdom.
Sincerely,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Back to Ambisonics

2011-12-06 Thread Eric Carmichel
Hello Sampo,
Many thanks for your thorough and interesting reply to my post. Even if 
binaural listening with head tracking could be perfected, headphones are still 
out of the question when it comes to presenting stimuli to hearing aid and 
cochlear implant users. A need for presenting sound in three dimensions (or, at 
the very least, surround sound on the ear-level plane) was the whole reason I 
became seriously interested in Ambisonics.
It is interesting that you mentioned ultrasonic frequencies. I recall, from 
years back, one manufacturer's attempt to market an ultrasonic device that 
would compete with cochlear implants. (If I remember correctly, it was a 
German-based company called Hearing Innovations.) The ultrasonic signal was a 
60-kHz carrier and presented to the user via bone conduction. One might 
ascertain that inertia alone would mechanically filter out the 60-kHz signal, 
thus leaving only the signal used to modulate the carrier to stimulate the 
inner ear. But, according to the manufacturer, individuals with profound 
sensorineural hearing loss could "hear" with the device, thus precluding the 
idea that it was merely a fancy bone-conduction hearing aid. I don't know what 
physiological processes would have been involved or if further research is 
being done.
Although it's pretty well established that most neurons can't go through the 
whole action potential / depolarization cycle within a 1 ms time-frame, the 
motile, outer hair cells can vibrate at frequencies on the order of 8 kHz (or 
higher?) when electrically stimulated (at least in vitro). Although OHCs don't 
innervate afferent fibers, they certainly affect hearing, and add to the 
complexity (and mystery) of the peripheral auditory system. I worked on a 
project with Lin Bian, M.D. and Ph.D. to show how the cochlear partition moves, 
cycle-by-cycle, and it definitely isn't linear. This has a bit to do with the 
"rectification" process. That particular study didn't make it into the pages of 
JASA, but Lin's subsequent studies did.
Your paragraph "Now that I've read some basics of cochlear implant tech, I 
don't see how such considerations are taken into account. Thus, Eric, since you 
seem to be worried about the effects of real life background noise on CIs, 
maybe you could go double the mile by trying out a CI analysis algorithm which 
hybridizes your typical Shannon-esque noise-band vocoder with a selective 
application of pure, rectified, time-domain information, straight from the 
sampler" was well-taken in that I'm not trying to copy what other researchers 
and CI manufacturers are doing. The following is taken directly from my website:
"The overall design results in a virtual 'electrical transmission line' along 
the implanted membrane. The active array doesn't suffer from current smearing 
when a large number of electrodes are simultaneously energized... The high 
channel count isn't merely an attempt to achieve better frequency 
discrimination based on the place theory of frequency coding; instead, our 
implant design and multiplexing strategy follows many established principles of 
psychoacoustics -- some of which are ignored in other CI designs. The 
high-density electrode will improve dynamic range and speech quality (as well 
as frequency discrimination) when one CI is used, and nearly-normal interaural 
time difference (ITD) coding when two implants are used. Additionally, there 
are theoretical benefits to our high channel density design when severe nerve 
ganglion damage is present... Because our design works on different principles 
than other implants, it is important to note that our
 acoustic simulations are considerably different from noise-band vocoder 
demonstrations used to simulate CI listening." (This paragraph, by the way, is 
about my CI design.)
In brief, a lot of things haven't been taken into account when it comes to 
auditory neural prostheses, and I'm not sure why they haven't. What I do know 
is that I have a heck of a lot to learn, and I keep an open mind. What most 
people may not realize is that hearing science is more of a hobby than vocation 
for me. I'm excited about Ambisonics because it's new information for me 
(although I realize it's not exactly new technology). I'm enjoying the plethora 
of recordings that are available, and my interest in Ambisonics extends well 
beyond hearing research. In fact, I'm giving a demo to an Audio Production and 
Technology class this Thursday. Should be fun!
Kind regards,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Patents, Serendipity, and Questions

2011-12-21 Thread Eric Carmichel
Greetings to All,
The patent 'finds' are interesting and, as usual, give credence to the 
expression "there's nothing new..." Of course, if one looks closely at the 
cited patents, you'll see that the microphones and associated circuitry mix 
down to single-channel outputs (or provide a meter reading of energy density of 
sound waves). Gerzon, Craven, and colleagues didn't steal anything from their 
predecessors: They just took ideas a step or two further. (I can say the same 
with my sole patent.) It is amazing that after studying audiology and hearing 
science that I continue to find much valuable information in the older books by 
Harry F. Olson and Leo L. Beranek (among others), or vintage articles from 
Proceedings of the IRE, etc. Although audio has been a life-long hobby, 
Ambisonics is relatively new to me. As with many things in life, one idea leads 
to another, and pretty soon you become immersed in a plethora of literature and 
potentially useful ideas. My earliest
 introduction to Ambisonics was the articles that appeared in The Audio Amateur 
(circa 1970s), but I was quite young and too much into conventional stereo to 
take quadraphony seriously (but, hey, I was smart enough to keep all of my 
hardbound issues of TAA for future reference). A little over one year ago, I 
had communicated with Bengt-Inge Dalenback of CATT-Acoustic. I was trying to 
build a surround system for testing cochlear implant patients in controlled 
but real-world scenarios, and auralization was one way of improving current 
test protocols. Bengt-Inge was very helpful, and he is the person who 
re-introduced me to Ambisonics and the sursound list.
At present, a surround system known as R-Space is being used to study cochlear 
implant (CI) efficacy in noise. Larry's system offers advantages to research 
scientists and audiologists, but its original scope was mainly limited to 
hearing aid studies and being small enough to fit in an audiometric test booth 
(this puts the loudspeakers a mere 2 feet from the listener). If you'd like to 
see an actual R-Space install, I took a snapshot of one and it appears in a 
PowerPoint available thru my website (cochlearconcepts.com). Use of the R-Space 
is a step in the right direction, but I sincerely believe an Ambisonic system 
(or a high-order ambisonic system) provides greater flexibility and larger 
listener sweetspot. Oticon has a fancy surround setup that used HOA, but this 
is more elaborate than what I need or can afford (I'm an independent researcher 
who does this as a hobby). After reading a long list of articles, to include 
most of Michael Gerzon's articles
 on Ambisonics and psychoacoustics, I have a couple of questions. Questions 
follow:
1. Is there any preferred method of calibrating speakers used in an Ambisonic 
setup? Options at hand include swept sine, pink noise, MLS, and IR measures. 
There are articles in the AES (and elsewhere) that compare these methods, but 
has does anyone have a particular preference of calibrating speakers when it 
comes to Ambisonics? My current setup is a circular array of eight speakers (r 
= 1.4 m) in an average sized living room. I have an Earthworks calibration 
microphone at the listening position (ear-level). Side note: I have considered 
adding gobos between speakers (thus enclosing the space) in addition acoustic 
absorbers and diffusers throughout the room.
2. Has anyone compared or noted differences between the Virtual Visual 
Microphone (VVM) software and offline processing using MATLAB? My speaker setup 
lends itself to the code outlined in the article Using Matlab/Simulink as an 
implementation tool for Multi-Channel Surround Sound by P. Schillebeeckx, I. 
Paterson-Stephens, and B. Wiggins. If I were to use VVM to do the same 
(starting with B-formatted files), any thoughts as to how the mic directivity 
should be set (cardiod being 1, subcardiod being approx. 0.7) when using eight 
or more loudspeakers?
3. I have seen discussion and articles regarding Ambisonics and shelving 
filters. Any recommendations as to "best" filter settings based on 
speaker-to-listener radius? For example, the aforementioned R-Space has a 
radius = 2 ft. (0.61 m). If I were to use a recording made from a Soundfield 
mic with the R-Space, what sort of filtering would be required for such a tiny, 
8-speaker arrangement? Would this system even lend itself to Ambisonics? The 
arrangement I have at home has a radius of 1.4 meters and is what I'll be using 
for my research. I anticipate adding more speakers to make it more of a 
periphonic system. To date, my background noise recordings were made using a 
TetraMic. The speech stimuli are "dry" (semi-anechoic room) recordings that can 
be auralized to match the background room noise reverb characteristics. The 
speech stimuli, as is probably obvious, are used to measure speech 
comprehension ability. I try to keep signal-to-noise ratios
 reastic, which is a weakness of many studies. To date, studies ha

[Sursound] Thanks for links, insights, etc.

2011-12-22 Thread Eric Carmichel
Greetings:
Hello Michael C. and Fons A.,
Thank you for your detailed and informative responses to my questions.
Fortunately, the speakers I have chosen are well-matched and have good response 
characteristics. I matched them some time ago; however, each speaker underwent 
testing at an identical location, not at their respective positions in my 
listening room. Because I am interested in three-dimensional Ambisonics, four 
of the eight speakers in the (current) octagonal array will have to be close to 
floor level: This is the only way to get moderately wide vertical separation 
without putting the listener in a high chair. I recently observed that speaker 
response (independent of room characteristics) changes because the floor 
imparts an affect (I believe more than just the proximity effect). Fortunately, 
large amounts of EQ aren’t needed, and I’m mostly interested in smoothing the 
response in the 100 Hz to 10 kHz range.
I’m a minimalist when it comes to audio. I was never one to use graphic EQs (or 
modern-day VSTs to achieve the same). I began building amplifiers while in 
grade school, and a 10 watt, class-A amp designed by J. Linsley Hood and 
described in Wireless World (1969-ish?) was a favorite of mine for many years. 
Later I built a class-A, push-pull VT amp with 300Bs and an interstage 
transformer. This was for my Lowthers. I never got into the single-ended stuff 
because it seemed easy to mitigate transformer core saturation issues with 
class-A push-pull designs that operated along the transfer characteristic as SE 
biasing. My point is this: I don’t like too many things in the circuit path, 
and I only use EQ when absolutely necessary. However, measurements serve to 
“validate” my research findings, particularly when they’re slated for 
publication or under scrutiny. If I use EQ, I try to use filter types that 
yield the best transient characteristics and
 minimal phase anomalies. I downloaded, as per your suggestions, the PowerPoint 
/ PDF by J. Nettingsmeier. Looks like really good information. I will give it a 
thorough reading after Christmas. Thanks for recommending.
RE MATLAB: Some of the cochlear implant (CI) simulations I do are simple phase 
vocoder scripts written in MATLAB. While in graduate school, my doc committee 
consisted of respected researchers (does W. Yost, M. Dorman, or S. Bacon ring a 
bell with anybody?) who were huge proponents of MATLAB. The general attitude 
was “if you can’t do it in MATLAB, it isn’t worth looking at; furthermore, if 
it requires hardware, we don’t even want to look at it.” Kind-of strange 
attitudes in my book, but I’ve always been more of a hardware person, whether 
it’s digital or analog. I continue to do off-line wav processing in MATLAB 
because I can show the underlying math as well as the statistical outcome. More 
recently, I’ve been using Visual FORTRAN for projects.
RE Linux: I’m mostly a PC (Windows) user, but I’m not one to argue about the 
superiority of one OS over another. I have a BIG investment in software, and I 
don’t want to buy two versions of everything. It’s bad enough keeping up with 
the latest Adobe media suite or incarnation of Windows. I’ve mostly stayed with 
PCs so that I get best support for my National Instruments DAQ hardware or 
other (legacy) devices. Because I have several computers, setting one up with 
Linux is no problem at all. I used to run Red Hat Linux on one machine, and I 
really did believe in the superiority of Macs when Windows 98 repeatedly 
crashed. Nowadays I’ll use what works best or is accessible. So that I can 
experiment with Ambdec, I’ll load Linux on a dedicated hard drive. My audio 
hardware consists mostly of MOTU FireWire interfaces, but I also have an Avid 
PC extension chassis that has four identical PCI SoundBlaster cards on it. I’m 
sure I can find ASIO
 drivers for Linux that will work with my MOTU gear. The SoundBlaster cards are 
generic enough to work with about any OS (maybe even OS2 Warp).
I’ve been duly warned of the consequences of using more than six loudspeakers 
in a horizontal-only, first-order Ambisonic configuration. Thanks, Fons, for 
the very clear explanation. I do, however, want a flexible system because I’d 
like to move towards a 3-D setup (or higher-order Ambisonics via recordings 
made with an mh acoustics Eigenmic). Additionally, I have plans for an 
experiment that compares energetic versus informational masking of vocoded 
speech in the sound field, and I’ll be using two quasi-independent 4-channel 
systems for this. When it comes to music enjoyment, I’ll stick with your 
recommendation of six loudspeakers. Again, many thanks to all for the help!
Sincerely,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/ma

[Sursound] Journal articles

2012-01-09 Thread Eric Carmichel
Happy New Year to All,
I have to agree with Dave M. that the Acta Acustica united with Acustica 
articles can be fiendishly expensive (Dave's words). As a student, I'm keeping 
my fingers crossed that I can get these through inter-library loan. In some 
instances, articles of interest can be found as a doctoral thesis / 
dissertation and, consequently, available through a university at no charge. As 
an example, I have a pdf copy of Sylvain Favrot's PhD thesis titled "A 
loudspeaker-based room auralization system for auditory research." This thesis 
(with some modifications) appeared as an article in Acta Acustica united with 
Acustica. Similarly, a rather expensive book regarding transaural stereo 
techniques by William Gardner can be found (in dissertation form) on MIT's 
website. I certainly respect international copyright law, so I don't distribute 
info I've obtained unless the publisher has given permission. It would be nice, 
of course, to find affordable ways of accessing
 information, particularly when the information isn't proprietary or being used 
for commerical (profit) ventures. Should anyone know legal ways of obtaining 
the Acta Acustica articles, or the information contained in them, I would be 
most grateful for the help.
Kind regards,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Linux help and difficult listening (uploaded wav files)

2012-01-11 Thread Eric Carmichel
Hello Jörn,
Thanks for writing. Your response isn’t too late, and it may have saved me from 
potential grief. From what I’ve read, FFADO 2.0 will work with the MOTU 
Traveler (first generation) and the 896HD. These are the very two MOTU devices 
that I own.
Some years back (prior to learning of Ambisonics), I used Linux because there 
was a lot of great freeware available for Linux users. I used (and still have) 
a boxed version of Red Hat Linux 8.0. In the interest of multi-media, I 
recently downloaded Ubuntu Studio 11.1 for x86 (downloaded as an iso file, and 
then burned to DVD). Naturally, there is debate as to the “best” Linux for 
media use.
My interest in Linux at this time is because you (and others) had recommended 
or suggested AmbDec. From what I’ve read thus far, I look forward to trying 
AmbDec. Chances are, however, I’ll create the requisite audio files using one 
platform (i.e. Linux) from B-formatted files, and then play them back using a 
PC-based DAW. The impetus for the PC-based DAW is because I’m using hardware 
that I designed and built for automating psychoacoustic experiments. In a 
nutshell, the hardware (photo uploaded) is akin to a voltage-operated 
surface-controller that works in real time because it gets feedback (based on a 
listener’s response via a response box) before a subsequent stimulus is 
presented. My hardware controller works well with Nuendo 4.3 and Audition 2, so 
I’ll probably stick with these (PC) DAWs. If a dummy driver is needed to create 
audio files using the AmbDec software, then I imagine JACK or FFADO will work. 
I have never used JACK; is it
 similar to ReWire?
For my background noise, I have recorded several “representative” 
establishments/restaurants with noise levels hovering around 60-65 dBA, 70 dBA, 
and 75-80 dBA. I used a TetraMic connected to a Roland R-4 Pro recorder to make 
the 4-channel (raw, or A-format recordings). I then used VVMic and the cal (IR) 
files that came with the TetraMic to obtain the B-format files. A separate 
audio recorder was used to provide phantom power to my Earthworks 
(omnidirectional) calibration mic and record SPLs. An acoustic calibrator 
provided the cal tone and reference level for playing back the background 
(restaurant) noise at actual levels. To me, recording a single venue and then 
arbitrarily adjusting the playback level to achieve a particular SNR is not 
representative of real-world scenarios. Amplifying a quiet coffee house isn’t 
representative of a “louder” establishment. I don’t think many would disagree 
with this idea, but most researchers use one source
 for background noise regardless of the background noise level or desired SNR.
Although Ambisonics may not be the ideal way of presenting background noise, it 
has to be a heck of a lot more realistic than methods previously used to test 
speech comprehension in noise, which are then reported in peer-reviewed 
literature. To give you an idea of the stimuli being used to evaluate cochlear 
implants, I have uploaded a few sentences (stimuli) and the respective 
background noise used by cochlear implant researchers. The first file is a 
stereo stimulus file and can be downloaded from

www.elcaudio.com/examples/ci_stim_stereo.wav

Without independently adjusting either the left or right gain, the SNR is 0 dB 
(silence between sentences was removed to obtain the signal level in dB). If 
you listen to this first file under headphones, it’s easy to ignore the noise 
and concentrate on the signal. Things are a little more blurred when listening 
through loudspeakers. A cochlear implant user doesn’t have the luxury of 
headphone listening or spatial signal segregation (assume a single implant). 
For your enjoyment, I also ran the stereo signal through a cochlear implant 
simulator that generates monaural files. (Note: “stim” in the file names refer 
to stimulus, whereas “sim” refers to simulation.) A monaural simulation of the 
above stereo file is here:

www.elcaudio.com/examples/ci_sim_mono.wav

and a stereo simulation file (noise in one channel and signal in opposite 
channel, which isn’t spatially realistic) is available here:

www.elcaudio.com/examples/ci_sim_L_R.wav

Imagine listening to your music (or a conversation) with this much distortion! 
Mostly, ask whether you believe these wav files, stereo or mono, are 
representative of real-world listening. This should shed some light on why I 
wish to improve the methods we use to test and evaluate hearing impaired 
listeners.
Thanks, as always, for your help and insight.
Kind regards,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

-- next part --
A non-text attachment was scrubbed...
Name: dtlg010.jpg
Type: image/jpeg
Size: 66729 bytes
Desc: not available
URL: 


[Sursound] Motivation for authors

2012-01-11 Thread Eric Carmichel
Hello Fons,
Your query ("what motivates authors to make their work available in this way") 
made me think of my own situation. Perhaps publishing in peer-reviewed journals 
is analogous to receiving Merit Badges in Scouts: In some instances, it’s how 
one gets rated, noticed, or makes it to the next level. It seems (at least in 
the U.S.) that professors are pressured to publish in professional journals. As 
this applies to me, I was told (as a Master’s student) that I’d need at least a 
few peer-reviewed articles under my belt in order to get into a doctoral 
program of study. There’s a catch, of course, because it’s difficult to do 
research in hearing science without university affiliation. At present, I’m 
pursuing research while, at the same time, on the lookout for a doctoral 
advisor. Doing good deeds and being committed to purposeful work is great, but 
I suppose I'm still deficient when it comes to those "Merit Badges."
I have written a couple of noteworthy articles regarding hearing, but only one 
appeared in a "peer-reviewed" journal. An earlier article was intended for a 
much broader (albeit layman) readership, and it reached people who could truly 
benefit from the information contained within the article. Specifically, the 
article was about hearing protection and muzzle blasts, and it appeared in 
Outdoor Life magazine. Submitting the same article to, for example, Audiology, 
might have earned Brownie points needed for admission to grad school, but 
submitting an article regarding hearing protection to hearing scientists / 
audiologists is simply preaching to the choir. I was happy that the article 
found favor with a large readership even though it didn't appear in a 
"professional" journal. A second article regarding binaural electronic hearing 
protectors found its way to Noise & Health (which IS peer-reviewed), and I was 
grateful that they accepted it for publication. I
 had previously submitted the article to JASA, and had received a very kind 
rejection letter. Some magazines will accept or reject articles because of 
reader interest or current research trends. The Journal of the Audio 
Engineering Society is known for publishing articles on Ambisonics, but maybe 
they rejected a series of related articles, and Acta Acustica united with 
Acustica picked them up (?). Once copyrighted, I imagine that the publisher has 
exclusive rights to the manuscript, even in derivative form. But how they can 
justify high prices certainly eludes me. Downloading single articles from JASA 
is kind-of pricey, too. Subscription to AES’s library is reasonable, and you 
wonder why others aren’t the same. Furthermore, the AES offers anthologies that 
include hard-to-find articles.
I wished I could simply upload research and schematic diagrams to my website 
and make them available for good will to all researchers. But unless something 
gets published in a professional journal, it may be (mis)construed as 
“amateurish” or “unimportant” to those in academia. How unfortunate this is! 
Please know that I am grateful to all of you who have freely shared you 
insights, expertise, and wisdom, whether you’re an audio professional with 
years of experience or a hobbyist with personal opinions on music and 
Ambisonics.
Sincerely,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] When realtime isn't the right time

2012-01-12 Thread Eric Carmichel
Hello Michael,
I agree with you 100 
percent about forgetting realtime and creating the wav files that can be played 
on my existing setup. Even in the past, when I've used MATLAB to alter signals 
(for example, a MATLAB-implemented phase vocoder for 
implant simulations), I do the processing off-line, save the file(s), 
and then line them up in a DAW for reproduction in a quasi-random order. What 
you suggest is what I've intended to do all along (although not 
necessarily with Linux/AmbDec until it was suggested). Maybe I've been 
using 'real-time' in an incorrect way. Please allow me to explain...
Because some of my experiments involve adaptive test procedures, something has 
to 'give' during the test (during = the real-time part). The wav  files, 
however, do NOT undergo processing once they've been lined up to play in a 
particular sequence 
via a multi-track device or DAW. An example of an adpative procedure 
would be trying to 'force' a speech comprehension score of 50 percent 
correct, and altering the SNR to achieve this score. The background 
noise level, signal level, or both would have to change in 'real time' 
(that is, change while the test is in progress) to zero in on the SNR 
that yields a 50 percent speech score, subject by subject. Background noise is 
continuous and on its own tracks, while the speech stimuli is lined up 
sequentially on a separate track or tracks.The background noise plays through x 
number of speakers at a time (let x = 6 for horizontal-only Ambisonic surround) 
while the speech plays 
through a dedicated speaker or speakers, but always one at a time. If I want 
the speech signal to be 
maintained at, for example, 75 dB SPL and the listener isn't having 
difficulty with the speech material at + 5 dB SNR, then the background level 
(all six 
channels) has to be elevated as the test moves forward. All I really need to do 
is automate the levels, not process the wav 
files from B-format to speaker feeds. To change the background noise 
level (presented through 6 speakers) means that all six channels have to change 
in unison and precisely by the same amount, perhaps in 2 dB 
steps. This is pretty straightforward with a discrete, 6-channel preamp. With a 
DAW, as you know, controlling six channels simultaneously 
requires sending them to a buss with a minimum of six channels (a 7.1 
surround buss works nicely--just don't panpot anything and keep the 
channels totally discrete). The surround buss's master fader can adjust 
all channels simultaneously, and I do this in 'real-time' based listener 
responses. Actually, responses are recorded, and my software/hardware 
combo does the fader adjustment automatically.

Things get slightly more complicated when the research involves 
electro-acoustic simulation (EAS), which is a hybrid form of cochlear 
implantation. Again, I don't need to do 
anything with the Ambisonic processing, but 'real time' filtering comes 
into play. Briefly, the listener hears an acoustic signal at or below a 
certain f0--say 250 Hz--while the speech in the ranges of 250 Hz to 8 
kHz is presented electronically via his/her implant or via a simulation. I have 
programmable, hardware digital filters (up to 8th order) for my 
filtering needs, so I don't have to do wav file processing prior to 
presenting the stimuli, nor do I have to run all of the multitrack 
channels through a VST filter. As with the aforementioned adaptive test 
protocol, the test subject's responses are electronically recorded, 
responses are fed back (after an algorithm does the decision making), 
and signals are adjusted accordingly to achieve a certain outcome.
Ok, maybe I didn't explain this all too well, but at least it should help 
explain my definition of real-time, at least as far as my personal setup goes.
As always, many thanks to all for the feedback, suggestions, and 
questions. The questions make me think harder, and perhaps my bifurcated 
ganglion-of-a-brain will grow at some point!
Cheers!
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Rearward, march! (RE binaural listening rearward illusions)

2012-01-13 Thread Eric Carmichel
Hi Dave,
I just wished to add my two bits regarding binaural listening and the rearward 
illusion you experience. Having investigated the effects of binaural electronic 
hearing protectors on localization, I do recall two sources of information (in 
addition to my own) where listeners experienced a rearward illusion of sound 
sources. The studies had to do with hearing protection devices (HPDs), but 
aspects of the studies apply to binaural listening in general. Of course, 
retaining head and pinna cues is what we desire with binaural recordings, but 
one man’s HRTF is another man’s, well...? In one of the (HPD) studies, pinna 
cues were absent because of occlusion, and this was believed to account for a 
rearward illusion. The references are

Russell G, Noble WG. Localization response certainty in normal and disrupted 
listening conditions: Towards a new theory of localization. J Aud Res 1976; 16: 
143-50

Oldfield SR, Parker SP. Acuity of sound localization: A topography of auditory 
space: II, Pinna cues absent. Perception 1984; 13: 601-17

For Russell and Noble, it was believed that loss of canal resonance accounted 
for a rearward illusion (this was for listeners wearing earplugs). Under 
earphones, things are different. For example:

In my study*, it was easy for subjects’ to discern left-from-right sound source 
location but discrimination between left rear and left front (or right rear and 
right front) was difficult. Front-back reversals accounted for the largest 
percentage of errors. Most errors made for the HPD conditions occurred at 120 
degrees and 240 degrees (rear plane) and sounds coming from these locations 
were often judged as coming from 60 and 300 degrees (front plane), 
respectively. One listener, however, made localization errors opposite from 
other listeners. For this listener, regardless of condition, more ipsilateral 
errors were made to sounds coming from 0 degrees than for sounds coming from 
180 degrees. Localization under HPDs for this listener was also unique: Stimuli 
presented at 60 and 300 degrees were often judged to originate from 120 and 240 
degrees, respectively, which was opposite from the other listeners.

Why a frontal or rearward proclivity for any particular listener is a good 
question. But it does appear that it is consistent for a given person. For me, 
binaural recordings almost always seem to be in the head (despite everyone’s 
best efforts), but sounds will appear to be outside of my head if they’re to 
the extreme left or right and include the requisite cues (beyond ILDs). Results 
from my HPD study suggested that binaural electronic HPDs retain the ILD cue 
needed for lateralization (I carefully matched the gain between earcups). 
However, pinna-head cues needed to make accurate front/back judgments are not 
retained. According to Oldfield and Parker, such errors would be anticipated 
despite stereo sound provided by the HPDs because the ITD of sound at the 
tympanic membrane does not uniquely specify a location in space, only the 
left/right component.

Incidentally, manufacturers’ statements for their respective binaural 
electronic HPDs included

‘True ‘stereo’ for directional sound detection’

‘Stereo sound so much like your own hearing that you retain your natural sense 
of sound direction’

‘…provides you with 360 degrees awareness of sound direction with the clearest 
sound amplification available’

Hmmm... Check out the following and see what at least one study revealed.

*Noise & Health, October-December 2007, Volume 9. I think it cost a bit to 
download; however, I won’t comment here on the cost of journal articles. If 
you’d like to see a PowerPoint regarding this study, you can download it from

www.elcaudio.com/hearing/hpd_localization.pps   [26.37 MB]

I presented this study (and the PP) at a colloquium: Attendees included William 
(Bill) Yost and other noteworthy hearing scientists. Question: What if the same 
study was repeated only using an Ambisonic surround system? I wonder whether 
the same localization errors would occur. This, to some extent, might validate 
the usefulness of Ambisonics in hearing research.

Another PP, for those interested in signal processing, otoacoustic emissions 
and hearing physiology (not too much psychoacoustics), can be downloaded from

www.elcaudio.com/hearing/oae_study.pps   [5.62 MB]
(This study was kindly rejected by JASA, but it’s still in progress.)

Kind regards,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] AAA (Ardour, Arduino, & Ambisonics)

2012-01-13 Thread Eric Carmichel
Hi Fons,
Thanks for the info regarding Ardour. Although I’m not an application developer 
(nor aspire to be), I have used Python. Not too long back, I purchased some 
sensors from Phidget to make a response box. I also built a few gadgets based 
on the Arduino microcontroller, and Python code simplified a few of the 
interfacing tasks.
I’m a proponent of ergonomic response boxes and generally design and build my 
own response boxes in lieu of off-the-shelf interface devices. If the control 
layout (generally push-button switches) isn’t intuitive to the user, then I 
would question whether response time could be valid, at least not without a lot 
of user training. (Measuring response time can be useful in many experiments). 
All switches/keys should be equally accessible, and there shouldn’t be any 
ambiguity as to what each switch represents. Using a standard keyboard is 
generally a compromise.
Sometimes making an interface device ‘talk’ isn’t the only issue. For example, 
it’s difficult to route wires through a sound test booth if it isn’t 
pre-equipped with a patch bay/panel. One of my response boxes sends its signal 
along a single-conductor shielded cable (terminated with a BNC connector for 
ease of use). This response box used a pre-programmed microchip from a Velleman 
electronics kit: The design allowed me to send 15 discrete ON/OFF channels 
along the single-conductor cable which, in turn, was considerably easier to 
route than a multi-pin connector or multi-conductor cable would have allowed. 
Adding a patch panel or multi-conductor connector to the heavy steel walls of 
an audiometric test booth isn’t easy: I’ve had to do this (for others) in the 
past.
In other instances, a subject’s safety has to be insured in order to obtain IRB 
approval for a study. Fiber optic communication comes in handy when grounding 
or electrical isolation is a concern. The downside of fiber optics is that a 
battery-operated response box (or preamplifier when electrodes are used) is 
needed, but this is just a minor inconvenience. But with the aforementioned 
single-conductor setup, DC power (along with the multiplexed signal) is sent 
along the wire, and one need not worry about battery life.
Regardless of user-interface / hardware, talking with the computer is the next 
step. Having open source software (and Arduino hardware) has certainly made 
life easier for the experimenter. Once I get my Linux rig together, I’ll look 
into the possibilities offered. I make no claims as to being software or 
computer savvy, but I generally find a creative solution (or an adept person) 
to get things rolling. I’ll let you know how things progress with my Ambisonic 
setup as well as future hearing experiment(s).
Kind regards,
Eric C.
PS—Maybe I should have titled this  (Adriaensen, Ardour, Arduino, 
Ambisonics)?
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] TetraMic help (re cal files)

2012-03-17 Thread Eric Carmichel
Greetings Everyone,
First, thanks to all who read or responded my previous posts--everyone's help 
and insight was appreciated.
Second, as I continue my cochlear implant research, and creating virtual 
listening environments for 'real-world' testing of hearing-impaired listeners, 
I have made recordings of various places using my Core-Sound TetraMic. Now I 
have recordings using the TetraMic along with a TASCAM DR-680 and an Earthworks 
calibration mic (for analyzing levels and spectral content), but I've 
encountered a problem using the VVMic software. Len M. (of Core Sound) has been 
good to respond, but the problem is vexing. I've tried contacting David M. of 
VVMic fame, but no luck thus far getting a reply from him. The software works 
fine with A- (raw) or B-format files as long as I don't use the calibration 
files supplied with my TetraMic (these are the *.IIR files). I am hoping 
someone will look at the numbers below and compare them to their TetraMic cal 
files. I realize the files are unique to a given microphone, but if numbers are 
missing or far from what appears normal, this
 would suggest a problem with the cal files and not the software. Not all of 
the columns or rows are filled in, but that's the way they appear in VVMic for 
TetraMic. I could manually enter these numbers in the EQ settings, but I'm not 
sure whether that would fix the problem because the numbers may be outside of 
normal ranges. When using the cal files, the processing is quite slow, and the 
resulting wav files, when viewed in a basic DAW (time-amplitude window), show a 
wild, exponential-looking function that shoots off of the window. Note: The 
wave file doesn't appear like a clipped signal.
Many thanks for any help here (or maybe somebody knows how to get in touch with 
David McGriffy?).
Best regards,
Eric
Cal values shown below (and I hope the formatting stays intact when viewed in 
email):

  LFU   RFD   LBD   RBU
Gain W   1.6 1.6      0.0 1.8
Gain X    0.0 2.5      2.0     2.8
Gain Y    0.0 2.5      2.0 2.8
Gain Z    0.0     2.5  2.0     2.8
EQ F    4500   1    1       1
EQ W    1.5    15.9 15.9    15.9
EQ G 4.5 4.0  2.0 5.0
W -5.0
X       0.0
Y       0.5
Z       0.0
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] TetraMic update (and many thanks!!)

2012-03-19 Thread Eric Carmichel
Greetings Everyone,
First, Many thanks to all for the help regarding my recent post (VVMic & 
TetraMic help). Special thanks to David McGriffy (VVMic), Len Moskowitz (Core 
Sound / TetraMic), Richard L., Aaron H., Eric B., Jascha N., Paul H., Bill de 
G. and anyone I may have accidentally missed.
I received tremendous support from David McGriffy and Len Moskowitz.
The issue I had has been resolved. It wasn't a simple matter (i.e., not just a 
misplaced file or folder). Unexpected things can happen depending on one's 
software configuration, OS, potential conflicts with other software or hardware 
drivers, etc.
David McGriffy finalized an update to VVMic. Cal files were updated as well. 
And I learned a lot, too (how to better use the software and that people on 
this list are great).
Should anyone else run into the problems that I encountered, be sure to update 
to the most recent of VVMic. It doesn't require uninstalling older versions, 
finds cal files automatically, and is rock solid.
Again, thanks to all who read and responded to my prior post re the TetraMic 
and VVMic cal issues.
Sincerely,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Dissertation thoughts and another opinion

2012-03-31 Thread Eric Carmichel
Hi Cara,
I enjoyed reading your post and the many responses that followed. I assume 
you’re aware of the book “Michael Gerzon: Beyond Psychoacoustics.” I believe 
you’d find it to be worthwhile reading for your dissertation.
As an American, I’ll confess we like things big, loud, and gimmicky. Promoting 
Ambisonics here isn't so easy, even at a presumed "progressive" university. MP3 
files played through earbud-type headphones are favored by many young adults. 
I’m guessing this is universal.
I first heard of Ambisonics in the 1970s through articles that appeared in The 
Audio Amateur and Wireless World. At that time I wasn’t ready for anything 
beyond stereo, so I didn’t pay close attention to the emerging quad 
technologies. Admittedly, I was a teenager and was building my first 
“Williamson” vacuum tube amplifier back then. It was only recently that I 
“discovered” the magic and science of Ambisonics.
Things oftentimes happen serendipitously: While pursuing a PhD in Hearing 
Science, I wanted to create virtual listening environments from recorded, 
real-world scenarios to be used for testing cochlear implant (CI) patients. 
Current test protocols for assessing listening ability in noise seemed quite 
limiting. While questioning what was being used to assess CI patients, I read 
about auralization. Bengt-Inge Dalenbäck, PhD was most helpful here. From there 
I jumped onto Ambisonics, and just recently started making music recordings 
using an Ambisonic microphone. People on this sursound list have been very 
helpful, and the persons you hear from are well-respected in this field (not 
speaking for myself, though).
In addition to hearing science, I’m also studying Audio Production Technology 
at a music-oriented school (this is distinctly different from the university I 
attend). I use Pro Tools regularly, but it doesn’t have the surround plug-in. I 
also have Steinberg’s Nuendo on my computer, and this allows me to use the 
popular Ambisonic VST plug-ins. By the way, for those who may not have tried 
it, I’ve have good success with the Harpex software for creating HRTF 
simulations from B-format files. I also have access to awesome studio gear 
(Neumann U47 mics, an SSL console, tubed compressors, and the like), but no one 
I work with addresses anything beyond stereo unless you get into sound for 
video (but sound for video is pretty much effects-oriented).
Back to Ambisonics: One topic you may wish to explore as a “new” topic is using 
Ambisonics in hearing research; specifically, understanding hearing pathologies 
(compared to normal-hearing psychoacoustics). I’ve tried to promote Ambisonics 
as a research tool and for music recording (two distinct audiences). I have a 
few links on my website (cochlearconcepts) that you can probably find 
elsewhere, but there’s also a PowerPoint somewhere on my site that briefly 
touches on the need for real-world testing (the paper focused on Ecological 
Psychology because it was also intended for a grad psych class). My hearing 
research led me to Ambisonics, and Ambisonics led me back to my love for music 
production technologies. I, too, have a heck of a lot to learn, but it has been 
a worthwhile journey.
Very best of luck with your school and project!
Sincerely,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] ORTF, Blumlein, and HRTF files for download

2012-04-05 Thread Eric Carmichel
Greetings All,
I'm glad the topic of Blumlein, ORTF, etc. came up. I've been doing a lot of 
music recording (in contrast to my usual cochlear implant research). Included 
in my arsenal is an SSL console, Neumann U149 mics, an AEA R44 ribbon mic (plus 
several Royer ribbon ics), superb musicians, and more.

Because of a recent article ("The Science and Art of Ambisonics") that appeared 
in the April 2012 issue of Recording magazine, I have received questions from 
students regarding my TetraMic and the Harpex software downloads (free player 
and trial version of their VST). I use the VST (along with Nuendo) because it 
provides the HRTF conversions. The HRTF reference that best matches my head 
seems to be number 1051.

The Blumlein and ORTF settings give very different results when playing back a 
walk-around-the-mic dialog. To make the sample recording, my girlfiriend 
(Janice) and I read from a script and we rotated positions while the other was 
reading. To be clearer: We initially mapped out eight positions (0, 45, 90, 
135, 180, 225, 270, and 315 degrees) with 0 being straight ahead. The rotation 
was counterclockwise, and a TetraMic was used along with a TASCAM DR-680. An 
Earthworks measurement mic was also used to measure levels (a cal file was 
created using an acoustical calibrator). The A-format files were converted to 
B-format using VVMic and the IIR files supplied with my TetraMic.

Next I used the Harpex VST (in Steinberg's Nuendo DAW) to create stereo and 
binaural files. One file was created using the stereo/Blumlein setting, another 
was stereo/ORTF, and a third was binaural/1051 ("1051" works best for my head).

The walk-around gives quite different spatial impressions using the difference 
mic simulations. You can download the files (dummied down to mp3) and see 
photos of my mic arrays by going to cochlearconcepts.com/music_page/

(Note: I don't have a link from the cochlearconcept front page, so you have to 
manually enter www.cochlearconcepts.com/music_page/ in your browser.)

With the Blumlein setting and listening under headphones, voices that 
originated from the back left appear to come from the right. Similarly, voices 
from the back right appear to the left. For sounds originating from the front, 
everything is natural and isn't too different sounding from the HRTF setting. 
But in a "surround" of sound (to include naturally-occuring reverberation), 
sounds from the rear are "off" (laterally crossed).

The ORTF gives what one might expect: Strong signals, left and right, for 
sounds originating from the front L and R, respectively, while sounds from 
behind are mostly rejected. Placement of sounds is correct (through headphones 
it's more lateralization than localization, but no contralateral errors) 
despite levels being weaker for rearward sounds. I'm actually a proponent of 
ORTF for music recording, especially when isolating other musicians playing 
simultaneously in a session.

The HRTF setting works quite nicely. The sense of placement and level is what 
we'd expect: Sound from the right rear appear to originate from this location 
when listening under my Sennheiser HDA 200 headphones. AKG 214 phones work ok, 
too, but not as well as the Sennheiser's. Note: All of the files come from the 
same B-format file. L-R errors aren't because the B-format files were in the 
wrong order (W thru Z).

Anyway, listening to the surround recording of a live source might shed light 
on what the software is doing. I welcome you to take a listen (link provided 
above).

For those who may be curious, the photos on the same link are briefly described 
below:
a_001: Janice and my array of mics while video taping a live music performance. 
The group being recorded, Turning Point, used to open for the popular 
jazz-fusion band SpyroGyra in L.V. I haven't processed all the files (yet).
a_002: TetraMic setup for making recordings that I uploaded (ORTF, HRTF, 
Blumlein)
a_003: Neumann U149 tube condenser mic and sax
a_004: Another Neuman U149 being used to mic upright bass
a_005: AEA R44 awaiting yours truly for some truly awful vocals
a_006: Pair of vintage Neumann KM 84s for drum overheads (mics in X-Y 
configuration)
a_007: Student seated at an SSL 4000E/G console (console previoiusly belonged 
to Turner Broadcasting)
b_001: Turning Point (jazz band)
b_001: AKG mics in "mini" ORTF config to close-mic a piano (lid almost closed 
in order to isolate piano)

Happy listening,
Eric
Eric L. Carmichel
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] RE ORTF, Blumlein, and HRTF files for download

2012-04-06 Thread Eric Carmichel
Hi Daniel,
Thanks for writing. I wasn’t considering the L-R reversal a mic anomaly or 
issue with any of the Ambisonic software (Harpex, VVMic, etc.). I use Blumlein, 
ORTF, and mid-side mic techniques a lot in my commercial (music) recordings. 
Clearly, the lobes of the figure-8 polar patterns cross sides when using the 
Blumlein configuration. This is of little consequence when capturing an 
orchestra forward of the mikes or for capturing the natural ambience in a music 
hall. But when it comes to making “HRTF” or binaural recordings from the 
popular stereo miking techniques (apart from an acoustical test fixture such as 
KEMAR), I just wanted to demonstrate differences among the techniques with a 
simple demonstration. If accurate placement, or perceived sound-source location 
of sounds emanating from the rear plane is important to a person (or, in my 
case, hearing research), then an Ambisonic mic (or native miking) is well worth 
considering. Sounds originating
 from any direction can be reproduced fairly accurately without loss of 
directional cues.
As you know, it is often desirable in live and studio sound to use 
uni-directional mics to reject unwanted (rearward) sounds, and virtual mics 
using B-format files can do this quite nicely, too. While it is known that the 
off-axis coloration for many “classic” cardiod mics lends to their 
characteristic sound, I observed that the characteristic sound for off-axis 
sources when using virtual mics may a bit different from what we’re used to 
hearing. When people ask which mic technique is best, it really depends on what 
one is trying to capture--I guess that was part of the message I hoped to 
convey to readers. I’m really enjoying experimenting with Ambisonics, as this 
is a new adventure for me. I like to share my experiences, and certainly hope 
others can benefit from them as well.
Kind regards,
Eric




 From: Daniel Courville 
To: Eric Carmichel ; Sursound  
Sent: Friday, April 6, 2012 9:44 AM
Subject: Re: [Sursound] ORTF, Blumlein, and HRTF files for download
 
Le 12-04-05 14:28, Eric Carmichel a écrit :

>With the Blumlein setting and listening under headphones, voices that
>originated from the back left appear to come from the right. Similarly,
>voices from the back right appear to the left. For sounds originating
>from the front, everything is natural and isn't too different sounding
>from the HRTF setting. But in a "surround" of sound (to include
>naturally-occuring reverberation), sounds from the rear are "off"
>(laterally crossed). (...) L-R errors aren't because the B-format files
>were in the wrong order (W thru Z).

I'm not sure if you're saying that you consider this left/right reversal
an anomaly, but it's the expected behaviour when using a Blumlein setup:
the rear lobe of the microphone pointing left in front is pointing right
in the back. Conversely, the rear lobe of the microphone pointing right in
front is pointing left in the back.

- Daniel
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20120406/39e564f5/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Thanks Haig (and a note of pitch perception)

2012-04-07 Thread Eric Carmichel
Hello Haig,
Thank you very much for the note and for the link (below). Gathering
from my research and from what I’ve read over the years, pitch discrimination
is difficult for cochlear implant (CI) recipients, thus making music 
enjoyment... well... not so
enjoyable. As the video states, we take our ability to discern pitch for
granted. I hope other will view the video/doc. In case other readers didn’t see
your post, I’ve provided the link to the concert you recorded:
 
http://www.abc.net.au/arts/stories/s3051873.htm
 
A friend of mine (Louise Loiselle) 
received a grant from the
NIH as well as funding from Med-El (Austria) to research localization
ability for bilateral (or bi-modal) CI patients. Her doctoral committee 
includes world-renowned hearing scientists Bill Yost (who's well-known 
in psychoacoustic circles) and Michael Dorman (who heads one of the 
world’s
leading CI labs). I’m guessing Michael is aware of the CI music you 
recorded,
but I’ll forward the link to Louise, Bill, and Michael. Others I know 
who will
be interested in Robin Fox's composition include Drs. Chris Brown and 
Sid Bacon. It is because of my interest in creating virtual listening 
environments for studying CI efficacy in noise that I stumbled upon Ambisonics 
(and I've since added Ambisonics to music-recording arsenal).

 
One question I had asked myself not too long back was where
to “insert” a CI simulator when using normal-hearing listeners. When 
listening in a surround of sound, vocoding* the
signal going to the individual speakers (8 feeds in my octagonal setup) 
doesn’t make much
sense. However, Ambisonic recordings could once again help because I can 
rotate
virtual mics in 3D environments using the B-formatted files created from live
recordings. A virtual (monaural)) mic can represent a CI mic (akin to a 
hearing aid mic),
and the signal picked off the mic can be routed through a CI 
simulator/vocoder.
This signal, in turn, ultimately goes to my (calibrated) ER-3A insert 
phones, L or R.
This may seem trivial, but two microphones, properly spaced (similar to 
ORTF placement), allows me to
simulate bilateral CI listening in a 3D environment. Again, the 
bilateral (versus binaural) signal is presented to the subject via ER-3A insert 
phones. The two channels (L & R) can be individually processed, which would be 
the case for bilateral CI users. Bimodal (electric and acoustic) modelling is 
also possible. I use CI Sim software that
was developed at the University of Granada,
and another CI simulator developed by Dr. Qian-Jie Fu. Maybe some other 
readers
have a better way to do this, or have presented CI-simulated sounds
acoustically through a loudspeaker array?


Anyway, I’m rambling now... Always a 
lot of ideas and thoughts (to include a couple of worthwhile ones). Many thanks 
again for the link!
Best regards,Eric

*CI simulators are, for the most part, specialized tone or noise vocoders. 
Envelope extraction can vary (e.g., half- or full-wave rectification with 
appropriate time constants or via a Hilbert transform), and the number of 
output channels varies depending on the number of virtual electrodes being 
simulated. A large (> 12) electrode count doesn't significantly improve speech 
understanding, and narrowing each channel's bandwidth may not improve frequency 
discrimination (narrowing the bandwidth works for normal-hearing listeners, but 
realistic simulations provide broad- or narrow-band noise, not pure tones, on 
the output channels). These are just a few of many variables.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Thanks, and a request for PhD mentoring

2012-05-09 Thread Eric Carmichel
Hi Aaron,
Many thanks for sending the link to JASA Express Letters. Readers of this 
mailing list may question why an article pertaining to cochlear implant (CI) 
patients is relevant to Ambisonics. I’ll get to this a few paragraphs down.

It is no surprise that interaural timing differences (ITDs) would be lost with 
CI users. All of the 22 or so electrodes along an implanted array cannot be 
energized simultaneously; this would result in current smearing. To avoid this 
problem, methods of “channel-picking” and interleaving (multiplexing) are used. 
The incoming signal is analyzed, and six (typical) of the frequency bands with 
the greatest or most relevant energy levels are sent to their respective 
electrodes. This processing takes time; a lot more time, in fact, than the time 
it takes sound to travel from one ear to the opposite ear.

Skewed timing issues may impart other deleterious effects. As an example, 
normal-hearing listeners don’t perceive each and every reflection in a 
reverberant environment as a sound apart from its original source. If we 
perceived each reflection as a distinct sound, listening in reverberant 
environments would be quite difficult. The brain has its way of putting 
information together. The Haas effect depends on timing cues as well. Hearing 
loss creates problems that go well beyond mere threshold elevation (loss of 
frequency discrimination being a problem, too!).

Many of the articles on localization ability with CI users probably 
underestimate how difficult it is for CI users to localize sound because 
stimuli are generally presented in the frontal plane (or hemisphere) only. From 
what I learned (via personal communication) at a conference on implantable 
auditory prostheses (CIAP), things really “fall apart” when stimuli are 
presented behind the listener. Data collected on rearward plane localization 
ability might result in little meaningful information.

A friend and former colleague of mine is doing her doctoral dissertation on 
localization ability using binaural EAS patients (localization ability being 
just a part of the study). EAS stands for electro-acoustical stimulation: These 
CI users have residual low-frequency hearing that may be augmented with a 
hearing aid, or they may have near-normal thresholds at the extreme low end of 
the audible spectrum. Low-frequency hearing is where ITDs are effective. 
Interaural level differences (ILDs) are minimally effective or do not exist at 
low frequencies because low-frequency sounds diffract around the head. (By the 
way, Aaron, I know you already understand this: I’m providing this info here 
for other readers.) One of several variables than can affect localization 
ability using ILDs is compression of the signal. Compression is routinely used 
in hearing aids and is almost mandatory for CI processing. A robust signal 
presented to one side of a CI user’s head
 may get compressed, while the attenuated signal (reduced via head shadow) on 
the listener's opposite side doesn’t get compressed. Consequently, the signal 
could appear equally “intense” (or perceptively “loud”) at both ears.

The article on CI patients’ localization using ILDs by J. M. Aronoff, D. J. 
Freed, L. M. Fisher, I. Pal, and S. D. Soli (link provided below) makes mention 
of a CI’s (electrical) pulse-width as a possible factor when it comes to ITD 
distortions. I found this interesting because my CI design uses a constant 
pulse width and real-time processing. Briefly, both pulse width and pulse 
amplitude are varied in typical CI designs. The amplitude (voltage) is 
restricted based on safety, and the width’s range is limited by its usefulness. 
The nerve action potential (AP) is different: It’s all or nothing, and mostly 
the same amplitude and duration. Firing rate, place, and the number of nerves 
being innervated dictate loudness and pitch. My CI design uses pulse-width 
modulation (all in real-time) and doesn’t require interleaving to work: For 
these reasons, it may provide improved localization ability.

Re Ambisonics (finally): I’m looking for mentorship; specifically, I’d love to 
hook up with someone who’s into music, hearing science and, of course, 
Ambisonics. I’ll be the first to admit that where I began my doctoral studies 
was a poor match-up. The department was hardware-phobic. If a project required 
more than MATLAB and a keyboard or mouse as the response box, it was viewed 
with a dubious eye. Furthermore, one outside person with some “clout” 
proclaimed that Ambisonic recordings with a Soundfield mic sounded “muddy” to 
him. More than one professor unfamiliar with Ambisonics took his word for this 
(re muddy Ambisonics) without taking a listen, but failed to acknowledge that 
the person purporting "mudiness" had a severe hearing loss (wore hearing aids) 
and was marketing a wholly different “surround” system aimed at audiologist and 
hearing researchers. Pretty much everything I offered was frowned on (or, more 
accurately

[Sursound] Catching the same fly twice (and a curious question)

2012-05-30 Thread Eric Carmichel
Greetings All,
I was intrigued by the post titled 'catching flies' because distance-to 
information is an area of interest to me. As a few folks out there know, my 
interest in Ambisonics (aside from music) is its application to hearing 
research. It is important for safety reasons that a hearing aid (HA) or 
cochlear implant (CI) user be able to a determine source's distance.

Side note: It's interesting that a mic would be compared to the ear. No one 
should expect a microphone alone to do what the ear or auditory system does. A 
quality mic can accurately convert pressure variations to analogous voltage or 
current variations. That's about it. A laboratory grade mic and audio-analysis 
hardware or software can readily measure changes in relative phase, intensity, 
and frequency, and do this over a very wide dynamic range. But converting 
pressure variations (or particle velocity) to voltages is just the beginning of 
a chain of events that ultimately results in a listener’s perception of pitch, 
location, loudness, etc. If the goal is to reproduce a real-world sound field 
around the listener's head, then we need to add the following to the chain: 
Loudspeakers, signal processors, room acoustics, etc. Of course the mic is 
hugely important, and is at the heart of Ambisonics.

Now back to distance approximation:
I’m not sure how many readers are familiar with the book Ecological 
Psychoacoustics (edited by John Neuhoff). For those of you who are interested 
in loudness constancy, loudness of dynamically changing sounds, etc. this book 
addresses aspects of psychoacoustics that aren’t found in the best books on 
psychoacoustics (e.g. An Introduction to the Psychology of Hearing by Brian C. 
J. Moore). One of my mentors and an all-around great guy, William (Bill) Yost, 
wrote, 'The chapters in Ecological Psychoacoustics suggest many reasons why 
combining the rigor of psychoacoustics with the relevance of ecological 
perception could improve significantly the understanding of auditory perception 
in the world of real sound sources. Ecological Psychoacoustics provides many 
examples of how understanding and using information about the constraints of 
real-world sound sources may aid in discovering how the nervous system parses 
an auditory scene.'

Although I don’t ascribe to a single 'school' of psychology, I do buy into 
James Gibson's idea that man (and animals) and their environments are 
inseparable (this is at the heart of Ecological Psychology). Here is where I 
find 'fault' or room for improvement with a lot of controlled laboratory 
experiments: The person (subject) is isolated from his/her environment, thus 
limiting the external validity of many experiments. As an example, there are 
ways of judging a sound source's distance that could be difficult to replicate 
using convention playback systems in the laboratory. It has been hypothesized 
that we are sensitive to the curvature (or flatness) of a wavefront, and that 
this shape provides cues as to distance. But when performing controlled tests 
of this hypothesis, free-field (anechoic) environments are limited in physical 
dimensions, so near-field / curved-wavefront conditions are difficult to avoid. 
Outside of the laboratory, reflections from
 surfaces are probable cues to distance. In a cafeteria (for example), the 
signal-to-reverb ratio grows as a talker approaches us, thus giving a viable 
cue as to the talker's distance. Naturally, intensity increases as well, but 
intensity alone isn't a great cue without a reference. A distant noise source 
could be equally loud but at the same time reverberant, thus compelling the 
listener to believe the noise source is at a distance. How well HA and CI 
recipients judge distance (and therefore safely avoiding disaster) is one of 
many questions I'm interested in. Again, I'm building a playback system 
designed to answer some of these questions. But if Ambisonics involves too much 
psychoacoustic 'trickery' (as some on the sursound list like to say), then it 
would not be the best recording/playback method for performing the 
aforementioned experiments. But to date, re-creating the sound field as it 
originally existed at the listener' s head via Ambisonics
 (while letting the ear and brain do the rest) seems to be one of the best 
research tools at my disposal. (Note: HRTF via headphones isn't a solution 
because headphones physically interfere with behind-the-ear HAs and CIs).

So how good is Ambisonics in reproducing the original auditory 'scene'? If the 
reconstructed wavefield is close to the original, then what happens when you 
record the Ambisonics system itself? Will the playback of this recording yield 
the same spatial information as the first recording did through an appropriate 
first- or n-order system? Or will the recording of the playback capture the 
so-called 'trickery,' thus making the recording-of-a-recording useless. Anybody 
tried this? I think I’ll give it a go using a four spea

[Sursound] Doppler ILLUSION (vs. shift) and more

2012-05-31 Thread Eric Carmichel
Greetings All,
Many thanks for the insightful responses to my recent post (Catching the same 
fly twice). I was pleased to read that most folks on the mailing list are a lot 
more qualified to discuss Ecological Psychology than I was (and thanks to E. 
Deleflie for the kind and insightful note).

Dr. Peter Lennox brought up a point that I had considered adding to my post. 
The point has to do with what spatial perception is/does. I very much agree 
that discerning or sensing a CHANGE in distance is more important than static, 
directional localization. This, too, can be intertwined with Gibson’s (and 
others) papers on point-of-impact, although Ecological Psychology deals more 
with vision that hearing. I will, however, unabashedly reveal that I received a 
D on high-school physics project because I wrote that the Doppler shift was 
mostly bogus when it comes to discerning motion (towards or away from the 
observer) because the Doppler shift yields a CONSTANT pitch shift until the 
point where it passes the observer. Years later, a paper was published by 
Michael McBeath (pronounced McBeth, as he told me) et al that proved that my 
earlier and ‘naive’ notion was correct.

McBeath wrote: ‘Despite this fact of physics [re Doppler shift], most people 
tend to hear a pitch rise as the train approaches and a pitch fall as the train 
departs... We tested this phenomenon in the laboratory by presenting listeners 
with simulated Doppler shifted tones and asking them to track with a joystick 
the changes in pitch that they heard... Since the physical measurement of 
frequency and the perceptual experience of frequency (i.e. pitch) were going in 
opposite directions, we called this phenomenon the Doppler illusion. The 
pattern of pitch change that is heard (rising then falling) resembles the 
pattern of LOUDNESS change that occurs. We argue then that the Doppler illusion 
is due to the change in loudness that occurs as the train approaches. More 
generally, dynamic changes in loudness can influence perceived pitch in a 
previously undocumented way.’ [Source = 
http://www.acoustics.org/press/133rd/4pppa3.html]

A lot of my research questions aren’t what the normal-hearing listener 
perceives, but what the hearing-impaired listener perceives. Sadly, we can't 
put ourselves in these shoes, even with simulations. It is for this reason that 
I wish to create a PHYSICAL replica of the acoustic environment, not merely an 
illusion that is 'real' to normal-hearing persons. Dr. Cynthia Comptom-Conley 
(Gallaudet University) did her dissertation on an 8-speaker surround system 
that was (originally) intended to be useful for audiologists. She demonstrated 
that hearing aid users made the same number and TYPE of mistakes in the 
surround environment as they did in similar, real-world environments. This, 
then, ‘proved’ that the surround system accurately replicated the real world. I 
know the system’s designer, and had recommended incorporating additional 
stimuli to extend the usefulness of the system. The designer frowned on 
Ambisonics, stating that recordings made with a
 Soundfield mic sounded muddy (Egad!! Did I mention he had hearing loss and 
wore hearing aids?). The sole recording for the research-oriented surround 
system was made with 8 Sennheiser gradient mics, each equally spaced in a 
horizontal array. Photos of this recording system show a KEMAR centered in the 
mics, but I learned that the photo was all for show (KEMAR really was being 
used as a dummy--to impress other dummies??). The system is novel and useful 
because it fits in a standard audiometric test booth, but I believe an 
Ambisonics system would be a lot more flexible (not to mention it's easier to 
make field recordings with a Soundfield mic).

Saying that HA users make the same mistakes in an artificial environment 
doesn’t really 'prove' that this environment provides the same sensation to 
hearing impaired people as a natural environment would, but the results 
certainly lend credence to the system, at least to the extent that one can 
demonstrate that directional mics are beneficial to hearing aid users. So, akin 
to gamers who are fooled by illusions, or normal hearing people who make 
mistakes when presented with distorted stimuli (to simulate hearing loss), my 
quest for the elusive Holy Grail of audio reality is to build a system that 
provides the same PHYSICAL pressures, wavefront curvatures and directions, 
phase angles, blah blah that existed at the recording site. Hopefully, too, the 
‘sweet spot’ would be large enough to permit normal head movement--the use of a 
bite brace just isn’t natural (the aforementioned system has a radius of 2 
feet, so head movements are frowned upon). From
 what I've read, acoustic holography, wave field reconstruction, or high-order 
Ambisonics are my best bets towards achieving this lofty goal.


If the system is REAL enough (again, referring to physical replication, not 
perceptual illusion), then itera

Re: [Sursound] Doppler ILLUSION (vs. shift) and more

2012-06-01 Thread Eric Carmichel
Greetings,
I very much agree with Bo-Erik that what I proposed would be a difficult, if 
not nearly impossible goal to achieve. Loudspeakers are certainly one of the 
weak links in the system--and the link most open to subjective impressions. Dr. 
Bengt-Inge Dalenbäck (CATT-Acoustic) had suggested using the Tannoy System 600 
near-field monitors (a great choice for my budget). I have a few of these, and 
saving for more. At present, I’m using older, passive KRK monitors with Focal 
drivers; I’ve matched the dozen or so I own for frequency response. The KRKs 
are fairly compact, yet each speaker has enough of a low end response that it 
can provide the requisite stimulus without need of a sub or second loudspeaker. 
Note: Some stimuli are designed to originate from a single location, hence the 
need for speakers that are independently full-range.
It’s probably a lot more reasonable to state that what I wish to achieve is a 
system that is “close enough” to a real world (acoustic) environment so that my 
listening experiments have external validity. Experiments that purport 
significant improvement in hearing aid or cochlear implant performance won’t 
mean much if (for example) noise is coming from one loudspeaker and a single 
talker (target stimulus) is emanating from a second speaker on the opposite 
side of the listener. This, to me, just isn’t “real world.” I’ve investigate 
several surround systems and I’m very pleased with results I get from my 
Ambisonics set up and recordings made with my TetraMic (in addition to more 
musical recordings made by many of you on the sursound list). Of course, my 
subjective impression of my personal system would hardly pass scientific 
scrutiny without measurable claims. Fortunately, nobody would expect a perfect 
system, but I should have a
 measurably “realistic” system if I’m going to denounce or applaud a new CI or 
HA processing strategy.
When it comes to sound source localization, I received a very kind email from 
Dr. WilliamYost the other day. Bill is undoubtedly one of the great names in 
psychoacoustics. The note below (from Bill) may be of interest to those 
interested in spatial hearing, ecological psychology (previous posts), and 
hearing in general: 
“I have been struck with how little we actually know about free-field sound 
source localization. My [Air Force] grant, which just started, deals with 
gaining more information about localizing more than one sound source, esp. when 
the sources produce sound at the same time. This is a topic with almost no 
literature... I am very interested in how the auditory system deals with 
situations in which the source moves and the listener is stationary as opposed 
to when the listener moves and the source is stationary. Both can produce 
nearly identical changes in cues like ITDs and ILDs, but it would not be good 
if the source was perceived as moving when the listener moves. There is a long 
and rich literature on this problem in vision, with several known neural 
circuits that cancel the retinal image changes based on vestibular or 
proprioceptive cues when the observer moves. There is very little work on this 
topic in the hearing literature.” [end abridged email]
As always, I am grateful for the help and insights that I receive on this 
mailing list and other sources. Fortunately, I’m also spending a lot of time 
making live (studio) recordings behind an SSL 4000 console, so my audio 
endeavors continue to be more play than work. Life is good.
Best always,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] The Sound of Vision (Mirage-sonics?)

2012-06-02 Thread Eric Carmichel
Greetings All,
I continue to learn a lot from this site (and someday hope to have something to 
give back). For now, however, I will briefly comment on posts by Umashankar 
Mantravadi, Augustine Leudar, and Entienne.

Entienne wrote the following [abridged]: **The argument essentially says that 
for something to appear real it has to fit people's *pre-conception* of what is 
real, rather than fit what actually is real. In other words, throw out 
veridicality (coincidence with reality); instead, try to satisfy people's 
belief of reality. This is another argument for questioning the extent to which 
physical modeling has the capacity to create illusions of reality in sound...**

Entienne made me consider further something of great importance re my CI 
research. Briefly, we really don’t know what auditory perception is like for 
hearing-impaired listeners (remembering that there’s a lot more to 
sensorineural hearing loss than threshold elevation). For example, does the 
Haas effect work for them? Why is noise-source segregation so difficult? Does 
breaking apart an auditory scene create greater dysfunction, or can they put 
the pieces back together to give the illusion of a unified sound source (as 
with the cello example)? How does multi-band compression sound for them, etc? 
We would most certainly like to know how altering a physical stimulus improves 
their belief of reality (thus improving their ability to communicate or enjoy 
music)? But how do we measure the perception of cochlear implant and hearing 
aid users other than providing *physically accurate, real-world* stimuli? Side 
note: Thanks for the reference to H. Wallach
 (1940).

Re Augustine’s post: Thanks for suggesting Gary Kendall’s paper. While it 
doesn’t provide a *complete* explanation (who can?), it is a good read. I 
proposed a somewhat similar study while a grad student, but the stimuli would 
have included speech, dynamical sounds (such as breaking glass or a bouncing 
ball), and unfamiliar sounds. The constituent components of the unfamiliar 
sounds would be spatially separated but have identical start times. We could 
then ask whether it’s familiarity (as with a cello), arrival times, or other 
variables that unify the separate sounds into a common source.

Umashankar Mantravadi wrote the following: *As a location sound mixer, I 
exploited the visual reinforcement of sound in many situations. If you are 
recording half a dozen people speaking, and the camera focuses on one - 
provided the sound is in synch - the person in picture will sound louder, 
nearer the mic, than the others. It is a surprisingly strong effect, and one 
side benefit is you can check for synch very quickly using it.*

Many thanks for sharing this experience. I am currently creating AV stimuli 
(using a PowerPoint presentation as the metronome/teleprompter). While there is 
nothing new or novel about incorporating video, I am unaware of any 
investigations using cochlear implant patients’ in a surround of uncorrelated 
background noise combined with a video of the talker(s). One could also study 
the effects of simulated cochlear implant hearing (using normal-hearing 
subjects) with visual cues in a *natural* environment.

It has been known for some time that lipreading is useful for comprehending 
speech presented in a background of noise. For example, Sumby & Pollack (1954) 
showed that visual cues could aid speech comprehension to the same degree as a 
15 dB improvement in SNR. Sounds with acoustic features that are easily masked 
by white noise (for example, the voiceless consonants /k/ and /p/) are easy to 
discriminate visually.

There is a plethora of literature surrounding the benefits of lipreading. It is 
entirely possible that visual cues can affect more than just speech 
comprehension. A study showing the reduction of stress when a listener is aided 
by lipreading could be interesting: It is possible that visual cues, regardless 
of speech comprehension advantages, could reduce listener stress in a difficult 
listening environment. Capturing subtleties, such as talker voice level as a 
function of background noise level, could make video and audio stimuli more 
realistic. Although we might not be sensitive to these subtleties when 
sufficient information is available to us (us = normal-hearing listeners), 
hearing-impaired individuals might make use of visual cues in ways that have 
not been explored. Systematic reduction or elimination of available information 
can be accomplished when the stimulus contains all of the *essential* 
information in the environment.

I am exploring Ambisonics (along with video) as method of capturing the 
essential information. At worst, I have a great-sounding system for listening 
to others’ musical recordings. Thanks to everyone for the recordings of crowd 
sounds, music, software, and for sharing your wisdom.
Best regards,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 


Re: [Sursound] The Sound of Vision

2012-06-03 Thread Eric Carmichel
Hi Robert,
Thanks for the note. I remember a communication we had sometime back.
I agree that we cannot know what reality is like for another person. 
Fortunately, inferential statistics, or just a plain ol' consensus, helps here. 
For example, most normal hearing listeners get the sense (or illusion) of a 
phantom images when listening to a decent stereo system. There's agreement to 
this despite not knowing what each individual's perception is like. But the 
same *illusion* may not apply to the hearing impaired listener. But this isn't 
to say that hearing impaired individuals, or even those with a profound 
unilateral hearing loss, can't localize/spatialize sound. So, in this somewhat 
elementary example, we could surmise that stereo recordings would be a poor way 
of bringing *physical reality* to the laboratory (unless it's stereo imaging 
techniques we wish to evaluate).

Whether a person is studying auditory processing disorders, hearing loss, 
implantable prostheses, vision, etc. it would be most ideal to have a 
controlled, real-world environment. If the physical variable is both 
*real-world* (for external validity) and repeatable / portable across 
laboratories (latter being easy to do with recordings and modern electronics), 
then the perceptual consequence of changes made to a single variable (e.g. a 
change in a CI processor's envelope detector) can be determined with a certain 
degree of confidence. Naturally there will be outliers, a range, and all the 
stuff you know about much better than I do, but I believe a *physically real* 
periphonic system will yield much more meaningful results than a two speaker 
system in a tiny audiometric test booth. In the early days of CI testing this 
may have not been the case: Simply getting a decent speech understanding score 
was an accomplishment! But as processors and hearing aids
 advance, I believe the test protocols will have to advance too. Just my 
thoughts here.
Many thanks again for writing.
Kind regards,
Eric



____
 From: Robert Greene 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Sunday, June 3, 2012 7:01 PM
Subject: Re: [Sursound] The Sound of Vision (Mirage-sonics?)
 

Could I point out that in fact one does not
know what auditory reality is like for other
people whether or not they are hearing impaired?
One supposes it is similar. And structurally
it is similar--people tend to hear sound in the
same locations under given circumstances.
But literal sensation is entirely unknowable--
do you see the same color when you look at
green grass that I do? This is essentially unknowable.
One supposes so for convenience. But there is
no way to know because pure sensation cannot be
communicated.
This is something that will not change. More
and more evidence can be adduced to the effect
that the brain processes are similar. But there
cannot be proof that the experience is the same--
this is unverifiable from the scientific viewpoint
and always will be(in anything like our present
scientific world anyway). Thought and sensation
transfer in the literal sense is not around.
Of course there is always that movie("Strange Days")
Maybe someday. But right now, no one can know
what anyone else experiences except in some structural sense.
Robert

On Sat, 2 Jun 2012, Eric Carmichel wrote:

> Greetings All,
> I continue to learn a lot from this site (and someday hope to have something 
> to give back). For now, however, I will briefly comment on posts by 
> Umashankar Mantravadi, Augustine Leudar, and Entienne.
>
> Entienne wrote the following [abridged]: **The argument essentially says that 
> for something to appear real it has to fit people's *pre-conception* of what 
> is real, rather than fit what actually is real. In other words, throw out 
> veridicality (coincidence with reality); instead, try to satisfy people's 
> belief of reality. This is another argument for questioning the extent to 
> which physical modeling has the capacity to create illusions of reality in 
> sound...**
>
> Entienne made me consider further something of great importance re my CI 
> research. Briefly, we really don?t know what auditory perception is like for 
> hearing-impaired listeners (remembering that there?s a lot more to 
> sensorineural hearing loss than threshold elevation). For example, does the 
> Haas effect work for them? Why is noise-source segregation so difficult? Does 
> breaking apart an auditory scene create greater dysfunction, or can they put 
> the pieces back together to give the illusion of a unified sound source (as 
> with the cello example)? How does multi-band compression sound for them, etc? 
> We would most certainly like to know how altering a physical stimulus 
> improves their belief of reality (thus improving their ability to communicate 
> or enjoy music)? But how do we measur

[Sursound] Blumlein, HOA, Pinnae xfer function, etc.

2012-06-04 Thread Eric Carmichel
Hello Dave,
Thanks for the note (re the sound of vision). In addition to my hearing 
research, I’ve been doing a fair amount of music recording. Here in the USA, 
spaced pair miking is quite popular, as is isolating instruments and then 
positioning the instruments via panning in the final mix. I’ve gone the 
direction of ORTF, Blumlein, mid-side, etc. to satisfy my own interest (and 
create my own opinions). I absolutely love the sound I get using the Blumlein 
Stereo technique with ribbon (generally Royer) mics. It really comes across as 
natural and, in the frontal plane, rivals Ambisonic recordings I've made using 
a borrowed Soundfield mic. The only issue I have with the Blumlein stereo 
technique, at least when it comes to recording stimuli for my research, is the 
reversal of sound sources picked up from the rear. If I was recording a jazz 
quartet in front of me (as an example), and not concerned about the direction 
of reflections in the rear plane, this would not
 be a problem. But if I desire to immerse listeners in a restaurant of 3D 
sound, the Blumlein technique isn’t the best. My solution to date: Ambisonics.

I’d use binaural recordings and matching HRTFs to subjects, but the problem 
here is that of fitting headphones over hearing aids or CIs. I guess what I’m 
ultimately after is the equivalent *HRFT* in the soundfield, and one that 
allows free head movement. The sweetspot doesn’t have to be large, as I don’t 
anticipate subjects to be bouncing around (parent consent isn't the only reason 
for not using children). I’m definitely against putting people’s head in a 
restraint (bit bar or whatever) because head movements are a part of natural 
listening.

When I refer to stereo as an illusion, I’m usually referring to the arrival 
time and SPL of two identical sounds emanating from separate (L + R) 
loudspeakers as being equal. The resulting acoustic wave or waves at the 
listener's head aren’t the same as the physical wave originating from a speaker 
located directly ahead of the listener. I’m fairly sure it’s mostly the brain’s 
integration of info, not physical waveform superposition, that gives the 
illusion that the sound is coming from inbetween the L & R speakers. In one of 
my studies that ultimately got published in Noise & Health, the stimulus sounds 
came from individual speakers--I didn't have to worry about wave field 
reconstruction. In that study, pinnae cues certainly turned out to be valuable 
for locating complex sounds: When ILDs and ITDs were maintained, but the pinnae 
obscured, localization errors weren’t ambiguous (judged by response time) but 
nearly always in error (the errors
 were front-back judgments, never lateralization errors).

With regard to your suggesting an anechoic environment with a standardized HOA 
speaker array and direct sounds coming from individual speakers: That’s pretty 
much what I’m shooting for! Oticon appears to be using a HOA system with (I 
believe) 29 speakers. But because I want to use field recordings of restaurants 
or similar venues, I’d need a HOA mic. Anybody out there used the Eigenmic, or 
know where I could get a sample recording made with one? Other suggestions??

I guess the good news is that my research isn’t costing taxpayers money or 
hurting anyone. It’s probably odd that I made hearing research something of a 
hobby: All of the work I do is self-funded and not-for-profit. I always have 
unanswered questions (and the list grows). But once I get things in place and 
do a pilot study, I’ll invite a few prominent researchers over to scrutinize my 
setup. I gave a colloquium some years back, and it mostly ended up with 
professors (and department heads) arguing with each other. But despite that 
ordeal, a few ideas came to fruition, and I ended up getting a research award 
from the American Academy of Audiology. That was a good spring board for what I 
wish to accomplish next.

I really appreciate that people put their time and thought into this list, and 
are patient enough to help someone like me.
Thanks again,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Viola d'amore, Ambisonia, & bittorrent files

2012-06-04 Thread Eric Carmichel
Hello Everyone,
For the first time since I began exploring Ambisonics (and purchasing a 
TetraMic), I see Ambisonia is up and running. What a great site! Many thanks to 
Marc L, Dave M, Oliver L, Etienne, and all the people who made contributions. I 
have two quick questions:

First, does anybody have Ambisonic recordings of a viola d’amore (e.g. a 
concerto for viola d’amore, harpsichord and strings)? For those who may not be 
familiar, the viola d’amore is a bowed instrument with fourteen strings 
(typical)*--seven playing strings and seven additional resonating (or 
sympathetic) strings that go through the bridge and between the fingerboard and 
neck of the instrument. In Germany, violas d’amore without sympathetic strings 
existed for a short time during the early 18th century (thus a 7-string 
viola??). Origins of the viola d’amore are obscure, but it is likely that it 
evolved from instruments coming from the Mid East, where instruments with 
resonating strings were common. Vivaldi used the tunings of D major, d minor, A 
major, a minor and F major in his eight concerts for the viola d’amore.

*Violas d’amore exist with different combinations of playing strings (4, 5, 6, 
and 7) and sympathetic strings (from 4 up to 14).

My second question is that of bittorrent downloads. I’ve generally avoid these 
because of... well... uncertainty as to the security of my computer and the 
authenticity of downloads. I feel confident that anything coming from Ambisonia 
(and its users) is safe, but does anyone have an opinion regarding the 
requisite software for downloading/converting bittorrent files? Specifically, 
should I purchase software that includes built-in virus protection? I'm 
guessing that, like web browsers, not all software packages are equal.

Thanks to all for your help and insights!
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Bittorrent responses--thanks!

2012-06-05 Thread Eric Carmichel
Hello Marc, Paul, Aaron and everyone,

Thanks for the informative responses to my bittorrent inquiry.
Paul, I had previously downloaded a number of files from your ambisonic.info 
site, as there is reference to it from the Core-Sound site (I'm fairly certain 
this is how I found it months back). I particularly like the VoiCE 
recordings--many thanks for sharing your recordings and hosting the recordings 
made by John L and Aaron H. My father is a WWII airplane buff, and I downloaded 
a number of John's recordings for my dad to enjoy (of course, I like them too).

I'm piecing together a dedicated "downloads" computer. I can install basic 
anti-virus software as a safeguard or, perhaps better yet, run Linux - Ubuntu 
on the machine. Even with Linux, it's probably prudent to scan files I wish to 
transfer to a computer running Windows, but this is painless work.
I'm not terribly paranoid of viruses (and most of what anyone would find on my 
computers would bore 'em to death), but everyone's friendly advice made me feel 
more at ease regarding bittorrent downloads.
Thanks again,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Red is blue & sideways is straight ahead

2012-06-06 Thread Eric Carmichel
Hello All,
First, many thanks for taking time to read this. This may be one of my better 
attempts at communicating what I’m attempting to do.
I very much appreciate and respect all the input regarding human perception (re 
prior posts / the sound of vision).

Professor Robert Greene wrote *...But right now, no one can know what anyone 
else experiences except in some structural sense.* I fully agree, but we 
(experimenters, psychologists) would have to provide the same physical stimulus 
for participants to agree on what *red” is. This means that light reflecting 
off of the *red* object contains the electromagnetic wavelengths requisite for 
stimulating the retinal cones (and rods too?) and eliciting a perception of the 
colour red (or the light itself is could be *red* by physical definition). Same 
goes for audio stimuli.

I believe it would be interesting to study how the hearing impaired *hear* 
reverberation. Have you listened to the Scottish prayer example that is often 
used in classroom demonstrations? This so-called “ghoulies and ghosties” 
demonstration (found on the “Harvard tapes”) has become somewhat of a classic. 
The recording is of a hammer striking a brick followed by an old Scottish 
prayer. The reader is Dr. Stanford Fidell. Playing the recording backwards 
focuses our attention on the echoes.

Practically no one reports hearing echoes in small (although reverberant) 
spaces when a transient sound is initiated. The echoes are not *heard* although 
the reflected sound may arrive as much as 30 to 50 ms later. The Scottish 
prayer demonstration is designed to make the point that these echoes do exist 
and are appreciable in size. Our hearing mechanism somehow manages to suppress 
the late-arriving reflections, and they go unnoticed (at least for the majority 
of us).

There is reason to believe that hearing-impaired persons have greater 
difficulty suppressing reverberation (a central processing issue, not 
necessarily peripheral organ dysfunction??). Hearing and consciously perceiving 
these echoes could, then, impart a deleterious effect on word recognition 
ability. But without providing the same physical stimulus to the hearing 
impaired listener, can we determine the magnitude of effect? If the recording 
of the hammer (transient) is perceived as being the same regardless whether it 
is played in reverse or not, we can make inferences regarding echo suppression. 
But if the recording used for one population (normal-hearing listeners) is not 
identical to the recording used to study a different population (e.g. 
hearing-impaired listeners), what initial inferences can we make about the 
latter’s perception under reverberant conditions? A recording / playback system 
that includes echoes coming from multiple directions could
 provide additional insight (and real-world validity).

All I’ve been saying is that the one variable that can be controlled is the 
physical stimulus. Stimuli that represent real-world scenarios have more 
external validity than tightly controlled sounds made up of monaural buzzes, 
clicks or tones. Similarly, it’s relatively easy to build and program a robot 
that can navigate in a virtual world built around well-defined colors, blocks 
and shapes; understanding how we navigate in the real (complex) world requires 
more complex stimuli (e.g. Rodney Brooks’ robots successfully navigate over 
difficult terrain without a priori info about the environment). We will never 
know what these robots are *thinking* (some don’t even run on code), but we can 
still measure their performance and then find ways to improve on the design. I 
wish to improve hearing aid and cochlear implant design; consequently, I need 
physical stimuli that represent the world outside of the laboratory. This has 
been my impetus for exploring
 Ambisonics. Naturally, I'm greatly enjoying the musical / artistic aspects of 
Ambisonics as well.

Kind regards,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] KEMAR parts for the TetraMic

2012-08-13 Thread Eric Carmichel
Greetings to All:

Several months ago I had suggested making a recording of an Ambisonics 
recording using an Ambisonic mic centered in an array of six or eight 
loudspeakers. I was curious as to whether the second-generation recording would 
retain the original recording’s (perceived) directional cues. This project, as 
well as several of my projects-in-work required that my TetraMic be rotated on 
its center, vertical axis. There are a number of ways to achieve this, to 
include a U-shaped fixture that could be attached to the Audix shock mount 
(used to hold the TetraMic), allow room for the mic’s cable, and then be 
affixed to a turntable.

Just by dumb luck, I found an easy solution to centering the TetraMic in my 
arsenal of spare parts. In turns out that the Brüel & Kjær clamp used to hold 
1/2-inch mics in KEMAR (or similar acoustical test fixtures) fits nicely around 
the base of a TetraMic’s brass handle. The B&K clamp is easily bolted to a 
frame that makes centering and leveling the mic over a turntable or tripod ball 
head simple. The fixture I built also holds the four TetraMic XLR 
adapter/preamps. You can see photos of my handiwork by going to

elcaudio.com/research/page_001.htm

Side note: There is no link to the above photos from the homepage: I’m way 
behind updating both of my sites (= cochlearconcepts and elcaudio).

The test room shown in the photos is a semi-anechoic room with treatment on all 
walls, the floor, and ceiling. My thanks go out to William (Bill) Yost, PhD for 
allowing me to use the room. Briefly, I wanted video and dry voice recordings 
of speech stimuli for an upcoming study. I made video recordings (from two 
angles) of the talkers to complement the speech stimuli.

While I had access to the semi-anechoic room, I made a few recordings using my 
TetraMic so that I could initiate the aforementioned recording-of-a-recording 
project. Instead of using an array of loudspeakers or a “natural” environment 
to create the first-generation recording, I used a single loudspeaker in a 
fixed location and rotated the TetraMic on its vertical axis. The Bilara ball 
head seen in the photos has tick marks that allow accurate rotation in 
15-degree increments. The distance from the loudspeaker to the TetraMic was 2 
m. For each angle, the same sequence of tone bursts was presented and recorded. 
There are a lot of test signals that could have been used, but I chose to use 
1/6-octave pure tones ranging from 50 Hz to 15 kHz. Tones were generated using 
a popular acoustic analysis application (Arta). It was interesting to note that 
I could hear clicks at the ultrasonic frequencies (which for me is anything 
above 14 kHz).

Time-domain analysis of the Arta-generated tones revealed that the 1/6-octave 
sine wave bursts don’t have rise or fall-time envelopes, and that the tones can 
end abruptly anywhere in their cycle. I ended up applying rise and fall times 
so that I could use the tone bursts in future listening experiments. Localizing 
a click is a lot different from localizing a pure tone because the pinna 
transfer function may be more dominant than ILDs or ITDs when localizing 
complex (e.g. click or transient) sounds. The click is probably masked at 
audible frequencies, but the subsonic, transient information could still 
provide a localization clue (could be interesting to find out).

Well, I have a lot of recordings, both audio and video, to sort through. I’ll 
make another post once this is completed. I’ll also make the recordings 
available at a later date.
Best,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Mobile sound, calibration, planar waves

2012-10-02 Thread Eric Carmichel
Greetings All,
Regarding mobile recording rigs, their seems to be belief that any recording 
device or preamp that uses rotary pulse encoders in lieu of potentiometers is 
intrinsically dead accurate across channels. For this reason, a lot of folks 
avoid using recording devices with conventional pots for their Ambisonic 
recorder. I personally prefer NOT to bring my laptop with outboard hardware 
(e.g., MOTU Traveler) into the field unless absolutely necessary. I really like 
the sound and quality of my Edirol Roland R-4 Pro. Very low noise preamps 
(purportedly better than the R-4 without "Pro" suffix) and it's super easy to 
use. But it has those horrible pots we abhor. Actually, the concentric level 
adjustments on the R-4 have a conventional pot on the inside and what appears 
to be an encoder on the outside. With the pots fully CW, the gain adjustments 
are uniform across channels. Gain is displayed digitally on the display when 
using the outer gain knobs. I confirmed
 accuracy across channels with a balanced output calibrator I built.
Briefly, my "calibrator" is nothing more than a THAT Corp. balanced line 
driver, tone generator, and low-noise attenuator using Vishay resistors. 
Low-noise circuitry is needed because we're dealing with mic-level (mV) 
signals. The balanced line driver precludes the need for a pseudo-balanced 
(really single-ended) signal feeding the R-4 Pro's balanced (XLR) inputs. All 
in the name of low noise... even with short cable leads. In the end, this 
calibrator could be of benefit to those who may not know whether their 
digitally-controlled attenuators are truly equal across all channels. I've seen 
researchers assume the voltage or attenuation at a device's port is exactly 
what their MATLAB (or whatever) codes says it should be; they didn't consider 
the effects of buffer resistors, loading, etc. These effects are generally not 
accounted for in the software, on a computer screen, or other digital displays. 
Hardware calibration isn't a bad thing at all!
RE plane waves: I saw the recent discussions regarding waves and wavefield 
reconstruction. I'm not one of those persons who solves Legendre polynomials in 
my sleep, so I avoided putting in my 2 cents worth. But I do have a 
question/comment. Some years ago, I recall seeing articles on psychological 
warfare and an acoustical "laser" that had commercial application, too. I can't 
find those references (or my folder with the related JASA article, etc.), but 
now I question what I read. Anyway, it seems that a sound arriving at the ear 
from a coherent sound source would have to be "planar" but of fixed, or finite, 
area. Waves emanating from a source or many sources (such as a water fall) 
approximate plane waves as they become distant from the source. But the "area" 
as well as distance of plane waves would be quite large (infinite if we assume 
the wave is shaped like the surface of a sphere). Because of differences in how 
planar waves can be generated, wouldn't
 there be differences in how these waves diffract around and interact with the 
head, pinnae, etc. In other words, "head shadow" would vary even when the wave 
generating sources' direction and angle remained the same. In some ways, this 
would be akin to comparing shadows from conventional light to laser light 
(assuming no reflecting surfaces to create diffuse light). Just some 
thoughts... and probably has some bearing on localization experiments performed 
in near field versus far field listening environments (??).

Thanks to everyone here for your help, ideas, and suggestions that have 
propelled me along in my research endeavors. My latest explorations have 
involved deconvolution and swept-sine sources using my TetraMic. (I use a KRK 
9000 monitor as my source along with a battery operated "high-end" amplifier. 
Still trying to create the Holy Grail of "real-world" stimuli for hearing 
science and cochlear implant research.

Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Hybrid Hi-Fi (HyFi?), IRs, etc.

2012-10-03 Thread Eric Carmichel
Greetings to All,
I have been reviewing the literature on Auralization in attempts to create 
viable stimuli for research. Everybody here has been great. I do have another 
question/comment regarding loudspeaker placement.
In nearly all Ambisonic setups, the listener's head lies on a line connecting 
two or more speakers. This includes the 4-speaker cube arrangement. I've noted 
that having two speakers immediately to the left and right of the head creates 
an image that's similar to headphone listening; in other words, it's akin to 
lateralization versus localization effects. Is there any reason not to use an 
odd number of speakers arranged in such a way that no two speakers form an 
imaginary line passing through the listener's head? I am considering building a 
hybrid system based on Ambisonics and Ambiophonics, and was considering a 
pentagonal loudspeaker arrangement. The "Ambiophonic" component would be using 
dividers (gobos or flats, as they're called) between speakers so as to reduce 
early reflections in an otherwise "standard" living room space. From what I've 
read about Ambiophonics, it's an extension of transaural stereo techniques 
(e.g. William Gardner's doctoral
 thesis) with the addition of a partition. It seems that the advantages 
provided by the partition (or partitions in my case) would apply to Ambisonics. 
Please bear in mind that I am designing a system for single-listener research, 
so the obvious disadvantages of dividers (i.e. space hogs) isn't an issue. Has 
anyone had experience using dividers?
I've also been creating research stimuli using avatars (for lipreading), AT&T 
Natural Voice text-to-speech (ATT Labs makes high res voices) software for 
creating sentences, and IRs recorded with a SoundField mic. Daniel Courville's 
website and Bruce Wiggins WigWare are fantastic resources for any of us 
attempting sound design via Ambisonics. I also have a licensed (meaning 
paid-for) version of Harpex, and this is highly recommended for those who can 
afford it. One of my favorite post production DAWs is Sony Sound Forge 10. I'm 
often having to convert numbers of channels (e.g., four B-format channels to 8 
processed channels), and this is very easy to do with Sound Forge. I also use 
digidesign Pro Tools and Steinberg Nuendo, but neither of these is as easy to 
use as Sound Forge.
For the home brew crowd out there, I'll probably upload my plans for a 
multi-channel preamp based on Burr Brown chips. The impetus for building such a 
device (versus buying a ready-made surround sound controller/preamp) is that I 
can use software to control the gain on the Burr Brown chips (a rotary 
controlled encoder is used for conventional volume control). I'm devising 
experiments where the signal-to-noise ratio has to vary depending on a 
subject's response (e.g., two "misses" in a row means increase the SNR). The 
software controller does this automatically, and a MIDI track on a DAW can be 
used to track the changes. Just passing this along for other researchers...
Disclaimer: Suggestions, questions, and ideas presented herein are in no way a 
reflection of my cat, who is far wiser than yours truly.
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Hybrid Hi-Fi (HyFi?), IRs, etc.

2012-10-03 Thread Eric Carmichel
Hi Michael,
Thanks for the note. At least I got as far as 2*pi radians / 5 = 105 degrees. 
Oops, I meant 72--the 105 degrees (Fahrenheit) is today's high temperature for 
Phoenix (and I'm more than ready for summer here to be over!!).
Seriously, the math for speaker feeds from B-formatted material isn't at all 
daunting, though I can't say the same for A- to B-format conversions. I just 
wasn't sure whether even numbers and symmetry have been used as a matter of 
convenience, or whether each speaker requires a directly-opposing "complement" 
speaker for best wave field reconstruction. I have six speakers, but one is to 
be used independently of the others and may not be arranged in accordance with 
the other five speakers. The remaining five (should I go this route) will have 
equal spacing among them. Some of the VST plug-ins provide B-format-to-5.1 
surround conversion, and a 5.1 layout (minus the 0.1) could work for me, too. 
My arrangement doesn't require equal loudspeaker spacing. Speaking of 5.1...
I believe the Waves IRs for surround have been converted from B-formatted IRs 
to a proprietary format (?) for 5.1 application; i.e., I don't think they have 
B-formatted wav files that users can apply to speaker arrangements of their 
choice. I know a lot of the IRs on Acoustics.net (Waves library of IRs) were 
recorded with a Soundfield mic along with Prof. Angelo Farina's expertise, but 
the Waves library and IR-360 surround convolution software is geared more for 
home and theatre surround systems. Am I correct in saying this? I just 
downloaded a trial version of Waves IR-360 to see what their VSTs offer. Of 
course, the Ambisonic (B-format) IRs provided free-of-charge on OpenAirLib.net 
and Space-Net.org provide a lot of material for auralization, and I'm grateful 
to everyone who has provided recorded material as well as food for thought.
Kind regards,
Eric



____
 From: Michael Chapman 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Wednesday, October 3, 2012 12:24 PM
Subject: Re: [Sursound] Hybrid Hi-Fi (HyFi?), IRs, etc.
 
> Greetings to All,
> I have been reviewing the literature on Auralization in attempts to create
> viable stimuli for research. Everybody here has been great. I do have
> another question/comment regarding loudspeaker placement.
> In nearly all Ambisonic setups, the listener's head lies on a line
> connecting two or more speakers. This includes the 4-speaker cube
> arrangement. I've noted that having two speakers immediately to the left
> and right of the head creates an image that's similar to headphone
> listening; in other words, it's akin to lateralization versus localization
> effects. Is there any reason not to use an odd number of speakers arranged
> in such a way that no two speakers form an imaginary line passing through
> the listener's head?

You mean you want two speakers to form a real line through ...  ... ;-)>

But, seriously, I seem to remember matrices for pentagons (?Richard
Furse's site).

No reason why you shouldnt sit down and work out equations for non-even
numbers.

In practice as the minimum speaker requirement (pantophony) for 1st, 2nd,
3rd, 4th-order
is 4, 6, 8, 10,  I don't think non-even has been used much ...

> I am considering building a hybrid system based on
> Ambisonics and Ambiophonics, and was considering a pentagonal loudspeaker
> arrangement. The "Ambiophonic" component would be using dividers (gobos or
> flats, as they're called) between speakers so as to reduce early
> reflections in an otherwise "standard" living room space. From what I've
> read about Ambiophonics, it's an extension of transaural stereo techniques
> (e.g. William Gardner's doctoral
>  thesis) with the addition of a partition. It seems that the advantages
> provided by the partition (or partitions in my case) would apply to
> Ambisonics. Please bear in mind that I am designing a system for
> single-listener research, so the obvious disadvantages of dividers (i.e.
> space hogs) isn't an issue. Has anyone had experience using dividers?
> I've also been creating research stimuli using avatars (for lipreading),
> AT&T Natural Voice text-to-speech (ATT Labs makes high res voices)
> software for creating sentences, and IRs recorded with a SoundField mic.
> Daniel Courville's website and Bruce Wiggins WigWare are fantastic
> resources for any of us attempting sound design via Ambisonics. I also
> have a licensed (meaning paid-for) version of Harpex, and this is highly
> recommended for those who can afford it. One of my favorite post
> production DAWs is Sony Sound Forge 10. I'm often having to convert
> numbers of channels (e.g., four B-format channels to 8 processed
> channe

[Sursound] Take a Load off Intel (and put the Load on IC)

2012-10-05 Thread Eric Carmichel
Ok, the subject title is a take on the Robbie Robertson/The Band/Dylan song The 
Weight (Take a load off Annie... and you can put the load right on me). So what 
does this have to do with sursound? Answer: Native processing (Intel) versus 
dedicated hardware control (via a collection of BB PGA2311 ICs).
One of my early concerns regarding research and Ambisonics had to do with 
simultaneous control of 8 or more channels. I'm guessing (and it's really a 
guess) that many Ambisonic Aficionados (AA?) use either a surround 
preamp/receiver or the Master fader of a DAW--so long as the fader can provide 
control for a 4-channel (or greater) buss. I don't like using a mouse-n-DAW 
because this requires being at the computer. Surround controllers, on the other 
hand, are generally limited in their number of channels or become expensive. 
One solution to my 'dilemma' was to use a DAW surface controller. The simplest 
implementation of this idea was an attempt to use a MIDI volume controller to 
remotely control the Master fader. A kit available from midikits.net23.net 
provided an easy to build and flexible solution. This is a hardware device with 
a USB interface that serves to control the (software) Master fader. Another 
solution, for anyone dealing with large channel
 counts, is to use programmable gain amplifiers. This is probably what the 
majority of modern surround receivers use. But by building my own preamp, I 
achieved a large channel count by using serially-connected Burr Brown PGA2311 
ICs. A single rotary pulse encoder controls all channels, but now I have the 
added benefit of software control. The software control has advantages when 
automating data collection and stimuli presentation. Attempts of mixing DAQ 
applications and DAWs via ReWire weren't so successful. The hybrid solution 
works well. This brings me to my recent post regarding hyfi...
Thanks to all who wrote. The info on Richard Furse's site helped immensely. 
Regarding my 6th (or roaming speaker): This channel stands alone for a few 
reasons that I didn't explain but will comment on here: First, my current study 
involves SNRs in reverberant environments. The primary noise source is talkers 
and room reflections... specifically, talkers at a distance. The signal is 
speech from a nearby talker. This represents a scenario found in restaurants, 
and a listening condition that is difficult for cochlear implant users. In the 
absence of other noise or talkers, the SPL of direct sound coming from the 
talker (signal) is considerably greater than the resulting echoes that follow. 
This may not be the case for extremely reverberant spaces such as the Hamilton 
Mausoleum, but it does apply to a typical classroom or restaurant. In fact, the 
signal-to-reverb ratio of the talker's voice gives a clue as to his/her 
distance. This ratio changes depending on
 the noise source's distance from the listener. I could use a minimal wet/dry 
mix to create 'some' reverberation for the nearby talker, but it isn't really 
necessary. Another consideration (the real story) is that the speech signal 
emanating from the lone speaker is created on the fly. I use a fixed number of 
words to make data collection simple, but software randomly mixes the word 
order (grammatically correct sentences aren't required). This way, I use a 
handheld response box containing, say, 8 words written on push-buttons, and the 
subject simply pushes the buttons in the order the words are heard. (Keyboards 
or word recognition software to collect responses becomes unwieldy and 
unreliable). When the listener makes x consecutive mistakes, the SNR is 
automatically improved to make listening easier (or decreased to make it more 
difficult in the case of consecutive correct responses). The noise is surround 
noise via an Ambisonic set up and auralizaton/or
 live recordings of restaurant noise. Although reverberant noise is generally 
diffuse, localization cues and "glimpsing" aid the listener in segregating and 
understanding the signal. At least that's the idea.
I'm fortunate that, for the time-being, I have access to a controlled listening 
environments. You can see photos of my gear and the room by going to
elcaudio.com/research/page_001.htm
elcaudio.com/research/page_002.htm
elcaudio.com/research/page_003.htm
The room isn't my own, and my thanks go to Dr. William (Bill) Yost for letting 
me use his research space.
I have an array of speakers at home, but data collected from a living room 
hardly qualifies as "controlled" or scientific. This was why I was wondering 
whether an Ambiophonic-Ambisonic hybrid system might be possible. Ideally, I'd 
like to construct a system that is portable. Gobos and flats may work, 
particularly if they are constructed of materials that provide absorption 
across the speech spectrum of frequencies. Low-frequency absorption via a gobo 
would be a more daunting task, though the right combo of mass and compliance 
could yield a low Q absorber. Just idea

[Sursound] Take a Load off Intel (and put the Load on Wii)

2012-10-06 Thread Eric Carmichel
Hi Augustine,
Many thanks for your suggestion. I had considered MaxMSP, but I'm not a 
seasoned user of it. The MIDI project I referenced is actually Arduino-based, 
and I've built a number of controllers using the Arduino boards. The down side 
of using a (RF) remote controller for my work is that some of the sound rooms 
employ steel walls--virtually Faraday shields--thus precluding RF reception. 
One workaround is to use a PIC chip to send multiple control signals along a 
single coax cable. My response boxes work this way: I don't have to use 
multi-conductor cables or multi-pin connectors to go through an audiometric 
test booth; yet, I can transmit and receive multiple signals (basic 
multiplexing via a microcontroller). I should really look more into MaxMSP--I 
have this as well as FlowStone, LabVIEW, and a number of related applications. 
For years, I've used NI LabVIEW because I'm comfortable with it. But I'm not 
aware of ASIO drivers that allow using most audio
 interfaces. Same goes for MATLAB--in my experience, you can only use two 
channels of any given sound card regardless of the card's channel count. I had 
built a hardware controller that integrates with most DAWs--and it interfaces 
with a MOTU 828 (or any audio device). It also uses a chip from a Microsoft 
mouse (easily recognized human interface using Mac or PC) to control Pause, 
Play, etc. functions. Discrete signals, not human operation, signal the mouse's 
functions. In the end, I had an automated stimuli presentation and data 
collection system (data was response time as well as perceived direction--the 
localization study appeared in Noise & Health). Furthermore, it allowed for a 
forced-choice test method. All of this could have been implemented with 
software, but I chose the tools that were available to me at the time. As my 
programming skills improve (along with my understanding of Ambisonics), I find 
new methods and tools needed to perform tasks. I
 very much appreciate your suggestions and will most likely make good use of 
them.
Kind regards,
Eric



____
 From: Augustine Leudar 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Saturday, October 6, 2012 2:43 AM
Subject: Re: [Sursound] Take a Load off Intel (and put the Load on IC)
 

I am not sure about the second part of your post. But with regards to the first 
half did you consider using max msp (or probably supercollider) ? you could 
control as many channels as you want with a computer and control them with 
almost anything  from a wii controller via blue tooth to more or les anything 
with an aduino. 



On 6 October 2012 01:37, Eric Carmichel  wrote:

Ok, the subject title is a take on the Robbie Robertson/The Band/Dylan song The 
Weight (Take a load off Annie... and you can put the load right on me). So what 
does this have to do with sursound? Answer: Native processing (Intel) versus 
dedicated hardware control (via a collection of BB PGA2311 ICs).
>One of my early concerns regarding research and Ambisonics had to do with 
>simultaneous control of 8 or more channels. I'm guessing (and it's really a 
>guess) that many Ambisonic Aficionados (AA?) use either a surround 
>preamp/receiver or the Master fader of a DAW--so long as the fader can provide 
>control for a 4-channel (or greater) buss. I don't like using a mouse-n-DAW 
>because this requires being at the computer. Surround controllers, on the 
>other hand, are generally limited in their number of channels or become 
>expensive. One solution to my 'dilemma' was to use a DAW surface controller. 
>The simplest implementation of this idea was an attempt to use a MIDI volume 
>controller to remotely control the Master fader. A kit available from 
>midikits.net23.net provided an easy to build and flexible solution. This is a 
>hardware device with a USB interface that serves to control the (software) 
>Master fader. Another solution, for anyone dealing with large channel
> counts, is to use programmable gain amplifiers. This is probably what the 
>majority of modern surround receivers use. But by building my own preamp, I 
>achieved a large channel count by using serially-connected Burr Brown PGA2311 
>ICs. A single rotary pulse encoder controls all channels, but now I have the 
>added benefit of software control. The software control has advantages when 
>automating data collection and stimuli presentation. Attempts of mixing DAQ 
>applications and DAWs via ReWire weren't so successful. The hybrid solution 
>works well. This brings me to my recent post regarding hyfi... (refer to 
>original post)
>
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20121006/b598b708/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Soundfield mic for sale

2012-10-11 Thread Eric Carmichel
Greetings:
I've noticed an STS 250 mic that has been on eBay (at least the US site) for 
more than a week. In fact, I had written the seller (who I do NOT know ) with 
some info pertaining to the STS, and the seller has since added that info to 
the listing. Specifically, the paragraph that reads,
"...acoustics.salford.ac.uk/studentarea/studios/st250_userguide.pdf for the 
user guide. I have the Harpex VST (fully licensed) and player as well as VVMic 
(the latter being useful for A- to B-format conversion when a hardware decoder 
isn't part of the mic). Svein provides great support for his product (Harpex 
dot net)... always very helpful... and has commented on some of my work 
regarding Ambisonics (cochlearconcepts dot com).  Potential buyers may also 
want to know the the four B-channel outputs come from the PAIR of XLR 
connectors when in B mode (of course, the four channels aren't balanced 
outputs, but this is rarely a problem when recording the B-formatted outputs). 
The user manual clearly explains the pin-out needed for B-format output."

So in truth, I know nothing of the mic's condition or the seller's knowledge of 
the STS 250 being sold. But it does look like a very nice mic.

If I manage to find an "affordable" Soundield or similar high-end Ambisonic 
mic, I'll be selling my TetraMic and matching preamps. In truth, the TetraMic 
is a great mic for the money. My research is entirely self-funded, so I 
understand what you mean when it comes to budgeting. Very good luck in your 
search for a Soundfield mic.
Best,


Eric
www.cochlearconcepts.com
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] E's Sursound Saga, Part I--Why what I do wrong works

2012-10-14 Thread Eric Carmichel
Hello Sampo,
I always appreciate your suggestions, insight, and (occasional) provocative 
comments. Here’s a tiny bit on info that may shed light on why I do things in 
‘squirrelly’ fashion (and my Ambisonic recording of a chattering squirrel ain’t 
nothin’ compared to this diatribe).
My doc studies began at what was considered to be the number one cochlear 
implant (CI) research lab in the world. (Trust me: I wasn’t the one doing the 
bragging or lab ranking.) Upon my arrival, I was surprised to observe the 
age-old listening condition of speech in one speaker and noise in another 
speaker was still the standard way of studying speech in noise. A simple 
two-speaker arrangement was adequate for earlier CI studies, and is probably 
adequate for studies involving unilateral implantation. But for studies 
involving bilateral implantation (localization, anyone?) and electro acoustic 
hearing (hybrid devices), it seemed that a more ‘realistic’ listening 
environment should be standardized across laboratories.

The surround system chosen by several research facilities is a system known as 
R-Space (see revitronix.com for info). The R-Space system was designed to fit 
in a standard audiometric test booth. They certainly managed this, and the 
R-Space has its merits. But the radius of its 8-speaker circular array is a 
mere 2 feet (0.61 m), so even slight head movements could change the relative 
sound levels at a listener’s ears. The R-Space’s background noise stimulus was 
recorded using eight equally-spaced Sennheiser gradient (or shotgun) mics. At 
least one photo of the recording setup shows a KEMAR centered in the 8-mic 
array (all mics on a horizontal plane), but this photo was for show. The KEMAR 
was neither needed nor used during the recording session.
I’m not knocking the R-Space: It was designed for a specific application, and 
it uniquely fits in a tight space. Its main limitation (aside from the weenie 
speaker radius) is the number of available recordings. At the time the system 
was installed at one facility, there was only the 8 (discrete) channel 
recording of Lou Malnati’s Pizzeria. It was my belief that more diverse stimuli 
could be generated, and that a system less sensitive to head movement would be 
of value. (Note to proponents of binaural recordings, head tracking, and HRTFs: 
Headphones are out of the question because they don’t fit over hearing aids and 
cochlear implant processors.)

One study utilizing the R-Space provided ‘scientific proof’ that the background 
noise, as played through the R-Space, could be used to demonstrate real-world 
differences between a hearing aid’s omni and directional mic settings (or 
something like this--Dr. Compton-Conley’s doctoral dissertation can be found on 
the Revitronix website). The R-Space has since been used in other studies, but 
I believe the original noise stimulus has been ‘bastardized’ in such a way as 
to make the external validity of some studies questionable. For example, 
studies have shown such-and-such speech comprehension scores using a +15 dB 
SNR. What you have to read between the lines is that the background noise was a 
recording of a pizzeria, but the noise was being presented at 60 dBA (a rather 
quiet pizzeria for Chicago!). In a different study, the background noise was 
presented at its recommended SPL (= 70 dBA) but the speech stimuli was 
presented at 85 dBA (an
 unrealistically loud talker!). My goal was to find a ‘better’ way to present 
the speech and noise, and to create a larger library of purposeful 
background-noise scenarios. Noise environments would include a quiet coffee 
house and a noisy airport terminal. This is where my journey into auralization 
and, subsequently, Ambisonics, began. Believe me, I make no bones about being a 
novice at auralization and Ambisonics.

The R-Space is more than a set of JBL speakers and an 8-channel recording: It 
includes a MOTU FireWire interface, a Mac computer with external Glyph hard 
drive, a compact 8-channel power amp (QSC, I recall), and MOTU’s Digital 
Performer DAW (sort-of overkill if it’s sole purpose was to play 8 tracks of 
pre-recorded audio!). The 8 channels came pre-assigned to their respective 
tracks, and the session wasn’t really meant to modified (perhaps the reason 
R-Space used SF2 files in lieu of wav files). The university I attended had the 
idea that a few lines of code, or a program written in JAVA or Python (because 
it’s free), could be used in conjunction with Digital Performer. The idea here 
was to present monaural speech stimuli in the same way and with the same 
interface they were accustomed to using. But commercial DAW software is ‘bullet 
proof’ (or idiot proof) and for good reason--to prevent novices and hackers 
from crashing computers (or
 keeping them from attempting to reverse-engineer proprietary software?). 
Furthermore, MIDI-based software doesn’t seem to communicate well with other 
MIDI applications (too easy to create positive fe

Re: [Sursound] E's Sursound Saga, Part I--Why what I do wrong works

2012-10-15 Thread Eric Carmichel
Hello Conor,
Many thanks for writing. Your email gets at the heart of the second installment 
of my Sursound Saga (Part II of the diatribe is in work). Yes, I had definitely 
considered a couple of surround mic options. I am relatively new to Ambisonics 
and auralization and, not too long ago, I had more than a few naive notions. I 
had considered the Holophone, spaced miking, the Eigenmic, a Soundfield mic, 
and even the possibility of affixing a circular array of Pressure Zone Mics 
(Crown PZM) on a cylindrical support column located inside a nearby open-court 
restaurant. I did a lot of reading on beamforming--I believe my first source of 
info on this topic was the Handbook of Signal Processing in Acoustics 
(Havelock, Kuwano, and Vorlander, Editors; Springer 2008). [Great book, but 
retails for a lot of $$.].
By the way, you had kindly sent me some product info regarding the RealSpace 
audio camera. You had sent the pdf brochure to me more than a month ago, and I 
had meant to contact you. I was quite curious as to the hardware needed (or 
included) to manage a 64-channel mic array (I'll write to your personal address 
with more questions). Recordings could certainly be adapted to the existing 
R-Space, but I am particularly interested in constructing a more 'open' 
loudspeaker arrangement so that listeners aren't sandwiched between speakers in 
an already-crowded audiometric test booth. One person on my doc committee is 
William (Bill) Yost--I'm going to pass your info along to him. His research 
facility is being renovated. Right now, Bill's awaiting the installation of a 
rotating chair to be used for balance studies (USAF funded research). The 
installers will also be adding a semi-hemispherical array of speakers (roughly 
akin to the setup used at Wright Patterson
 AFB, but only half a dome will be used). I'm not exactly sure what types of 
acoustical stimuli will be presented, but the study does involve a surround of 
sound.

For my work, I have explored a plethora of surround IRs needed to generate my 
own stimuli from dry recordings. My budget (as well as limited brain power) 
dictates what tools I choose and ultimately use. I'll write more in my upcoming 
installment about how I've generated a surround of cocktail party 
(multi-talker) speech for use as background noise. By adding 'natural' reverb 
(e.g. Waves IRs recorded via a Soundfield mic), I can do a bit better than my 
original notion of creating reverb via ray tracing techniques in MATLAB.I'm 
also making live recordings, both for fun and research purposes. My current 
arsenal of recording gear includes a TetraMic (and matching preamps), a TASCAM 
D-680 recorder, a Roland R-4 Pro recorder, and a MOTU 896HD audio interface 
(mostly used for playback).
Anyway, I'd certainly like to learn more about your products.
Thanks again for the info.
Kind regards,
Eric



____
 From: Conor Mulvey 
To: Eric Carmichel  
Sent: Monday, October 15, 2012 8:24 AM
Subject: Re: [Sursound] E's Sursound Saga, Part I--Why what I do wrong works
 

Eric,

Interesting "diatribe." I am by no means an acoustician but I wondered if you 
had considered using a spherical microphone array with beamforming to record 
speech-test background noise for the R-Space system?  It could certainly 
provide you with a quiet coffee house, a noisy airport terminal and other 
content quickly.

In the interest of full disclosure, I do work for VisiSonics and we are 
proponents of head tracking and HRTFs but we do also have a spherical 
microphone array that we build, use and sell.  Using beamforming, we have 
recorded environments for speaker arrays. 

Just a thought,
Conor
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20121015/628744d7/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Quietest show on Earth

2012-10-25 Thread Eric Carmichel
Hello All,
Just saw the sursound post regarding Orfield Labs' anechoic room and the "30 
dB" conversation level.
I met Steve Orfield at his Tucson (AZ) home not too long ago--actually brought 
up the topic of Ambisonics with him, too. I'll ask Steve where the 30 dB came 
from--and what units of dB. Maybe dBD at 2 m--or converted from units of noids 
(measure of annoyance, oui?).
Anyway, there was a good article on Orfield Labs in a Twin Cities (Minnesota) 
news journal some time back. There used to be a recording facility where 
Orfield Labs is located, and it was home to a few major pop/rock hits (I 
believe Funky Town by Lipps being one of 'em).
Best to everyone,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Vestibular response, HRTF database, and more

2012-11-03 Thread Eric Carmichel
Greetings,
Mostly through serendipity, I have had the pleasure and privilege of great 
teachers. I studied recording arts under Andy Seagle (andyseagle.com) who 
recorded Paul McCartney, Hall & Oats, and numerous others. My doc committee 
included Bill Yost, who is widely known among the spatial hearing folks. And, 
of course, I've learned a lot about Ambisonics from people on this list as well 
as a plethora of technical articles.

I recently sent an email to Bill with the following question/scenario. I 
thought others might wish to give this thought, too, as it gets into HRTFs.

There have been a lot of studies regarding localization in the transverse 
(horizontal) plane. We also know from experiments how well (or poorly) we can 
localize sound in the frontal and sagittal planes. By simply tilting someone 
back 90 degrees, his/her ears shift to another plane. This is different from 
shifting the loudspeaker arrangement to another plane because the semicircular 
canals are now in a different orientation. If a circular speaker array was 
setup in the coronal plane and the person was lying down, then his/her ears 
would be oriented in such a way that the speakers now circle the head in the 
same fashion as they would in the horizontal plane when the person is seated or 
standing. It's a "static" vestibular change, and gravity acting on the 
semicircular canals (and body) lets us know which way is up. But do we have the 
same ability to localize when the body is positioned in different orientations, 
even when the sources "follow" the
 orientation (as is the case in the above example)? How about localization in 
low-g environments (e.g. space docking)? The question came to me while camping. 
I seem able to pinpoint sounds quite well in the (normal) horizontal plane 
despite a skewed HRTF while lying down (and somewhat above ground).

On another (but related) topic, I have downloaded the HRTF data from the Listen 
Project, and have been sorting the participant's morphological features. I have 
this in an Excel spreadsheet, and am converting this to an Access database. 
Using the data, one can pick an "appropriate" HRTF starting with gross 
anatomical features (such as headsize) and whittle it down to minute features 
(such as concha depth or angle). I find HRTF discussions interesting, but still 
argue that headphones and whole-body transfer functions make a difference, too. 
Insert phones destroy canal resonance, whereas an earcup with active drivers 
may have a large "equivalent" volume, thus minimizing external meatus/earcup 
interaction (a mix and match of resonances). Because of this, there can be no 
ideal HRTF, even when it matches the listener.

While listening to HRTF demos, the notion of auditory streaming and auditory 
scenes came to mind. Some sounds were externalized, but other sounds of varying 
frequencies, while emanating from the same sound source, appeared in my head. 
The end result was that the externalized sounds provided a convincing (or at 
least fun) illusion, but problems do persist. A stringent evaluation of HRTF / 
binaural listening via headphones would require breaking the sounds into bands 
and seeing if a sound's constituent components remain outside of the head. When 
doing so, a brick-wall filter wouldn't be necessary, but a filter that 
maintains phase coherency would be recommended. The demo I refer to was that of 
a helicopter flying overhead. Though I haven't done this (yet), it would be 
interesting to use FFT filtering to isolate the turbine whine (a high-pitched 
sound) from the chopper's blades. The high-pitched sound appeared to be in my 
head, whereas the helicopter as a
 whole seemed externalized. Again, an individualized HRTF and different phones 
may yield different results. Side note: Be careful using FFT filtering--it can 
yield some peculiar artifacts.

I am hoping to use headtracking in conjunction with VVMic to model different 
hearing aid and cochlear implant mics in space. This offers the advantage of 
presenting real-world listening environments via live recordings to 
study/demonstrate differences in mic polar patterns (at least first-order 
patterns) and processing without the need for a surround loudspeaker system. In 
fact, it's ideal for CI simulations because an actual CI user never gets a 
pressure at the eardrum that then travels along the basilar membrane, 
ultimately converted to nerve impulses. With VVMic and HRTF data, I should be 
able to provide simulations of mics located on a listener's head and then 
direct the output to one or both ears. This does not represent spatial 
listening, but it does represent electric (CI) hearing in space. Putting a 
normal-hearing listener in a surround sound environment with 
mock processors and real mics doesn't work because you can't 
isolate the outside (surround) sound from the intended simulation, even 
with EAR foam plugs and audiometric insert phones. 
VVMic and live recordings via Ambisonics is a solu

Re: [Sursound] Vestibular response, HRTF database, and more

2012-11-04 Thread Eric Carmichel
Howdy Michael,
When it comes to death masks, their HRIRs, and localization in the spiritual 
plane, I won't go there. There's enough in the physical realm to keep me 
confused. For example, I have a pair of socks with L embroidered on one and R 
on the other. I have no idea how the sock manufacturer could tell which was 
which. But so long as I have R and L tattooed on my feet, I can keep things 
aligned. Shoes, on the other hand, are always trial and error.
Seriously (kind-of), there seems to be a dearth of info on auditory-vestibular 
interaction, particularly the dynamic effects. We "know" that a fixed item is 
stationary when we move our head, and we know that a moving object is in motion 
when we stand still (or lie down in the case of a good tracing). Lots of 
studies of vestibular-vision interaction, and there are certainly ways to trick 
the mind into believing what ain't true is believable.
Re static anatomical plane inversions: I don't believe that a vestibular cue 
will turn an azimuth interaural cue into a vertical HRTF cue (or vice a 
versa).  But if the vestibular interaction is at the level where all cues are 
used to determine the total 3D location of the source, maybe vestibular input 
can correct for where the head is situated (?).
As usual, I ask a lot of naive questions. Perhaps I'm like the prisoners in 
Plato's Cave Allegory who were only allowed to see (or hear) things from a 
limited perspective. Release from the cave may be too much for yours truly to 
comprehend. But as long as I turn toward the shadows (or my feet), everything 
remains real.
Best regards,
E




____
 From: Michael Chapman 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Saturday, November 3, 2012 9:52 PM
Subject: Re: [Sursound] Vestibular response, HRTF database, and more
 

Eric,

A bit wide of your topic ... if not indeed off topic.

If you lie a young healthy person (i.e. 'normal' skin elasticity)
on their back and take a copy of their face (a mask).

If you place this on your desk (paperweight-like) it may draw
comments, but not about it being unnatural.

Now hang it on the all. It won't look right and people are
likely to say so. Those who have seen 'death masks' in
museums might even ask if it is one.

(You can extend this ... with strange results ... to parts
of the body that are 'normally' clothed ... but that is
another matter.)

So a trivial example of an audience's automatic (and
unconscious) compensation for orientation.

Think you now have to do the experiments you've
outlined ;-)>

Michael

or orientation
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20121104/ba637918/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Which order (but not extactly high order)?

2012-11-05 Thread Eric Carmichel
Greetings,
I would like to model microphone pickup patterns in conjunction with HRTFs and 
Ambisonic recordings that I've made. To give a specific example, I would like 
model a miniature supercardiod mic, pointed forward, that is located proximal 
(or superior) to the pinna. This would be akin to a directional mic on a 
hearing aid or CI processor. The HRTF can be approximate, as the mic is likely 
to be placed slightly above the pinna and close to the head, not right at the 
opening of the ear canal. Some mics, however, are located in the concha, so the 
IRs from the Listen Project would approximate the mic placement, but not the 
mic's polar pattern.
I have recordings of cafeterias and public spaces that I made using a TetraMic. 
VVMic allows me to create first-order mic patterns that can be rotated in 
space. This alone is useful, but does not include the acoustic shadow that 
would be created by a hearing aid wearer's head. I have the Harpex VST, too. 
Harpex includes the HRIRs from the Listen Project (Svein, please correct me if 
I'm wrong on this), thus making binaural simulations a snap. But to get an HRTF 
that includes a specific mic pick-up pattern is a little trickier.
I had initially used VVMic to create the mic pattern I wanted, and then aimed 
it to the direction I wanted. Input was B-formatted wav files. The resulting 
output is a single channel, or N identical channels if I want to create N 
tracks. I created 4 tracks and used these as pseudo B-formatted material in 
Harpex.
The other "order" would be to create a stereo (binaural) output via Harpex from 
the original (authentic) B-formatted material. Then one of the two channels, L 
or R, could be made into four identical tracks that can be fed to VVMic to get 
the intended polar response. The four tracks, of course, are not B-format.
A bit of head scratching tells me neither method outlined above is correct. At 
least the binaural output from Harpex should be equivalent to an 
omnidirectional mic placed at an ear's concha (ITE hearing aids), and that 
could be used for simulations of electric listening. But I'd really like to 
model hybrid devices that combine both electric (cochlear implant) and acoustic 
(hearing aid) stimulation. It seems that using Ambisonic recordings without the 
need for loudspeakers would be an elegant way to simulate CI listening in 3-D 
environments, but using normal-hearing listeners.

Regarding my recent post (vestibular-auditory interactions and HRTFs): Thanks 
to Peter L. for making my clumsy wording clearer and to Dave M. for making the 
idea more direct and to the point. I have to be careful when referencing the 
anatomical horizontal plane versus the horizontal plane that lies perpendicular 
to gravity. Although a bit off topic of Ambisonics, the post did directly 
relate to spatial hearing. Because it's easy to do virtual mic rotations with 
Ambisonic material, Ambisonics could be a useful tool for studying 
vestibular-auditory interactions.
Thanks to everyone,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Going 'round and 'round

2012-11-06 Thread Eric Carmichel
Hi All,
Being as the vestibulocochlear nerve is responsible for transmitting 
sound and equilibrium (balance) information from the inner ear to the 
brain, it seems that balance dysfunction (or abnormal situations, such 
as space travel) could affect localization.
Dr. Michael Cevette, Ph.D. (Mayo Clinic), Dr. Jan Stepanek M.D. (Mayo Clinic) 
and the Aerospace Medicine & 
Vestibular Research Laboratory (AMVRL) team have investigated vestibular 
illusions underlying spatial disorientation in the aerospace 
environment. From what I gathered in a lecture co-presented by Dr. Cevette and 
Dr. Stepanek, "galvanic vestibular stimulation" (GVS) can be used to induce 
disorientation (the purpose of being to train astronauts and pilots). 
This, combined with studies of spatial hearing, might shed some light on some 
purposeful questions. I'll shoot Dr. Cevette a note and see 
whether he can provide any insight for those who might be interested.
Experiments are in work for studying effects of 
vestibular-auditory-localization interactions. Bill Yost (ASU) is awaiting 
installation of a rotating chair and a loudspeaker array. To my knowledge, the 
loudspeaker array will consist of 40+ speakers in a semi-hemispherical 
orientation. I'll ask Bill if he has considered using GVS. 

I recall seeing one reference (trying to dig it up) where the experimenters 
used a chair rotating at a constant rotational velocity. This, then, would 
result in a static but abnormal change in vestibular balance. Not sure if or 
how the sound (or sounds) followed the chairs rotation. Once Bill gets the 
rotating chair and loudspeakers in place (the big setback here in the USA is 
dealing with building codes), I am hoping to use the array for studies 
involving Ambisonics. Vestibular disorders aren't my interest, but spatial 
hearing is.
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Generating interest

2012-11-07 Thread Eric Carmichel
For Sursound subscribers, the ideas of virtual microphones and binaural 
recordings via Ambisonic recordings is old news. But there are a lot of hearing 
scientists, sound designers, recording engineers, and surround-sound 
enthusiasts who are not familiar with Ambisonics and what it offers.
One way of using Ambisonics for research requires no loudspeakers at all (which 
is a plus when dealing with budget and limited space). I just sent the email 
(below) to colleagues who are not familiar with Ambisonics, VVMic, etc. with 
the hope that they find the idea purposeful. People on this list may have 
suggestions as how to improve on my idea. Thoughtful ideas and constructive 
criticisms always welcome.
Best,
Eric

Greetings to All,
I hope you will take a moment to review this email, as I believe the idea 
outlined herein has a lot of research potential. I have also provided links to 
sample recordings that I made yesterday (November 7).
Imagine that we had the technology to create a "perfect" acoustical replication 
of restaurant noise (3-D, of course), and we wish to use this acoustical 
replica to simulate cochlear implant (CI) or EAS listening in a 3-D listening 
environment. One thought might be to put a mock processor or HA body on a 
listener's head, and vary the mic settings (omni, directional, etc.). The mic's 
output would next be routed through a vocoder or other simulator, and the 
resulting electric signal would be routed to an insert phone placed in the 
normal-hearing listener's ear. Now the listener is free to move his/her head in 
our controlled 3-D listening environment while, at the same time, make 
adjustments to the mic's pickup pattern (as well as processor settings).
Two (of several) obvious flaws with this idea are: 1) We don't have a perfect 
replica of a 3-D listening environment, and 2) It would be impossible to block 
out the acoustical stimulus, even with the use of foam plugs and insert phones.
We could correct for the second problem by simply recording what comes through 
the HA mic or mock CI processor's mic, but then we'd have to make separate 
recordings for every mic pattern as well as head position. We could have used a 
KEMAR from the very start, but then we couldn't have captured the same live 
"event" each time we switched to another microphone type or head position.
I believe the best solution can be obtained from mathematical extraction from 
Ambisonics-recorded surround sound wav files. When done correctly, all 
directional information is contained within an Ambisonics recording. All that 
is needed, then, is a way to create virtual microphones that can be steered in 
any direction (in the digital, not acoustical, domain). This is relatively easy 
to do via VVMic, and first-order pickup mic patterns are also a snap to create. 
What one has to be careful of is that mic polar responses are 3-D, too (noting 
that most polar plots are shown in 2-D). With the right software (which I'm 
working on), we can add head tracking. Head tracking is already popular when 
used with binaural recordings. But you have to look closely at what's being 
proposed here: I am creating ELECTRIC (2-D) listening as it might be perceived 
in a 3-D environment. The L or R channel of a binaural recording will roughly 
give the equivalent of an omni mic
 placed at or in the ear canal, but this hardly qualifies as a realistic 
simulation of CI mic placement or type. What I am proposing will provide a lot 
more flexibility and CI realism than an omni mic placed at the concha. When 
more than one mic or mic pattern is used, as would be the case for EAS, we can 
just as easily create a variety of mics, mic orientations, and mic polar 
responses from a single master recording. No speakers are needed, so there's no 
speaker distortion, combing effects, or other anomalies. Here's an example of 
what can be done:

I made a recording in a cafeteria yesterday. The unprocessed recording can be 
heard by going to

www.elcaudio.com/research/unprocessed.wav

Using appropriate software (VVMic, Harpex, and other VSTs) to steer a virtual 
cardiod mic away from me (the talker) yields the wav file below 
(directional_mic_01.wav). Note that this isn't merely deleting a channel or 
channels, nor is it panning L or R. I can steer the mic up, down, sideways, 
front, back, or any direction in 3-D space that I wish.

www.elcaudio.com/research/directional_mic_01.wav

Steering the virtual (cardiod) mic towards me when using the same 
master-recording segment (for direct comparison to the above) yields the 
following wav file:

www.elcaudio.com/research/directional_mic_02.wav

The next step is adding the mic's response and directional characteristics to 
an HRIR so as to include head shadow.  At present, I can turn the master 
recording into a binaural recording using over 50 HRIRs (via Harpex), or I can 
create an unlimited number of microphones and mic directions along with 
1st-order pickup patterns (e.g. VVMic), but not bo

Re: [Sursound] Which order (but not extactly HOA)

2012-11-08 Thread Eric Carmichel
Hello Fons,
Thanks for reading my post and for commenting on it.
Years back, I picked up a book titled Acoustical Factors Affecting Hearing Aid 
Performance (Studebaker & Hockberg, Editors). Within the book was a chapter 
titled Transducers and Couplers (or something like this) authored by Mead 
Killion. Though I don't have this in front of me now, the chapter suggested 
that designers of miniature hearing aid mics have mostly overcome the inherent 
problems of mic self-noise, uniform response, etc.
But I view certain claims with a dubious eye. The size of a tiny electret mic 
does impose physical restrictions and a good reason to raise an eyebrow. I'd 
guess that the "directional" characteristics, as purported by the 
manufacturers, do not include head shadow or the effects of head baffling (more 
z-effects than shadow-related attenuation), though both effects must be 
considered when determining the device's net response. As you know, most 
electret mics are intrinsically omnidirection, though Panasonic provides cadiod 
patterns in their line of electret mics. For the most part, mic -- or a hearing 
aid's -- directionality is obtained by using forward and rearward facing mics 
combined with basic signal cancellation. I can't imagine that there could be 
venting behind the diaphragm of a tiny mic that would then change its response 
(particularly at low- to mid-frequencies). Separate mic elements provide 
opportunities for directional manipulation, but I
 suspect the directional characteristics are highly frequency dependent -- just 
as the superposition of waves can be.
I do know from experience that minute changes to the protective plastic 
components surrounding a mic makes noticeable changes for HA and CI users. For 
example, one manufacturer added what was intended to be a windscreen (really a 
hard plastic "windshield" -- not foam to break up turbulence). The size of the 
shield was miniscule compared to the wavelengths of speech-frequency sounds, 
but its size was comparable (or larger than) the mic diaphragm diameter. Users 
of the device complained of the change / addition, and measurable differences 
in speech discrimination scores supported their perception (even indoors and 
away from wind noise). The shield more-or-less formed a cavity with openings on 
the sides. One side was proximal to and parallel with the wearer's head. The 
end message I took from all of this was that small changes can make big 
differences, even to those with poor hearing.
I'm not a mic designer (obviously), but I am interested in the combined effects 
of processing, bandwidth, and polar patterns in real-world listening. Even 
today, I recorded the ambient sounds of another popular cafe using my TetraMic.
Once I have converted my raw wav files (A-format) to B-format (again, I use a 
TetraMic, hence no direct recording in B-format), I can do a number of 
manipulations. All of this is very cool. If all the channels were originally 
received via a left-pointing cardiod mic (as an example), then all the channels 
would be identical -- minus the phase anomalies that result from the four mic 
elements being located in slightly different places. From what I see in the 
equations, four identical A-formatted channels converted to B-format would 
simply yield a W channel with all information, and the X, Y, and Z outputs 
would be null or zero. Applying a HRIR to this would give a L and R output with 
both ears being the same in phase, response, etc. (albeit shaped by the head). 
Any attempt to rotate the "head" would just be aligning the head with the mic, 
not rotating the head and mic in space. I'm not a mathematician, so I could be 
way off here.

I can certainly create a HRIR using monaural material and then apply the IR of 
a cardiod mic to simulate the polar response, but this would simply give the 
equivalent of the mic (with head) pointed to a single, fixed source. If I 
wanted to study the effects of a stationary talker in a reverberant room, this 
would actually be useful. But trying to simulate the effects of ongoing, moving 
or multi-timbral sounds in 3-D space is not easy to solve.
Again, I truly appreciate your expertise and insights. I'll also try to dig up 
data on actual HA mics.
Best,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Which order (but not extactly high order)?

2012-11-10 Thread Eric Carmichel
Hello Eric,
Many thanks for the link to Knowles' miniature mics, as that is more current 
than the HA mic articles (e.g. Killion et al) I have on file. Clearly, such 
mics aren't intended to compete with Neumann, Schoeps, etc. studio mics, but 
their performance is quite amazing for such tiny mic elements! And their 
miniscule size might make them entirely suitable for multi-element "point 
source" mics, too.
Boosting the low end, as you pointed out, certainly has a deleterious effect on 
noise performance. Although F0 (first formant) isn't needed for intelligible 
speech--at least for normal-hearing persons--it does find an important purpose 
in EAS (electric-acoustic stimulation) devices. Briefly, persons with profound 
hearing loss (but with residual low-frequency hearing) have poor speech 
comprehension using only their low-frequency hearing. Cochlear implant 
(electric) hearing yields modest speech scores for many of these persons. But 
when the low-frequency (acoustic) hearing is added to electric stimulation, the 
combined results are amazingly good--much better than what researchers might 
have anticipated. One of the earlier problems was preserving low-frequency 
hearing while implanting an electrode array. Nowadays, shallow electrodes are 
implanted in order to preserve low-frequency hearing (note: a hearing aid is 
still needed to boost the low frequencies).

Questions, of course, remain, and it will be useful to study the effects of EAS 
in reverberant conditions and a variety of noise types. Whether the mic used to 
boost the low frequencies (conventional amplification) should be omni or 
directional may not be clear, and same goes for the mic or mics used to process 
the mid and high frequencies.
As you had also pointed out, there's a head-mic interaction, and this must be 
considered. After all, we're looking at a system, and a holistic approach is my 
objective. Although not every possible variable can be accounted for, it 
doesn't hurt to try a few representative scenarios that a CI recipient would 
encounter. The ability to hear and discern warning signals (and their 
direction) or moving vehicles can be studied. Restaurants, Houses of Worship, 
and classrooms are typical listening environments, and Ambisonic recordings 
certainly provide opportunity to re-create acoustic events over and over again 
in a controlled, repeatable manner. Even without loudspeakers, I believe it is 
feasible to study mic directionality when using Ambisonic recordings. Combining 
mic polar characteristics with HRIRs is the next step.
Thanks for reading my posts, and for taking time to share you wisdom.
Best,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Quietest place on Earth revisited

2012-11-24 Thread Eric Carmichel
Greetings All

Back in October (Sursound Digest, Vol 51, Issue 24, to be exact) there was a 
post regarding places to visit, and Orfield Labs, the "quietest place on Earth" 
was showcased. It was then pointed out that the BBC article (link below) said, 
"an average conversation runs at about 30 decibels."

I asked Steve Orfield, who owns/operates Orfield Labs, where this figure might 
have come from. Steve politely replied with, "We aren't responsible for the 
levels. We always reference 65.5 dBA, from the old articuatlion index standards 
and ANSI."

So, I suppose the "30 decibel" level came from an unreliable source -- seems to 
happen a lot since the advent of the Internet (that we all know is an alien 
conspiracy).

For those who can get it (yes, the aliens censor the Internet), here's a video 
that aired on NBC's TODAY, November 23, 2012:

http://todaynews.today.com/_news/2012/11/23/15385024-the-sound-of-silence-sailor-finds-peace-in-worlds-quietest-room?lite

The link to the "30 decibel" reference here:

http://www.bbc.com/travel/blog/20121022-the-quietest-place-on-earth

Best to All,
ELC
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Quietest place on Earth revisited

2012-11-24 Thread Eric Carmichel
I'm sure there are lots of honest (as well as intentional) mistakes in the 
news. So...


-20 dBm (0.1 W) vs. 1/10 mW vs. 1/10W -- just minor differences.

Obviously, there are bigger (as well as more subtle) blunders in the news than 
30 decibel speech. Perhaps speech levels were measured by the reporter at a 
distance... perhaps the 
length of a soccer field? Or the "decibel" reference wasn't the standard 20 uPa 
-- it didn't state 30 dBA or 30 dB SPL.

Nowadays, I do hear and see more errors regarding "dBm" because impedance (600 
ohms standard for dBm) is rarely accounted for. This, then, often creates 
confusion when teaching the difference among line-level, mic-level, dBm, dBv, 
dBu, dBV, etc. I think I'll go into journalism... few will notice my mistakes.
Best,

ELC


____
 From: David Pickett 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Saturday, November 24, 2012 12:33 PM
Subject: Re: [Sursound] Quietest place on Earth revisited
 
At 08:46 24-11-12, Eric Carmichel wrote:

>Back in October (Sursound Digest, Vol 51, Issue 24, to be exact) there
>was a post regarding places to visit, and Orfield Labs, the "quietest
>place on Earth" was showcased. It was then pointed out that the BBC
>article (link below) said, "an average conversation runs at about 30 decibels."
>
>I asked Steve Orfield, who owns/operates Orfield Labs, where this
>figure might have come from. Steve politely replied with, "We aren't
>responsible for the levels. We always reference 65.5 dBA, from the old
>articuatlion index standards and ANSI."
>
>So, I suppose the "30 decibel" level came from an unreliable source --
>seems to happen a lot since the advent of the Internet (that we all
>know is an alien conspiracy).

The BBC does not always check facts.

http://news.bbc.co.uk/2/hi/programmes/fast_track/9766517.stm

See this item on whether electronic gadgets can actually affect the safe 
operation of a plane during take off or landing, where a scientist making the 
measurements mistakenly tells a journalist that -20dBm = 1/10 mW without being 
challenged.  Later the journalist quotes this as 1/10 W.

I wrote to point this out, and received a "thank you for your interest" reply...

David
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20121124/d5838e46/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Ghost in Machine

2012-12-14 Thread Eric Carmichel
Greetings to All,

I've been working on listening samples to help explain my "ideas" regarding 
hearing aid and cochlear implant research to others. For starters, I'm using 
IRs obtained with a Soundfield mic to auralize dry speech. Unfortunately, more 
questions than sounds surround me weary head.

I discovered an artifact that will need to be addressed, and the answer may be 
obvious to the experts out there. I have uploaded files so that everyone can 
here the artifacts. The files can be downloaded from www.elcaudio.com/demos/


The dry recording I had initially planned as a demo is titled 
janice_sample_condensed.wav. The recording was made in a semi-anechoic room 
with a Rode NT1-A mic. No big deal here. I took a 6-word sample of the longer 
word list and cut out time between words. Zero-crossing detection was used to 
eliminate pops as I deleted sections of silence between words. The resulting 
file is labeled janice_sample_condensed.wav.

Next, the monaural wav file (speech) was "auralized" using the four 
B-formatted, 96 kHz, 24-bit IR files obtained via a Soundfield Ambisonic mic. 
The four IRs (w, x, y, and z) were applied to the monaural dry recording 
(janice_sample_condensed). Finally, I used a popular VST to convert the 
resulting four B-formatted files to a stereo/binaural file (KEMAR or similar 
HRTF). The stereo file is titled janice_60x00y.wav. The 60x00y comes from the 
position of the mic relative to the loudspeaker.

Now for the weird stuff: When you listen to the janice_60x00y.wav file under 
headphones (it's a binaural recording), it's fairly clear that the talker is to 
the right of the listener. This would be expected based on the mic/speaker 
orientation. The first word is the easiest to localize, and one could argue the 
precedence/Haas effect helps localize the first sound in the reverberant room. 
As the sentence progresses, the localization is more blurred (at least to me). 
So, to investigate whether other words could be well localized by starting at 
each word's onset, I moved the wav file editor's cursor to begin at around 4 
seconds. What I noticed was a distinct impulsive/gunshot sound--it isn't 
remotely subtle. This "burst" has nothing to do with non-zero crossing point 
pops or the abrupt start/stop of a waveform without fading in/out of it. This 
occurs at any number of locations, but is particularly noticeable around 3.8 
seconds. But when you listen to the
 wav file from start to finish, no such sound exists. I also trimmed off the 
wav file's first four seconds and provided 50 ms fade-in. The impulse is still 
clearly audible. But yet, it goes completely unnoticed when listening to the 
full-length file from its beginning.

Because the four IR files are 2 s duration, I thought there might be a "ripple" 
that occurs every two seconds. So, to test this, I created a 600 ms noise burst 
from ANSI speech-weighted noise (600 ms is approximately the time taken to say 
Tom). I added a 10 s tail of silence to the noise burst. Next I proceeded to 
apply the IR files using the same settings (e.g. 100 percent wet) as I did with 
the dry-speech recording. There are no "ripples" of impulse noise in the silent 
region. I then cropped off a small initial portion of the noise burst and 
applied a fade in. The impulsive sound is very evident, but doesn't occur when 
listening to the file from its beginning (i.e., the original, full-length 
file). The speech noise files are speech_noise_600ms.wav ("dry" noise); 
speech_noise_hrtf_1.wav (same processing as dry speech stereo); and 
speech_noise_hrtf_cropped.wav (fade-in added to the trimmed file).


Artifacts such as this make me question a lot of what's going on 
research-wise. I don't know how hearing-impaired persons hear or deal with echo 
suppression and artifacts, so these "ghosts" could present a very real problem. 
Although we might not hear the artifact in one condition 
(i.e., playing from beginning), there's still something going on behind 
the scenes.

This kind-of reminded me of "Ghosties and Ghoulies" found on the Harvard Tapes 
psychoacoustic demos (briefly, this demo shows how the brain suppresses echos: 
When the hammer blow is played backwards, the decay is quite audible, when 
played forward, it's a brief sound).

Please listen to the files yourself. Your insight is most welcome.

Back to work (and a lot of coffee).
Best always,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Ghost in Machine

2012-12-15 Thread Eric Carmichel
Hi John,
Thanks for writing. No, this isn't at all like playing speech from middle of a 
word or music. Certainly beginning a recording from a waveform that would have 
abrupt onset would result in a pop or click. Have you listened to the file? I 
deleted the first 4 s, added a 50 ms fade-in, and the impulse sound is still 
there. But if you begin the wav file from the beginning, there is no artifact.
The impulse-like sound (more gunshot sounding--actually sound of IR itself) is 
quite loud sounding, though there's no noticeable change in amplitude of 
waveform. That's why I use "loud" in lieu of intense--it's perceptual.
If you take the normal (dry) speech or natural speech recorded in same room 
where the IR was recorded, no such artifact exists. You might get a small click 
or pop at middle of waveform--this, again, is normal and equivalent to playing, 
say, a cosine wave from beginning (big click because of abrupt rise time). 
Please listen to file if you cand download it. Use any generic wave editor (I 
use Audition because of big visual and easy to use) and move cursor to various 
parts of file. The impulse is there--almost everywhere--but only in the 
processed recording.
Again, many thanks for writing.
Kind regards and Happy Holidays,
Eric




____
 From: John Abram 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Saturday, December 15, 2012 6:50 AM
Subject: Re: [Sursound] Ghost in Machine
 
Perhaps I'm misunderstanding, but this sounds completely normal to me.
The artifacts are simply side effects of starting playback of recorded
speech from the middle of a word. Is this situation going to present
itself to a person using a hearing aid? I mean does the device itself
act as a noise gate?

-- 
with best wishes, John
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20121215/cc43824a/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Ghost in Machine--quick Addendum

2012-12-15 Thread Eric Carmichel
Hi John,
Again, thanks for writing. Questions and comments always make me think harder 
because I often realize that I didn't state my question/problem accurately.
You have a good point regarding gating. This is often evident to hearing aid 
users if settings are too abrupt (expansion seems to work better than gating 
for minimizing some noise).
In my case, I use recorded speech and noise stimuli in research. Hearing-loss 
and cochlear implant simulators are often used so that I can use normal-hearing 
listeners as research participants. The stimuli may sound natural to 
normal-hearing listeners. There's often the problem of conditioned 
listening/hearing (sound design for movies depends on this) versus critical 
listening. We "expect" things to sound a certain way. In the case of my 
auralized (better stated as processed) recordings, the artifacts aren't 
heard--at least not to the normal ear. But if somethng is peculiar about the 
recording (such as is the case of mp3 files--this relies on psychoacoustics, 
too), then we can't say it replicates "real-world" listening even if it sounds 
good or is very hi-fi. Actual recordings with a Soundfield mic don't present 
the curious artifact. Creating the physical reconstruction of a wave field at 
the listener's head is ideal--and why I got started on Ambisonics.
 My IR-processed recordings sound ok--so long as they're played from the 
beginning of the file. But the artifact clearly indicates there's something 
very unnatural about the stimuli. Although it can be ignored by normal-hearing 
persons, I have no idea how the hearing-impaired (to include central auditory 
processing, not just sensorineural loss) might perceive the wav files--even 
when played from the start.
Anyway, everyone's input is always welcome. I hope my previous note and this 
post help clarify my question/concern. I'm still learning--and this means 
learning to formulate questions in understandable ways. I'm very appreciative 
of people's time and expertise.
Thanks and Happy Holidays,
Eric




 From: John Abram 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Saturday, December 15, 2012 6:50 AM
Subject: Re: [Sursound] Ghost in Machine
 
Perhaps I'm misunderstanding, but this sounds completely normal to me.
The artifacts are simply side effects of starting playback of recorded
speech from the middle of a word. Is this situation going to present
itself to a person using a hearing aid? I mean does the device itself
act as a noise gate?

-- 
with best wishes, John
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20121215/461b8332/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Ghost in Machine

2012-12-15 Thread Eric Carmichel
Hello Fons,
Thanks for writing.
I have recordings made in highly reverberant spaces--and no such artifact exist 
in those recordings. Yes, a reverb tail can be heard, but not a loud, distinct, 
"gunshot" sound anywhere in the recordings.
I will upload the 2 s IR for you (I can't go to ftp site from current coffee 
cafe at this moment). However, the same effect occurs with IRs downloaded from 
the Open Air Library--I tried them to see whether my recordings were to blame. 
My IRs were obtained using swept sine measurement and deconvolution, as per 
protocol outlined by Angelo Farina (by the way, did anyone ask if the Farina 
piezo-pinna transducer is related?).
What might be the problem is the software used to apply the IRs to dry 
recordings. In my uploaded example, Sony Sound Forge 10d and its built-in 
Acoustic Mirror was used. Perhaps I should be using Altiverb, Waves IR3, 
YouVerb (I made this up), etc. I'm no expert on IRs--just getting started aside 
from apply Waves' and Trillium Lane's reverbs to music.
Many thanks for help. Please listen to sample--it's more than a reverb 
tail--and pretty odd.
Best always,
Eric




____
 From: Fons Adriaensen 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Saturday, December 15, 2012 9:51 AM
Subject: Re: [Sursound] Ghost in Machine
 
On Sat, Dec 15, 2012 at 08:20:49AM -0800, Eric Carmichel wrote:

> No, this isn't at all like playing speech from middle of a word
> or music. Certainly beginning a recording from a waveform that
> would have abrupt onset would result in a pop or click.

To me it sounds as the normal reverb tail. Which you don't notice 
when the sound that caused it is included, as it sounds natural
in that case.

There may be another issue, but to determine this I'd need the
2 seconds B-format room IR you used.

Ciao,

-- 
FA

A world of exhaustive, reliable metadata would be an utopia.
It's also a pipe-dream, founded on self-delusion, nerd hubris
and hysterically inflated market opportunities. (Cory Doctorow)
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20121215/113a225d/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Plate Reverb rocks

2012-12-15 Thread Eric Carmichel
Hello John, Fons, and all who read my post regarding IRs and alleged artifacts.
Because my observation was both new (to me) and curious, I did a bit of 
exploration. If nothing else, it would be important NOT to chop up speech 
intended for stimuli after applying reverberation. The same could be said for 
speech recorded in a reverberant environment.
John and Fons were (of course) correct in stating that what I hear is the tail 
of a sound's decay. But in some instances, it's far more pronounced than I 
would have imagined. If an echo's tail bleeds into a subsequent word, the echo 
is quite pronounced when one starts from the word's onset. It is particularly 
noticeable when the sound that created the echo is a broadband sound because it 
will then sound like the impulse itself. I suppose that's why it's so 
pronounced. But it really appears loud, and not something that is heard when 
the wav file starts before any echos are present.
There were differences in the onset sound when comparing natural and 
IR-produced reverberation. With naturally-occuring reverb, a strong "T" sound 
(a lingua-alveolar stop) will excite room modes and create an audible echo when 
the wav file is (meaning started slightly beyond the initial production of the 
T). But it does sound like a "T" sound and not like the "IR" shot that I was 
hearing.
When using speech-weighted noise (600 ms duration, 100 ms rise/fall time) plus 
a reverb IR, the effect of echoes is quite pronounced when starting playback 
anywhere in the wav file. Because it's a broadband sound, it does sound like 
the IR (or a "gunshot"). It is like a ghost in the recording.
I next created a pure-tone noise burst (730 Hz--random selection of 
frequency--100 ms rise/fall time) and applied the same IR used in other 
samples. Regardless where I started the playback, the result is a pure tone 
(with echo). There is a noticeable pop if one doesn't start at a zero crossing, 
but this would be expected. A short rise-time would fix this type of click/pop, 
but doesn't "fix" processed speech that is started midway in wav file.
Just to convince myself that my software doesn't create artifacts, I used an IR 
of a different type: This time, a stereo HRTF wav file. It sounds quite good, 
and no pecular sounds or artifacts are present when file is started midway in 
the sample. Tout va bien.
And to investigate other forms of reverb, I took a 1970s recording that used 
more than a moderate amount of plate reverb. For those of you who remember Neil 
Young's After the Goldrush performed by Prelude, that was my sample of choice. 
This was akin to the natural reverb in that clearly-pronounced stops/phonemes 
can be heard bleeding into subsequent phrases when you begin at a phrase.
One likely reason I was hearing so much "gunshot" noise in my original samples 
is because there was other noise in the recording. The presence of echoes and 
tails created by the broadband noise gives the "gunshot" sound. None of the 
artifacts sounded very speech-like, but I assume this is not a fault of the IRs 
or processing; instead, I assume the underlying noise in the recording is being 
mathematically operated on when using IRs. Noise simply accentuates the effect. 
Noise, to include mic self noise, that are not present in the real-world 
environment will still be operated on by the IR, and echoes of any noise in the 
wav file become distinctly audible when the wav file is started from any 
arbitrary point (sans the beginning of the wav file).
Lessons learned: I was wrong, but not entirely so. It is clear that recorded 
speech can't be chopped up and then presented as speech stimuli. The words or 
sentences to be auralized have to processed as a whole, and then presented to 
the listener. Even with fade-ins, the effect of lingering echoes is extremely 
pronounced when the IR comes from an highly reverberant space. It's less 
noticeable in moderately reverberant spaces, but not subtle. Clearly, arbitrary 
starting points aren't arbitrary when it comes to creating stimuli.
Thanks again for help, and for setting me straight. One way to learn is to 
experiment and listen carefully. Others already knew what I had discovered for 
myself, but I think I have a good grasp of what's going on. Listen and learn.
Happy Holidays,
Eric





 From: John Abram 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Saturday, December 15, 2012 6:50 AM
Subject: Re: [Sursound] Ghost in Machine
 
Perhaps I'm misunderstanding, but this sounds completely normal to me.
The artifacts are simply side effects of starting playback of recorded
speech from the middle of a word. Is this situation going to present
itself to a person using a hearing aid? I mean does the device itself
act as a noise gate?

-- 
with best wishes, John

Re: [Sursound] Plate Reverb rocks

2012-12-16 Thread Eric Carmichel
Hello Fons,
Thanks again for your help. I'm good at electronics and hearing science, but 
Ambisonics and IRs are new to me. It was never mentioned once while I was in 
grad school (hearing science), so this has been a self-teaching experience. 
I've received a great deal of help from you and other on the sursound list. As 
I've mentioned in my past ramblings, stimuli we use in assessing hearing aid or 
cochlear implant efficacy are far from real world. Simple (but rarely spoken) 
speech phrases with monaural white noise* maskers provided a convenient 
yardstick for lab experiments. If a proposed experiment required more than 
MATLAB and a laptop's built-in sound card, it was frowned upon. I'm an "old 
school" hardware junkie, but also use MATLAB, Visual Fortran, C, etc. And I try 
to integrate this background with current issues in hearing disorders. *Note: A 
lot of people were using white noise (occasionally pink noise) as masking 
noise. I had suggested the change to speech
 noise, and did the final mastering and calibration of a now-popular test CD. 
Some of the original tracks (various sources) were meant to be presented at 
such-and-such dB hearing level (HL); whereas others were to be calibrated in 
dBA or dBC. Talk about confusion for the audiologists! I adjusted all tracks to 
match a cal sound that was to be measured in dBC. Whether one used speech 
babble or weighted noise tracks, there was no confusion regarding the 
presentation levels or SNRs. Anyway...

The IRs have been uploaded for you. These particular files were downloaded from 
a site that no longer exist, so I can't provide the link. The now-missing site 
was a bit like OpenAirLib.net, but included a classroom, "Great Hall" and 
"Octagon". The original files were already split into W, X, Y, and Z files 
(i.e., not interleaved). I converted to 16 bit and 48 kHz to match my existing 
dry speech (which I recorded). Sony Sound Forge Pro's Acoustic Mirror doesn't 
require that the bit or sample rates of the IR and wav files match, but I had 
already made conversions. I'm still processing my own sweeps, which are 
probably even noisier. My tests were made using a KRK 9000 (I found this to 
have low distortion across frequencies and uniform response) and a TetraMic. If 
I remember correctly, the classroom files were made using a Soundfield mic 
(ST250, ST450, or similar) and a high-end Genelec monitor. You can find the 
16-bit, 48 kHz files in the ir folder:
 www.elcaudio.com/demos/ir/

When using quality IRs from Trillium Lane or Waves, it's easy to change wet/dry 
ratio to taste. But this is subjective. When using IRs for auralization, I 
assume 100 percent wet is the norm. In the end, shouldn't the location of the 
talker (or source) be commensurate with the mic-to-speaker orientation of a 
given Ambisonic IR? That much isn't too bad, but all rooms appear far more 
reverberant than what their respective RT60s suggest. And recordings of speech 
made in reverberant rooms (actual recordings, not processed) sound nowhere as 
reverberant as processed recordings made with IRs. This is why I ask about the 
wet/dry ratio. But if empirical or subjective impressions are used, I can't 
vouch for any scientific validity of the stimuli -- I simply chose it because 
it "sounded right" (this ain't gonna fly).

Many thanks for your time.
Best wishes,
Eric





 From: Fons Adriaensen 
To: Eric Carmichel  
Sent: Sunday, December 16, 2012 4:13 AM
Subject: Re: [Sursound] Plate Reverb rocks
 
(going off-list)

On Sat, Dec 15, 2012 at 03:40:32PM -0800, Eric Carmichel wrote:

> One likely reason I was hearing so much "gunshot" noise in my
> original samples is because there was other noise in the recording.

Yes. The original speech recording contains quite some VLF noise.
It could be a good idea to remove that *before* editing, even more
so if you do hard edits at zero-crossing points instead of short
crossfades. When VLF noise is present the zero-crossings are not
those of the waveform of interest and you could end up splicing
two different 'DC' levels together.

I also suspect there is considerable VLF noise in the reverb IR,
it's a typical problem with IRs made using tetrahedral mics. Also
this should be removed before performing the convolution.
(that's one of the reasons I asked to see the IR)

> Clearly, arbitrary starting points aren't arbitrary when it comes
> to creating stimuli. 

Or editing e.g. classical music recordings (which tend to have 
quite some reverb on them). Here the problem is usually the 
inverse one: you can't e.g. cut at the start of a take (which
is not the start of the piece) as you don't have the reverb
from the previous notes.

I'm still puzzled by the 'car passing by' sound after the
reverberated noise burst. It co

Re: [Sursound] Plate Reverb rocks

2012-12-17 Thread Eric Carmichel
Hello Pierre,
That's good info, particularly for mixing and mastering music.

The problem I have with generating stimuli for research is that is can't be 
"mixed for taste" -- that is, compression, reverb, and EQ shouldn't be used to 
make the recording sound better. In an attempt to help my fellow students who 
are also doing hearing research, I outlined several things that should be 
considered when producing "real world" stimuli. In the outline, I had stated 
that nearly every Grammy-winning pop recording used compression and "verb", 
whether hardware-based or VST/RTAS (I continue to use a Teletronix LA-2A for 
vocalists). But we can't use compressed speech or environmental sounds if we 
wish to replicate environments... unless it's the effects of compression we 
wish to study.

Recording speech in a restaurant is trickier than one might imagine. In a noisy 
environment, the speech alone has a wide dynamic range. The average rms value 
for speech (65 dBA) and it dynamic range based on rms values is meaningless. I 
don't actually know what the range is when we compare the softest phoneme to 
the loudest voiced sound or to a raised voice. Naturally, we raise our voices 
above the background noise (the Lombard effect), and one paper by Tufts and 
Frank (2003) showed that a talker’s voice level, on average, increases 5 dB for 
every 10 dB increase in background noise level. Reference: Tufts, J. B., and 
Frank, T. (2003). Speech production in noise with and without hearing 
protection. J. Acoust. Soc. Am., 114(2), 1069-1080.

When we combine speech with a cacophony of background noise, managing the 
recording without compression or clipping becomes a challenge. Naturally, 
compression at the first stage of amplification would help a great deal here, 
and may go unnoticed (perceptually) when using a curvilinear compression with 
low compression ratio. But if I then wish to use this as "real-world" stimuli 
to study the effects of hearing aid compression on (for example) localization 
or speech intelligibility in noise, I can't say the hearing aid is doing the 
work--compression had already been applied.

So, modifications of any kind or psychoacoustic anomolies that aren't present 
in real-world scenarios taint the research stimuli. I can't use my own ears to 
"hear" what someone with a disorder might hear. But if I know with reasonable 
certainty that the stimuli PHYSICALLY reflects real-world conditions, then the 
problems a hearing-impaired person faces in everyday listening should be 
replicated in a controlled, laboratory condition.

Thanks again for passing along your two cents--it really is valuable to know 
what goes on the "regular" world of music recording. I have Altiverb, but 
really haven't used it much. Now I'm interested in exploring it further.
Kind regards,
Eric C.





 From: Pierre Alexandre Tremblay 
To: Eric Carmichel ; Surround Sound discussion group 
 
Cc: Fons Adriaensen  
Sent: Monday, December 17, 2012 3:20 AM
Subject: Re: [Sursound] Plate Reverb rocks
 
You know, one thing not to forget is that most pop music mixer will pre- and 
post-process reverb sends... filtering and compression are very common before 
the reverb, and would smooth the transient triggering the reverb... the 
opposite can also be true...

I use a chorus before or after the reverb too to help the static-ness of IRs... 
now Altiverb has built it in...

my 2 cents
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20121217/432331d9/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] IR files

2012-12-19 Thread Eric Carmichel
Hello Toni,
You're correct about Waves--no x, y, z files--just surround on their 
Acoustics.net website and IR bundles (I have the Waves Mercury bundle).
www.openairorg.net is the only remaining Ambisonic IR library that I can find. 
There has been updated info to the site since I first "discovered" it months 
ago.
Although a second website now seems to have vanished, there was a site that 
hosted the Ambisonic IRs for a classroom, Great Hall, and Octagon. They even 
included detailed floor plans for the three spaces, showing the various mic 
locations (00x00y to whatever in 1 meter increments). I had downloaded the zip 
files while they existed; however, I hesitate to upload these to my own site 
because the files may have been copyrighted (e.g., leaked Waves raw files). If 
anybody else can shed light on the site, I don't mind hosting the zip files. I 
believe each space had about 300 MB of wav files and info available in zipped 
form. As with OpenAir, the IRs were 24 bit res, 96 kHz sample rate.
I've been recording my own IRs, but so far it's hard to get noise out of 
recordings. I believe any anomolies or noise in the IRs and/or the dry 
recordings manifest themselves in audible and deleterious ways after 
processing. Fons A. had pointed out that dc offsets, to include unnatural 
waveforms resulting from the joining/splicing of files could create unwanted 
artifacts, too. I always use zero-crossing fades when editing material for my 
work, but it might also be an advantage to remove very low frequency 
(infrasonic) content.
Kind regards,
Eric C.






Hi all,

Apologies if this has already been asked in this list, but:

** does anybody know of one (or more) library of reverbs in B-format? I'm
interested in all possibilities: either as separate downloadable IRs, or as
convolution plug-ins, etc.

I'm not interested in the decoding of the B-format return to whatever
loudspeaker system; only in being able to play myself with the B-format
return.

Thanks a lot!
Toni

PS: I guess that Waves actually has one such library, but they only provide
the decoded 5.1 returns, thus not exposing the B-format intermediate feeds.
So this doesn't count.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] NOT The Barber of Seville

2013-02-01 Thread Eric Carmichel
Howdy All (Authentic Western Greeting):
I have to add my two bits on "How not to advertise binaural." I've heard the 
barbershop demo as well as a plethora of the more ubiquitous helicopter demos. 
My delicate and politically correct way of putting it is this: They all suck, 
and in more ways than one.
I got into Ambisonics because people doing peer-reviewed studies on hearing 
were using equally-crappy stimuli to study, in a laboratory environment, 
various processing strategies for cochlear implant and hearing aid devices. At 
least the sentences (e.g. IEEE 5-word sentences) weren't providing cognitive 
clues. I've been a proponent of providing the physical reconstruction of waves, 
and letting the listener use subtle cues such as head movements to determine 
sound-source direction, distance, or clarity. Physical realism shouldn't 
include psychoacoustical clues such as ILDs or ITDs until you stick a head 
(plus brain) into the acoustical environment; i.e. let the brain, person, or 
processor act on the physical space. Don't create stimuli (at least for 
legitimate research) that includes such clues unless you're studying effects of 
a particular parameter. Nothing's going to be perfect in the lab, but wave 
field synthesis, Ambisonics, etc. provide a way of
 capturing a dynamic environment (compared to stand-alone monaural sources 
arbitrarily panned around the room). I've used HRTFs and IRs as a way of 
(poorly) demonstrating the potential of Ambisonics, but fear that I've done an 
injustice in the process.
I use EAR insert earphones (which totally obliterate ear canal resonance--which 
is good if KEMAR already accentuated them!), Sennheiser HDA 200 audiometric 
phones (great for estimating SPL based on voltage level for the average 
listener), AKG studio phones, and Sony DJ phones. All give a different 
presentation that goes beyond tonal characteristics or timbre. But back to 
those binaural and transaural demos...
When it comes to binaural demos, parts of sounds remain in-the-head, whereas 
other components of the auditory scene move about as they should. In the end, 
they amuse and captivate first-time listeners, but that's about it. I heard one 
transaural demo that wasn't at all bad, but the sweet spot was so small that 
shifting position in a chair threw the illusion off. For me, sound in space 
from a surround of loudspeakers rules. This isn't portable, so giving a demo of 
Ambisonics requires that the listener has access to the requisite equipment. 
Hard to find those listeners...
Best to all,
ELC
Eric Carmichel
Cochlear Concepts
I take full responsibility for my opinions, and have eaten my words more than 
once.
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20130201/6aaa701b/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Barber Poles and Zeroes (life in the S-domain?)

2013-02-03 Thread Eric Carmichel
Greetings everyone,
I enjoyed the recent posts regarding binaural demos and headphones versus 
loudspeaker arrays.
I am writing this post off the cuff, so I can’t site peer-reviewed articles 
immediately. But I believe what follows is accurate.
Audiograms for individuals with profound hearing loss oftentimes suggest 
measurable, low-frequency hearing. Naturally, there is reason to question 
whether the sensation is, in fact, hearing or a vibrotactile response. It is 
easy to demonstrate our sensitivity to low-frequency vibration by lightly 
touching the cone of a loudspeaker. Not too many years back, the standard 
transducer for measuring hearing was the Telefonix earphone. Some profoundly 
deaf persons would indicate a ‘tickling’ sensation at the low (below 500 Hz) 
frequencies. Probably not a surprise when the presentation level was 90 dB HL 
(which equates to an even greater SPL at the low and high frequencies).
Of greater concern to audiologist is that of unilateral or asymmetric hearing 
losses. Above a certain presentation level, bone conduction makes the sound 
clearly audible in the ‘better’ ear, which may not be the ear being tested. 
This ‘crosstalk’ is greatly minimized by use of insert phones (predominately 
EAR phones), but narrowband masking noise is used to ‘mask’ the test signal 
from the non-test ear. Of course, the masking noise itself could be audible in 
the opposite ear if the masking noise level (in HL or SPL) is greater than the 
inter-aural attenuation (dB) of bone and tissue.
Aside from masking and audiometric protocols, the idea of sound transmitted via 
bone conduction is important when it comes to normal-hearing listeners. We 
probably don’t give it a lot of consideration to it because the predominant 
‘sensation’ is that of hearing, so we assume that sound only reaches the inner 
ear via the outer ear. What we ‘feel’ at loud concerts is identifiable as 
low-frequency vibration, but this certainly adds to the ‘sound’ (I wonder 
whether we could induce motion sickness if the vibration wasn’t in time with 
the airborne sound).
I don’t have knowledge of persons who can discern pitch through touch (skin), 
but any type of vibrotactile (or even visual) response that is rhythmic could 
be discerned as a pattern that can be associated with music or speech. For 
persons who were deaf at birth (congenital deafness; not necessarily genetic 
causes), there is certainly reason to believe they have developed heightened 
sensitivity to such vibration, and, in extraordinary persons, pitch could be 
also be discernible. By the way, percussionists definitely tune their 
instruments to match other instruments in the orchestra/group.
RE profound hearing loss at birth: Whether the brain creates synapses to 
accommodate hearing via tactile input at a young age (i.e. when the brain is 
plastic enough to do amazing things) could be studied using EEG mapping or fMR. 
The sense of hearing is generally associated with the temporal lobe (and 
further reduced to specific areas, or gyri and sulci). I do not know if these 
areas would ‘light up’ in response to other sensory input (such as applying 
vibration to the feet) for those persons. Please be aware that the temporal 
lobe represents higher level cortical function (pattern recognition and 
language), not necessarily sound-source direction. The latter may occur at the 
mid-brain (inferior or superior colliculus).
Scientists have a good grasp on how we hear (meaning the physiological aspects 
of the peripheral auditory system), but I fear we know little on how we 
‘listen’. Isolating simple sounds allows us to study and understand important 
aspects of hearing, but humans (and other critters) as well as our environments 
are complex. I’ll leave discussions of man-environment interactions to 
Ecological Psychologists or their opponents.
Regarding dentally-implanted hearing aids, my scant knowledge of this is that 
they are bone-conduction devices. Some children (and adults) may have stenosis 
or atresia that precludes them from hearing despite a normal inner ear system. 
This type of hearing loss is a conductive hearing loss. By providing sound 
vibration to boney surfaces (such as the mandibular process), the inner ear 
receives stimulation as though the footplate of the stapes were acting on the 
oval window (middle and inner ear components). Sound travels faster in a solid 
medium than it does for air, so the usual directional cues are probably lost or 
would take time to ‘learn’ (the other aspect of localization).
You can take a tuning fork and, while it’s vibrating, touch it to your 
forehead, teeth, or prominent (hard) area just behind your ears. It is evident 
from a simple demonstration that the increase in level is not merely a result 
of sound reaching the outer ear. There’s a limit to high frequency response via 
bone conduction because of inertia. It takes more energy to accelerate a large 
mass. The light weight of the eardrum and bones of th

[Sursound] Of Scorpions and Phase

2013-02-05 Thread Eric Carmichel
Greetings to All,
First, thanks to Dave and Haigel for your insights and responses to my recent 
post. I know little about insects or arachnids, and had no idea scorpions have 
a keen sense of sound-source direction. This suggests the equivalent of an 
interaural time delay, but as it would apply to vibrotacticle sensation at 
scorpions’ feet or pectines. This is an educated guess because I can’t image 
much difference in motion intensity in the short distance between feet, 
particularly if the vibration was produced by a distant source.
Speaking of interaural time delays, I have a concern or suggestion as to how 
educators teach the concepts of phase and time delay. In the speech and hearing 
sciences (as well as recording sciences), it is often taught that time delay 
and phase delay are intimately related (sometimes to the point of making phase 
and time seem synonymous). When talking about a complex signals (what most of 
the world consists of), this creates confusion. Wouldn’t it be better to teach, 
at least with sound in mind, that time-delay is akin to distance, not phase? 
Moving a loudspeaker back, say 10 meters, is equivalent to a 30 ms (approx.) 
time delay that is independent of frequency. All frequencies are phase delayed, 
but the amount of phase shift is different for every frequency for a given time 
delay (perhaps the source of confusion among students). But here’s something a 
little more interesting:
If phase delay between ears were responsible for our ability to discern 
sound-source direction, then shifting the phase (apart from time delay) should 
provide a directional cue. Actually, I haven’t tried presenting a 1100 Hz tone 
to both ears with one side phase reversed and both ears receiving the onset of 
tone at the same time (i.e. appropriate phase shift for frequency/head 
dimension, but no time delay). In air, the 1100 Hz tone is “shifted” 180 
degrees over 6 inches, which is approximately the distance between ears. 
Conversely, I can time delay the tone so that it is presented to one ear 450 
usec ahead of the opposite ear, but the initial phase of the respective 
presentations are same for both ears. This is time delay without phase delay. 
The perception or sense of direction to phase delay versus (solely) time delay 
is not the same. I guess I bring this up because I’ve overheard discussions 
where someone is attempting to explain ITDs cues by
 saying they’re the same as a sound’s phase delay between the ears. 
Furthermore, I continue to see online blogs where students are asking for a 
clear (not textbook-written) explanation of the difference between phase and 
time delay.
Admittedly, none of the above addresses surround sound or Ambisonics (or does 
it??), but I do wish to ask if anybody has listened to broadband sounds that 
have been shifted 90 degrees (relative to the opposite ear). Such a phase shift 
is no trivial task for broadband sounds. It is, of course, trivial to take a 
pure tone and delay it 90 degrees using a wave editor, time delay, or an 
all-pass filter tuned such that x frequency will be shifted pi/2 radians. I 
suppose a fixed, broadband shift could be accomplished via a Hilbert 
transformation or careful design of two sixth-order filters combined to provide 
the requisite phase shift over a wide frequency range (the latter would work in 
real time, though this isn’t requisite for a demo). I don’t know whether 
arbitrary phase shifts (apart from simple polarity reversal) result in image 
blurring or whatever, particularly under earphones where wave superposition in 
the sound field doesn’t occur. Note that
 this is hugely different from time delay because all frequencies are being 
shifted 90 degrees regardless whether there is or isn’t a time delay. It’s also 
different from all-pass filters where phase shift is also frequency dependent 
(and usually 90 degrees at one and only one frequency). Has anybody out there 
built or listened to a circuit with a fixed, 90-degree phase shift across the 
audible or speech frequencies? Again, the complex reference signal would be 
presented to one ear, and the processed signal presented to the opposite ear. 
Could be interesting, or could be totally uneventful. At least it would 
demonstrate the differences between time delay and phase shift.
Thanks for reading.
Best to All,
Eric
Eric L. Carmichel
Cochlear Concepts
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Magnets, unicycles, and Ambisonics

2013-02-05 Thread Eric Carmichel
Greetings to All,
I was looking at the web link that David M. provided regarding inertial 
transducers, and wondered whether placing one of these devices on my living 
room wall would approximate the elusive infinite-baffle loudspeaker (I’m 
recalling articles written by the great Harry Olson). But considering that I 
live in a duplex and that I’m already the neighborhood’s designated mad 
scientist, I probably should avoid such experimentation.
As far as inertial devices and the hearing sciences go (I’ll get to the topic 
of surround sound, too), there was a bone-conduction device that was intended 
to compete with cochlear implants. My understanding was that an ultrasonic 
carrier frequency (around 60 kHz and transmitted via bone conduction) was 
modulated with the speech signal. I don’t know whether the speech was presented 
as pulses (such as neural pulses) or as analog sound. For normal-hearing users, 
this would be akin wearing a conventional bone-conduction (or inertial) 
transducer. The ultrasonic carrier would be filtered out via inertia (as well 
as undetected by the inner ear), leaving only the lower-frequency speech signal 
to be heard. But for persons with non-functional cochleas, some other mechanism 
would have to be at work. I believe the company making the device was 
German-based and went by the name Hearing Innovations. They set up shop in 
Tucson, AZ (and perhaps other cities), but
 seemed to quickly disappear. Not sure what the story was, or whether the 
device worked. On to the topic of binaural listening...
From a theoretical perspective, headphone listening could be quite real because 
real-world listening is ultimately a function of the one-dimensional pressure 
changes impinging on the eardrums. Recent posts suggested that headphone 
listening can’t provide the same stimuli or experience as real-world listening, 
particularly for low-frequency sounds. One very important aspect of 
low-frequency (acoustic) hearing is how much it can contribute to speech 
understanding when combined with electrical hearing (meaning implanted 
electrodes, or cochlear implants). I’ve written on this topic in past posts, 
but I will repeat that the combination of acoustic and electrical hearing 
results in an improvement in speech understanding that far exceeds the sum of 
the individual modes' individual contributions to speech scores. When it comes 
to cochlear implants, the question is how much benefit does this combination of 
modes (acoustic plus electric stimulation) provide in
 noisy or reverberant environments? People with cochlear implants have great 
difficulty hearing in noise, so improvements in this area are of great interest.
I have proposed studies using a surround of noise to investigate the efficacy 
of EAS listening in real-world environments. To date, studies have shown 
improvement in speech understanding using EAS or EAS simulations (namely 
normal-hearing listeners donning earphones) combined with speech babble and 
artificially-generated reverberation (stereo or monaural). One could argue that 
providing natural, multi-directional reverberation (via a surround of speakers) 
in the sound field would be an easier listening task than listening under 
earphones. In a surround of uncorrelated background noise, we could use our 
ability to localize sounds (assuming this can be done with implantable 
prostheses) to segregate the speech from noisy or reverberant background 
sounds. But I am decidedly against the idea that the research outcomes or 
processing strategies optimized for headphone listening are valid for the 
majority of real-world listening situations, even when the
 headphone listening task is the more difficult of the two.
Here’s an analogy. One of the few physical feats I can perform is riding a 
unicycle. I also used to set up obstacle courses for unicycle riding that were 
quite rugged. For most, riding a unicycle over rugged terrain is arguably more 
difficult than learning to ride a bicycle. However, I am not remotely adept at 
performing stunts on a bicycle despite demonstrating an ability to stay upright 
on a single wheel. Both activities involve balance, wheels, and pedaling, but 
the ability to ride a unicycle does not translate to bicycle handling skills. 
What I experience while listening under headphones may require more 
concentration than real-world listening (it depends on the task), but that 
doesn’t mean that my ability to concentrate on words or sentences in noise 
while listening under earphones has real-world application or translation. 
Listening under headphones with pink noise as the background noise may prove to 
be more difficult than listening in a
 surround of speakers with speech noise. But finding methods or processing 
strategies that improve the speech scores for a difficult task doesn’t 
necessarily translate to improving one’s ability to do well while attempting 
the conceivably simpler, real-world tasks. If I wish to study improvements in

[Sursound] Spheres, Argon, and IRs

2013-02-07 Thread Eric Carmichel
Greetings All,
The idea of recording an IR in an unusual environment* is interesting. When I 
first read the original post, I wondered how closely a simulation would match a 
recorded response.
Although I'm far from an expert on room acoustics, I have used simulation 
software (e.g. Sabine, Odeon, CATT-Acoustics, COMSOL, etc). In addition to the 
absorption coefficient of the metal, the sphere's radius, and the listener's 
position, one might also wish to add the speed of sound in argon. Pressure and 
temperature will affect speed of sound (and resulting wavelengths and 
Eigenmodes) in gas. Is the gaseous environment a mixture of oxygen, nitrogen, 
and argon that permits one to enter without problem? Would a balloon filled 
with air pop in the same manner as free space if the compressed air has no 
place to escape (i.e., you're adding to the net pressure of an enclosed volume, 
not a place where free expansion can take place). As usual, I'm writing 
off-the-cuff and my notions may be wrong, but a simulation of the acoustical 
environment could be interesting when compared to a live recording or IRs made 
with a Soundfield mic.

*There is a new experiment about to start at the Gran Sasso laboratory (Abruzzo 
region, central Italy) to detect dark matter (featured on BBC News24 today). 
The business end appears to be a  metal sphere, loosely comparable to a 
bathroom in size, which will soon be filled with argon. 
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Jason and the Argon-nots

2013-02-09 Thread Eric Carmichel
Greetings to All,
Just a few thoughts regarding recent posts and the argon-filled sphere.
Martin, I definitely boo-booed by suggesting the recording would be made in an 
all-argon atmosphere. But comparing the *sound* one might experience between 
the two conditions (air vs argon) might have been interesting.
I learned to scuba while in the military--this was during the Gulf War/Desert 
Storm. Most of what I learned regarding Boyle's law, Charles's law, and partial 
pressures came from two classroom lectures (this was prior to embarking on a 
college degree). Anyway, I don't believe breathing noble gases for a brief 
period and under normal atmospheric conditions would result in much more than 
momentary oxygen deprivation. The use of gas mixtures under immense pressure 
(deep sea diving) is a wholly different thing. Divers learn to ascend at rates 
that permit off-gassing and pressure equalization at the physiological 
level--it all relates to gases under pressure and dissolving in liquid (in this 
case, blood). What is far worse at normal pressures are gases (particularly CO) 
that attach themselves to the blood in the same manner as oxygen. A lot of 
people don't realize that CO is one of the deadliest gases there is, even in 
dilute concentration. Helium and Argon
 don't chemically attach to much of anything (hence their categorization as 
noble gases). On another extreme, climbers (K2, Everest, etc.) need time to 
acclimate to low atmospheric pressure (this goes far beyond oxygen needs). 
Without proper adjustment, extreme elevation changes can result in severe 
sickness or worse. Anybody up for making IRs in the Himalayas?

If someone wished to do hyper-real sound design for sci-fi movies, he/she would 
have to consider how we might sound and hear in alien atmospheres. But I guess 
Captain Kirk wouldn't have appeared too manly had he started talking like 
Mickey Mouse while in a rarified environment. Actually, the ideas of echoes and 
sound on distant planets might be of value... one of these years.
Lastly, I was happy to read Richard Lee's comments regarding the TetraMic and 
using it as a sound intensity probe. Never thought of using the TetraMic for 
this. Thanks, Richard, for the insights (and the many contributions you've 
made). I used to have a pair of matched B&K 1/4-inch mics that were for an 
intensity probe. The B&K mics were matched for phase as well as frequency 
response. Do the IRs included with the TetraMic compensate for phase changes as 
well as frequency response variations? Judging by the math (only a part of 
which I understand), I imagine they do.
Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Jason and the Argon-nots

2013-02-09 Thread Eric Carmichel
Hi Michael,
I thought it was just the xenon strobes that had a numbing effect (in 
conjunction with appropriate music, etc.). Just read a bit about xenon after 
your post--xenon appears to activate certain potassium channels--similar to 
nitrous oxide and cyclopropane. Learn something everyday. Thanks for writing.
Best,
Eric C.  





 From: Michael Chapman 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Saturday, February 9, 2013 1:01 PM
Subject: Re: [Sursound] Jason and the Argon-nots
 
> Greetings to All,
> Just a few thoughts regarding recent posts and the argon-filled sphere.
> Martin, I definitely boo-booed by suggesting the recording would be made
> in an all-argon atmosphere. But comparing the *sound* one might experience
> between the two conditions (air vs argon) might have been interesting.
> I learned to scuba while in the military--this was during the Gulf
> War/Desert Storm. Most of what I learned regarding Boyle's law, Charles's
> law, and partial pressures came from two classroom lectures (this was
> prior to embarking on a college degree). Anyway, I don't believe breathing
> noble gases for a brief period and under normal atmospheric conditions
> would result in much more than momentary oxygen deprivation.

Xenon is anaesthetic ... because ...
...   well, because it is.

Michael
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20130209/68353768/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Bi-Amping the B-format

2013-02-20 Thread Eric Carmichel
Greetings to All:

I am building a surround system for the playback of live recordings and video. 
Naturally, as with all of us, there are economic as well as technical 
constraints. The purpose of this post is suggest (and receive feedback) for a 
system that uses multiple subwoofers in order to obtain the directional 
components of very low frequency sounds. One could justifiably argue that we 
don’t have a keen sense of direction for the low lows (and why subwoofers can 
be placed in inconspicuous places), but that is not the issue at hand. In fact, 
I’ve always been a proponent of center-channel subs since my introduction to 
hi-fi decades ago. Today my low end needs direction to accurately recreate 
industrial sounds.

The system is comprised of 16 channels. The advantage of 16 (or less) channels 
is that I don’t have to use a dedicated Word Clock to sync my MOTU FireWire 
interfaces. I’m fond of these interfaces, and two of them can sync-up via the 
FireWire link without complication. I intend to use two hexagonal arrays of 
small-sized loudspeakers (a bit larger than the 3-inch coned Genelecs, but not 
much more so). One array will be near floor level, whilst the second array is 
proximal to the ceiling. According to the literature (Malham, Rumsey, and 
others come to mind, but I’m shooting from the hip), diametrically opposed 
pairs may be preferred when the listener is centered in the array. Further, it 
is purported that six speakers provide immunity against drawing signals toward 
a single speaker. Eight speakers is probably overkill and doesn’t leave me the 
four channels needed for a square array of subs. Any thoughts as to whether the 
two hexagonal arrays
 providing horizontal and height information should be offset or vertically 
aligned?

Regarding the need for subs: With ‘normal’ music content, twelve speakers 
working in concert would provide more than adequate low-frequency energy. But 
I’m going to be using live recordings where a particular low-frequency sound 
could be coming from an extreme R, L, front or back direction. In this 
scenario, I’d rather have the subs handle the load but I still need to preserve 
‘direction’ as stated above. Because four speakers can provide adequate 
surround sound, my intent is to frequency-divide the B-formatted signal and 
send the highs and lows to their respective feeds via ‘conventional’ Ambisonic 
decoding. To be clearer, I will digitally filter the B-format signal so that 
each of its four components (W, X, Y and Z) are divided into a high and 
low-frequency signal component. The low-frequency components will be decoded 
and sent to the square (and likely horizontal) array of four subs. The highs 
will be decoded based on the position of
 the 12 ‘full-range’ speakers. I use full-range loosely here because the added 
bass channels aren’t for enhancement, but to alleviate the 12 speakers from 
their low-end duty.

I haven’t determined the best crossover frequency, and this may be determined 
in part by a combination of the speakers used and the stimuli to be presented. 
I wish to use the lowest possible frequency, but not to the point of driving 
the small speakers to distortion. I’m guessing a digital (crossover) filter 
that is both maximally flat and phase coherent is best, though slight dips 
caused by frequency response anomalies are easy to EQ out. I use EQ judiciously 
because it is generally just a marginal cure for a loudspeaker's deficiencies. 
Upping the response at some frequency extreme merely adds to distortion that is 
‘measured’ (in SPL) as a boost at the deficient frequency (or third-octave band 
or whatever). Only a spectrum analyzer or critical listening reveals where the 
real boost is occurring.

Although my proposed strategy focuses on subwoofers and low frequencies, it may 
find purpose at higher frequencies. For example, I read about Oticon’s 
carefully placed 39-speaker array in the Hearing Journal (circa 2010). I recall 
that the speakers were positioned within 1 cm of their ideal position. But what 
about the ‘acoustical centers’ of loudspeakers? How would this be determined? 
Justification of such extreme placement would require knowledge of phase 
characteristics and an exacting acoustic center. However, by applying frequency 
division to the B-formatted signal, each speaker within a bi- or tri-amplified 
enclosure could receive its own, unique decoded signal based on its absolute 
position within an array. Of course, this means a s- load of channels and amps, 
but perfectionists and the technically inclined may find this appealing.

Anyway, I’m curious as to what others may have attempted with regards to 
bi-amping the B-format signal (or whether it’s a remotely good idea), use of 
multiple subwoofers, and whether the hexagonal arrays providing height 
information should be offset or vertically aligned. As always, I appreciate the 
help and insights of the experts and experienced Ambisonics aficionados.

Best,
Eric C

[Sursound] A Thanks... And Another Post

2013-02-21 Thread Eric Carmichel
Greetings to All,

First, many thanks to Jörn, Bo-Erik, and Michael for your responses to my 
recent post. Your responses gave me food for thought, and I’d like to add a few 
comments regarding audio interfaces, psychoacoustics, and pseudo second-order 
miking (I’m confused in this latter area).

Bo-Erik, I fully appreciate your input. The selection of a good sub is 
something I need to give careful consideration to. I currently use a single sub 
that uses a servo-controlled 12-inch driver in a sealed enclosure. The sealed 
enclosure is preferred because airflow noise through ports is not an issue. 
It’s also a forward facing configuration, so the *directional* characteristics 
of the sub (if direction actually exists) are predictable; or, at very least, I 
can point the driver in a known direction. Unfortunately, this particular sub 
is no
 longer manufactured, but I believe there are a number of decent 10-inch subs 
on the market (and a plethora of junk speakers).

Regarding localization, home theatre sound and the 80-Hz xover point:

I’ll confess ignorance when it comes to knowledge of a separate or unique 
physiological mechanism used to localize (or omni-ize) ultra-low 
frequencies. Within the context of room reflections, music listening, 
home theatre, and the like, I’m fully aware that frequencies below 80 Hz
 are near impossible to localize. Add to the overall auditory scene the 
constituent frequencies that provide unambiguous sound-source 
information, the need for a surround of subs really goes out the window. Some 
spout a slightly higher cut-off frequency, but I gather that 80 
Hz is the accepted standard for home theatre. I have seen literature on 
5.1 and 7.1 referring to subs and low-frequency enhancement for the sole 
purpose of 
effects--the surround speakers are still operated full-range. [Although I 
haven't tried this, I imagine we can accurately lateralize a sub-80 Hz tone 
under earphones. If so, then a lot of our (in)ability to localize low 
frequencies in the sound field is mostly a consequence of physical variables 
such as long wavelengths, head diffraction, room reflections, etc., and not a 
unique mechanism or deficiency of the brain, mid-brain, or peripheral sensory 
organ.] Regardless of accepted protocol, I do have reason for using multiple 
subwoofers, and this reason purposely ignores psychoacoustics.

Although this seems obvious, I’ll argue that sounds in nature do not morph to 
accommodate human perception; i.e., psychoacoustics is a product of a head 
(with ears and brain) in a physical space. For this reason, I do not want to 
present stimuli with “built-in” psychoacoustic enhancements or manipulation. 
Ideally, I want an “untainted” physical space for observing (yes, observing, 
not necessarily measuring) the perception of special-needs populations. 
Populations to be considered include the
 elderly, children who are neurotypical (NT) or autistic and, of course, those 
lucky normal listeners. There's reason to believe that certain populations may 
have compromised auditory processing ability (in addition other sensory input) 
that ultimately results in aberrant or atypical behavior. Because we do not 
know how such persons perceive and react to sound, the best I can do is provide 
physical realism. I wish to present accurate sound-source direction regardless 
of frequency. Psychoacousticians may or may not be in agreement as to how the 
*average* listener perceives sound, but I'm not interested in average 
listeners. Admittedly, some assumptions and subjective impressions always come 
into
 play 
when choosing and using audio equipment. For example, I’ve yet to hear 
any two brands (or models within a brand) of loudspeakers that sound 
anything alike, so we have to accept that none of this is
 going to be perfect.

When it comes time to construct a sound system for accessing sound quality (or 
simply for musical enjoyment), I will most certainly use a single sub as you 
suggested. But for my proposed system, the subwoofer's crossover frequency and 
filter order does become something of a choice based on speaker performance. 
Because I will be filtering/processing the four B-formatted wave files before 
decoding (none of the processing will be done in real-time), I have a lot of 
choices for filter types--and perhaps the addition of group delay. I have 
numerous MATLAB Toolboxes for processing wav files in addition to the Advanced 
Signal Processing and Digital Filter Design Toolkits in LabVIEW. Thankfully, 
I’m no longer limited to the *bouncy* 8th-order elliptic filters I used to 
construct. The problem nowadays is that there are way too many choices that are 
relatively easy to implement. Responses to my last post provide clearer 
direction--thanks.

Jörn brought into the discussion not only filter type and slope, but the choice 
of audio interfaces. I have first-generation MOTU 896HD units. Although there 
are two FireWire ports on each unit's backside, MOTU states

[Sursound] Submersed in Subs (and ideas)

2013-02-22 Thread Eric Carmichel
Greetings to Everyone,
I received some great input and ideas related to my last two post (bi-amping 
the B-format and subwoofers). Everyone’s input was greatly appreciated.
It wasn’t all too long ago that I first learned of Ambisonics, and then this 
group. At the very start, there was a comment stating that a certain amount of 
“psychoacoustic trickery” is intrinsic to the Ambisonic decoding. I suppose 
this is where Ambisonics differs from wavefield synthesis or acoustic 
holography. Creating a realistic physical space wouldn’t require integration of 
the duplex theory of hearing in the decoding; it would (somehow) simply 
re-create the waves as they existed naturally, with or without knowledge of 
ILDs or ITDs. When we add the 700 Hz ITD-to-ILD switchover as a part of the 
decoding, we’re assuming the listener will be the “average” human. This is fair 
enough because Ambisonics and other surround topologies were designed with 
people in mind. But if we were to design a system to evaluate a “listening 
machine” (or an animal we know little of), we need something that provides 
real-world directionality, intensity,
 and doesn’t care whether humans can or can’t determine the direction of a 40 
Hz sound.

But because the transition frequency is 700 Hz for conventional Ambisonic 
decoders and 400 Hz for domestic decoders (thanks, Martin, for this info), it 
seems plausible to make this the electronic crossover point as well. The 
woofers, if properly placed, can handle frequencies that have unambiguous 
direction as well as the difficult-to-localize low lows. This, then, would be 
akin to a two- or three-way speaker system, only better: The woofer would 
receive a signal that is position-specific. So now I’m thinking what Fons and 
Martin wrote is a plausible solution to my space/channel count/directional lows 
concerns. Of course, this will require a speaker with a smooth 40 Hz (or lower) 
to 400 Hz response that can deliver a fair amount of distortion-free energy if 
the low-frequency sound emanates from an extreme L, R, front, or back 
direction. Somewhat more clearly: If one speaker has to do the work of six 
working in concert, it can still manage the signal. I
 don’t plan on blasting anyone out of the room, but low frequencies originating 
from well-defined directions do exist in the stimuli.

I won’t have to fly the speakers making up the centre (horizontal) array; only 
the speakers suspended from the ceiling have to chosen based on their weight 
and size. I’m doing this in a modest-sized room, not an auditorium. Having said 
all of this, upping my channel count to from 16 to 18 would require a minimal 
change, and therefore I could provide three horizontal arrays of 6 loudspeakers 
each. The upper and lower rings/arrays will produce frequencies above 400 Hz 
only, while the center (horizontal-speaking) array provides the 400 Hz and 
below energy. (Kind-of like six really stretched out two-way speaker columns, 
only with Ambisonic decoding for the mids and lows).

Other suggestions that I’ll try include adding a Focusrite (or similar) D-A to 
the system and Reaper software. I’m not sure, though, if Reaper works with MIDI 
automation, and I use a MIDI-based system of my design to collect data and 
automate faders. I’ll find out soon enough. There are certainly a lot of ideas 
to consider. If I didn’t already have a great field recorder, I’d certainly get 
the MOTU Swiss Army knife... I mean Traveler... in addition to the MOTU 896HD I 
have now. I already have more than 16 channels worth of analog outs, but not 
all hardware devices at my immediate disposal share the same software driver. I 
believe adding a second MOTU and a D-A (Focusrite or Behringer) is a great 
option.

On a wholly different note: Next week I’ll be making an Ambisonic recording in 
the Superstition Mountain Range, located in Arizona. In the mornings, there’s a 
low-pitched steam whistle that blows from some type of facility west of the 
mountain range. There’s quite a delay (at least 5 s) before an echo is heard 
from one trailhead, but the large cliff face on the mountain and the 
obstacle-free desert terrain makes for interesting illusion: You’d really 
believe there’s a second steam whistle up in the mountains. There’s very little 
“spectral” modification of the whistle's sound as the sound travels across the 
distance, and both the whistle and resulting echo are loud enough for a good 
SNR. Curious as to how a recording of this will sound on my home rig. I’ll 
share the wav files as soon as they’re processed.

As always, many thanks to everyone for your time and insights.
Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Saga of the Subs

2013-02-24 Thread Eric Carmichel
Greetings to All,
As always, many thanks for everyone’s time. This post references replies 
received from Peter and Jörn: Thank you both for your expertise, sharing 
experiences, and taking time to write.

[from Dr. Lennox] “For mobile listeners, and indeed, off-centre listeners, the 
amplitude gradient at Lf was readily discernible, so that kind of cue, though 
not normally part of Duplex theory for directional hearing, seemed relevant to 
the experience.”

ELC: I recently made recordings in a semi-anechoic room using a third-octave 
stepped-frequency test signal. I presented the tones at different angles.* The 
lowest test frequency was 50 Hz. At this frequency, at least in the 
semi-anechoic room, sound-source location was unambiguous. I wasn’t testing my 
ability to localize; this was just a casual observation that seemed to go 
against the grain. Once I process the recordings, I’ll have a chance to hear 
and see how well defined source-source location is as a function of frequency 
(all I have now are the A-format files recorded to the TASCAM D680 via my 
TetraMic). It'll be curious to hear whether the 50 Hz tone appears to come from 
a specific direction in my horizontal array (which consists of 8 full-range 
speakers identical to the speaker I used in the test room). *Notes regarding 
test setup: I did not use a turntable for rotation (the speaker was fixed); 
instead,
 I have a ball head mounted on a tripod. A separate jig allows me to 
center the mic vertically over the ball head. I rotated the mic in 
fifteen degree steps, 0 to 360 degrees. The stepped test tone was presented and 
recorded at all 24 positions.

[from Jörn] “...the inability to localise bass sounds is a very persistant 
urban myth. In rooms, ok, but anybody who has been near an open-air rocknroll 
stage during subwoofer calibration will have no trouble localising the sound :)”

ELC: Visual cues may play part in this ability, too, but I fully agree with 
you: The sound-source direction is unambiguous in certain situations. This is 
why I wish to include multiple subs in my forthcoming experiments. Another 
common myth is how much the Doppler Effect plays in our perception of an object 
moving towards or away from us. A car moving towards us will have a constant, 
albeit upwardly shifted, frequency. There is no frequency change until the 
moment the car passes. Note that I selectively use the word frequency, not 
pitch. Perceived pitch DOES change, but is a function of increasing intensity, 
not a constantly-changing frequency. Pitch, measured in mels, is affected by 
SPL. Until the moment a moving object passes the observer, it’s mostly changes 
in intensity and quality (timbre) that alert us to movement. I’ve used the 
Doppler Effect functions in various VSTs (or RTAS for Pro Tools), but mostly to 
provide the effect of an object
 moving on a curved path. I bring this up because of a subsequent comment 
(below) regarding rendering.

[from Jörn] “...Interesting research project! If you can, please share the 
results.”

ELC: I’ll be glad to. Not every listener is human, hence the need to remove 
assumptions based on human hearing. Listening “machines” are involved (more 
later), and arrays of mics along with beam-forming techniques can determine 
both time and level differences over a wide range of frequencies. Furthermore, 
the mic arrays are considerably more directional at the low frequencies than, 
say, the typical (first-order) cardiod mic. Without a barrier (Jecklin disc or 
head/torso), no ILD. Mics close together = skewed or non-existent ITD. Just 
want the waves in centre of array to be as they would exist in real life... in 
the absence of a human head.

[from Jörn] For the actual experiment, I can see how you would use pre-rendered 
files for extra robustness, repeatability and foolproof documentation, but 
while finding the setup, I'd really recommend to use real-time filtering. 
Insert usual ‘linux/jack/ardour/fons' stuff’ plug here.”

ELC: Yes, thanks for suggestion. Because I won’t be only person running the 
experiment, I will be making the files playable on the most generic of DAWs. 
It’s a bit like submitting audio stems to a video producer: I don’t always want 
the receiving party to have control of all the effects, just the overall level 
or balance. Secondly, I may use filter functions that aren’t available in my 
VST collection. Admittedly, I can’t at this time imagine needing a filter that 
requires advanced signal processing (e.g. MATLAB or LabVIEW) offline.

[from Jörn] “...The thing with dual stacked rings is: sources on the equator 
have very low rE already, being ‘vertical phantom sources’. Nothing against 
rings as such, but make sure you have a good part of your speakers at ear 
level. It sure is a psychoacoustic decision, but if it really bothers you, you 
should go for an even distribution on the sphere anyways.”

ELC: The general consensus is to use an even distribution on a sphere. The 
original rea

Re: [Sursound] Saga of the Subs

2013-02-24 Thread Eric Carmichel

Hello Robert,

Perception of distance is a complex interaction, 
and psychoacoustical experiments generally limit the number of variables
 (out of necessity) studied in static or laboratory settings.
Not 
that I disagree with you, but there's more to the "Doppler Illusion" (as
 it has been called) than meets the eye... er, the ear.
For two related papers, please refer to the following links:

http://www.public.asu.edu/~mmcbeath/mcbeath.research/Doppler/Doppler.html

http://psychology.clas.asu.edu/files/1996_JEP-HPP%28DopplerIllusion%29.pdf

I
 did something similar while an undergrad, but did have enough 
confidence in my own measurements to submit the findings for 
publication. Most of what I realized is that a measurable change in 
pitch seemed too small for anyone but a trained musician to perceive it,
 yet everyone could judge the object as approaching and then receding. 
The spectral nature of the moving vehicle changed, but mostly due to 
relative proximity of buildings and obstacles and diffraction. But the 
fundamental pitch of, say, a siren, when isolated, didn't provide the 
dominant cue. I guess it's akin to the duplex theory of localization: We
 use the cues that are available; there's not an instant switch between 
one mode to the next, particularly when complex sounds interact with the
 head and pinnae, and head movements also help resolve ambiguities. 
Vision and experience naturally add to our perception of distance and 
motion.

Thanks for corrections, but please also consider viewing the above two links--I 
need to re-read them myself.
Best,
Eric


____
 From: Robert Greene 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Sunday, February 24, 2013 5:46 PM
Subject: Re: [Sursound] Saga of the Subs
 

This is wrong about the Doppler effect and perception of distance.
It would be correct if the object moving and emitting sound
as it moved were coming straight towards you and going through
you and then moving away. But a police car with a siren say
is not aiming straight at you(or at least you better hope not!)
It is going by on a straight line that has a closest point to you
but that closest point is not you! So the amount of pitch shift
in fact changes continuously, being a max shift up when the
car is far away, diminishing gradually until the car is as close
to you as it will get(at which point the car is not changing distance
from you at all  in the instantaneous sense) and then gradually
as the object moves away with increasing speed relative to you!
the pitch falls to a minimum.
This is not dependent on mel shifting with level--it is
literally the case on the frequency level. (The mel shift
would be the same whether the car were approaching or departing--
just reversed in time. This as I recall is not what is observed--
the situation does not have time reversal symmetry)
Robert
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20130224/7c03bdf2/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Point-to-Contact... and Ambisonics

2013-02-26 Thread Eric Carmichel
Greetings All,

I’ve much enjoyed the recent post, particularly Dr. Peter L’s comments 
regarding “auditory looming” and the like. Not too long back I took a 
psychology class (yep, still a student) on perception. But unlike the 
perception classes I had in the past, there was very little emphasis on the 
peripheral organs, brain, or anatomy. Instead, the emphasis was on Ecological 
Psychology. This was a new topic for me, and gave me a new way of considering 
how we see (or hear) in the world.

I was already hell-bent on creating real-world stimuli for research, and 
Ecological Psychology (EP) provided further motivation to test man in a 
“natural” environment. (One of EP’s premises is that man and his environment 
are inseparable pairs). Much of the work in EP has been geared toward vision, 
but proponents of EP (notably Gaver) have written on the topic of hearing as it 
applies to EP, animals and their environments. I recently happened upon a 
EP-related paper authored by R. L. Jenison titled On Acoustic Information for 
Motion.

What occurred to me as being potentially important for hearing aid (HA) users 
is point-to-contact as it applies to audio. Assessing one’s ability to read 
5-word sentences in a background of pink noise probably won’t tell us how a HA 
user “feels” while standing on a street corner. How a person feels about 
his/her HA, or, more importantly, how the HA makes the person feel (nervous, 
confident, energized, afraid, etc.) will likely determine a person’s motivation 
for continued HA use.

Rendering is one way of creating convincing illusions (particularly along side 
video), but the subtle cues that are absent in all but the most complex of 
models may be insufficient to tell how one feels about a particular device. It 
is for this reason I wished to record a few real-world scenarios that are free 
of assumptions or built-in (exaggerated) cues. I have chosen live recordings 
via Ambisonics. The subtle cues that are easily ignored by the normal-hearing 
listener may important to those who have sensory impairments. I don’t need 
stimuli that are “cluttered” with sounds coming from all directions, but I do 
wish to include naturally occurring (3D) reverberation and motion. 
Signal-to-reverberation ratio provides cues as to a source’s distance and 
whether it is receding or moving towards us. Other cues are available as well 
(Doppler shift or Doppler illusion, level changes, rate of level change (as it 
might affect pitch perception),
 diffraction patterns, etc). Not everybody, however, may be able to use the 
“obvious” cues and may, instead, rely more heavily on the more subtle cues. Add 
to this compression and (possibly) frequency transposition of hearing devices, 
and many available cues become distorted or lost.

I suppose what I’ve hoped to avoid in my research design is putting 
psychoacoustics on top of psychoacoustics; that is, avoid using stimuli that 
was designed based on assumptions as to how we hear. I want to observe or 
measure behaviors and feelings in natural space, and how HA processing can 
affect these behaviors and feelings.

I wrote a paper (Ecological Considerations for Cochlear Implant Research) for 
the aforementioned psychology class, and the topic of surround sound is 
included (this was roughly the time I began reading about Auralization and 
Ambisonics). Anyone interested in reading and scrutinizing the paper can find 
it here:

http://www.cochlearconcepts.com/eric_articles/ecological_considerations.pdf

An accompanying PowerPoint to the paper can be found here:

http://www.cochlearconcepts.com/powerpoints/psy591_ecarmich.pps

I’ve come a long way with my ideas since the time I wrote the paper. As always, 
I appreciate the help, corrections, and feedback from all Sursound Digest 
contributors... your insights are always welcome.
Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Loud Whispers and Quiet Shouts

2013-02-27 Thread Eric Carmichel
For those who do not have access to Spatial Hearing: The Pyschophysics of Human 
Sound Localization, Revised Ed. By Jens Blauert, I have provided a few 
sentences from this book. Another book that is recommended is Binaural and 
Spatial Hearing in Real and Virtual Environments by Gilkey and Anderson 
(Chapter 13 of this book, written by D. H. Mershon, addresses distance 
perception). From Blauert:

“Familiarity of the experimental subject with the signal plays an important 
role in the localization between the distance of the sound source and that of 
the auditory event. For familiar signals such as human speech at its normal 
loudness, the distance of the auditory event corresponds quite well to that of 
the sound source. Discrepancies arise, however, even for unusual types of 
speech at their normal loudness. As an example, figure 2.7 [see note below*] 
shows localization in the range of distance from 0.9 to 9 m with a human 
speaker whispering, speaking normally, and calling out loudly (Gardner 1969).”

In a subsequent chapter, Blauert writes:

“The closer a person approaches a sound source in an enclosed space, the 
stronger the component of the primary field in comparison with that of the 
diffuse field (figure 3.48). The difference between the levels of the primary 
and reflected sound furnishes information to the auditory system about the 
distance of the sound source. The auditory system takes this information into 
consideration in forming the distance of the auditory event. This relationship 
has been described many times [references go back as far as von Hornbostel, 
1926]...”
In the next paragraph, Blauert writes:
“It must be pointed out that meager statements about spatial hearing in 
enclosed spaces up to this point are only valid as general rules. Departures 
from these rules and additional effects can occur in connection with rooms of 
specific shapes, with particular sound sources, and with specific types of 
signals.”

When it comes to my personal interests, I have considered distance and 
loudness effects as well as well as how to present them. I have created 
real-world stimuli that are to be presented at “normal” 
levels. I have included subtleties, such as talker voice level as a 
function of background noise level, to make the audio (and video) 
stimuli more realistic. For example, it shouldn’t take a lot to convince
 anyone that we tend to raise the level of our own voice in a noisy 
environment--this phenomenon is known as the Lombard effect (see, for 
example, Lane & Tranel, 1971). Tufts & Frank (2003) showed that 
talkers’ voice levels increase, on average, 5 dB for every 10 dB 
increase in background noise level. I am aware of studies that used restaurant 
noise presented at low levels (60 dBA) to maintain a favorable SNR. Conversely, 
some researchers used the same surround of restaurant noise (recorded with 8 
Sennheiser mics) at its actual level, but elevated “normal” speech to 85 dBA to 
maintain a favorable SNR. What I mean by favorable is on the order of + 15 dB 
SNR. I should have asked, “Does the restaurant noise sound far away, or does it 
sound like a “quiet” pizzeria?” Research participants' thoughts on this topic 
might have been interesting. Anyway, I'd prefer to use a recording of a quiet 
environment for the instances I need a +10 dB (for example) SNR in lieu of a 
moderately loud restaurant simply turned down in presentation level. I also 
recorded "loud" speech--features of which certainly differ from whispers 
presented at a loud level.
So how did I plan to record a variety of background noises for research? 
Ambisonics miking, of course!
Best,
Eric C.

Gardner, M. B. (1969): Distance Estimation of 0 Degree or Apparent 0 Degree 
Oriented Speech Signals in Anechoic Space. J. Acoust. Soc. Am., pp. 47-53.

*If anybody would like for me to photocopy the figure (or entire page), I will 
be glad to do so and upload the image to my website.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Ambisonic-specific loudspeaker system

2013-03-02 Thread Eric Carmichel
Hi Steve,

Thanks for writing and sharing your interests and insights (Sursound Digest Vol 
56, Issue 2). It is always interesting to get input from people with different 
backgrounds, and it sounds as though you have an extensive background in the 
recording arts and music production.

I know what you mean about the “obligatory” NS10; in fact, one appears on the 
*Products* page of my personal site (cochlearconcepts.com) along with a 
4000-series SSL console, and a classic combination of music production gear 
(e.g. tubed UA and Manly compressors). [The ugly dude in photo is yours truly.]

Like many audio enthusiasts, I got into hi-fi as a youngster. I was an avid 
speaker builder, and authored a couple of articles in The Audio Amateur and 
Glass Audio magazine. As a person with some loudspeaker design experience, I 
took note of your comment, “This would also slot nicely into the change of 
directional hearing mechanisms, while at the same time not having a crossover 
in the centre of the most important mid range. It would also reduce the 
interference caused by placing drivers close together.”

It’s interesting that many loudspeaker designs (to include highly regarded 
powered monitors) have a crossover point right smack in the middle of the ear’s 
most sensitive frequency range. The trade-offs for higher or lower xover 
frequencies are 1) the larger-diameter mid-woofer becomes directional at higher 
frequencies (all relative to cone diameter); 2) typical dome tweeters can’t 
manage the power or long cone excursions needed to reproduce low frequencies; 
and 3) adding a mid-range driver adds to the complexity of the design as well 
as physical size of loudspeaker system. I’m not saying anything you don’t 
already know here. But what could be interesting is designing an 
Ambisonic-specific loudspeaker system to offset the crossover frequency.

I’m personally biased towards at least one single-coned, full-range driver: The 
Lowther (ok, it has a whizzer cone, but still no crossover network). But these 
guys are expensive, and the one pair I had were mounted in massive, 
acoustic-labyrinth enclosures. But on a more realistic notion, there are many 
good speakers that have smooth on- and off-axis responses from roughly 40 Hz up 
to 700 Hz. And I’m confident there are more than a few single-coned 4- or 
5-inch drivers that can accurately reproduce 700 Hz to well beyond the upper 
mids. To complete the high-end, a tweeter (or array of tweeters) could be added.

An analog (passive or active) crossover  at 700 Hz wouldn’t be needed, as the 
B-format signal could be split into specific bands (high and low) ahead of the 
decoding. [This, by the way, is where I intended to do offline processing.] 
Such an arrangement would require two Ambisonic decoders (upper band and lower 
band decoders with their own, unique speaker sends). Fortunately, this isn’t a 
big processing load. I’d prefer to do this than split the highs and lows after 
decoding with a single decoder. If the crossover frequency was chosen to be 700 
Hz (or 400 Hz), then a greater number of place-specific speakers could be used 
for the highs (as you suggested) than for the lows. The mids and highs could 
easily be grouped together (for economy). The loudspeakers handling 400 Hz and 
above could be two-way, but with a higher-than-usual crossover frequency 
(meaning conventional crossover but at, say, 5 kHz).

Such a system would have atypical crossover frequencies, but the advantage 
would be keeping the crossover point out of the ear’s sensitive range as well 
as the range where phase anomalies (introduced by active or passive filtering) 
could be most audible. The directional characteristics of the drivers are a 
function of baffling and cone diameter, so cone diameter and enclosure size 
would be purposely small for the speakers managing 700 Hz and up.

If such a system provided advantages, I suppose it would be ill-suited for any 
of the popular surround formats. The workaround would be to stack the smaller 
(one- or two-way) speakers atop the bass/mid-bass speakers and bi-amp them. 
Time alignment and additional subs optional.

Many thanks for taking time to read.
Best,
Eric C
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] KEMAR, Neumann, Zwislocki

2013-03-29 Thread Eric Carmichel
Hello Guilherme,
I have some insight regarding your question re KEMAR and the Neumann acoustical 
test fixtures/heads.
Briefly, KEMAR was designed with hearing science in mind. The torso was 
designed to approximate "average" human size (I think we have, on average, 
grown since the introduction of KEMAR). Additionally, the material from which 
KEMAR is fabricated has an absorption coefficient to match that of humans 
(clothed or not clothed??--will have to refer to Knowles Electronics for this 
info). KEMAR is generally equipped with two interchangeable ear sizes: Large 
and small. If you look at impulse responses obtained with a KEMAR (e.g., the 
widely used IRs that came from a MIT lab study by Gardner et al), you'll 
probably see in the info section which of the two ears was used. Internal to 
KEMAR, their are microphone clamps for 1/4- or 1/2-inch mics (two different 
clamps for each mic size). A pig-tail adapter allows two Bruel & Kjaer mics (L 
+ R) to fit within KEMAR's limited head space (getting into the mind of KEMAR 
is a tight fit?).
When making a recording using internal mics (not the same as mics proximal to 
the ears conchas), the resonant peak created by KEMAR's ear canals will have to 
be considered. The recordings with peaks work well with deep-seated earphones, 
such as EAR phones, that otherwise destroy the ear's natural canal resonance. 
Note: Earphones worn OVER the ears modify the natural resonance, but don't 
destroy it. One could argue that the (approximate) 6cc volume of circumaural 
headphones over the ears' 2cc volume will certainly change things a bit. 
However, the active drivers of headphones may result in a larger "equivalent" 
earcup volume that imposes less of a change than one might predict. (Analogy 
here: The B&K acoustical calibrator has a large equivalent volume despite a 
small physical volume--this large virtual volume minimizes error caused by mic 
placement in the calibrator.) Just be aware that mic placement, either in KEMAR 
or proximal to concha, will affect
 recordings at the very important mid frequencies.
Another thing about KEMAR is that it is designed to accomodate a Zwislocki 
coupler. Maybe it's more accurate to state that the Zwislocki couple was 
designed to fit inside of KEMAR. Anyway, the Zwislocki coupler mimics middle 
ear function. Briefly, it is mathematically equivalent in compliance,  mass, 
etc. of the middle ear (tympanic membrane, ossicles, ligaments, etc.).
If you wish to learn more about KEMAR recordings, I recommend a search for 
articles authored by Zwislocki, Mead Killion (of Etymotic Research), and 
others. One article by Killion is titled "Zwislocki was Right." A Google search 
for Jozef Zwislocki will reveal some very interesting information regarding 
human hearing.

Now for  the Neumann head: I believe this was designed primarily for 
high-fidelity, binaural recordings. I have listened to recordings made with the 
Neumann head (and IRs obtained via the Neumann head), but can't state whether 
these are significantly different from KEMAR recordings. Like many things, one 
has to consider the overall system: Recording and playback. I've heard KEMAR 
recordings that suck, and others that were fabulous. Why the difference between 
recordings? Not sure, as I didn't have all the details. Maybe getting 
recordings from the same venue and source would help, but I don't know of any 
direct A-B material for comparison.
I hope this info helps some. As usual, I'm writing off the cuff without any 
reference material, so please pardon any inaccuracies. At least the people 
mentioned above (Killion, Zwislocki) will reveal accurate and detailed info.
Best regards,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Anthropometrics, Loudspeakers, & Vision

2013-03-30 Thread Eric Carmichel
Greetings All,
Binaural recordings, HRTFs, and headphone listening are popular topics among 
many of us. Regarding ear shape and head size, I'll have to read more on what 
is of greatest importance for accurate localization. Here are a couple of 
resources that I just downloaded:

On the improvement of localization accuracy with non-individualized HRTF-based 
sounds
http://webs.psi.uminho.pt/lvp/publications/Mendonca_et_al_20120_JAES.pdf

HRTF Personalization Using Anthropometric Measurements
http://pdf.aminer.org/000/349/698/virtual_audio_system_customization_using_visual_matching_of_ear_parameters.pdf

Head width were certainly affect ITDs, but to what extent does this alter our 
sense of sound-source direction? Relatively gross errors seem to have minimal 
affect on lateralization, but this is different from localization in 3D. Pinna 
size and shape alter the spectral nature (name phase / amplitude) of 
higher-frequency, broadband sounds, but we can "re-learn" localization ability 
with a new or different set of ears. Head movement is certainly a way of 
resolving ambiguities, and 
head-tracking systems do an admirable job taking this into account. Regardless 
of the accuracy of any HRTF, nothing seems to compare to listening in a 
surround of loudspeakers. But personal listening privacy under headphones / 
earbuds is here to stay. Fortunately, localization accuracy isn't all that 
important for MUSIC enjoyment, and accurate HRTFs aren't necessary for a large 
number of applications. I'll confess that after listening to good quality 
binaural recordings, conventional headphone listening is, well, rather lacking. 
Conversely, high-fidelity monaural recordings played through a quality speaker 
(or pair) never seem to grow dull. As for the same recordings under headphones: 
I'd prefer not to say.


Side note: Thanks, Aaron, for the links to the Neumann articles. Curious to see 
whether any encoding is needed to optimize loudspeaker listening for recordings 
made with the Neumann dummy heads (have articles on 
laptop--I'll find out soon enough).

And now an update on an old topic... Vestibular-vision-auditory interactions. I 
recently received a very kind communication from William (Bill) Yost. As I 
understood from his email, there is a very strong vestibular-vision-auditory 
interaction that allows one to accurately locate a sound source when a person 
is moving. With the eyes closed and subject under constant rotational velocity 
(i.e., vestibular system in equilibrium or, equivalently, turned off), 
stationary sounds sources appear to move and moving sound sources appear to be 
stationary. I won't say more at this time because I don't want to pass along 
info without having permission to do so. I'll have a chance to visit Bill's lab 
in the very near future, and ask when the results of the study are slated for 
publication.

Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] KEMAR, Neumann, Zwislocki (Justin Bennett)

2013-04-01 Thread Eric Carmichel
Greetings,
Some of the first binaural recordings I made were with the Core Sound mics 
attached to my glasses frame (mics very close to conchas, but not in the ears). 
I also made recordings of traffic sounds using the same mics and with a 
KEMAR--the mics, however, were in the KEMAR which then added a resonant peak. A 
nice plus about the Core Sound mics (and this isn't meant to advertise 
anything) is that they fit nicely in KEMAR's 1/4-inch mic clamps (B&K) as well 
as acoustical calibrators designed to accommodate 1/4-inch mics. In fact, I 
continue to use the mics in my own acoustical test fixture that has two ears on 
one side and one ear on opposite side. Impressions of my ears were made to 
create/mold the ears and canal used on the test fixture. The fixture includes a 
strain gauge so that headband force can be measured. Purpose of three-eared 
fixture: To test the efficacy of hearing protection devices to blast noises. I 
can directly compare an occluded and open ear
 with same explosive noise. Anyway... placing the Core Sound (or similar) mics 
on a person gives rise to thoughts of a full-body transfer function and 
unconscious head movements. When listening to recordings made with mics 
attached to my glasses frames, the sense of sounds below is quite remarkable. 
This includes water splashing as vehicles passed by, my own footsteps, and a 
ball bouncing. None if these sounds are as real (place-wise) with recordings 
made via KEMAR or with a second "generic" dummy head. The "full-body" 
recordings may not work for all, and I don't know how they sound when played 
through speakers. But I will certainly state that they are amazingly real 
sounding recordings. Of course, one bias may be that I had a visual of the 
auditory scene, and it was my own minute head movements that may have 
contributed to the place-realism. Always fun to experiment... and learn from 
one's own experiences.
Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Anthropometrics, Loudspeakers, & Vision

2013-04-02 Thread Eric Carmichel
Hello Etienne and all Sursound Readers,

Many thanks for your response, insight, and “food for thought”. You brought up 
interesting points which, in turn, prompted me to dig deeper into Ecological 
Psychology (referring to the Gibsonian school).

There’s certainly something to be said for choosing the “right” information 
versus ability to detect or pick up additional information. In fact, this could 
get to the heart of some of my initial thoughts regarding hearing research (and 
my initial interest in Ambisonics). As you may know from prior diatribes and 
posts, I have interests in cochlear implant research and spatial hearing. It 
probably comes as zero surprise that an array of 22 electrodes used to 
innervate the auditory nerve provides, at best, impoverished input to the brain 
(especially when compared to the
 input provided by the approx. 3500 inner hair cells of normal-hearing 
listeners).

When electric hearing is combined with acoustic hearing (hearing aid or not), 
we might surmise that the low-frequency (acoustic) energy simply adds to the 
amount of information received. For normal-hearing and impaired listeners 
alike, the low-frequency energy by itself provides very little usable 
information. For example, low-pass filtered speech (f3 = 200 Hz, high-order 
filter) is quite difficult to understand. In fact, f0 for women is above 200 
Hz, so little speech information resides at the very low lows.

Electric (cochlear implant) hearing alone provides reasonably good speech 
comprehension scores when speech stimuli are presented in a quiet environment 
(+20 dB or better SNR). Scores obtained from 5-word sentences could range from 
50 – 90 percent correct (I don’t have an exact reference at hand, but I believe 
this is a good estimate). When electric
 hearing is augmented with the below-200 Hz acoustic stimulus, speech scores 
improve by a big jump. Furthermore, speech comprehension ability in reverberant 
environments improves. One might be inclined to conclude that when the sensory 
input is impoverished, any additional input is welcomed and quickly used to 
fill in any missing information gaps or resolve ambiguities. But the 
synergistic combination of electric and acoustic hearing suggests, at least to 
me, something beyond “additional” information is at work.

Research regarding electric-acoustic stimulation (EAS) has led to exciting 
results and interesting discussions, but the background noise and reverbation 
used in many studies are often of the artificially-generated (pink or white 
noise maskers) and one-dimensional or mono reverb nature. Sursound Digest 
readers probably recall the discussion I initiated regarding multi-channel 
subwoofers and identifying sound-source direction (at least
 in free-field) for very low frequency sounds. My interest and concern for 
presenting accurate low-frequency and realistic sound source direction wasn’t 
about measuring localization ability for very low-frequency sounds: My interest 
was to build a periphonic system for evaluating REAL-WORLD low-frequency 
sounds’ contribution to or detraction from EAS listening. Needless to say, 
real-world sounds don’t come from a single subwoofer or direction. Whether we 
can determine direction isn’t the important part, but the subtle (and perhaps 
subconscious) aspects of real-world listening do matter.

My take or concern over “realism” versus artificially produced stimuli isn’t 
one of difficulty; in fact, I’d state that many artificial and monaural noises 
(dichotic or diotic presentation) mixed with speech present more difficult 
listening conditions than what we encounter in the real world. The problem is 
one of learning what is “real” and useful. As an analogy, being able to ride a 
unicycle (arguably difficult) doesn’t guarantee one’s success or ability to 
ride a bicycle. It may be easier to ride a bike, but there are also more ways 
to fall or crash at high speed, so the need to maneuver a bike and learn the 
rules of the road are more important for safety. Learning, success, or 
attending to a difficult listening task in the laboratory doesn’t guarantee 
user success in the cacophony of real-world stimuli.

To date, I’m not aware of studies where real-world stimuli and scenarios have 
been used to study the efficacy of EAS listening. It is entirely possible that 
the addition of low-frequency energy, whether a part of speech signal or not, 
helps users choose the “right” information. In the real world, there is a lot 
of multi-sensory noise. For
 normal-hearing listeners, segregating an auditory signal from noise is 
accomplished, in part, by perceived spatial separation. This is equally 
important for those involved with the recording arts: Spatially separating a 
pair of shakers from a hi-hat is most often accomplished via panning. With 
speech, we also face informational masking as well as energy masking. So, 
adding more “speech” information (aside from level to improve SNR) isn’t as 
important a

[Sursound] A quick Post Scriptum (yep, I goofed)

2013-04-02 Thread Eric Carmichel
Referring to my last post (moments ago) the following paragraph is in error.


Another area I would be interested in investigating is time-to-contact as it 
applies to hearing (Gibson was mostly involved with vision), and how binaural 
implantation might improve a listeners sense of safety in “three dimensional” 
space where there are multiple, moving, sound sources. Such studies under 
headphones are very realistic. As you wrote, “One of the characteristics in 
Gibson’s ecological approach that has been adopted by the VR field is the idea 
that perceptions are confirmed as true through ‘successful action in the 
environment’. Tilting one’s head can be considered action in the environment, 
and if the spatiality of the sounds heard correlate then that action can be 
considered successful. So head movements help to confirm that what is being 
perceived is correct.” I very much agree with what you wrote. Adding to this, 
avoiding collision is certainly a more successful action than identifying 
location to the nearest nth of a
 degree.

What I meant to say was, "Such studies under headphones are NOT very realistic."

A surround of speakers still rules.
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Surround formats and lossy compression

2013-04-05 Thread Eric Carmichel
Greetings to All:

When it comes to surround sound coding/decoding, I never make a peep because 
I'm ignorant on the topic. However, a friend who heads the Dept. of Audiology 
at a children's hospital had asked a question regarding MP3s. Although the MP3 
format may be nothing more than a distant relative to surround formats, the 
thought of using "lossy" file types in research studies utilizing 
surround-sound stimuli does concern me. I answered my friend's question (re 
MP3s) as best I could, and the answer is shown below (I copied and pasted it 
verbatim--sorry for it's long length). Some of the concerns outlined below may 
or may not apply to surround sound (?).

Has anyone experienced odd artifacts while doing hybrid mixing (sounds from 
monaural sources added to actual, or live, Ambisonic recordings) and where 
sound files stored in lossy formats were converted to wav files? Re surround 
sound for research: Are there file formats that should be avoided as far as 
psychoacoustic research goes? Are all lossless formats more-or-less equal in 
terms of 'purity'.

Thanks in advance for any insights.
Eric C.


---original email and response re MP3s and audiology follow---

Hi Eric –
I hope you’re doing well. I’d like to pick your brain, if you don’t mind. What 
do you think about the use of MP3 or MP4 recordings for speech audiometry? I’m 
thinking of possible pitfalls in the compression and the bandwidth of the 
signal compared to, say, FLAC or standard wav files. Of course, audiologists 
used vinyl LPs and tape recordings for decades without any worry. Thanks,
Bob

Hi Bob,

You ask a good question and one that should be examined from more than a 
“fidelity” point of view. But before I dive into this, please allow me to make 
my first disclaimer: I’m writing this off the cuff, so I won’t give any 
references to peer-reviewed studies (but then, who needs peer review when the 
answer comes from Eric Carmichel?). Second disclaimer: I assume you already 
know a lot of what I wrote below--if I explain something that is either 
“obvious” or well known, it’s only to help me communicate my thoughts.

Researchers [ref?] have shown that the majority of listeners cannot tell the 
difference between a 44.1 kHz (or kS/s), 16-bit wav file and an MP3 derived 
from the same wav file. I don’t know what program material was used in the 
studies, but let’s assume music. If we can’t tell the difference between music 
MP3s and CDs, then “surely” we can’t hear a difference between speech MP3s and 
CDs. This might be one argument in favor of using MP3s for speech audiometry.

I believe most MP3s have a 32 kS/s sampling rate, which isn’t by itself much of 
a size reduction from 44.1 kS/s files. The compression scheme used to create 
MP3s is (or was) proprietary and largely based on psychoacoustical principals. 
Sounds that can’t be heard because of energy masking are “removed” at the time 
they would otherwise be masked. MP3s, unlike FLAC (free lossless audio codec), 
use a “lossy” compression scheme--what is lost isn’t brought back--it just 
doesn’t contribute (perceptually) to the sound. I’d guess that both forward and 
backward masking are taken into account as well. The usual “bandwidth” (not 
sure it’s a good use of the word) for MP3s is 128 kilobytes/s, but some files 
use a variable bit rate*. An MP3 whose bit rate is 128 KBps is considered 
“radio” quality, while higher rates are probably indistinguishable from 
CD-quality wav files, at least in terms of fidelity.

[Side notes: CD-quality refers to 44.1 kHz sampling rate and 16-bit resolution. 
16 bits is, well, a mix-n-match of 16 zeroes and ones. MP3s are also 16 bit, 
but this is variable. Sixteen bits yields a total of 2^16 unique combinations 
of zero and one (0 to 65535 represented digitally). A byte is 8 bits--just 
basic computer nomenclature that goes back to caveman days and ASCII standards. 
So, 16 bits (lower case b) is same as two bytes (upper case B). If sampling 
rate is 32,000 samples per second and our resolution is 2 bytes, then we’re 
“streaming” 32,000 * 2 = 64,000 bytes per second, or 64 KBps. Unlike kilohertz 
(kHz), the K is capitalized when referencing kilobytes (KB) or kilobits (Kb). 
Because we’re (generally) dealing with two interleaved channels (L + R), the 
rate doubles. This is where the 128 KBps comes from: 32 kHz * 2 channels * 16 
bits / 8 bits/byte = 128 KBps.]

MP3s, like FLAC or wav files, are NOT limited to a fixed sample rate or bit 
depth. If frequency response were our only concern, Nyquist frequency (also 
known as “foldover” frequency) or Nyquist’s theory says that the highest 
reproducible frequency without aliasing is half of the sample rate. So, bit 
depth (= resolution) and upper frequency limit should NOT be our concern for 
using MP3s. So why be against MP3s? Read on...

When it comes to perception, we really don’t know what the hearing impaired, 
autistic (neurotypical), or “not

Re: [Sursound] Surround formats and lossy compression

2013-04-05 Thread Eric Carmichel
Hi Eric B.,
Thanks for your detailed and informative reply.
While drafting a recent Sursound post regarding KEMAR, I did a Google search to 
make sure I was spelling Zwislocki correctly. One reference that appeared at 
the top had something to do with inner ear simulation software--I need to go 
back and check it out. Anyway, I don't know whether the "front end" of 
available inner-ear simulation software allows the user to study neural coding 
with an arbitrary source or audio file. Analyzing complex sounds at the neural 
level (particularly innervation of inner hair cells) would require a lot of 
data-logging channels or replaying the stimulus over and over while 
systematically moving the 'logger' to the many virtual receptors. If such a 
simulation exists, one might be able to measure differences (at neural level) 
between a wav file and its mp3 counterpart. Considering that neural firing 
appears to be a lot more complicated than basilar membrane motion alone (which 
is primarily mechanical except for motile outer hair
 cell contributions to membrane elasticity), we might expect to use statistical 
measures to decide what significant differences, if any, exist. This still 
wouldn't provide conclusive evidence when it comes to perception. I realize 
there has been a long-standing debate over audio file types and rates, but my 
guess is that the subjects used in studies consisted of normal-hearing young 
people with no familial history of hearing loss. Normal hearing generally means 
thresholds = 10 dB HL or better at the audiometric test frequencies (highest 
test frequency being 8 kHz). Such screening measures are generally employed to 
ensure a 'normal' (and homogeneous) population. Perhaps people claiming to 
possess 'golden ears' have participated in mp3 vs wav studies, too. But this 
kind-of avoids the subtle issue of outliers or people who have (for example) 
auditory processing disorders or other abnormalities that are independent of 
hearing thresholds. We have a reasonable
 grasp on how mammals hear (the physiological aspect), but we don't know a 
whole lot about how we listen. Of course, just as with hybrid mixing, the way 
to avoid potential pitfalls or danger in a research or clinical environment is 
to avoid the lossy file types altogether.
Best,
Eric Carmichel (also not to be confused with Eric Carmichael--my last name has 
an unusual spelling. Cheers!)




-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20130405/1c6d0db1/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Surround formats and lossy compression

2013-04-07 Thread Eric Carmichel
Thanks to everyone who responded to the posts re file formats.

Does "lossless" A = "lossless" B (commutative property?) in ALL instances. Does 
every lossless file type use all 16 bits (e.g.) for the net resolution, or are 
one or two bits used in hand-shaking protocols to insure transfer accuracy 
(number of actual bits used may have more to do with A-D converter, but that's 
another topic)? I suppose codecs or other software used could affect playback 
quality? Is converting FLAC to WAV to ALAC completely without error? And are 
popular surround formats merely interleaved wav files (or similar) or lossy? I 
sincerely don't know a lot of this stuff, just enough to question online 
definitions. The help/input was greatly appreciated.
RE upper or lower case K: I noticed Francis Rumsey used Kbyte for kilobyte in 
at least one of his books (ref: The Audio Workstation). I'm not arguing k is 
the only SI unit prefix, but more than one online computer dictionary uses Kb 
and KB (the b and B referring to bit and byte, respectively). The Internet is, 
as we know, provides a plethora of misleading information from dubious sources 
(e.g., 12-year-old girls posing as expert auto mechanics in chat groups). The 
problem a newbie can run into is when two 'experts' disagree. Glad there's an 
accepted standard that's (mostly) impervious to change.
Best,
Eric C.


Eric Carmichel wrote:
...
> Are all lossless formats more-or-less equal in
> terms of 'purity'.

Eric B has already addressed this; lossless
means lossless.

...
> Unlike kilohertz (kHz), the K is capitalized when
> referencing kilobytes (KB) or kilobits (Kb).

In SI unit prefixes there is only a lowercase k.
A capital K, even if popular, is wrong.

Regards,
Martin
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20130407/a82ab2b1/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Meandering a bit (not a byte, but perhaps a nibble)

2013-04-09 Thread Eric Carmichel
Greetings All,

I have made more than one attempt at recording (Ambisonically, of course!) a 
steam whistle and its resulting echo in the Superstition Mountains of Arizona. 
Wind has been the foe, but it is interesting to hear first-hand how atmospheric 
conditions affect sound. In addition to wind noise masking the echo, sound 
appears to travel up-hill (attenuation) when traveling opposite to wind 
direction. Furthermore, isotherms (layers of cold and warm in the canyons) 
appear to change the way sound travels.

Although yesterday's attempt at sound recording weren’t successful, I took time 
to visit the Boyce Thompson Arboretum (also in the Superstition Mountain 
Range). Despite high winds, the arboretum was somewhat isolated from an ongoing 
desert dust storm. The arboretum's many flowers and plants attract a lot of 
birds, so it’s a potentially great place to record bird sounds. (Side: Why am I 
the only person in the mountains with a mic? Normal people have cameras.)  
While I was enjoying the scent of roses and honeysuckle blossoms during my 
visit, the thought of electronically “recording” scents and odors came to mind 
(not exactly a new idea). After all, we have multiple methods of electronically 
recording images and sounds. It then made me think about sensation, perception, 
and how “reality” travels across/thru various medium. The amusing thought of an 
older Warner Bros/Bugs Bunny cartoon that referred to “smell-a-vision” also 
surfaced.

Although I don’t believe scents and odors would enhance movie-goers experiences 
(didn’t director John Waters already try this?), it does elicit thoughts of 
vials of elements and compounds being electronically mixed to produce odorants. 
Or, as with other implantable prostheses, what (and how) would be “recorded” to 
produce the sensations of olfaction and gestation via their respective cranial 
nerves? Sound travels on a medium (typically air for audition), as does light 
on an aether (ok, Michelson and Morley proved light doesn’t travel on such a 
medium). This could elicit discussion regarding the various schools of 
psychology and perception (Gestalt theorists, etc.) and how the 
sensation-evoking stimuli reach us (not to be confused with how they’re 
detected). Ecological psychology, for example, addresses vision and hearing, 
but these are sensations evoked by events that disturb or propagate through a 
medium. Touch, taste, and smell have no such
 medium, though many animals rely heavily on olfaction for survival and can 
determine the direction of a scent’s source (air current direction?). Certain 
schools of thought lean heavily on just a couple of sensations, not sensation 
as a whole. This is why I don't ascribe to any single school of thought 
regarding perception.

As I digress (and meander in my thoughts), the definitions of media and medium 
come to mind. Just one week ago, a Sursound reader/contributor, Mark, kindly 
asked whether I had heard of Marshall McLuhan. I have since downloaded a couple 
of books by (and about) McLuhan. As I understood (via Mark’s email), McLuhan 
received funding from IBM to launch a research project on various types and 
combinations of sensory inputs. Because of differences among scientists, 
McLuhan's research ran into problems. McLuhan is also the person who coined the 
phrase “the medium is the message.” Depending on our definition, we could say 
“the medium (e.g. air) carries the message.” I guess that’s being a bit too 
pedantic, but then touch carries a strong message without need for a medium or 
media. And regardless of the best audio-video recording gear in the world, I 
wouldn’t be able to capture or convey my experience at the Boyce Thompson 
Arboretum without the
 elusive smellavision.

As the title of this post indicates, I’m meandering. But the medium, message, 
and enjoyment of music and other sounds change in the presence of other 
stimuli. Surround sound also changes (and generally enhances) our listening 
experience, at least compared to mono or stereo. As humankind strives to move 
forward, I’m curious what the next “medium” may be, and how surround sound will 
be shaped by paradigm shifts. For now, I'm just meandering about the message...

Best,
Eric C.

PS--I understand that a nibble (capital N or lower-case n?) consist of 4 bites 
(or half a byte).
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Gustation, not gestation

2013-04-09 Thread Eric Carmichel
Made typo in my last post: Certainly intended to write gustation (taste), not 
gestation. I need to pay more attention to auto-correct features (which can 
lead to auto-error) and update my spell-check dictionary.
Ciao,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Medium Meanderings

2013-04-10 Thread Eric Carmichel
Greetings,
Many thanks, Peter and Mark, for your highly-detailed and informative 
responses. There’s a lot to digest.
I have an interesting thought on new media and man’s proclivity toward photos, 
audio recordings, books, etc. I’m not speaking for everyone, but I’ll confess 
that I have a certain compulsion towards collecting objects and “events”. When 
it comes to collecting printed media, I used to buy a lot of books, but I am 
equally ok with downloading books in electronic form. When I really want to 
study a book, I generally obtain a hardcopy because I don’t care to read from a 
computer screen. In fact, I ordered two hardcopies of the McLuhan books you 
(Mark) mentioned in your email. But reading, in general, forces me to be in a 
physically uncomfortable position; consequently, I find audio books purposeful 
and for more than one reason. In addition to the issue of physical comfort, I 
actually get more out of listening to an audio book than visually reading the 
pages: I actually absorb more from a book, to include text books, when I’m 
simultaneously
 performing a menial task. So, then, I welcome and embrace an age of audio 
books.
I have a fairly large vinyl/LP collection. As I part with my collection, I 
really don’t mind giving up the physical collection so long as I have 24-bit, 
48 kHz digital copies of the LPs being played through my moving-coil phono 
cartridge and high-end turntable. It’s a though I’m trying to record an “event” 
-- that event being the hearing the sound of the record through an analog 
system as well as the musical performance.
I’m rarely without my camera (still images, not video), but I’ll bet I’d carry 
my surround microphone just as often as a camera if it were as portable and 
easy to set up (at least a mic and field recorder is just as easy to set up as 
a vintage, large-format camera). Although I rarely print or frame my photos, 
the stored “images” are just as much a part of my photo collection as the 
printed images. I'm grateful that I no longer have to deal with dark-room 
chemicals, though some would argue that a certain art is lost (I can virtually 
dodge and burn in Photoshop).
Perhaps the next “medium” will be our ability to store and retrieve events in 
our brains in much the same fashion as we store data electronically. The 
formation of neural synapses create memories, but our ability to tap into these 
at will (or at appropriate times) seems lacking. Maybe all we need is a 
controllable “outer feedback loop” that re-routes events stored in white matter 
to cycle through the peripheral neurons and back to the cerebral cortex 
(Eureka! The iBrain). I suppose some sort of meta data would be needed to 
control sample rate for equivalent real-time experiences.
In a sense, different and newer media allows us to collect things we never did 
in the past. When it comes to computing, I know people who collect operating 
systems and software simply just to have it, not necessarily use it. Our 
survival (or ability to impress a potential mate) may/may not have been 
dependent on collecting objects. Regardless, I believe “collecting” is a common 
trait, and certainly not exclusive to humans. If I can collect a sound’s 
multi-faceted, multi-directional properties, well that’s a way cooler item to 
collect (and listen to) than its uni-dimensional counterpart. And, in a sense, 
I can retrieve more of the “event” I wished to collect. Collecting events isn’t 
a desire to re-live the past (maybe?); instead, perhaps it’s a way of “moving 
forward through the rear view mirror” (familiar title, Mark?) or, more 
importantly, simply enjoying life.
Ironically, one of my favorite posters is a B&W photo of EB White taken by 
photographer Jill Krementz. It shows White in his writing space, which is 
little more than an empty shed. There’s a lot to be said for simplicity, 
especially when our senses are constantly bombarded. Even a surround-sound 
recording of “quiet” results in a surround of audible mic self-noise. Move the 
air along, and the mic self-noise becomes the pleasant sensation of a breeze.
Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] The Pigeon Blimp (or Bird on a Wire)

2013-04-11 Thread Eric Carmichel

Greetings,

In response to “I’m curious what the next ‘medium’ may be, and how surround 
sound will be shaped by paradigm shifts,” Dr. Lennox wrote:

“I’d bet that one day we’ll discern a difference between ‘surround sound’ and 
‘3-D sound’, where the latter contains a great deal more depth of field - 
distance information is probably at least as important in spatial hearing as is 
direction perception...” (By the way, Peter, I look forward to reading 
Violentyev, Shimojo, and Shams (2005), Touch-induced visual illusion. 
NeuroReport, 16, 1107-1110. Thanks for ref.)

I will now attempt to describe a “visual” (compared to “optical”) illusion and 
how it could apply to acoustical depth-of-field. Just a few days ago and while 
driving, I pointed to an object and said, “I wonder what type of blimp that 
is.” The object had (roughly) the shape of a blimp, to include a rudder. It was 
suspended in air, over the city, but no other foreground or background 
references to provide information as to its actual size. I was at a stop light, 
so motion didn’t provide a cue either. As I made a turn, I could see light 
reflecting from the fine wire that the “blimp” (actually a pigeon) was resting 
on. Light direction initially obscured or camouflaged the wire. Mildly 
embarrassing, but funny. Experience tells me small-ish objects can’t float, 
even when filled with helium (buoyancy force less than weight of fabric), so I 
assumed a large object. Blimps aren’t nearly as common as birds, but they’re 
not terribly uncommon
 in Phoenix.

So in this situation, I had allowed experience to be the noise. Unlike some 
illusions (such as the moon and the Ebbinghaus illusions), I had no relative 
size references -- just the sky as background -- until I could see the wire 
(its expected size also based on experience). There were no optical aberrations 
aside from an "invisible" wire. Simply put, my brain (or bifurcated ganglion) 
relied on past experiences to (wrongly) determine what the object was as well 
as its size. How, then, does this fit into acoustics?

Experience also gives us insight to a signal’s distance when there’s no 
relative noise. When there is noise, we expect nearby objects to have a a 
better SNR than distant objects and a large signal-to-reverb ratio (an 
important type of SNR). Without noise, we might expect the sound to “agree” 
with the size and material/composition of the sound source. Furthermore, 
Ecological Psychologists (e.g. Gaver) might describe the sound by how it’s 
produced (rolling, tearing, breaking, liquid dropping), and we would likely 
expect the sound of a drop of water to be close to us. Loud speech and yelling 
have tonal and prosodic characteristics that set them apart from normal speech, 
so we use relative level-to-vocal effort to determine the distance of talker. 
Because learning and experience is involved, acoustical field – distance 
information depends, to some extent, on a person’s knowledge of the sound being 
heard, not just the physical aspects of the
 wave. Not that I’m saying anything you don’t already know, but I thought I’d 
share an amusing, true-to-life anecdote: The Pigeon Blimp.
Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Guns and Odor

2013-04-11 Thread Eric Carmichel
Re Smell-O-Vision: Subsequent to my post, I read that smell-o-vision actually 
existed. But the Bugs Bunny cartoon titled Old Grey Hare goes back to 1944 and 
predates (or predicts) an actual realization of smell-o-vision. Yes, Odorama 
much easier on atmosphere.
Re Raytheon: Hard to imagine this is new. Spatially separating sounds goes back 
to WWII and (as I understand) is why radio operators wore radio receiver on one 
side of the head in lieu of bilaterally. And as for threats, the azimuthal 
display on radar threat warning systemsshows, in two dimensions, distance to 
and quadrant of threat. Azimuth determined by triangulation, threat type by 
signature. I assumed there might have already been an audible (voice or tonal) 
warning that gave pilots and crew a sense of direction, too, but maybe that was 
too obvious (or unnecessary)?
It's certainly known that personal protection devices, particularly earmuffs, 
skew sense of direction. This is problem for law enforcement, military, and 
even for recreational users. It's a greater issue with full-coverage headgear. 
As my research showed (Effects of Binaural Electronic Hearing Protectors on 
Localization and Response Time to Sounds in the Horizontal Plane, Noise & 
Health, Vol. 9, No. 37, 2007, pp. 83-95), lateralization ok with electronic 
binaural protectors, but front-back judgments (including those off to side) can 
be unambiguous but wrong. Actually, that was my first attempt at surround sound 
(long before I learned of Ambisonics), and I'd be curious if the results would 
be similar using an Ambisonic system or wave field synthesis. I chose discrete 
locations for stimuli along with a continuous surround of noise. Had the 
locations for stimuli presentation been virtual locations (i.e., phantom image, 
no speaker), I wonder how listeners
 would have responded while donning binaural electronic hearing protectors. 
I've proposed using Ambisonics to create listening environments to test 
listening in noise (note the use of listening, not hearing), and proposals have 
been sent to Army researchers and elsewhere. Gamers have certainly used 3D 
audio to assist in their tactics. Now to add smell-o-vision to their arsenal
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Guns and Odor

2013-04-12 Thread Eric Carmichel
Hello Mark,
That's interesting about the child's voice/whisper. I worked in electronic 
warfare and communications while in Air Force (during the short-lived Gulf War 
era), but never heard anything aside from (female) voice warning systems. 
Perhaps child's voice too subliminal (depend, of course, on level), evoked 
emotions that could compromise mission, gender bias (a few female pilots), or 
not as effective for pilots who didn't have family of their own. But 
conceptually, it's a very interesting way of being heard through the noise.
I had worked on an infant screening device for Etymotic Research (IL) while a 
student at UA (Tucson). Briefly, the stimulus was to be a recording of the 
infant's own mother (voice chips used to record; compander chip needed to 
ensure uniform record level). Other noises and women's voices weren't 
effective, or infant habituated very quickly to loud sound (Mayo clinic once 
used what was referred to as "spook-a-baby" to elicit response to sound, but 
infant not likely to turn eyes or head after one "blast"). Mother's own voice 
provided a reliable and repeatable stimulus. At that time, brainstem evoked 
auditory response (BAER) was the screening and objective test measure for 
infant hearing in hospitals, but now otoacoustic emissions are widely used 
because its quick and inexpensive. The screening tool was designed to be used 
at home to detect loss during the important language-acquisition years (6 - 18 
mos or thereabout -- I'm not a speech-language person).
Best,
Eric C.





   2. Re: Guns and Odor (newme...@aol.com)

Eric:

When I first started experimenting with "localized" sound -- intended for  
an acoustic interface to smart phones (and also before I "met" Ambisonics) 
-- I  was working with a fellow named Bo Gehring, who might be recalled for 
his early  contributions to video-game sonics.

He had once worked on a project for the US Air Force that was trying to  
solve the problem of getting a pilot's attention to inform them that a  
heat-seeking missile was about to fly up their tailpipe.

The solution, which I don't know if it was ever implemented, involved  
having a child's voice "whisper" through the headphones (in what is a *very*  
noisy environment), "Daddy, pull up!"

Both the child, the whisper and the "daddy" were important,  
psychoacoustically speaking, as I recall.

Mark Stahlman
Brooklyn NY
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] In memory of ETD... and a separate topic

2013-04-13 Thread Eric Carmichel
Greetings Everyone,
This may be old news for Sursound readers and AES members, but I just learned 
today that Edward T. Dell, Jr. passed away in late February. I apologize if 
this is repetitive, but I don't recall any prior post regarding Mr. Dell's 
passing.
For those who may not recognize his name, Ed Dell published several 
audio-related magazines, to include The Audio Amateur (TAA), Speaker Builder, 
and Glass Audio. At least one article on Ambisonics appeared in TAA. There's a 
nice write-up on Mr. Dell that can be viewed here:

http://audioamateur.com/from-the-editors-desk/edward-t-dell-jr-in-memoriam/

If you are unable to go to the link, I copied and posted the first paragraph:

"Edward T. Dell, Jr.: In Memoriam
February 12, 1923–February 25, 2013
Edward Dell, founder and former publisher of Audio Amateur Inc., died Monday, 
February 25, 2013, at the age of 90.
Dell, a legendary audio guru, developed his taste for publishing and audio as a 
teenager. He became a veteran builder of audio hi-fi speakers and was a 
longtime full member of the Audio Engineering Society and the Boston chapter of 
the Acoustical Society of America. He published magazines and books on all 
areas of audio for more than 35 years."

I had the fortunate pleasure of having a lengthy conversation with Mr. Dell. 
This was a number of years ago. Our conversation covered several topics: One 
topic that intrigued Mr. Dell was women in audio (or the lack of women in 
audio). According to Ed, around 99 percent of his subscribers (TAA, Speaker 
Builder, and Glass Audio magazine) were male. My take was that women were more 
sensible than men, and weren't about to spend hundreds of dollars on a power 
cord that will *make the difference between night and day* (unless you're an 
unsophisticated listener--and who wants that!!). This brings me to a second 
(and possibly precarious) topic on gender "bias". Studies have shown that women 
have, in general, better hearing than men. Occupational and recreation causes 
of hearing loss among men were traditionally more common for men, but this may 
not be the case now. Thresholds aside, I believe there's at least one study 
that shows women have better
 frequency-discrimination ability than men. Studies that I've been a part of 
(or helped design) almost always ensured an equal number of men and women so as 
to remove any gender bias. Of course, a t-test could be used if there was a 
reason to believe whether women had, for example, better localization ability 
than men.
But something I have given more thought to is how we *feel* about the system 
that reproduces the music or a sound, and how technology effects the way we 
*feel* about the music itself. I have no doubt that some audiophiles are as 
*emotional* about their equipment as the music. Similarly, our feelings about 
the technology used to present sound stimuli in research studies could have an 
impact on a study's outcome. Men's proclivity toward equipment and technology 
(to include Ambisonics) might make us "poor" research subjects because we're 
listening for things that the system is *supposed* to do. Preconceptions and 
emotional *noise* affect an otherwise unbiased response (or opinion) based 
solely on the stimuli. But you know what? For many applications, I believe 
there's something to be said for research that measures how something makes us 
*feel* (bring in the Likert scales?). Do we feel a certain way about a musical 
composition because it's recorded with a
 SoundField mic versus a Shure SM57? Is there something about a speaker's look 
that makes us *feel* a certain way, and other parameters that we believe we're 
measuring objectively are actually skewed? Are women as *awed* (awe is 
definitely a feeling) by surround effects as men, or are they more in-tune with 
dialog? Does the success of surround sound depend on how we feel about the 
equipment, or how it makes us feel (probably not inseparable), or the emotions 
that it evokes in men and women alike?

I was curious as to whether Mr. Dell ever came to a consensus as to why few 
women are interested in high-end audio. I looked up his name to 1) to say 
*hello* and 2) to thank him for the many years he devoted to audio. I was 
saddened to learn of his passing. But like Michael A. Gerzon, Edward T. Dell, 
Jr. will be remembered among the audio and hi-fi community as one of the audio 
gurus.

Best,
Eric
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Re-re-inventing the wheel

2013-04-20 Thread Eric Carmichel
Just received an email which - seems someone else is reinventing the Soundfield 
again - see http://www.quaud.io/
This time it's based on mems microphones and is very small so it ends up using 
blind source separation in order to get good source-interference ratios. 
There's only one reference to Gerzon and Craven in their papers, the latest 
one, and it's only brief - and no mention at all of Villa Pulkki's work which 
seems closely related. Interesting...
Dave

Hi Dave,
That is interesting... but then, too much info might preclude their getting a 
patent? I did notice that the mic in question uses omnidirectional capsules. 
I'll have to re-re-read the literature by Gerzon, Craven, et al. I recall that 
Gerzon's earliest ideas depended on figure-of-eight mics (akin to Blumlein 
Stereo), whereas all later incarnations use subcardiod mics. Whether this is of 
any consequence or not... I don't know. Does beam forming or delay techniques 
to create additional first-order patterns from the omnidirectional mics change 
up the design (and math) from arrays using intrinsically cardiod mic elements? 
Anyway, I certainly hope credit will go to where credit belongs. Long live 
Ambisonics!
Eric C.
PS--I look forward to listening to the YouTube samples of from the 
aforementioned link. I hope it's not another helicopter. Or worse, a barber 
shop scenario.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Re-re-inventing the wheel

2013-04-20 Thread Eric Carmichel

SNR wouldn't have been my initial concern because I have some wee-tiny 
electrets that have (purportedly) +10 dBA noise--pretty low for a small 
capsule.
When I think of the "classic" multi-polar mics such as the 
AKG-414, the multiple patterns are often derivatives of back-to-back 
diaphragms sharing a common stator (I think... not a mic expert here). 
In comparison, ribbon mics (Coles, Royer, vintage RCA, etc.) are 
figure-of-eight (or bi-directional) because of their pressure-gradient 
design. Cardiod condenser and dynamic mics have rear venting for 
delay/cancellation (delay using materials of varying density, not merely
 time-in-air delay), hence their directional characteristics. So...
Having
 directional characteristics provides direction-dependent output levels 
for each of four mics. Spacing, of course, provides a time
 difference component for computing direction. The ideal is no 
inter-capsule spacing (= zero time delay). Tightly spaced omnis are just
 that... omni... and wouldn't have discernible time or level differences
 unless there's *some* time difference or pressure difference. Sound 
intensity probes rely on a phase (and level) difference to determine the
 vector quantity of sound power (SPL alone being a scalar quantity).
So,
 based on acoustical signal processing and beam forming described by, 
for example, Vorlander, I was curious whether the *new* surround mic 
used such processing to create four virtual subcardiods that would also 
serve as the equivalent *A-format* mics. Any single mic, or average of 
all mics, would be the omni component.
For mics such as the AKG 414, 
the electrically and acoustically combined response yields one polar. 
So, I was really wondering how four omni mics could provide unique info 
for multiple directions. A highly-directional mic
 can be created using omnis and beam forming, but not a *series* of 
directions at a given instant. Now, scratching my head, there's no 
reason that multiplexing among the mics couldn't be used to create 
rapidly-changing patterns that are akin to interleaved quad channels. 
That is to say, only one direction is picked up at a time, but the 
derived direction changes swiftly enough that it appears to have four 
*directional* mics (is this a new idea... it just came off the top of my
 head... most of what I think up has been done.)
I just found the 
technology interesting/curious, and wondered where it might deviate from
 the Soundfield mic to the point of being a unique design. One aspect of
 a patent is that the invention be unique.

Above post in response to:: If one regards the subcardioid as made up of omni 
and figure of 
eight components, is it not the case that the ambisonic XYZ signals of 
the Soundfield Mic are derived solely from the figure of eight 
components?

Further, if this new mic relies on omni capsules, how
 will it not suffer from the signal to noise ratio problem of Blumlein's
 method of deriving stereo from two omnis?

David
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Re-re-inventing the wheel

2013-04-20 Thread Eric Carmichel
I truly appreciate your informative and highly detailed response. For 
helping understand spherical harmonics (or Legendre polynomials?), and 
for mics lying on surface of a sphere, this helps a lot.

But here's what
 I don't understand about the quaud (quaud.io) mic: They say the four 
omnidirectional mics lie on the corners of a tetrahedron--essentially 
same arrangement as Soundfield, but with omni mics and 
positioned on corners of tetrahedron. Re mics on a sphere: In a corner-oriented 
tetrahedral arrangement, the mics would lie on a *virtual* sphere just as much 
as mics on a sphere could be lying on a virtual tetrahedron. But at some point 
the actual (physical) surface becomes a piece of the whole. This is clearly 
evident when the sphere is large enough to be a human head. So I'm not always 
clear as to whether it's the mics' virtual orientation in space, or the 
physical boundary of a spherical surface, that *shapes* the sound and creates 
the requisite time and pressure differentials.
Dave M's original post states that someone else is... again... re-inventing the 
Soundfield mic. I'm sure that I'm not the only person who is curious as to what 
makes the quaud (Trademark) mic *unique* and different from the Soundfield 
mic--particularly if the quaud mic is patented. Is, for example, more than one 
tetrahedral arrangement used to achieve *surround* spacing--which would then
 be a wholly different thing? I need to read further. Thanks again for 
info. The following was cut-and-pasted from their website (I think Dave 
provided all of this in his post, too):

quaudio
 comprises four omnidirectional microphones located at or near to the 
corners of a regular tetrahedron. Since these capsules are
 omnidirectional they can be located at the opposite corners of a cube 
with no loss in generality. This arrangement is straightforward to 
achieve in a standard PCB assembly line by soldering two pairs of MEMS 
or electret capsules on opposite sides of the substrate. Alternatively 
it is possible to solder three capsules to one side and a single capsule
 to the other. In both cases the acoustic centre of each sensor should 
be separated laterally by a distance corresponding to the vertical 
separation between membranes on either side of the device
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Overzealous Underthinking

2013-04-23 Thread Eric Carmichel
This post refers to Sursound Digest Vol 57 Issue 16

(from) Eric:

A highly-directional mic can be created using omnis and beam forming, but not a 
*series* of directions at a given instant.

(response from) Fons:

??? What would stop anyone from using whatever beamforming algorithm twice (or 
more times) in parallel, using the same mic signals as input?

New thoughts...

Hi Fons, It’s not uncommon for me to *underthink* things. I sort-of realized 
that electrical buffering would allow any of the mics to be used in parallel, 
even if their respective signals were mixed electrically in any possible 
combination (to include polarity inversion) or digitally offline. But, I have 
considered a mic technique that *might* benefit from multiplexing (or its 
signal processing equivalent).
Briefly, I’m a big fan of the Blumlein technique because it gives a wonderful 
front stage when played through a basic stereo setup. The inherit problem of 
this technique comes from source-sounds that emanate from behind the mic 
arrangement (two figure-of-eights, of course). We can’t selectively choose 
front from back and then swap the rearward sounds’ L-R orientations. The sum 
and differences of the two bi-directional mics could be manipulated if we got a 
positive output from both the front and rear lobes simultaneously. This may 
sound trivial, but this can’t be done in parallel because we don’t have 
independent outputs for each of the *lobes*. In other words, getting a negative 
output from a compression to the front could be accomplished via polarity 
inversion, but this automatically leads to a positive output for a rarefaction 
to the rear. It *could* (?) be accomplished with the addition of a second pair 
of mics (starting to sound Ambisonic),
 but their differences (physical spacing and performance), when compared to the 
first pair, would create some error, though perhaps not by much. Two *virtual* 
mics could, in real-time, be created via multiplexing (same as separating odd- 
from even-numbered samples of a digitized signal?). This leads to a 
four-channel output from two figure-of-8 mics, which, for the time-being 
doesn’t get us anywhere. But if the R1, R2, L1, and L2 signals were 
*appropriately* mixed (e.g. adding R2 to –L1), maybe there’s a way to *get 
back* the rearward sounds’ proper L-R orientation. As I think about it, the 
mics would have to digitally swap L and R intermittently (one swap per sample), 
which won’t work because they have to be physically facing L or R as is 
required for the Blumlein technique.
Well, now that I’ve proven myself wrong (again) while jotting down ideas, I’m 
going to post this anyway so that others will steer clear from the foibles of 
poorly conceived ideas. Or, maybe I actually am onto something (unlikely). When 
I consider the elegant *simplicity* of Ambisonics, it really is a very cool 
topic: Four mics, and a lot of positive directions!
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Sounds in Motion

2013-04-23 Thread Eric Carmichel
Greetings All,
As with many *coincidences* in the universe, the recent discussions regarding 
the quaudio mic were informative and timely. From what I gathered, there are 
certain mic techniques and algorithms that are better suited for the recording 
of STATIONARY sound sources than for MOVING sources. This, then, made me 
realize that recording moving sound sources is no trivial task; particularly 
when it comes to the reconstruction of an auditory event.
I have the idea that certain sounds can be separated from noise by not only 
their spectral characteristics and spatial location, but also by their 
perceived motion. I’m not referring to judging the distance of the moving 
object, or its direction of travel. An analogy to vision would be camouflage: 
Despite relatively poor vision, I often detect wee critters such as lizards, 
toads, and insects because of their motion. There’s no way I could otherwise 
see them against a backdrop of similarly colored terrain. The first step 
towards identifying the critter is realizing that it’s there to begin with.
I now have a reason for incorporating auditory *motion* in a research project. 
Maybe a few of you would like to join in or provide assistance. I’ll confess 
that I could use a project to get me a step closer to acceptance into a 
doctoral program. Furthermore, there’s a conference happening October 15-18 in 
Toronto: It’s the 8th Objective Measures Symposium on Auditory Implants. Their 
theme is ‘Unique people, unique measures, unique solutions’ reflecting a 
collective goal of providing the best hearing for persons needing an auditory 
prostheses (= cochlear and brainstem implants). Below are a few of my ideas 
(egad!) and thoughts:
I know I’ve said this more than once, but I’m not too keen on presenting 5-word 
sentences presented in a background of pink noise as an *objective* measure of 
cochlear implant (CI) efficacy. This is may be objective in telling us how well 
a person performs while seated in a surround of pink noise and listening to 
nonsensical sentences, but so what? I’ve been hoping to present or propose a 
slightly *better* yardstick, even if there’s no past or standard reference to 
pit my data against. I had previously proposed adding video to complete the 
AzBio, IEEE, and SPIN sentences (currently used for speech audiometry), and I 
know of at least one doctoral student who has taken this to heart. What I now 
propose (with video) are sound-source identification tasks that can be 
*objectively* scored.
Simple sounds may not be readily identifiable by the hearing impaired. This 
isn’t new news, but how well stationary and mobile sounds can be identified by 
an implant wearer could be of value, particularly when designing implant 
algorithms. Imagine listening to several sounds through a 12-channel (max) 
vocoder. This roughly approximates CI listening. Your pitch discrimination 
ability is largely shot to hell, and dynamics would be compromised if 
compression were also added. Sounds emanating from various sources would be 
blurred, but hopefully your binaural sense of direction provides some 
signal-from-noise segregation. You still detect rhythm... at least for 
repetitive sounds.
Given the above, we might go a step further (in the direction of Ecological 
Psychology) and ask whether we can tell a difference from tearing or breaking 
sounds, water drops from other temporal-patterned sounds, or rolling sounds 
from steady-state noises. Wind, although around us and a type of motion, is 
stationary relative to, say, a rolling object. When heard through a vocoder, 
they may be indistinguishable... unless the perceived motion of the rolling 
object provides useful information. Given a closed set of choices to choose 
from (and perhaps visual context), we could determine *objectively* how well we 
identify sounds presented in a background of other sounds. The latter is the 
*new* part: Can we segregate and then identify, sounds because of their motion, 
spectral make-up, etc. despite minimal or distorted information?
I would prefer to create stimuli from real-world sounds, though panning 
monaural sounds could be of some help. I like the *naturalness* of Ambisonic 
recordings, but now question how well they can be reproduced. I know that there 
are recordings of airplanes and helicopters (recorded by Paul Hodges, John 
Leonard orAaron Heller? I can’t find names/recordings online), so I have no 
doubt that Ambisonics is a viable method of recording moving sound sources. I 
am, however, concerned about the limitations, and how many sounds (to include 
sounds’ reflections) can be reproduced without raising doubt as to the 
*accuracy* of the playback.
I believe this is a do-able project that could provide meaningful information. 
Fine tuning implants to deal with an *outside* world could be different from 
the algorithms used to perfect speech understanding.
Best,
Eric C.
-- next part --
An HTML attachment was scrubbed.

Re: [Sursound] what mics do you use?

2013-04-25 Thread Eric Carmichel
Hello Matthew,
I know you already received accurate and detailed responses to your question, 
but thought I'd add something.
The ears, by themselves, are essentially omnidirectional. They're akin to most 
pressure mics with no rearward venting. The head shapes the sound (ITDs, 
ILDs... the stuff you're aware of), as do the pinnae and body.
The 3daudio mic provides ITDs and pinnae transfer characteristics... I think 
(mere separation gives the time difference). Given the correct filter 
characteristics (i.e., the HRTF) and mixing output of mics to opposite ears, 
you could provide some semblance of a head. With further processing you could 
perhaps get the equivalent of transaural stereo that provides a 3D listening 
experience with only two loudspeakers. Now, you might ask, can I take this 
information, set up the appropriate equations, and work backwards to get a 
B-formatted signals that can ultimately be processed for the desired number of 
channels? When setting up the linear equations (I suppose each wav file would 
be its own array or variable in MATLAB), you'd probably run into infinite 
solutions (or no solution) because the number of equations wouldn't match 
number of unknowns. If you're after an inexpensive *binaural* mic, there are 
good ones you can wear in or at the ear (ask Len, he
 knows). But I take you're looking for a true Ambisonic solution at low cost. 
I'm curious to build an Ambisonic mic using low-noise Panasonic or Knowles 
elements, but calibration is key. I have a great Ambisonic mic (TetraMic), but 
like to build things for the fun of it. Always good learning experience, too. 
Best wishes in your quest... maybe keep us up to date? Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] eigenmike, VisiSonics, etc.

2013-06-15 Thread Eric Carmichel
I read the comments regarding mh Acoustics' eigenmike. Of interest was the 
topic of off-axis coloration (particularly FA's response and explanation). 
Off-axis coloration is often used to *advantage* -- at least when a certain 
characteristic sound is desired. When you consider how differently top vocal 
microphones sound (e.g. Sony 800G vs Neumann U47), sound recording engineers 
accept and embrace coloration. But as it applies to Ambisonics or 
HOA/multi-element mics, I can see where there are problems. Do same issues 
apply to VisiSonics' mic and/or software? The prior discussions made me wonder 
whether calibration for many *tetra* style mics is sufficient; that is, are 
polar responses of first-order subcardiods predictable from on-axis amplitude 
responses or IRs that are obtained on-axis? Would phase plots provide more 
useful info or prediction? (I think of the tightly calibrated BK/DPI mics used 
in sound power and intensity probes and how they're matched
 in phase as well as frequency response.) If IRs were taken off-axis (maybe 
they already are for the TetraMic?), what might be the *best* polar coordinates 
be for obtaining a response needed to minimize inherent or potential problems? 
Just questions... sorry I don't have answers... but at least this topic helps 
us consider ways to improve upon the hardware (mics) and software. Many thanks 
for everybody's time.
Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] PotLuck Audio

2013-06-17 Thread Eric Carmichel
Greetings All,
I just registered for an upcoming audio conference (primarily recording) in the 
US. I thought I'd pass the URL along for those who may be interested. The list 
of sponsors is impressive--I'll be curious to see what surround gear is 
displayed or introduced.
Anybody on the list planning to attend? I'll look forward to meeting if you are.

Here's the link:

www.potluckconference.com (potluckconference dot com if URL gets scrubbed).

Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Testing left, center, right...

2013-06-21 Thread Eric Carmichel
Over the months, I’ve read a couple of posts asking whether it is possible to 
extract or synthesize surround channels from binaural stereo or non-Ambisonic 
surround formats. I am now attempting to do something that would appear to be 
simpler and more straightforward: that *something* being the conversion of 
conventional L-R stereo to L-C-R.
My reason for wanting to do this is because I have three loudspeakers that I’m 
very fond of: A pair of vintage KRK 7000s and a single KRK 9000 (not with the b 
suffix). I want to use the 9000 as the center channel because it has as the 
same Focal tweeter as the 7000s, but the 9000 has a much larger woofer (the 
7000s being somewhat deficient in this area). I thought it would be fun to 
experiment with a full-range center channel in lieu of using the 9000 (or any 
speaker) as a sub-only.

There’s a lot of info on the Internet about converting stereo to mid-side, but 
not too much about L-C-R. The simplest *fix* would be

C = (L+R)/2

which is akin to extracting the mid signal from a mid-side encoding. But this 
doesn’t separate what is only in the left channel, what is only in the right, 
and what is common to both L + R. If I have a sound going only to the L 
channel, using (L+R)/2 would give me half of that sound on the center channel. 
Needless to say, signal separation would suffer.

Going a step further, the left channel, L, contains the unique part of its 
content, l (lower case L), and half what is common, or C/2. That is,

L = l+C/2

Likewise:

R = r+C/2

But to extract what is common in terms of R and L is algebraically impossible 
because we don’t know what l and r are (l and r don’t cancel out); I suppose if 
we did, then there wouldn’t be a problem to begin with. The problem doesn’t 
exist when you have stems to work with (e.g. cinema sound track production) in 
lieu of attempting to extract it from existing L and R channels.

Perhaps this simple exercise also explains, at least in part, why we can’t 
derive Ambisonic channels from non-Ambisonic sources (mathematically, it seems 
impossible to set up a determinant with unique solutions).

Perhaps readers here have had success with a simple stereo to L-C-R converter? 
I have converted Ambisonic recordings to L-C-R: This is pretty easy with any 
number of plug-ins (I use Harpex for most of my Ambisonic processing).
Thanks for everyone’s time.
Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Ambisonic Aerobics (Ambirobics??)

2013-06-22 Thread Eric Carmichel
Hello Everyone,

First, many thanks for responses and insights to my recent post (Testing left, 
center, right...).

I look forward to trying Fon's suggestion--I'm always game for a listening 
experiment as long as high SPLs aren't involved.

Regarding Dave's post and the following link:

**This may be the answer to the lack of a feeling of moving...

http://www.kickstarter.com/projects/1944625487/omni-move-naturally-in-your-favorite-game?ref=category


Dave**

Maybe Ambisonics combined with Wii will start a trend in surround 
entertainment? I recently joined Kickstarter because of numerous art and 
science projects I have in mind. One design, near completion, is a unique 
self-calibrating audiometer for use in developing countries (such as the USA). 
Seriously, people in the UK and US can submit proposals to Kickstarter and have 
chance for *free* funding. For those unfamiliar, it's worth a look (by the way, 
it was a Sursound reader/contributor who suggested Kickstarter to me--thanks!).

Regarding Eero's comment, **I guess the other option to extract the center 
speaker information from two channel stereo is a Dolby Surround decoder, which 
you know much better than any of us, Eric.**

For those who are regular readers, you probably know there are a number of 
Eric's on the list. I generally sign off as Eric C. so as not to embarrass the 
wiser and brighter Erics. One of the *other* Erics is an expert of Dolby. I 
don't know a whole heck of a lot about Dolby, but I did create a pseudo decoder 
for AC3 that used analog filters to get a 90 degree phase shift. For a single 
frequency, shifting 90 degrees is trivial. But for a complex tone, shifting the 
fundamental (for example) 90 degrees won't result in ALL of the composite 
frequencies being shifted 90 degrees. The filter was akin to a *Hilbert 
transformer* in that it is frequency-independent across the audio frequencies 
(or at least 50 Hz - 10 kHz based on Bode and phase plots). I've tried to 
explain this to those who are not mathematically inclined, but who have some 
familiarity with Fourier transforms (even if it's on a non-mathematical basis). 
You see, Hilbert transform is something of a
 *buzz word* among cochlear implant research, but not too many know its 
function aside from an aid in extracting a speech signal's envelope. If we 
perform a Fourier expansion on a square wave (Wikipedia good source of 
demo)--you'll see coefficients such as  1/3, 1/6, etc. ahead of sin(...). 
Moving all of the sines *forward* or *backward* by 90 degrees is (I think) 
equivalent to a Hilbert transform and, consequently, a way of shifting a 
complex wav by 90 degrees. Moving sin forward 90 degrees is -cos, moving 
backwards is + cos. This is obviously bit more complex than a 180 degree 
inversion. Why 90 degrees? I dunno, just seems cool to try and listen to 
(remembering, of course, there also has to be a reference signal). Also a part 
of decoding. Remember that I'm NOT the smart Eric, so I can't give better 
explanations.

Best to all,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] Giving Precedence to Ambisonics

2013-06-26 Thread Eric Carmichel
Greetings All,
I have a friend who's an advocate of the Decca Tree mic arrangement. Many of 
his recordings (a lot of choir and guitar) sound quite nice, so I looked into 
aspects of the Decca Tree technique. For those who may not be familiar, the 
*traditional* Decca Tree arrangement is comprised of three spaced 
omnidirectional mics. A center microphone is spaced slightly forward. From what 
I've read thus far (Spatial Audio by Francis Rumsey, Focal 
Press; and selected articles in the AES Stereophonic Techniques Anthology), the 
slightly advanced time-of-arrival for the center mic stabilizes the central 
image due the precedence effect. However, the existence of the third (center) 
mic can result in exacerbated comb-filtering effects that can arise with spaced 
pairs. So, to avoid these filtering effects, bring on a Soundfield / Ambisonic 
mic...??
As I understand, Ambisonics already takes into consideration known 
psychoacoustical principles, and is why shelving is used to *optimize* ILDs and 
ITDs above and below 700 Hz, respectively. But as many readers may know, there 
are some nearly unpredictable ILD/ITD effects at approx. 1.7 kHz (for example, 
see Mills, 1972, Foundations of Modern Auditory Theory). Creating a virtual 
Decca Tree seems straightforward. To move the center channel, or a virtual mic 
*forward* would require little more than offline processing. I wonder whether 
anybody has tried the following: Slightly delay all channels except the signal 
(or feeds) that make up the forward-most (central) channel. Using an Ambisonic 
mic would eliminate combing effects. I realize a number of Ambisonic plug-ins 
have built-in crossed-cardiod, Blumlein, and spaced omni functions, but not 
sure I've seen any of them give *precedence* to the precedence effect or Decca 
Tree arrangement.
Two-channel playback (both convention and binaural) is here to stay for a 
while, so optimizing Ambisonics for stereo is desirable to me. In fact, one of 
my favorite recordings from the late 80s was made with the band (The Cowboy 
Junkies) circled around a Calrec Soundfield mic. I've never heard whether the 
Trinity Session recording was released in a surround format, or if the mic's 
hardware decoder converted straight to stereo from the get go. That particular 
recording made me aware of the Soundfield mic, though surround sound wasn't an 
interest for me at that time.
If anybody I had attempted the Decca Tree using an Ambisonic mic (even with 
addition of a separate and forward omni mic), I'd be interested in knowing what 
your experiences were.
Many thanks for your time.
Best,
Eric C. (the C continues to remind readers that this post submitted by the 
*off-the-cuff* Eric)
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Giving Precedence to Ambisonics

2013-06-26 Thread Eric Carmichel
Hi Aaron,
Many thanks for the link to Ron Streicher's article -- I passed the link along 
to my friend who is a big advocate of the Decca Tree.
I've listened to demonstrations of the precedence effect, and they always 
involved a single sound source (such as a talker) coming from two loudspeakers. 
The signal to one loudspeaker was delayed, but slightly greater in level. The 
sound appears to come from the non-delayed speaker despite its lower SPL. I'm 
writing off the top of my head, but I believe level difference can be 6 dB or 
greater (up to 11 dB?) and the sound will still appear to come from the 
non-delayed speaker. What makes the Decca Tree interesting, then, is that when 
recording is mixed to two channels, there's a phantom (center) image that 
serves as the non-delayed source. In other words, both speakers (L+R) are same 
distance from listener, and level is the same, too, to create the central 
image. But keeping the image stable (as it's touted) is accomplished by virtue 
of the L+R signal being slightly pushed ahead (time-wise) of the extreme L or R 
signals. This delay is made possible by a
 slightly forward mic in the recording setup.
Now I'm curious to use two speakers to demonstrate the precedence effect, but 
using the L + R signal as the delayed signal (or visa versa) and seeing whether 
the source will continue to have originated from the non-delayed speaker or 
phantom image. I know time differences are used in panning, but they're 
generally *weaker* than level differences. In comparison, the precedence effect 
isn't subtle, but it does wear off after the onset of a sound, and level 
becomes the dominant localization cue--at least that has been my experience. I 
haven't heard a single-source demo of the precedence effect that uses a phantom 
image as the delayed or non-delayed source--the sounds have always come from 
discrete speakers/locations.
Thanks again for help and link.
Best,
Eric C.



____
 From: Aaron Heller 
To: Eric Carmichel ; Surround Sound discussion group 
 
Sent: Wednesday, June 26, 2013 11:06 AM
Subject: Re: [Sursound] Giving Precedence to Ambisonics
 


Ron Streicher has written about using a Soundfield as the middle mic in a Decca 
tree

   http://www.wesdooley.com/pdf/Surround_Sound_Decca_Tree-urtext.pdf

and Tom Chen has a system he calls B+ Format, which augments first-order 
B-format from a Soundfield mic with a forward ORTF pair.   I've heard it on 
orchestral recordings at his studio in Stockton and it sharpens up the 
orchestra image nicely.

Aaron Heller (hel...@ai.sri.com)
Menlo Park, CA  US



On Wed, Jun 26, 2013 at 10:02 AM, Eric Carmichel  wrote:

Greetings All,
>I have a friend who's an advocate of the Decca Tree mic arrangement. Many of 
>his recordings (a lot of choir and guitar) sound quite nice, so I looked into 
>aspects of the Decca Tree technique. For those who may not be familiar, the 
>*traditional* Decca Tree arrangement is comprised of three spaced 
>omnidirectional mics. A center microphone is spaced slightly forward. From 
>what I've read thus far (Spatial Audio by Francis Rumsey, Focal
>Press; and selected articles in the AES Stereophonic Techniques Anthology), 
>the slightly advanced time-of-arrival for the center mic stabilizes the 
>central image due the precedence effect. However, the existence of the third 
>(center) mic can result in exacerbated comb-filtering effects that can arise 
>with spaced pairs. So, to avoid these filtering effects, bring on a Soundfield 
>/ Ambisonic mic...??
>As I understand, Ambisonics already takes into consideration known 
>psychoacoustical principles, and is why shelving is used to *optimize* ILDs 
>and ITDs above and below 700 Hz, respectively. But as many readers may know, 
>there are some nearly unpredictable ILD/ITD effects at approx. 1.7 kHz (for 
>example, see Mills, 1972, Foundations of Modern Auditory Theory). Creating a 
>virtual Decca Tree seems straightforward. To move the center channel, or a 
>virtual mic *forward* would require little more than offline processing. I 
>wonder whether anybody has tried the following: Slightly delay all channels 
>except the signal (or feeds) that make up the forward-most (central) channel. 
>Using an Ambisonic mic would eliminate combing effects. I realize a number of 
>Ambisonic plug-ins have built-in crossed-cardiod, Blumlein, and spaced omni 
>functions, but not sure I've seen any of them give *precedence* to the 
>precedence effect or Decca Tree arrangement.
>Two-channel playback (both convention and binaural) is here to stay for a 
>while, so optimizing Ambisonics for stereo is desirable to me. In fact, one of 
>my favorite recordings from the late 80s was made with the band (The Cowboy 
>Junkies) circled around a Calrec Soundfield mic. I've never heard whe

[Sursound] Of stereo miking, Fourier analysis, and Ambisonics

2013-06-27 Thread Eric Carmichel
Many thanks to everyone for your responses and insights (re Giving Precedence 
to Ambisonics). I would like to comment on the following two responses:
1. from Jeff
**May I suggest “Demonstration of Stereo Microphone Techniques,” Performance 
Recordings #6 wherein 18 coincident, near-coincident and spaced omni (2 and 3 
mic) stereo techniques are compared via a line of loudspeakers mounted at equal 
intervals and spanning 10 1/2 feet left-to-right. Each loudspeaker was 2 inches 
in diameter and the center to center spacing was 9 inches. An electronically 
generated tick was switched to each loudspeaker in turn starting at the center 
and moving full right, full left and full right again before ending in the 
center. The pros and cons of each technique are unmistakable...
Jeff Silberman**
and
2. from J?rn
**hi jeff,
i think the test you're mentioning is not entirely fair, as much as i like 
coincident techniques.
such a setup tests for localisation only, and with wide-band transients it is 
quite clear that spaced techniques will lose, and their main advantage (better 
perceived spaciousness in stereo-only playback, and better LF response) is not 
even considered.
miking is a trade-off. testing individual aspects won't tell us much about 
actual musical use.
best,
j?rn**

Eric C. responds

The array of 2-inch speakers is reminiscent of many psychoacoustical 
experiments I’ve participated in: More laboratory-like than musical. Note that 
the clicks run in a sequence in lieu of random places. Once we perceive a 
pattern (e.g. L to R sequence), we begin to fill in the spaces based on 
patterns. At least that’s my (intuitive) notion. Also, clicks are among the 
easiest sounds to localize. The broadband nature of the clicks provides 
multiple localization clues, to include ILD, ITD, and (very importantly) pinna 
transfer cues. I have collected data from my personal lab to provide evidence 
of this latter claim, and I welcome everyone to review and scrutinize it. 
Listening tests were performed using 8 young, normal-hearing persons. A rather 
large (2.6 MB) Excel spreadsheet contains all the data. I designed this 
spreadsheet to provide descriptive statistics for any combination of listeners 
(e.g., group all female participants), stimuli, listening
 condition (e.g., unoccluded), or azimuth/location on the fly. You can download 
the Excel spreadsheet here (again, it's 2.6 MB):

www.cochlearconcepts.com/stats/hearing_data.xls 

Graphical representation of the results (using SPSS) are in the same folder, 
and you can see it here:

www.cochlearconcepts.com/stats/Figure_6_96dpi.jpg 

As can be seen (and heard!), the broadband stimuli are easy to localize when 
compared to tonal stimuli. When participants were donning binaural stereo 
electronic earmuffs (net acoustic gain at ear = 0 dB, carefully calibrated to 
match earcups), lateralization was accurate. But discerning front-back angles 
on same side (L or R; e.g., 60 and 120 degrees) was nearly impossible. This 
demonstrates what happens when ILDs and ITDs are preserved, but pinna cues are 
lost. You can see spectral and time-domain analysis of the broadband stimuli 
here:

www.cochlearconcepts.com/stats/Figure_1_96dpi.jpg 

I have to agree with J?rn that the example miking demonstration isn’t all that 
fair, and for another reason: How much low-frequency energy can a 2-inch 
speaker provide? Although the Fourier decomposition of a transient or *click* 
sound may suggest it's a broadband signal, I have reservations about Fast 
Fourier Transforms and clicks. My reservation is, in part, rooted in my own 
ignorance of math, but I’ll have to state that I don’t believe the ear works 
exactly as math would predict. Let me explain...
Fourier series shows that an ideal impulse (Dirac delta function or Kronecker 
delta?) can be decomposed into sine waves, but these waves have to begin in the 
Paleozoic Era and end when travel to distance galaxies is a reality. There’s 
something about the time domain aspect of *real-world* sounds that is missing. 
So how does the ear respond?
Imagine the inner ear is comprised of contiguous filters. Perhaps the ears 
inner hair cells (IHCs) are akin to the reeds of a resonant reed frequency 
meter (which can have very fine frequency discrimination ability). Regardless, 
how would such filters respond to click sounds or similar transients? A click 
in itself doesn’t last long, and certainly the click's period much shorter than 
a fraction of a low-frequency sound’s period. But yet, we rely on IRs and 
maximum length sequence (MLS) stimuli for response characteristics as well as 
hearing demonstrations. Back to 2-inch speakers...
At least the low mass and confined acoustic centers of tiny speakers make them 
ideal for transient bursts, but then what instruments aside from a woodblock 
does this represent? There always seems to be trade-offs when it comes to 
miking, and some of these have to do with mixing down to mono for non-stereo 
broadcas

  1   2   >