On Tue, Jun 10, 2014 at 11:49:13AM +0100, Curtis Alcock wrote:
> Thanks for your replies, Fons.
> 
> > If your studio is to be used for hearing research you should 
> > probably ask yourself if you want this sort of processing - it
> > sort of 'interpretes' the spatial information in a way that
> > is not at all related to how our brains do it.
> 
> Do you think it depend on the type of research? The purpose of
> the room is not to simulate how the brain works, but to simulate
> an auditory scene – in a way that the brain can then use "as if
> it were real".

The result (of processing such as done by Harpex) will be a 'correct'
sound field, one that could exist as the result of having real sound
sources in the room. It does not present 'false information'. But the
question is if such a sound field is representative of real daily
life ones. These do not have the limitation of at most two directions 
per frequency band per time slice. Imagine a typical space which will
produce many early reflections and some amount of reverb.
How much this matters in your field of research I really don't know.
 
> Leading on from this question, is it even (practically) possible to
> simulate a sound field that uses processing that is related to how
> our brains do it? If so, what type of processing should I be looking at?

The purpose of the system (as far as I can see it) is to create or
reproduce the sound field, leaving the interpretation to the listener.
So there should not be any processing that tries to mimic psycho-
acoustic processes, e.g. by deciding what is important (perceptible)
or not. 

You could compare this to lossy encoding (e.g. mp3). Such algorithms
will remove the things that we won't hear and reduce the information
rate that way. What they do is based on psycho-acoustic criteria - 
critical bands and masking. Which means you shouldn't use mp3 encoded
signals if you're doing psycho-acousting research on critical bands
and masking. 

> My main purpose is to see how the combination of a person + hearing
> technology + sound scene integrates in order to "accurately" (i.e.
> "results in the studio are predictive of performance in real life".)
> assess the combined performance in a way that is repeatable across
> people and technology, then use that information to adjust the parameters
> on the technology. As a lot of this technology is now making decisions
> based on spatial information (I'm not sure about distance, but certainly
> direction), it is important to surround a person (and the technology) with
> sound that is "close enough" to real life.

TOA will produce a fairly accurate replica of real-life sound fields.

> Also, would there be enough spatial (even if it's only "interpreted")
> information in a TOA set up to convince the hearing technology to change
> it's directional microphone polar plot to reduce the loudest noise source?

For normal TOA this will be the case. I can't really confirm is that is
still true for 'upsampled' TOA. If the loudest noise source is a discrete
one (a single or at most two directions) things will work. In the other
case an upsampled TOA may not correctly reproduce it (even if it may sound
OK - that is an entirely different matter). 
 
> Michael mentioned possibly using "virtual microphones" for a localization
> test, rather than discrete speakers. Do you think such a test would be
> repeatable across different people using TOA, or from what you understand
> about brain vs AMB, does it be too variable or open to interpretation?

I don't understand what is meant by 'virtual micrphones instead of speakers'.
The outputs of an AMB decoder can be interpreted as coming from a 'virtual
microphone' having some polar pattern, but I don't see the relation.

> >> Is it possible to record directly in TOA?
> > 
> > Normally HOA is recorded by panning individual sources and adding
> > AMB encoded reverb or room acoustics.
> 
> Does panning individual sources mean moving the microphone (in which
> case, you would be losing the transient nature of sound)? Or does it
> mean recording from several spots simultaneously (e.g. triangulating)?

Neither. It's the same process as multitrack recording for stereo or 5.1.
You start with individual (mono) sources, each of them is sent through
an AMB panner and the outputs of those are summed on an AMB mixing bus.
The panner just distributes its input signal over all the channels of
the AMB bus in the right proportions that represent any particular 
direction (just as a stereo panner distributes the signal between L and R).
Each mono signal is also sent (with controllable gain and delay), to a
processor that adds room acoustics and reverb. The amplitude ratio of
direct sound and reverb, and their relative delay determine perceived
distance. 

> > You need more than 16 speakers for full 3D third order. Good layouts
> > (without preferred directions or gaps) have 20-25 speakers.
> 
> Does that mean constructing a dodecahedron with speakers positioned
> on the nodes?

Which would be an icosahedron. This is not a very practical layout.
A good one for full 3D 3rd order is 

* One speaker at the bottom (optional, requires raised listening  position)
* A ring of six at elevation -45 degrees
* A ring of eight at elevation zero.
* A ring of six at elevaton +45 degrees
* One speaker at the zenith (again optional).

which means 20, 21 or 22 speakers. See for example
<http://www.conservatoriorossini.it/conservatorio/strutture_servizi/space.aspx>
which uses this layout (with some compromises on the negative elevation
ring and without the bottom speaker). This is the best sounding AMB
room I know of (also because of the excellent acoustic treatment).

> Is there a minimum room area size ?

Speakers should not be too close, something like 1.5m is the minimum
distance (and this will require a decoder with near-field compensation).

> > There are a number of algorithms that can produce
> > reasonable results, but none of them produce optimal decoders.
> 
> Does hand-optimised decoding produce more optimal results?

It can, if done by someone skilled in the art and using the right
tools. Such a person would also be able to tweak the automated
methods so they will produce better results than when using their
default settings. But doing such things requires experience and a
good understanding of the underlying theory. The first and most
important thing is to choose a good speaker layout.

Ciao,

-- 
FA

A world of exhaustive, reliable metadata would be an utopia.
It's also a pipe-dream, founded on self-delusion, nerd hubris
and hysterically inflated market opportunities. (Cory Doctorow)

_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit 
account or options, view archives and so on.

Reply via email to