On 2017-07-05, Aaron Heller wrote:
1. You should use a first-order decoder to play first-order sources. That's not the same as playing a first-order file into the first-order inputs of a third-order decoder.
What Aaron said. The optimum decoders at different orders aren't comparable to each other. You'd like to think so, but it unfortunately really isn't the case. It even isn't the case that you can pseudo-invert UHJ into pantophonic B-format without needing a separate decoder matrix.
The worst thing is that the lower your limiting system order, the more you're relying on the precise psychoacoustics of an optimal decoder. Straying from it by naïvely feeding first order B-format into a second order decoder will be *much* worse, relatively speaking, than feeding second order into a third order one. Not to mention something which came from UHJ.
2. 1st-order periphonic (3D) ambisonics on a full 3D loudspeaker array gets the energy correct, and hence the sense of envelopment; localization is not that precise. The magnitude of the energy localization vector, rE, in this situation is only sqrt(3)/3, which Gerzon noted is “perilously close to being unsatisfactory." [1]
Feeding something like pantophony into a periphonic rig or vice versa is also suspect. Such setups lead to confusion between cylindrical and spherical harmonics, which means that the average intensity falloff gets mangled as well. While it remains sensible in angle, in radius around the sweet spot it doesn't. You can't rectify that problem even theoretically if you mix 2D and 3D ambisonic setups, with their topologically differing basis functions.
3. The decoders in the AmbiX plugins are single-band rE_max decoders, a dual-band decoder will improve localization for central listeners a bit. Both Ambdec and the FAUST decoders produced by the ADT (the ".dsp" files) support 2-band decoding.
...and as I said above, at low orders we're relying more on the optimum psychoacoustic decode. A single band rE_max just won't do there.
4. If you really want more precise localization, consider parametric decoding using Harpex or the Harpex-based upmixer plugin from Blue Ripple Sound.
And even before going with something like Harpex, which is essentially a try at dual source active decoding, at least put in one of the newer, numerically optimized passive decoders, such as (was it?) Bruce Wiggins's Tabu seach derived framework. Then after doing that and Harpex, try out something like DirAC, from the newer active decoder family.
In my experience, it works very well with panned sources and acoustic recordings in dry environments (outdoors, dry hall). For recordings in very reverberant halls (like my recordings), the improvement is not that great.
Harpex does a limited number of direct sources rather well and stably. DirAC on the other hand does a higher number of sources, combined with ambience separation and spatial whitening. The two approaches appear to be complementary, but as of yet, I've never seen anybody implement them in the same active decoder. Nor to really take heed of the older passive decoder ideas too well, in combination with any active decoder concept.
As for what someone said down this thread about the optimum number of speakers in an old discussion... That one started out with, was it, Furse's or Leese's "Giant Geese". The undeniable percept in wide area reproduction that sound sources just sound *way* too big, even if well localizable within the first order framework.
Correct me if I'm wrong, because I don't think anybody's put all of the pieces together in any one post, but... I believe especially after the NFC-HOA work and the many listening tests on sparse first order reproduction arrays of various cardinalities boiled downto a couple of points.
First, optimum ambisonic playback with any rig isn't just dependent on angle, but rig diameter as well. That's first seen in how near-field compensation makes the transmission format depend on intended diameter. It was presaged by the original distance compensation circuitry of Plain Old Ambisonic of the Gerzon vein, which is precisely the first order rig dependent part of NFC-HOA, just placed on the decoder side. Where it then can't fully compensate for...
...secondly, spatial aliasing caused by the sparseness of the rig. It can do so at a single sweet point at the center, for a dual band decoder. But over the whole audible band, the sweet spot is so small in the first order case that the compensation necessarily falls short even at inter-ear distances. Suddenly we start to hear combing from the several speakers out there, on the rim...
...leading to third, some psychoacoustics which we didn't really expect. We were always assuming that more speakers leading to a denser rig would just automatically make for a better sound stage, because it comes closer to the ideal holophonic limit. But that's not really true when we work so far away from the limit proper as we do with even a dozen or a two dozen speaker array; there we easily perceive multipathing, and the degradation which comes with it. As it also happens, it seems that there lower order multipathing, to most realistic degrees, somehow gets compensated by our hearing.
We don't have a nice theory of how precisely that happens, but we do seem to have plenty of evidence in both anechoic and more realistic conditions that something like that must be happening. For instnace, it's already more or less an established fact that a four speaker *most* basic first order POA system sounds better than a regular hexagon, and over a wider area; the difference isn't too subtle either: under blind listening conditions even I, with my pronounced hearing deficit, could *instantly* pick up on it.
That is then perhaps the best reason to go with higher order systems if we at all can: even if they can't approach the holophonic bound in any practicable way, they do isolate crosstalk so that it leads to less combing with a given number of speakers, so that multipathing doesn't lead to such prominent spectral lobing. And even if it does lead to time domain anomalies, they too will be closer to something our extant temporal (pre)masking machinery can handle.
Finally, once again, that's just my synthesis of a bunch of vague memories and my own thinking. Various people on-list more knowledgeable and more upto date might disagree. But in any case, these questions *have* been raised before, and on various occasions been discussed at length. Hopefully my ideas above can at least serve as pointers to what is already in the list archive. :)
-- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-40-3255353, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 _______________________________________________ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.