On 2014-11-12, Adam Somers wrote:

VR video (or as we call it, Cinematic VR) is in some ways the perfect use-case for ambisonics. This year we've created hundreds, if not thousands, of b-format recordings with accompanying 360º 3D video.

It really is. Because of the basic, most-old-fashioned ambisonic principle: fully uncompromised isotropy in encoding. (Note, I'm not saying anything about decoding.) In that the technology fits *abominably* well with stereoscopy, and especially looking around from a fixed viewpoint, in optics.

Then what ambisonics might *not* be so good in is virtual environments where you move about. That's because of the centred, angularly parametrized framework, which pretty much only lends itself to a fixed "view"point into the acoustic field.

You can then make it work in synthetic and even recorded-recreated acoustic environments. But not by direct recording and playback. You have to do something extra in between. You have to somehow abstract off your B-format recording, so that auditory cues still match. Like reverberation falloff, the auditory parallax of close sources over your movement, the mutual, directional correlation coefficients of the stuff you perceive as being part of the space and envelopment.

Ville's and Archontis's work can do anything of the sort. But still, given how nice they sound and how they also do that said kind of abtraction on the way, I'd say they are at the forefront of the stuff which could eventually become a Gaming/Movie/Department-store miracle-development.

Still, I've yet to find a solution for b-to-binaural which is as convincing as some of the BRIR-based object-sound spatialization packages (e.g. DTS HeadphoneX and Visisonics Realspace).

I believe I know where the problem is, or at least I believe I can participate meaningfully in a process which leads to a nigh-optimal solution.

And in this one, I do mean it, for real. I have some real ideas here, with my only problem being that I'm lazy, poor, already well underway into hard alcoholism...and short of hands who'd take my ideas seriously. The spherical surface harmonic kinds of ideas.

Just give me the usual starved for life and scholarship doctorand, even on-list, and I'll tell you how b-to-binaural is done. If not as a final solution, then as a bunch of processes and guidelines. A la Gerzon Himself. :D

I think what's primarily lacking is externalization, which perhaps can be 'faked' with BRIRs.

Ville Pulkki's work with DirAC, and his and his workgroup's two demonstrations, have me convinced that even fourth order ambisonics leaves too much artificial correlation in the soundfield at the size of a human head, to sound natural. That then also means that you can't just naïvely, linearly, statically, matrix down from any extant order of (periphonic or otherwise) ambisonics to binaural, even with full head tracking, and expect it to sound as good as the best object panning format.

So you need some active decoding magic in between. The Gerzon era problem with active decoding actually was that it was being used for the wrong purposes, too aggressively, and at such a very low level of electronic sophistication that it just couldn't work. Gerzon didn't ever touch Mathematica or MATLAB either, so that much of his analysis was that of an engineer, and not of an all-knowing AI-mathematician. Now, we can do a bit better in all those regards. We already even have things like Bruce's Tabu search results, well entrenced in the open literature; something the old days could not have dared to dream about.

So, why *not* go with active decoding once again? It's not a blasphemy if all of the original counterpoints have been answered, and we have psychoacoustic (and in my case stupendously obvious anecdotals) data to show we can just do better with active processing? I say no reason.

I.e. let's not talk about faking at all, anywhere. Let's only talk about capturing the most, most processable, simplest auditory data on the scene, and then about how to make the best of that captured data when replaying it. Via any means at all, aiming at transparency.

That's then why I'm such a fanboy once again of Aalto people's work: even the 4th order ideal, synthetic playback sounded like shit compared to the reference. After the newfangled DirAC processing, lo and behold, it came pretty close to the original. So if you think -- like I do -- of the outcomes, you too would have to beg for Eigenmics, Aalto's software for them, and then home AV cognizant of decorrelation, B-format minded sound intensity analysis, and that newfangled variety of infinite order decoder we call Higher Order DirAC.

Let me coin a short: as opposed to HOA, it's HODAC.
--
Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front
+358-40-3255353, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit 
account or options, view archives and so on.

Reply via email to