Re: [Sursound] Two new approaches for the distribution of surround sound/3D audio

Stefan Schreiber Sun, 08 Sep 2013 18:15:32 -0700

Dear colleagues...

To continue the proposal to use < certain > forms of .AMB as areal-world format for the transport/storage of 3D audio (including musicrecordings), I would like to hint to some further and important issuesinvolved.

A full .AMB decoder would have to be able to decode the nine differentcombinations of .AMB to different (standard?) loudspeakerconfigurations, and also to headphones. (The latter would be someimportant point in my "requirement list".) This means there will beplenty of combinations, and some great opportunities to mess things upif anybody wants to implement the 9*9 "or so" combinations ... :-)

It would be advantageous if we would be able to limit .AMB to some < CEprofile > with far fewer combinations!

(To cover just FOA won't be enough. We know that FOA has certainlimitations and won't be good enough for all applications. Think just ofthe sweet spot issues.)

My impression is that you would have to use < at least > 3rd order toovercome many/most of the typical FOA problems.


Some advantages of TOA, compared to FOA:

- much larger sweet spot (not only support for individual listeners; IMOthis is very important, as I would like to be able to demonstrate somewonderful recordings to "at least one" friend, even better to "somefriends". If you don't have friends, don't bother... :-D )

- angular resolution significantly improved, compared to FOA(improvement of more than factor 2)


- improved performance at higher frequencies

- we know that FOA has certain problems to present sound "from thesides", even if the playback rig would include loudspeakers at directlateral positions.


http://www.acoustics.hut.fi/~ville/papers/pulkkiicmc2001.pdf

- Decoding TOA to (ITU) 5.1/7.1 will show much better results than FOAto 5.1/7.1. (Comments? I know that 5.1 is an underspecified irregulararray from an Ambisonics TOA perspective, but you can decode this andthe results will be better than in the FOA to 5.1 case...)


- Improved behaviour at higher frequenciess

Altogether, a practical "CE" format based on Ambisonics and .AMB couldbe introduced in the following, simplified form

I) FOA/ UHJ (3-4 UHJ channels, the proposed backward-compatible form tostereo of FOA...)


3/4 channels

(Classical decoders and other decoders, supposed to improve on classicalones...)

II) 3rd order horizontal-only and 3h/1p, which you could combine tojust 3h1p (1st order vertical)


7/8 channels,  or 8 channels

(Might still be offered in some "UHJ", stereo-compatible fashion; "UHJ"for 2nd/3rd oder doesn't exist yet, but it can be done.)

III) 3h/3p, 16 channels (call this the .AMB master format? Anyway, thisis the upper end of Fu-M AMB...)


----------------------

2nd order Ambisonics possibly doesn't offer enough improvement over FOA,so might be cancelled in some CE format - for the sake of simplification.

Do you think that 3rd oder Ambisonics would be strong enough for thedistribution of real recordings?(This is the decisive point, because if the answer is "yes" you couldconvince some people.My personal impression is yes, as I know that TOA is successfullyapplied in some to many real-world installations, in live concerts etc.On the other hand, it would be nice to hear the feedback of peopleactually working with TOA.)

I am aware that there is no microphone for 2nd or 3rd oder Ambisonics,maybe beside of some experimental designs. If you would like to use TOAas a master/distribution/storage format for music, there < should > besome < TOA/AMB microphone > available.But certainly somebody could design one? I believe there could alreadybe some market for an < AMB microphone >. (The eigenmike doesn't counthere. I don't think this should be seen as a microphone designed forreal-world music recordings. S/N problems, and many other issues...Hasbeen designed by and for geeks, or should I apologize for this commentanyway? =-O )


Mixing of TOA is already possible, here and today.

Best,

Stefan

P.S.: Note also in this context that the Mpeg is on track to finalizeits Mpeg-H 3D Audio framework until beginning of 2015. (The basic < COdecoder > is already chosen, how it seems.)

Mpeg 3D audio is technically cinema surround with height (22.2 style),so there is some basic difference if compared to Ambisonics. (Which hasless company support, but offers full-spherical 3D audio even in itsclassical 4-channel form. ;-) )

UHJ (surround/3D audio) as extension of stereo based files
(distribution via Internet, on discs and streaming, including
YouTube, Spotify etc.)
I like the potential of this idea very much; but it can only move
forward with the free availability of freely available encoders and
decoders for 2, 3 and 4-channel UHJ, in both standalone and plugin
formats.  Seeing as how mere 2-channel versions have signally failed to
become available at all, I wonder what chance there is.
I had hoped that somebody else would state the obvious, in the end Ihave to do this myself... :-)
While I would understand the above argument IF UHJ would be some areaon its own, my proposal actually implied that you would use (in theend) a B format decoder.
You would < additionally > need an UHJ channel extractor (works on theAAC file/ .M4P/Ogg etc. < input > ), and secondly the UHJ to B format"translator". (The latter is just the application of some formulaswhich might not be trivial but are known and/or can be deduced. Froman IT perspective, this is very little program code. You just have toapply known formulas. This step also doesn't depend a lot on thespecific programming language which is used. Mathematics staysmathematics, and the "language" of mathematical formulas is older thanprogramming languages - which explains why formulas look more or lessthe same in any programming language - well, if I/you exclude Forthand other exotics.... :-D )
I would call the two additional steps the < UHJ front end > for someB format decoder.
I know that there would have to be done a lot more work to publish Bformat programs/plugins/mobile apps etc., and to describe B decoderdesign. Specifically, I believe that B decoders nowadays < should > beable to support output via headphones and binaural techniques. SectionIII of my 1st posting suggests that head-tracking hardware is bothavailable and cheap enough to be applied in real-world products,including future < surround capable HT headphones >. I mentioned thespecific hardware used in the Oculus Rift VR headset, just to givesome example for some existing HT chip. (There is plenty of otherhardware around.)
It might help to set up some open group, which would promote the useand design of B format (HOA? Section II...) decoders: describing thetheory behind, offering (open sourced) program code, distributing freesolutions etc. (To set up a working "open" group requires someorganisational skills, but it can be done.)
Again, the real problem seems to be the lack of available B formatdecoders. (My proposal is to transport "B format over stereo", in somesimple description. If so, it is again obvious that you should see theuse of UHJ extension channels just as a front end for B format,because this is the format which has to be decoded.)
I believe that "you" should promote the fact that B format is a real3D audio format, using just 4 channels. This is obviously someintriguing fact. (Note that the spatial "3D" resolution of full FOA isactually the same as the spatial 2D resolution of XYW, becauseAmbisonics is isotropic.)
IMO, 2-channel UHJ is something from the past. Don't use this if youcould distribute the real thing?! Which means B format, not a reducedform of B format. The use of 3/4 channel UHJ (maybe more channels forhigher oders) was suggested to stay compatible with 2-channelaudio/stereo files and streams. It has been shown that existingfile/container formats would allow the transport of < UHJ<---->Bformat > over stereo, via at least two different extension techniques.(File extensions, extensions in current container formats)
Best regards,

Stefan

P.S.: Mpeg Surround is also a decoder based design. (MPS encoder/decoder)
The same is valid for the future (Mpeg) 3D audio codec, currently indevelopment. I know that they take the topic "binaural output viaheadphones" very seriously, you just have to look into their CfP andsimilar documents...
P.S. 2:
Like everyone else I wish I had the time myself; but when factoring in
the need to learn about DSP programming and modern programming
languages, other commitments, and the slowing down of age...

Paul
Not any single person could do all the programming stuff, at least notanymore. There are just too many different platforms around....
Nevertheless, B format decoders/apps will be written if Ambisonics isseen as a format which is worth to be implemented. (Or if there isenough music in this format around.)In this sense, I would look to the applications/aspects which are"beyond" of what is offered by the 5.1 ITU layout. (IMO Ambisonicsstarts to shine if you factor in the inherent capability torecord/encode < full-sphere> 3D audio. And because you could reallynot expect that available 3D audio loudspeaker layouts would lookabout the same everywhere, the Ambisonics decoder can be seen as anecessary interface to real-world loudspeaker configurations, or toheadphones. 2nd advantage... More arguments?!)
_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound


_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound

Re: [Sursound] Two new approaches for the distribution of surround sound/3D audio

Reply via email to