Dear colleagues...

To continue the proposal to use < certain > forms of .AMB as a real-world format for the transport/storage of 3D audio (including music recordings), I would like to hint to some further and important issues involved.

A full .AMB decoder would have to be able to decode the nine different combinations of .AMB to different (standard?) loudspeaker configurations, and also to headphones. (The latter would be some important point in my "requirement list".) This means there will be plenty of combinations, and some great opportunities to mess things up if anybody wants to implement the 9*9 "or so" combinations ... :-)

It would be advantageous if we would be able to limit .AMB to some < CE profile > with far fewer combinations!

(To cover just FOA won't be enough. We know that FOA has certain limitations and won't be good enough for all applications. Think just of the sweet spot issues.)

My impression is that you would have to use < at least > 3rd order to overcome many/most of the typical FOA problems.

Some advantages of TOA, compared to FOA:

- much larger sweet spot (not only support for individual listeners; IMO this is very important, as I would like to be able to demonstrate some wonderful recordings to "at least one" friend, even better to "some friends". If you don't have friends, don't bother... :-D )

- angular resolution significantly improved, compared to FOA (improvement of more than factor 2)

- improved performance at higher frequencies

- we know that FOA has certain problems to present sound "from the sides", even if the playback rig would include loudspeakers at direct lateral positions.

http://www.acoustics.hut.fi/~ville/papers/pulkkiicmc2001.pdf

- Decoding TOA to (ITU) 5.1/7.1 will show much better results than FOA to 5.1/7.1. (Comments? I know that 5.1 is an underspecified irregular array from an Ambisonics TOA perspective, but you can decode this and the results will be better than in the FOA to 5.1 case...)

- Improved behaviour at higher frequenciess


Altogether, a practical "CE" format based on Ambisonics and .AMB could be introduced in the following, simplified form

I) FOA/ UHJ (3-4 UHJ channels, the proposed backward-compatible form to stereo of FOA...)

3/4 channels

(Classical decoders and other decoders, supposed to improve on classical ones...)

II) 3rd order horizontal-only and 3h/1p, which you could combine to just 3h1p (1st order vertical)

7/8 channels,  or 8 channels

(Might still be offered in some "UHJ", stereo-compatible fashion; "UHJ" for 2nd/3rd oder doesn't exist yet, but it can be done.)

III) 3h/3p, 16 channels (call this the .AMB master format? Anyway, this is the upper end of Fu-M AMB...)

----------------------

2nd order Ambisonics possibly doesn't offer enough improvement over FOA, so might be cancelled in some CE format - for the sake of simplification.


Do you think that 3rd oder Ambisonics would be strong enough for the distribution of real recordings? (This is the decisive point, because if the answer is "yes" you could convince some people. My personal impression is yes, as I know that TOA is successfully applied in some to many real-world installations, in live concerts etc. On the other hand, it would be nice to hear the feedback of people actually working with TOA.)

I am aware that there is no microphone for 2nd or 3rd oder Ambisonics, maybe beside of some experimental designs. If you would like to use TOA as a master/distribution/storage format for music, there < should > be some < TOA/AMB microphone > available. But certainly somebody could design one? I believe there could already be some market for an < AMB microphone >. (The eigenmike doesn't count here. I don't think this should be seen as a microphone designed for real-world music recordings. S/N problems, and many other issues...Has been designed by and for geeks, or should I apologize for this comment anyway? =-O )

Mixing of TOA is already possible, here and today.

Best,

Stefan

P.S.: Note also in this context that the Mpeg is on track to finalize its Mpeg-H 3D Audio framework until beginning of 2015. (The basic < CO decoder > is already chosen, how it seems.)

Mpeg 3D audio is technically cinema surround with height (22.2 style), so there is some basic difference if compared to Ambisonics. (Which has less company support, but offers full-spherical 3D audio even in its classical 4-channel form. ;-) )





UHJ (surround/3D audio) as extension of stereo based files
(distribution via Internet, on discs and streaming, including
YouTube, Spotify etc.)


I like the potential of this idea very much; but it can only move
forward with the free availability of freely available encoders and
decoders for 2, 3 and 4-channel UHJ, in both standalone and plugin
formats.  Seeing as how mere 2-channel versions have signally failed to
become available at all, I wonder what chance there is.

I had hoped that somebody else would state the obvious, in the end I have to do this myself... :-)

While I would understand the above argument IF UHJ would be some area on its own, my proposal actually implied that you would use (in the end) a B format decoder.

You would < additionally > need an UHJ channel extractor (works on the AAC file/ .M4P/Ogg etc. < input > ), and secondly the UHJ to B format "translator". (The latter is just the application of some formulas which might not be trivial but are known and/or can be deduced. From an IT perspective, this is very little program code. You just have to apply known formulas. This step also doesn't depend a lot on the specific programming language which is used. Mathematics stays mathematics, and the "language" of mathematical formulas is older than programming languages - which explains why formulas look more or less the same in any programming language - well, if I/you exclude Forth and other exotics.... :-D )

I would call the two additional steps the < UHJ front end > for some B format decoder.

I know that there would have to be done a lot more work to publish B format programs/plugins/mobile apps etc., and to describe B decoder design. Specifically, I believe that B decoders nowadays < should > be able to support output via headphones and binaural techniques. Section III of my 1st posting suggests that head-tracking hardware is both available and cheap enough to be applied in real-world products, including future < surround capable HT headphones >. I mentioned the specific hardware used in the Oculus Rift VR headset, just to give some example for some existing HT chip. (There is plenty of other hardware around.)

It might help to set up some open group, which would promote the use and design of B format (HOA? Section II...) decoders: describing the theory behind, offering (open sourced) program code, distributing free solutions etc. (To set up a working "open" group requires some organisational skills, but it can be done.)

Again, the real problem seems to be the lack of available B format decoders. (My proposal is to transport "B format over stereo", in some simple description. If so, it is again obvious that you should see the use of UHJ extension channels just as a front end for B format, because this is the format which has to be decoded.)

I believe that "you" should promote the fact that B format is a real 3D audio format, using just 4 channels. This is obviously some intriguing fact. (Note that the spatial "3D" resolution of full FOA is actually the same as the spatial 2D resolution of XYW, because Ambisonics is isotropic.)

IMO, 2-channel UHJ is something from the past. Don't use this if you could distribute the real thing?! Which means B format, not a reduced form of B format. The use of 3/4 channel UHJ (maybe more channels for higher oders) was suggested to stay compatible with 2-channel audio/stereo files and streams. It has been shown that existing file/container formats would allow the transport of < UHJ<---->B format > over stereo, via at least two different extension techniques. (File extensions, extensions in current container formats)


Best regards,

Stefan

P.S.: Mpeg Surround is also a decoder based design. (MPS encoder/decoder)
The same is valid for the future (Mpeg) 3D audio codec, currently in development. I know that they take the topic "binaural output via headphones" very seriously, you just have to look into their CfP and similar documents...


P.S. 2:

Like everyone else I wish I had the time myself; but when factoring in
the need to learn about DSP programming and modern programming
languages, other commitments, and the slowing down of age...

Paul

Not any single person could do all the programming stuff, at least not anymore. There are just too many different platforms around....

Nevertheless, B format decoders/apps will be written if Ambisonics is seen as a format which is worth to be implemented. (Or if there is enough music in this format around.) In this sense, I would look to the applications/aspects which are "beyond" of what is offered by the 5.1 ITU layout. (IMO Ambisonics starts to shine if you factor in the inherent capability to record/encode < full-sphere> 3D audio. And because you could really not expect that available 3D audio loudspeaker layouts would look about the same everywhere, the Ambisonics decoder can be seen as a necessary interface to real-world loudspeaker configurations, or to headphones. 2nd advantage... More arguments?!)
_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound


_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound

Reply via email to