Dear colleagues...
To continue the proposal to use < certain > forms of .AMB as a
real-world format for the transport/storage of 3D audio (including music
recordings), I would like to hint to some further and important issues
involved.
A full .AMB decoder would have to be able to decode the nine different
combinations of .AMB to different (standard?) loudspeaker
configurations, and also to headphones. (The latter would be some
important point in my "requirement list".) This means there will be
plenty of combinations, and some great opportunities to mess things up
if anybody wants to implement the 9*9 "or so" combinations ... :-)
It would be advantageous if we would be able to limit .AMB to some < CE
profile > with far fewer combinations!
(To cover just FOA won't be enough. We know that FOA has certain
limitations and won't be good enough for all applications. Think just of
the sweet spot issues.)
My impression is that you would have to use < at least > 3rd order to
overcome many/most of the typical FOA problems.
Some advantages of TOA, compared to FOA:
- much larger sweet spot (not only support for individual listeners; IMO
this is very important, as I would like to be able to demonstrate some
wonderful recordings to "at least one" friend, even better to "some
friends". If you don't have friends, don't bother... :-D )
- angular resolution significantly improved, compared to FOA
(improvement of more than factor 2)
- improved performance at higher frequencies
- we know that FOA has certain problems to present sound "from the
sides", even if the playback rig would include loudspeakers at direct
lateral positions.
http://www.acoustics.hut.fi/~ville/papers/pulkkiicmc2001.pdf
- Decoding TOA to (ITU) 5.1/7.1 will show much better results than FOA
to 5.1/7.1. (Comments? I know that 5.1 is an underspecified irregular
array from an Ambisonics TOA perspective, but you can decode this and
the results will be better than in the FOA to 5.1 case...)
- Improved behaviour at higher frequenciess
Altogether, a practical "CE" format based on Ambisonics and .AMB could
be introduced in the following, simplified form
I) FOA/ UHJ (3-4 UHJ channels, the proposed backward-compatible form to
stereo of FOA...)
3/4 channels
(Classical decoders and other decoders, supposed to improve on classical
ones...)
II) 3rd order horizontal-only and 3h/1p, which you could combine to
just 3h1p (1st order vertical)
7/8 channels, or 8 channels
(Might still be offered in some "UHJ", stereo-compatible fashion; "UHJ"
for 2nd/3rd oder doesn't exist yet, but it can be done.)
III) 3h/3p, 16 channels (call this the .AMB master format? Anyway, this
is the upper end of Fu-M AMB...)
----------------------
2nd order Ambisonics possibly doesn't offer enough improvement over FOA,
so might be cancelled in some CE format - for the sake of simplification.
Do you think that 3rd oder Ambisonics would be strong enough for the
distribution of real recordings?
(This is the decisive point, because if the answer is "yes" you could
convince some people.
My personal impression is yes, as I know that TOA is successfully
applied in some to many real-world installations, in live concerts etc.
On the other hand, it would be nice to hear the feedback of people
actually working with TOA.)
I am aware that there is no microphone for 2nd or 3rd oder Ambisonics,
maybe beside of some experimental designs. If you would like to use TOA
as a master/distribution/storage format for music, there < should > be
some < TOA/AMB microphone > available.
But certainly somebody could design one? I believe there could already
be some market for an < AMB microphone >. (The eigenmike doesn't count
here. I don't think this should be seen as a microphone designed for
real-world music recordings. S/N problems, and many other issues...Has
been designed by and for geeks, or should I apologize for this comment
anyway? =-O )
Mixing of TOA is already possible, here and today.
Best,
Stefan
P.S.: Note also in this context that the Mpeg is on track to finalize
its Mpeg-H 3D Audio framework until beginning of 2015. (The basic < CO
decoder > is already chosen, how it seems.)
Mpeg 3D audio is technically cinema surround with height (22.2 style),
so there is some basic difference if compared to Ambisonics. (Which has
less company support, but offers full-spherical 3D audio even in its
classical 4-channel form. ;-) )
UHJ (surround/3D audio) as extension of stereo based files
(distribution via Internet, on discs and streaming, including
YouTube, Spotify etc.)
I like the potential of this idea very much; but it can only move
forward with the free availability of freely available encoders and
decoders for 2, 3 and 4-channel UHJ, in both standalone and plugin
formats. Seeing as how mere 2-channel versions have signally failed to
become available at all, I wonder what chance there is.
I had hoped that somebody else would state the obvious, in the end I
have to do this myself... :-)
While I would understand the above argument IF UHJ would be some area
on its own, my proposal actually implied that you would use (in the
end) a B format decoder.
You would < additionally > need an UHJ channel extractor (works on the
AAC file/ .M4P/Ogg etc. < input > ), and secondly the UHJ to B format
"translator". (The latter is just the application of some formulas
which might not be trivial but are known and/or can be deduced. From
an IT perspective, this is very little program code. You just have to
apply known formulas. This step also doesn't depend a lot on the
specific programming language which is used. Mathematics stays
mathematics, and the "language" of mathematical formulas is older than
programming languages - which explains why formulas look more or less
the same in any programming language - well, if I/you exclude Forth
and other exotics.... :-D )
I would call the two additional steps the < UHJ front end > for some
B format decoder.
I know that there would have to be done a lot more work to publish B
format programs/plugins/mobile apps etc., and to describe B decoder
design. Specifically, I believe that B decoders nowadays < should > be
able to support output via headphones and binaural techniques. Section
III of my 1st posting suggests that head-tracking hardware is both
available and cheap enough to be applied in real-world products,
including future < surround capable HT headphones >. I mentioned the
specific hardware used in the Oculus Rift VR headset, just to give
some example for some existing HT chip. (There is plenty of other
hardware around.)
It might help to set up some open group, which would promote the use
and design of B format (HOA? Section II...) decoders: describing the
theory behind, offering (open sourced) program code, distributing free
solutions etc. (To set up a working "open" group requires some
organisational skills, but it can be done.)
Again, the real problem seems to be the lack of available B format
decoders. (My proposal is to transport "B format over stereo", in some
simple description. If so, it is again obvious that you should see the
use of UHJ extension channels just as a front end for B format,
because this is the format which has to be decoded.)
I believe that "you" should promote the fact that B format is a real
3D audio format, using just 4 channels. This is obviously some
intriguing fact. (Note that the spatial "3D" resolution of full FOA is
actually the same as the spatial 2D resolution of XYW, because
Ambisonics is isotropic.)
IMO, 2-channel UHJ is something from the past. Don't use this if you
could distribute the real thing?! Which means B format, not a reduced
form of B format. The use of 3/4 channel UHJ (maybe more channels for
higher oders) was suggested to stay compatible with 2-channel
audio/stereo files and streams. It has been shown that existing
file/container formats would allow the transport of < UHJ<---->B
format > over stereo, via at least two different extension techniques.
(File extensions, extensions in current container formats)
Best regards,
Stefan
P.S.: Mpeg Surround is also a decoder based design. (MPS encoder/decoder)
The same is valid for the future (Mpeg) 3D audio codec, currently in
development. I know that they take the topic "binaural output via
headphones" very seriously, you just have to look into their CfP and
similar documents...
P.S. 2:
Like everyone else I wish I had the time myself; but when factoring in
the need to learn about DSP programming and modern programming
languages, other commitments, and the slowing down of age...
Paul
Not any single person could do all the programming stuff, at least not
anymore. There are just too many different platforms around....
Nevertheless, B format decoders/apps will be written if Ambisonics is
seen as a format which is worth to be implemented. (Or if there is
enough music in this format around.)
In this sense, I would look to the applications/aspects which are
"beyond" of what is offered by the 5.1 ITU layout. (IMO Ambisonics
starts to shine if you factor in the inherent capability to
record/encode < full-sphere> 3D audio. And because you could really
not expect that available 3D audio loudspeaker layouts would look
about the same everywhere, the Ambisonics decoder can be seen as a
necessary interface to real-world loudspeaker configurations, or to
headphones. 2nd advantage... More arguments?!)
_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound
_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound