[Sursound] Here comes 3D Audio...

Stefan Schreiber Sun, 07 Dec 2014 15:13:02 -0800

Dear audio experts...

I would like to inform you about the availibility of some introductoryarticle about MPEG-H 3D Audio, written in a competent and very readablestyle by some of the creators:


http://www2.iis.fraunhofer.de/mpeghaa/papers/AES137_MPEG-H_v14_final.pdf
(Source 1; presented in October 2014)

------------------------------------

Mpeg 3DA is part of a bigger standard group, as such a co-standard ofthe most recent and powerful Mpeg video compression standard ( = HEVC):

ISO/IEC 23008 - High efficiency coding and media delivery inheterogeneous environments


http://en.wikipedia.org/wiki/MPEG-H

--------------------------------------

MPEG-H 3D Audio supports and integrates channel-based, object based andsound field/HOA based audio formats/technologies.


"Within MPEG-H 3D Audio,
flexible rendering to different speaker layouts is
implemented by a format converter that adapts the
content format to the actual real-world speaker setup
available on the playback side to provide an optimum
user experience under the given user conditions. For
well-defined formats, specific downmix metadata can
be set on the encoder to ensure downmix quality, e.g.
when playing back 9.1 content on a 5.1 or stereo
playback system."

You could say that 3DA is highly flexible both on the format input side(encoding basically any known < practically used > format) and at therendering/output stage.

Fig. 1 in the cited standard description gives very strong evidence that3D audio is a worthwile improvement, compared to horizontal surroundsound. 5.1 + 4H means 5.1 + 4 "Height" (speakers), BTW. (So I believe"5.1 + 4H" should refer to an Auro 3D configuration - which seems to bequite obvious.)

(Knowing some studies which are claiming that 3D audio is not "worthit": I firstly don't believe that any of these studies has been very setup in a very careful way, secondly they seem to contradict quite simpleobservations. IF we are able to hear - at least to some significantdegree - in three dimensions THEN any technology which claims to beperceptually < complete > has to reproduce sound in 3D. To test whatsounds "good" or "better" even doesn't matter if you think in this way.So maybe the first relevant test in any relevant scientific studyreferring to basic acoustical and perceptual questions could be if youcan hear and localize - at least "to some degree" - elevated and loweredsound sources. The answer is a profound "Yes", as everybody knows...)


Now a citing from another Fraunhofer (IIS) paper:

Thus, Silzle et al. undertook a study(7) to determine the relativeoverall perceived sound quality ofseveral speaker configurations, to determine if a practical compromisewas possible betweenthe sound quality provided by a 22.2 system and today's 5.1 and 2.0formats. As shown inFigure 4, the perceived quality improvement from 5.1 surround to 22.2is greater than that fromstereo to 5.1. However, ignoring the LFE channels, an upgrade fromstereo to 5.1 requires 3new speakers, while an upgrade from 5.1 to 22.2 requires 17 newspeakers. Our tests using anactive downmix method show that most of the perceived improvement ofupgrading 5.1 to a
22.2 system can be obtained with four additional height speakers


Source:
http://www.iis.fraunhofer.de/content/dam/iis/en/dokumente/forschungsfelder/AMM/Conference-Paper/BleidtR_SMPTE2014_Object-Based_Audio.pdf

And on the more practical side:

Adding immersive audio not only involves adding new signals, but alsoextending the panningand mixing functions of the live console or post-production digitalaudio workstation to handleperhaps 5.1 + 4H or 7.1 + 4H or third or fourth-order HOA signals.There are several operational
strategies for producing these signals:




Returning to Source 1 (Introductory Standard Description/3DA):

It is foreseeable that media consumption is moving
further towards mobile devices with headphones being
the primary way to play back audio.

Therefore, a
binaural rendering component was included in the
MPEG-H 3D audio decoder for dedicated rendering on
headphones with the aim of conveying the spatial
impression of immersive audio production also on
headphones.

The standard aims have all been met if not exceeded, it seems to me (asan outsider who has followed the standardization process since at least2011). So, congratulations!

It must be said that the development of such a standard has only beenpossible because of many years of basic and applied research in thisarea. The main contributors have therefore invested time, personal andmoney on this project, during a significant time frame. It is simplyencouraging to see that certain institutions and companies related toaudio research and (audio) consumer electronics still have somelong-term strategies and views. (Now compare this with the situation inthe so-called music industry - but this is getting "offtopic", and Idon't want to get too angry on Sunday evening...)

Contrary to some < publicly funded > university research I won't specify(or refer by name) but which remains "unavailable for anybody" - inspite of many presented < papers > about :-D , < this > standard willbe licensable as any other Mpeg standard - and seems already to beapplied as (3D) audio standard for future (HD/UHD) TV standards - ascurrently defined by ATSC and EBU.


Best regards,

Stefan Schreiber

P.S.:

There are definitely some more 3D audio standardization efforts around...

http://www.aes.org/events/137/papers/?ID=4048

P1-7 ECMA-407: New Approaches to 3D Audio Content Data Rate Reductionwith RVC-CAL

Inverse problems have only been known in spatial audio for a veryshort time; their only solution, called "inverse coding" inliterature, is essentially based on time-level modeling. Inverseproblems, however, unlike parametric coding, require only an initialtransmission of spatial side information, and thus can achieve muchlower bitrates than could be achieved with parametric coding.

The technology has been specified as the world's first international3D audio standard ECMA-407 and may be further extended with staticmodels in frequency domain.


Public available source:
http://ecma-international.org/publications/standards/Ecma-407.htm

(ECMA standards are freely available, which is good.)

A new way to perceptually eliminate redundant information makes use ofinvariant theory inside the encoder. Invariants with Gaussianprocesses were unknown until 2010 and have represented one majorproblem in non-applied mathematics for more than a century: DavidHilbert's proof that these coefficient functions form a field theninsinuated that their existence in random processes was very likely.As will be shown, when applied to spatial audio coding, invariantsrepresent a numerically efficient and perceptually powerful algebraictool.

Now, I am very probably too stupid to understand ALL O:-) of this, buthopefully this is some nice entertainment for all the lurkingmathematicians on our list...

P.S. 2: And what about our Ambisonics and "open audio" application andstandardization efforts? The question has to be asked, especiallybecause most of these efforts seem to end in the dustbin - as unfinishedprojects, and after people lost interest. I hope I will proved to bewrong...

Time to achieve some more "visible" results we have had plenty. (I amsorry to have to write this.)

Progress outside the acadmic world is done. The new Mpeg 3DA standardseems to be a milestone in audio standardization, being very general -but still open for future extensions.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20141207/fb6c1ebc/attachment.html>
_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit 
account or options, view archives and so on.

[Sursound] Here comes 3D Audio...

Reply via email to