On Sun, Sep 11, 2022 at 07:21:50PM +0300, Sampo Syreeni wrote: > If the directional > sampling was statistically uniform over the whole sphere of directions, and > in addition the sample of directions probed was to be in quadrature, it > would be an easy excercise in discrete summation to gain the transform > matrix we need.
Even if that case it isn't as simple as you seem to think. Any set of measured HRIR will need some non trivial preprocessing before it can be used. One reason is low-frequency errors. Accurate IR measurements below say 200 Hz are difficult (unless you have a very big and good anechoic room). OTOH we know that HRIR in that frequency range are very low order and can be synthesised quite easily. Another reason is that you can't reduce a set of HRIR to low order (the order of the content you want to render) without introducing significant new errors. One way to reduce these is to reduce or even fully remove ITD at mid and high frequencies, again depending on the order the renderer is supposed to support. Getting the magnitudes (and hence ILD) accurate requires much lower order than if you also want to keep the delays. Compared to these and some other issues, not having a set on a regular grid (e.g. t-design or Lebedev) is the least of problems you will encounter. There are other considerations. For best results you need head tracking and a plausible room sound (even if the content already includes its own). > So the best framework I could think of, years past, was to try and > interpolate the incoming directional point cloud from the KEMAR and other > sets, to the whole sphere, and then integrate. Using a priori knowledge for > the edge, singular cases, where a number of the empirical observations prove > to be co-planar, and as such singular in inversion. I tried stuff such as > information theoretical Kullback-Leibner divergence, and Vapnik-Cervonenkis > dimension, in order to pare down the stuff. The thing I settled on was a > kind of mutual recursion between the directional mutual information between > empirical point gained/removed and Mahalanobis distance to each spherical > harmonic added/removed. It ought to have worked. The practical solutions do not depend on such concepts and are much more ad-hoc. Some members of my team and myself worked on them for the last three years. Most of the results are confidential, although others (e.g. IEM) have arrived at some similar results and published them. Another question is if for high quality binaural rendering, starting from Ambisonic content is a good idea at all. Simple fact is that if you want really good results you need very high order, and 1. such content isn't available from direct recordings (we don't have even 10th order microphpones), so it has to be synthetic, 2. rendering it from an Ambisonic format would be very inefficient. For example for order 20 you'd need 441 convolutions if you assume L/R head symmetry, twice that number if you don't. Compare this to rendering from object encoded content (i.e. mono signals plus directional metadata). You need only two convolutions per object. Starting from a sufficiently dense HRIR set, you can easily generate a new set on a regular grid with a few thousand points, and interpolate them (VBAP style) in real time. This can give you the same resolution as e.g. order 40 Ambisonics at fraction of the complexity. Ciao, -- FA _______________________________________________ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.