Politis Archontis wrote:
We start by setting up a large dense 3D loudspeaker setup in a fully anechoic chamber (usually between 25~35 speakers at a distance of ~2.5m), so that there is no additional room effect at reproduction. Then we decide on the composition of the sound scene (e.g. band, speakers, environmental sources), their directions of arrival and the surrounding room specifications. We then generate room impulse responses (RIR) using a physical room simulator for the specified room and source positions. We end up with one RIR for each speaker and for each source in the scene. Convolving these with our tests signals and combining the results we end up with an auralization of the intended scene. This part uses no spatial sound method at all, no panning for example - if a reflection falls between loudspeakers it is quantized to the closest one. The final loudspeaker signals we consider as the reference case (after listening to it and checking if it sounds ok).
Is it only me to notice that these "original scenes" look highly synthetical?
Maybe good for DirAC encoding/decoding, but a natural recording this is not...
BR Stefan P.S.: (Richard Lee )
Some good examples of 'natural' soundfield recordings with loadsa stuff happening from all round are Paul Doombusch's Hampi, JH Roy's schoolyard & John Leonard's Aran music.
--------------------------------------------------------------------------
Then we generate our recordings from that reference. either by encoding directly to ambisonic signals, by simulating a microphone array recording, or by putting a Soundfield or other microphone at the listening spot and re-recording the playback. These have been dependent on the study. Finally the recordings are processed, and decoded back to the loudspeakers, usually to a subset of the full setup (e.g. horizontal, discrete surround, small 3D setup), or even to the full setup. That allows us to switch playback between the reference and the method. The tests have been usually MUSHRA style, where the listeners are asked to judge perceived distance from the reference and various randomized playback methods (including a hidden reference and a low quality anchor, used to normalize the perceptual scale for each subject). The criteria are a combination of timbral distance/colouration, spatial distance, and artifacts if any. I’ve left out various details from the above, but this is the general idea. Some publications that have used this approach are: Vilkamo, J., Lokki, T., & Pulkki, V. (2009). Directional Audio Coding: Virtual Microphone-Based Synthesis and Subjective Evaluation. Journal of the Audio Engineering Society, 57(9), 709–724. Politis, A., Vilkamo, J., & Pulkki, V. (2015). Sector-Based Parametric Sound Field Reproduction in the Spherical Harmonic Domain. IEEE Journal of Selected Topics in Signal Processing, 9(5), 852–866. Politis, A., Laitinen, MV., Ahonen, A., Pulkki, V. (2015). Parametric Spatial Audio Processing of Spaced Microphone Array Recordings for Multichannel Reproduction. Journal of the Audio Engineering Society 63 (4), 216-227 Vilkamo, J., & Pulkki, V. (2014). Adaptive Optimization of Interchannel Coherence. Journal of the Audio Engineering Society, 62(12), 861–869. Getting the listening test samples and generating recordings or virtual recordings from the references would be a lot of work for the time being. What is easier and I can definitely do is process one or some of the recordings you mentioned for your speaker setup, and send you the results for listening. There is no reference in this case, but you can compare against your preferred decoding method. And it would be interesting for me to hear you feedback too. Best regards, Archontis On 05 Jul 2016, at 09:32, Richard Lee <rica...@justnet.com.au<mailto:rica...@justnet.com.au>> wrote: Can you give us more detail about these tests and perhaps put some of these natural recordings on ambisonia.com<http://ambisonia.com>? The type of soundfield microphone used .. and particularly the accuracy of its calibration ... makes a HUGE difference to the 'naturalness' of a soundfield recording. Some good examples of 'natural' soundfield recordings with loadsa stuff happening from all round are Paul Doombusch's Hampi, JH Roy's schoolyard & John Leonard's Aran music. Musical examples include John Leonards Orfeo Trio, Paul Hodges "It was a lover and his lass" and Aaron Heller's (AJH) "Pulcinella". The latter has individual soloists popping up in the soundfield .. not pasted on, but in a very natural and delicious fashion ... as Stravinsky intended. Also to my experience, and that doesn?t seem to be a very popular view yet in ambisonic community, these parametric methods do not only upsample or sharpen the image compared to direct first-order decoding, but they actually reproduce the natural recording in a way that is closer perceptually to how the original sounded, both spatially and in timbre. Or at least that?s what our listening tests have shown in a number of cases and recordings. And the directional sharpening is one effect, but also the higher spatial decorrelation that they achieve (or lower inter-aural coherence) in reverberant recordings is equally important.
_______________________________________________ Sursound mailing list Sursound@music.vt.edu https://mail.music.vt.edu/mailman/listinfo/sursound - unsubscribe here, edit account or options, view archives and so on.