Re: [Sursound] TetraMic and Jaunt VR in Time, Gizmodo and Engadget (Virtual Reality Recording System)

Stefan Schreiber Fri, 16 May 2014 14:52:05 -0700

David McGriffy wrote:

On May 16, 2014, at 7:54 AM, Stefan Schreiber <st...@mail.telepac.pt> wrote:

Different people keep telling me that anechoic HRIR/HRTF sets don't sound very well (if applied 
to deliver virtual surround delivered via headphones), preferably you should use BRIRs 
(reverbant, or < non-anechoic >  O:-) ). This is not what film sound people would like to 
do, because the room acoustics and "soundscapes" in different movie scenes usually 
change a lot. The different acoustical environments would have to be captured via the SFM, or 
to be synthesized (DAW).

It might well be the case that "dry" HRIR sets don't work well for normal 
headphones.

In < your > case, things seem to work:


I have also read in multiple sources but particularly Bruce Wiggins that adding 
room impulses helps with the perception.  Perhaps it’s easier to convince 
someone that they are sitting in a small room listening to a set of surround 
sound speakers when they are, in fact, sitting in a small room listening to a 
pair of headphones.  But add the VR visuals?  I think everyone expected the 
visuals to help lock in the audio and vice versa, but I do wonder if the room 
impulses would still be better or not?

Mostly, I didn’t end up adding any room responses for reasons more like what Adam said, that given the different environments the system is attempting to convey, it’s not clear what room you would use.

I don't want to appear overly pedantic, but wasn't it me who wrotethis? ;-)

However, I plan to make this same library available for other purposes, both 
open source and commercial, and I expect that adding room responses would be 
useful to many applications so I will probably add that feature.  Perhaps when 
I do, the kind folks at Jaunt will let us know how it works when combined with 
visuals matching the original environment.

Well, this was my original question: Which kind of HRIRs/HRTFs didJauntVR use?


Citing myself:

Now, I have some suggestions for further improvement, to ourcolleagues and also Jaunt VR/TetraMic:
If the reference quality for HT binaural systems is about this

http://smyth-research.com/technology.html,
you would still have to employ personalized HRTFs (HRIR/BRIR) datasets into your decoder. (HRIR is anechoic. BRIR includes room acoustics.)

The decoder consists of four
virtual cardioids spaced at 90º offsets, and the HRIR at the appropriate
angle & ear is applied to each.  I'm essentially describing his code though
so he can chime in with additional details.
This is actually not quite the case... Nevertheless, this is a smart try to 
understand how a soundfield mike might work, and how a SFM recording might be 
decoded to binaural representation/headphones.   ;-)
Actually that’s just about exactly what my decoder is doing and how some of the other b-format to binaural systems I read about seem to work. There are details, of course. For example, the decoder can easily be changed to use more and different virtual speakers as long as it can find a matching HRIR to use; and it is possible to read any of the samples from the Listen database. But in my own listening tests, such as they are, I never found that adding more than four speakers helped.

At least you would have to shift the speaker positions and find orfind/interpolate the HRIR positions, right? (We are speaking of ahead-tracked decoder.)

Just seems to get muddy.  I really wanted to add height, in fact the preset is 
in the code, but I don’t think it sounds better, at least the way I’m doing it 
so far.

FOA seems to have some problems with height. One reason could be thatheight localizations seems to happen at quite high frequencies. Justsome hint which would have to be treated in depth...

However, my interpretation of your words is that JauntVR maybe doesn'temploy a 3D audio decoder. This is claimed in the articles about thecompany. Are they doing 3D audio decoding or not, then?

(The question is even more justified if I hint to the fact that you needmore than 4 speakers if realizing any 3D decoder for FOA... So?!)

My questions are around the details of a square decode. Cardioids, of course, are not max rE or rV. I suspect a more optimized decode might be better in any situation. Easy to try, but I have no subjective answer yet myself. Then there are shelf filters and NFC. I think shelf filters, as a way to optimize rE/rV will probably help in most cases too.


Why not simply call this a dual-band decoder?

Finally, thanks for the technical response. Because this is a technicalforum, and we can't work with PR information - which might bemisleading, as we have seen. (Speaking about the 3D audio part. I wouldbe happy to hear the answer if the JauntVR people are doing horiz.-onlyor 3D decoding. Without this information we don't know what we aretalking about!)


Best regards,

Stefan

NFC I think will depend on the situation. If you are using room responses, then NFC at the distance of the speakers in that room might help, otherwise, I don’t know what distance to use. We could choose a default as is sometimes done, but I think that would still be correcting for something that isn’t happening in a binaural system.If others have experience or especially analytical answers to any of these questions, I’d be very interested as it’s something I’m actively working on.



David
VVAudio

It should be stated that our current implementation is very much a
prototype and will require a good deal of refinement and personalization.

It might be oth worthwile and feasible for you to investigate the use of 
individual HRIRs/HRTFs. (Measured OR derived form 3D scans/photographs, as I 
have suggested. The latter method is definitively more complicated, but way 
faster.)

Best regards,

Stefan

Len Moskowitz wrote:

Jaunt VR has developed a virtual reality camera. They're using TetraMic

for recording audio, decoding with headtracking for playback over
headphones and speakers. For video playback they're using the Oculus Rift.

http://time.com/49228/jaunt-wants-to-help-hollywood-make-
virtual-reality-movies/
http://gizmodo.com/meet-the-crazy-camera-that-could-make-
movies-for-the-oc-1557318674

Citing from this link:

A close-up of the 3D microphone that allows for 3D spacialized audio. If

you're wearing headphones, there's actually headtracking for the Oculus to
tell which direction you're looking--when you change your view, the sound
mix will also change to match, in order to keep the sound in the same space.

I have suggested this possibility before, for example here:

http://comments.gmane.org/gmane.comp.audio.sursound/5172

(obviously thinking of some  < audio-only > application, without any
video. It was already clear that the Oculus Rift included all necessary
hardware for HT audio decoding, although Oculus didn't do this in 2012 or
2013.)

This suggestion led (by influence or coincidence) to some further
developments, which could be followed on the sursound list:

http://comments.gmane.org/gmane.comp.audio.sursound/5387

To be frank: At least two "groups" of people on this list have
demonstrated head-tracked decoding of FOA recently and < before > Jaunt VR,
done in a very similar  fashion. I could name Hector Centeno and Bo-Erik
Sandholm (Bo-Erik introduced the external HT hardware, whereas the Android
app by Hector already existed), further Matthias Kronlachner at IEM Graz.
If not more people...

Far from complaining about this, I would welcome this coincidence or
"coincidence". (The "VR movie" people and "our" list colleagues use
basically the same HT decoder technology,  and maybe even decoding
software.) Because this all shows that Ambisonics is mature enough to be
used even for some very sophisticated applications, if we speak about
cinematic VR demonstrations... (We are all using "the power of HT decoded
FOA,  in VR worlds, VR movies, and maybe even for 3D audio music
recordings"...  ;-) )

Seeing the recent and ongoing development activities in areas like UHD,
Mpeg-H 3D audio aka ISO/IEC 23008-3, gaming, VR, 3D movies and now "VR
movies" (this is not a technical term yet), it is probably a good question
why surround sound/3D audio is used in so many areas, but < still not > for
(published) music recodings. (This situation looks increasingly <
unbelievable >. )


Anyway: Congratulations to Len and TetraMic, who are involved in these
activities!


Now, I have some suggestions for further improvement, to our colleagues
and also Jaunt VR/TetraMic:

If the reference quality for HT binaural systems is about this

http://smyth-research.com/technology.html,

you would still have to employ personalized HRTFs (HRIR/BRIR) data sets
into your decoder. (HRIR is anechoic. BRIR includes room acoustics.)

It is probably possible to calculate both HRIR and BRIR data sets from 3D
scans, or even from "plain" photographs. (This has been done at least in
the case of HRIR/HRTF data sets, derived from optical 3D scans or
photographs of the torso/head/ear shapes. Probably there is still ample
space to improve the existing methods to calculate HRIRs/HRTFs from optical
data. For example, you could compare your calculation algorithm and
corresponding real-world  acoustical measurements, and follow some
evolutionary improvement strategy.  Matching calculation results and actual
measurements closer and closer, after each algorithm generation. Just a
quick idea...)

To calculate some (reverbant) BRIR data set (transfer function of some
listener in a room), you could maybe apply some form of acoustical
raytracing.

It would be far easier to < calculate > personalized  HRIR/BRIR data sets
than to measure them. (Because acoustical  full-sphere measurements would
require to measure hundreds or thousands of different positions, over a
full or at least half 3D sphere.)


Beside the suggestion to investigate the use of individual HRIRs/HRTFs, I
have a direct question to Jaunt VR:

What specific set of HRIRs/HRTFs (or BRIRs?) are you currently using as
part of  your Ambisonics --> head-tracked binaural decoder?

(I would imagine that you will have tested some existing collections, and
chosen some specific set according to your listening results. Because you
are using data sets and probably also software of other people/parties, I
believe it would be fair enough to answer this question. )


Best regards,

Stefan

P.S.: If this is possible, I also would be curious to hear what HT update
frequency you are using for the audio decoder, and maybe to ask some other
questions.






http://www.engadget.com/2014/04/03/jaunt-vr/

Len Moskowitz (mosko...@core-sound.com)
Core Sound LLC
www.core-sound.com
Home of TetraMic

_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

Re: [Sursound] TetraMic and Jaunt VR in Time, Gizmodo and Engadget (Virtual Reality Recording System)

Reply via email to