On May 16, 2014, at 7:54 AM, Stefan Schreiber <st...@mail.telepac.pt> wrote:

> 
> Different people keep telling me that anechoic HRIR/HRTF sets don't sound 
> very well (if applied to deliver virtual surround delivered via headphones), 
> preferably you should use BRIRs (reverbant, or < non-anechoic >  O:-) ). This 
> is not what film sound people would like to do, because the room acoustics 
> and "soundscapes" in different movie scenes usually change a lot. The 
> different acoustical environments would have to be captured via the SFM, or 
> to be synthesized (DAW).
> 
> It might well be the case that "dry" HRIR sets don't work well for normal 
> headphones.
> 
> In < your > case, things seem to work:
> 
> 

I have also read in multiple sources but particularly Bruce Wiggins that adding 
room impulses helps with the perception.  Perhaps it’s easier to convince 
someone that they are sitting in a small room listening to a set of surround 
sound speakers when they are, in fact, sitting in a small room listening to a 
pair of headphones.  But add the VR visuals?  I think everyone expected the 
visuals to help lock in the audio and vice versa, but I do wonder if the room 
impulses would still be better or not?

Mostly, I didn’t end up adding any room responses for reasons more like what 
Adam said, that given the different environments the system is attempting to 
convey, it’s not clear what room you would use.  However, I plan to make this 
same library available for other purposes, both open source and commercial, and 
I expect that adding room responses would be useful to many applications so I 
will probably add that feature.  Perhaps when I do, the kind folks at Jaunt 
will let us know how it works when combined with visuals matching the original 
environment.

> 
>> The decoder consists of four
>> virtual cardioids spaced at 90º offsets, and the HRIR at the appropriate
>> angle & ear is applied to each.  I'm essentially describing his code though
>> so he can chime in with additional details.
>> 
> 
> This is actually not quite the case... Nevertheless, this is a smart try to 
> understand how a soundfield mike might work, and how a SFM recording might be 
> decoded to binaural representation/headphones.   ;-)

Actually that’s just about exactly what my decoder is doing and how some of the 
other b-format to binaural systems I read about seem to work.  There are 
details, of course.  For example, the decoder can easily be changed to use more 
and different virtual speakers as long as it can find a matching HRIR to use; 
and it is possible to read any of the samples from the Listen database.  But in 
my own listening tests, such as they are, I never found that adding more than 
four speakers helped.  Just seems to get muddy.  I really wanted to add height, 
in fact the preset is in the code, but I don’t think it sounds better, at least 
the way I’m doing it so far.

My questions are around the details of a square decode.  Cardioids, of course, 
are not max rE or rV.  I suspect a more optimized decode might be better in any 
situation.  Easy to try, but I have no subjective answer yet myself.  Then 
there are shelf filters and NFC.  I think shelf filters, as a way to optimize 
rE/rV will probably help in most cases too.  NFC I think will depend on the 
situation.  If you are using room responses, then NFC at the distance of the 
speakers in that room might help, otherwise, I don’t know what distance to use. 
 We could choose a default as is sometimes done, but I think that would still 
be correcting for something that isn’t happening in a binaural system.  
If others have experience or especially analytical answers to any of these 
questions, I’d be very interested as it’s something I’m actively working on.


David
VVAudio


> 
>> It should be stated that our current implementation is very much a
>> prototype and will require a good deal of refinement and personalization.
>> 
> 
> It might be oth worthwile and feasible for you to investigate the use of 
> individual HRIRs/HRTFs. (Measured OR derived form 3D scans/photographs, as I 
> have suggested. The latter method is definitively more complicated, but way 
> faster.)



> 
> 
> Best regards,
> 
> Stefan
> 
> 
> 
>> 
>> 
>>> Len Moskowitz wrote:
>>> 
>>> Jaunt VR has developed a virtual reality camera. They're using TetraMic
>>>   
>>>> for recording audio, decoding with headtracking for playback over
>>>> headphones and speakers. For video playback they're using the Oculus Rift.
>>>> 
>>>> http://time.com/49228/jaunt-wants-to-help-hollywood-make-
>>>> virtual-reality-movies/
>>>> http://gizmodo.com/meet-the-crazy-camera-that-could-make-
>>>> movies-for-the-oc-1557318674
>>>> 
>>>>     
>>> Citing from this link:
>>> 
>>> A close-up of the 3D microphone that allows for 3D spacialized audio. If
>>>   
>>>> you're wearing headphones, there's actually headtracking for the Oculus to
>>>> tell which direction you're looking--when you change your view, the sound
>>>> mix will also change to match, in order to keep the sound in the same 
>>>> space.
>>>> 
>>>>     
>>> I have suggested this possibility before, for example here:
>>> 
>>> http://comments.gmane.org/gmane.comp.audio.sursound/5172
>>> 
>>> (obviously thinking of some  < audio-only > application, without any
>>> video. It was already clear that the Oculus Rift included all necessary
>>> hardware for HT audio decoding, although Oculus didn't do this in 2012 or
>>> 2013.)
>>> 
>>> This suggestion led (by influence or coincidence) to some further
>>> developments, which could be followed on the sursound list:
>>> 
>>> http://comments.gmane.org/gmane.comp.audio.sursound/5387
>>> 
>>> To be frank: At least two "groups" of people on this list have
>>> demonstrated head-tracked decoding of FOA recently and < before > Jaunt VR,
>>> done in a very similar  fashion. I could name Hector Centeno and Bo-Erik
>>> Sandholm (Bo-Erik introduced the external HT hardware, whereas the Android
>>> app by Hector already existed), further Matthias Kronlachner at IEM Graz.
>>> If not more people...
>>> 
>>> Far from complaining about this, I would welcome this coincidence or
>>> "coincidence". (The "VR movie" people and "our" list colleagues use
>>> basically the same HT decoder technology,  and maybe even decoding
>>> software.) Because this all shows that Ambisonics is mature enough to be
>>> used even for some very sophisticated applications, if we speak about
>>> cinematic VR demonstrations... (We are all using "the power of HT decoded
>>> FOA,  in VR worlds, VR movies, and maybe even for 3D audio music
>>> recordings"...  ;-) )
>>> 
>>> Seeing the recent and ongoing development activities in areas like UHD,
>>> Mpeg-H 3D audio aka ISO/IEC 23008-3, gaming, VR, 3D movies and now "VR
>>> movies" (this is not a technical term yet), it is probably a good question
>>> why surround sound/3D audio is used in so many areas, but < still not > for
>>> (published) music recodings. (This situation looks increasingly <
>>> unbelievable >. )
>>> 
>>> 
>>> Anyway: Congratulations to Len and TetraMic, who are involved in these
>>> activities!
>>> 
>>> 
>>> Now, I have some suggestions for further improvement, to our colleagues
>>> and also Jaunt VR/TetraMic:
>>> 
>>> If the reference quality for HT binaural systems is about this
>>> 
>>> http://smyth-research.com/technology.html,
>>> 
>>> you would still have to employ personalized HRTFs (HRIR/BRIR) data sets
>>> into your decoder. (HRIR is anechoic. BRIR includes room acoustics.)
>>> 
>>> It is probably possible to calculate both HRIR and BRIR data sets from 3D
>>> scans, or even from "plain" photographs. (This has been done at least in
>>> the case of HRIR/HRTF data sets, derived from optical 3D scans or
>>> photographs of the torso/head/ear shapes. Probably there is still ample
>>> space to improve the existing methods to calculate HRIRs/HRTFs from optical
>>> data. For example, you could compare your calculation algorithm and
>>> corresponding real-world  acoustical measurements, and follow some
>>> evolutionary improvement strategy.  Matching calculation results and actual
>>> measurements closer and closer, after each algorithm generation. Just a
>>> quick idea...)
>>> 
>>> To calculate some (reverbant) BRIR data set (transfer function of some
>>> listener in a room), you could maybe apply some form of acoustical
>>> raytracing.
>>> 
>>> It would be far easier to < calculate > personalized  HRIR/BRIR data sets
>>> than to measure them. (Because acoustical  full-sphere measurements would
>>> require to measure hundreds or thousands of different positions, over a
>>> full or at least half 3D sphere.)
>>> 
>>> 
>>> Beside the suggestion to investigate the use of individual HRIRs/HRTFs, I
>>> have a direct question to Jaunt VR:
>>> 
>>> What specific set of HRIRs/HRTFs (or BRIRs?) are you currently using as
>>> part of  your Ambisonics --> head-tracked binaural decoder?
>>> 
>>> (I would imagine that you will have tested some existing collections, and
>>> chosen some specific set according to your listening results. Because you
>>> are using data sets and probably also software of other people/parties, I
>>> believe it would be fair enough to answer this question. )
>>> 
>>> 
>>> Best regards,
>>> 
>>> Stefan
>>> 
>>> P.S.: If this is possible, I also would be curious to hear what HT update
>>> frequency you are using for the audio decoder, and maybe to ask some other
>>> questions.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> http://www.engadget.com/2014/04/03/jaunt-vr/
>>>   
>>>> Len Moskowitz (mosko...@core-sound.com)
>>>> Core Sound LLC
>>>> www.core-sound.com
>>>> Home of TetraMic
>>>> 
>>>> _______________________________________________
>>>> Sursound mailing list
>>>> Sursound@music.vt.edu
>>>> https://mail.music.vt.edu/mailman/listinfo/sursound
>>>> 
>>>> 
>>>>     
>>> 
> 
> _______________________________________________
> Sursound mailing list
> Sursound@music.vt.edu
> https://mail.music.vt.edu/mailman/listinfo/sursound

_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

Reply via email to