[Sursound] Surround formats and lossy compression

2013-04-05 Thread Eric Carmichel
-so-average” listener hears. Perhaps the 
“missing” information in lossy compression schemes provides useful or subtle 
information to those whose perception isn’t normal or average. Furthermore, we 
don’t know how adding masking noise (speech or weighted noise) to material 
reproduced from MP3s might affect an outcome. Here’s an interesting experiment: 
Convert a stereo MP3 to wav (you’re not gaining anything... yet), flip the 
polarity of one channel (i.e., a 180 degree “phase” change but without moving 
the time line), and mix the channels to create a 50/50 mono mixdown. In many 
instances, you’ll hear odd artifacts that aren’t explained by simple phase 
cancellations. In other words, mixing the original source material (master 
tracks, not the MP3) down to mono won’t give rise to the artifacts. So, there’s 
something about
 the encoding or decoding that affects files in unpredictable ways when 
changing back to a “lossless” (e.g. wav) file. Because there are audiometric 
test protocols that rely on phase flipping or combining signal and noise, I’d 
most certainly avoid lossy compression schemes. If the tests were as simple as 
speech detection thresholds, I don’t foresee any harm in using MP3 files. But 
for differential diagnoses, research, etc., stick with lossless files, whether 
analog, digital, wav, or FLAC.

In summary, my reasons for not recommending MP3s is that they are already 
“psychoacoustically tainted” and not the equivalent to actual stimuli even if 
perceived by normal-hearing listeners as equivalent. Frequency response isn’t 
the culprit. And with today’s technology, there’s very little reason to 
“conserve” memory in order to accommodate small speech (wav) files.

*Additional notes

MP3 processing may not entirely remove a sound that is otherwise masked; 
instead, the resolution, or bit depth, can be greatly reduced in instances of 
“unheard” sounds. Simply re-sampling a file to lower the sampling rate is a 
linear reduction. In other words, re-sampling a wav file from 44.1 kHz to 32 
kHz gives a 32/44.1 = 0.726 reduction in file size. Discussions of re-sampling 
among audio geeks, by the way, gets into the ultra-boring topics of dithering 
(ever heard “dithering down” used by recording engineers?), noise shaping, 
filter types, blah blah, but only a small percentage of the people who like to 
toss these words around can do the math.

I can’t state that I’ve generated a pure tone, saved it as a wav file, 
converted it to MP3 Pro, and then examine how many bits were actually being 
used to create the sinusoid. Opening the MP3 in a wav editor such as Sound 
Forge or Audition probably doesn’t allow one to “see” how the MP3 is being 
operated on in order to edit or play the file (again, MP3 compression is 
proprietary as well as lossy). Zooming in on the MP3 will probably reveal 16 
bits per sample, yet the file reduction is considerable (approx 10x smaller 
than the wav file it was derived from).

When it comes to bit depth and sample rate, one of the biggest reasons for not 
using mega-fidelity files (24-bit, 96 kS/s) isn’t one of memory allocation, but 
one of battery use. Yep, the processing power needed for super audio files is 
greater than for lower-fidelity files. Apple (so I’m told) limits sample rate 
based on power consumption, not memory used. If you really want to open a can 
of worms, get the audio geeks to argue over 16- and 24-bit audio files. I put 
more merit in bit depth than sampling rate, but mostly for reasons having to do 
with dynamic range.

Lossless compression codecs require processing power, too, but unless you’re 
doing audiometry in the field and power is at a premium, there shouldn’t be any 
problems regarding power. There may be an intrinsic latency when presenting 
material, but this would be on the order of milliseconds (or microseconds). 
Latency would only be a problem if other time-sensitive processing was involved 
(e.g, the use of VST or RTAS plug-ins for research). I really can’t think of 
practical reasons not to use FLAC files. A lot of this gets back to the quality 
of the master material, and what software was used to convert to FLAC or 
whatever.

Hope this isn’t too confusing.

Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20130405/0e18fb5b/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Surround formats and lossy compression

2013-04-05 Thread Eric Benjamin
xplained by simple phase 
cancellations. In other words, mixing the original source material (master 
tracks, not the MP3) down to mono won’t give rise to the artifacts. So, there’s 
something about
the encoding or decoding that affects files in unpredictable ways when changing 
back to a “lossless” (e.g. wav) file. Because there are audiometric test 
protocols that rely on phase flipping or combining signal and noise, I’d most 
certainly avoid lossy compression schemes. If the tests were as simple as 
speech 
detection thresholds, I don’t foresee any harm in using MP3 files. But for 
differential diagnoses, research, etc., stick with lossless files, whether 
analog, digital, wav, or FLAC.

In summary, my reasons for not recommending MP3s is that they are already 
“psychoacoustically tainted” and not the equivalent to actual stimuli even if 
perceived by normal-hearing listeners as equivalent. Frequency response isn’t 
the culprit. And with today’s technology, there’s very little reason to 
“conserve” memory in order to accommodate small speech (wav) files.

*Additional notes

MP3 processing may not entirely remove a sound that is otherwise masked; 
instead, the resolution, or bit depth, can be greatly reduced in instances of 
“unheard” sounds. Simply re-sampling a file to lower the sampling rate is a 
linear reduction. In other words, re-sampling a wav file from 44.1 kHz to 32 
kHz 
gives a 32/44.1 = 0.726 reduction in file size. Discussions of re-sampling 
among 
audio geeks, by the way, gets into the ultra-boring topics of dithering (ever 
heard “dithering down” used by recording engineers?), noise shaping, filter 
types, blah blah, but only a small percentage of the people who like to toss 
these words around can do the math.

I can’t state that I’ve generated a pure tone, saved it as a wav file, 
converted 
it to MP3 Pro, and then examine how many bits were actually being used to 
create 
the sinusoid. Opening the MP3 in a wav editor such as Sound Forge or Audition 
probably doesn’t allow one to “see” how the MP3 is being operated on in order 
to 
edit or play the file (again, MP3 compression is proprietary as well as lossy). 
Zooming in on the MP3 will probably reveal 16 bits per sample, yet the file 
reduction is considerable (approx 10x smaller than the wav file it was derived 
from).

When it comes to bit depth and sample rate, one of the biggest reasons for not 
using mega-fidelity files (24-bit, 96 kS/s) isn’t one of memory allocation, but 
one of battery use. Yep, the processing power needed for super audio files is 
greater than for lower-fidelity files. Apple (so I’m told) limits sample rate 
based on power consumption, not memory used. If you really want to open a can 
of 
worms, get the audio geeks to argue over 16- and 24-bit audio files. I put more 
merit in bit depth than sampling rate, but mostly for reasons having to do with 
dynamic range.

Lossless compression codecs require processing power, too, but unless you’re 
doing audiometry in the field and power is at a premium, there shouldn’t be any 
problems regarding power. There may be an intrinsic latency when presenting 
material, but this would be on the order of milliseconds (or microseconds). 
Latency would only be a problem if other time-sensitive processing was involved 
(e.g, the use of VST or RTAS plug-ins for research). I really can’t think of 
practical reasons not to use FLAC files. A lot of this gets back to the quality 
of the master material, and what software was used to convert to FLAC or 
whatever.

Hope this isn’t too confusing.

Best,
Eric C.
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20130405/0e18fb5b/attachment.html>

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


[Sursound] A format on zoom H2N

2013-04-05 Thread umashankar manthravadi
I have just posted a recording on soundcloud done with a newly modified zoom H2 
n. http://soundcloud.com/umashankar-manthravadi/ ambitest-02 soundcloud turned 
the b format file into a mono file. anything I can do make the b format file 
available. (apart from putting it on live drive) umashankar

  
-- next part --
An HTML attachment was scrubbed...
URL: 

___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound


Re: [Sursound] Surround formats and lossy compression

2013-04-05 Thread Eric Carmichel
Hi Eric B.,
Thanks for your detailed and informative reply.
While drafting a recent Sursound post regarding KEMAR, I did a Google search to 
make sure I was spelling Zwislocki correctly. One reference that appeared at 
the top had something to do with inner ear simulation software--I need to go 
back and check it out. Anyway, I don't know whether the "front end" of 
available inner-ear simulation software allows the user to study neural coding 
with an arbitrary source or audio file. Analyzing complex sounds at the neural 
level (particularly innervation of inner hair cells) would require a lot of 
data-logging channels or replaying the stimulus over and over while 
systematically moving the 'logger' to the many virtual receptors. If such a 
simulation exists, one might be able to measure differences (at neural level) 
between a wav file and its mp3 counterpart. Considering that neural firing 
appears to be a lot more complicated than basilar membrane motion alone (which 
is primarily mechanical except for motile outer hair
 cell contributions to membrane elasticity), we might expect to use statistical 
measures to decide what significant differences, if any, exist. This still 
wouldn't provide conclusive evidence when it comes to perception. I realize 
there has been a long-standing debate over audio file types and rates, but my 
guess is that the subjects used in studies consisted of normal-hearing young 
people with no familial history of hearing loss. Normal hearing generally means 
thresholds = 10 dB HL or better at the audiometric test frequencies (highest 
test frequency being 8 kHz). Such screening measures are generally employed to 
ensure a 'normal' (and homogeneous) population. Perhaps people claiming to 
possess 'golden ears' have participated in mp3 vs wav studies, too. But this 
kind-of avoids the subtle issue of outliers or people who have (for example) 
auditory processing disorders or other abnormalities that are independent of 
hearing thresholds. We have a reasonable
 grasp on how mammals hear (the physiological aspect), but we don't know a 
whole lot about how we listen. Of course, just as with hybrid mixing, the way 
to avoid potential pitfalls or danger in a research or clinical environment is 
to avoid the lossy file types altogether.
Best,
Eric Carmichel (also not to be confused with Eric Carmichael--my last name has 
an unusual spelling. Cheers!)




-- next part --
An HTML attachment was scrubbed...
URL: 
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20130405/1c6d0db1/attachment.html>
___
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound