On Wed, Oct 31, 2007 at 05:48:20PM +0100, Alexandre Ratchov wrote:
> On Wed, Oct 31, 2007 at 11:23:31AM -0400, Nick Guenther wrote:
> > On 10/31/07, Brian A Seklecki (Mobile)
> > <[EMAIL PROTECTED]> wrote:
> > > Some *BSD systems are adjusting PCM driver support to allow multiple
> > > process to open /dev/dsp / /dev/audio multiple times in-exclusively,
> > > mitigating the needs for piss-poor software API multiplex'ing solutions
> > > a-la ARTS/ESD.
> > 
> > Oh awesome! Is /Open/BSD one of those?
> > 
> 
> no; character devices (such as /dev/audio) keep per-unit state
> (encoding, rate, ...). To mix multiple audio streams per-stream
> state must be kept. That's why arts/esd/jack/... exist.

You don't need arts/esd/jack because of this. This can be solved in kernel.
The kernel opens the audio device for the highest common sampling rate
from those requested, or, if the rate cannot be switched without an
audible glitch, for the highest hardware available (48 or 96kHz, or
user-configured if it should be too much burden).

Then if the kernel has it 48 and the app opens 44.1, the kernel resamples.
According to Nyquist theorem, you first need to emulate the reconstruction
lowpass filter with 22.05kHz programatically, and then resample the
output stream at 48kHz and send it out. This requires an intermediate
stream at least common multiple, for this ugly case it's 2352 kHz. But
you actually don't have to shove data at 2352 kHz sampling rate in the
kernel, since you use only every 49th sample for the output. So some nifty
math or multipass filter does the job, for the expense of your brain exploding.

Now how the lowpass is done. It has to be really sharp otherwise you get
an 8-bit-era-like ringing in the sound. And it must not have large delay
(you don't want to hear the explosion in your Quake a second later) and
also not computationally expensive.

The suitable type of filter is called finite impulse response (FIR) and
it's just a naiive convolution with a short kernel. Now how to calculate
this kernel to get the best response? You make your kernel and imagine
it's cyclically wrapped. Then you calculate through FFT the ideal response
into it - that's perfectly sharp. But now since the response won't be
cyclically wrapped but occurs just once in the time and have zero strecthing
into both infinities, we have to fix this.

You take a Hann window http://en.wikipedia.org/wiki/Hann_window and apply that
and you got it. Hann window is just one cycle of a sine wave plus minus some
pushing around. You don't even need a sin for this, you can calculate it with
complex multiplication of one pre-calculated complex number.

How long do you make your kernel? The longer the kernel, the more
computationally intensive, but the sharper the transition so you lose less of
the high frequencies.

mplayer has to do this stuff all the time so it's full of this code. It does it
not only to accomodate for various sample rates, but also when you slow down or
speed up your video.  Maybe the code could be taken from mplayer. 

CL<

> 
> -- Alexandre

Reply via email to