Dear Hann Long Kwan,

>
> I am student working on a school project to implement MS stereo on dist10
> code. I am new to mp3. I tried MS sparsing based on ISO 11172-3 which does
not
> seem to work too well. I had KIV it. Now, I am trying to implement it
using
> the ISO 13818-7 MS stereo part. I am not very sure how to implement the
> algorithm on mp3.
>

ISO 13818-7 is actually MPEG-2 AAC, and there are some small differences
that I'll explain later...

> Here is my algorithm:
> 1. Calculate and store nb(b) and en(b) for L/R and M/S as in calculate
> threshold(part 1). However when calculating M/S threshold, do we use the
minimum
> c(w) values from L/R. Is that the tonality index for L/R?

According to the 13818-7 in each psychoacoustic band you need to use minimum
of C[w] and maximum of TB[b] of the L and R channel for both M and S
channels, where w is the FFT frequency index, and b is the partition
(psychoacoustic) band index.

> 2. Using the raw threshold and energies from M/S and L/R, calculate bmax
and
> all final thresholds.

Yes, but ISO docs recommend to do imaging every time, but some experiments
show that imaging is required only if the difference between L and R
thresholds is less than some value. LAME uses 1.58, for example. Check out
the psymodel.c in the LAME distribution, imaging is implemented somewhere.

> 3. Are the final threshold used to calculate the pre-echos controls? As in
> the nb(b) in step 11 as stated in the standards?

No, M/S thresholds are used only for coding, pre-echo control is applied
only in L and R thresholds.

> 4. We now have 4 finals threshold ie L/R and M/S. Which ones do we use to
> calculate the ratios in l3psy.c in mp3?
>

Now, here comes the tricky part. 13818-7 does not say anything about
choosing when to use LR and when to use MS. Beware - the difference is that
in AAC you can switch to M/S on scalefactor band basis. MP3, however, allows
switching to MS only on entire frame (this is a flaw in the MP3 design,
because some misunderstandings in the MPEG commitee, but it is way too late
to talk about this ;-)

Anyway, one straightforward idea is to use following algorithm:

1. Compute the difference between L and R thresholds
2. Compute the PE for L+R and M+S channels

Switch to M/S mode if and only if the difference between L and R is smaller
than some value (say 5 dB) AND M+S PE is smaller than L+R PE. Again, check
LAME psymodel.c for more details...

Regards,
Ivan Dimkovic



_______________________________________________
mp3encoder mailing list
[EMAIL PROTECTED]
http://minnie.tuhs.org/mailman/listinfo/mp3encoder

Reply via email to