Dear Hann Long Kwan, > > I am student working on a school project to implement MS stereo on dist10 > code. I am new to mp3. I tried MS sparsing based on ISO 11172-3 which does not > seem to work too well. I had KIV it. Now, I am trying to implement it using > the ISO 13818-7 MS stereo part. I am not very sure how to implement the > algorithm on mp3. >
ISO 13818-7 is actually MPEG-2 AAC, and there are some small differences that I'll explain later... > Here is my algorithm: > 1. Calculate and store nb(b) and en(b) for L/R and M/S as in calculate > threshold(part 1). However when calculating M/S threshold, do we use the minimum > c(w) values from L/R. Is that the tonality index for L/R? According to the 13818-7 in each psychoacoustic band you need to use minimum of C[w] and maximum of TB[b] of the L and R channel for both M and S channels, where w is the FFT frequency index, and b is the partition (psychoacoustic) band index. > 2. Using the raw threshold and energies from M/S and L/R, calculate bmax and > all final thresholds. Yes, but ISO docs recommend to do imaging every time, but some experiments show that imaging is required only if the difference between L and R thresholds is less than some value. LAME uses 1.58, for example. Check out the psymodel.c in the LAME distribution, imaging is implemented somewhere. > 3. Are the final threshold used to calculate the pre-echos controls? As in > the nb(b) in step 11 as stated in the standards? No, M/S thresholds are used only for coding, pre-echo control is applied only in L and R thresholds. > 4. We now have 4 finals threshold ie L/R and M/S. Which ones do we use to > calculate the ratios in l3psy.c in mp3? > Now, here comes the tricky part. 13818-7 does not say anything about choosing when to use LR and when to use MS. Beware - the difference is that in AAC you can switch to M/S on scalefactor band basis. MP3, however, allows switching to MS only on entire frame (this is a flaw in the MP3 design, because some misunderstandings in the MPEG commitee, but it is way too late to talk about this ;-) Anyway, one straightforward idea is to use following algorithm: 1. Compute the difference between L and R thresholds 2. Compute the PE for L+R and M+S channels Switch to M/S mode if and only if the difference between L and R is smaller than some value (say 5 dB) AND M+S PE is smaller than L+R PE. Again, check LAME psymodel.c for more details... Regards, Ivan Dimkovic _______________________________________________ mp3encoder mailing list [EMAIL PROTECTED] http://minnie.tuhs.org/mailman/listinfo/mp3encoder
