Hans Georg Krauthaeuser schrieb: > Hi All, > > I was playing with scipy.stats.itemfreq when I observed the following > overflow: > > In [119]:for i in [254,255,256,257,258]: > .....: l=[0]*i > .....: print i, stats.itemfreq(l), l.count(0) > .....: > 254 [ [ 0 254]] 254 > 255 [ [ 0 255]] 255 > 256 [ [0 0]] 256 > 257 [ [0 1]] 257 > 258 [ [0 2]] 258 > > itemfreq is pretty small (in stats.py): > > ---------------------------------------------------------------------- > def itemfreq(a): > """ > Returns a 2D array of item frequencies. Column 1 contains item values, > column 2 contains their respective counts. Assumes a 1D array is passed. > > Returns: a 2D frequency table (col [0:n-1]=scores, col n=frequencies) > """ > scores = _support.unique(a) > scores = sort(scores) > freq = zeros(len(scores)) > for i in range(len(scores)): > freq[i] = add.reduce(equal(a,scores[i])) > return array(_support.abut(scores, freq)) > ---------------------------------------------------------------------- > > It seems that add.reduce is the source for the overflow: > > In [116]:from scipy import * > > In [117]:for i in [254,255,256,257,258]: > .....: l=[0]*i > .....: print i, add.reduce(equal(l,0)) > .....: > 254 254 > 255 255 > 256 0 > 257 1 > 258 2 > > Is there any possibility to avoid the overflow? > > BTW: > Python 2.3.5 (#2, Aug 30 2005, 15:50:26) > [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2 > > scipy_version.scipy_version --> '0.3.2' > > > Thanks and best regards > Hans Georg Krauthäuser After some further investigation:
In [150]:add.reduce(array(equal([0]*256,0),typecode='l')) Out[150]:256 In [151]:add.reduce(equal([0]*256,0)) Out[151]:0 The problem occurs with arrays with typecode 'b' (as returned by equal). Workaround patch for itemfreq is obvious, but ... is it a bug or a feature? regards Hans Georg -- http://mail.python.org/mailman/listinfo/python-list