counting unique numpy subarrays

duncan smith Fri, 04 Dec 2015 11:47:16 -0800

Hello,
      I'm trying to find a computationally efficient way of identifying
unique subarrays, counting them and returning an array containing only
the unique subarrays and a corresponding 1D array of counts. The
following code works, but is a bit slow.


###############

from collections import Counter
import numpy

def bag_data(data):
    # data (a numpy array) is bagged along axis 0
    # returns concatenated array and corresponding array of counts
    vec_shape = data.shape[1:]
    counts = Counter(tuple(arr.flatten()) for arr in data)
    data_out = numpy.zeros((len(counts),) + vec_shape)
    cnts = numpy.zeros((len(counts,)))
    for i, (tup, cnt) in enumerate(counts.iteritems()):
        data_out[i] = numpy.array(tup).reshape(vec_shape)
        cnts[i] =  cnt
    return data_out, cnts

###############

I've been looking through the numpy docs, but don't seem to be able to
come up with a clean solution that avoids Python loops. TIA for any
useful pointers. Cheers.

Duncan
-- 
https://mail.python.org/mailman/listinfo/python-list

counting unique numpy subarrays

Reply via email to