Heli wrote: > Dear all, > > I need to loop over a numpy array and then do the following search. The > following is taking almost 60(s) for an array (npArray1 and npArray2 in > the example below) with around 300K values. > > > for id in np.nditer(npArray1): > > newId=(np.where(npArray2==id))[0][0] > > > Is there anyway I can make the above faster? I need to run the script > above on much bigger arrays (50M). Please note that my two numpy arrays in > the lines above, npArray1 and npArray2 are not necessarily the same size, > but they are both 1d.
You mean you are looking for the index of the first occurence in npArray2 for every value of npArray1? I don't know how to do this in numpy (I'm not an expert), but even basic Python might be acceptable: lookup = {} for i, v in enumerate(npArray2): if v not in lookup: lookup[v] = i for v in npArray1: print(lookup.get(v, "<not found>")) That way you iterate once (in Python) instead of 2*len(npArray1) times (in C) over npArray2. -- https://mail.python.org/mailman/listinfo/python-list