> Date: Thu, 10 Mar 2016 08:48:48 -0800 > Subject: Re: looping and searching in numpy array > From: heml...@gmail.com > To: python-list@python.org > > On Thursday, March 10, 2016 at 2:02:57 PM UTC+1, Peter Otten wrote: > > Heli wrote: > > > > > Dear all, > > > > > > I need to loop over a numpy array and then do the following search. The > > > following is taking almost 60(s) for an array (npArray1 and npArray2 in > > > the example below) with around 300K values. > > > > > > > > > for id in np.nditer(npArray1): > > > > > > newId=(np.where(npArray2==id))[0][0] > > > > > > > > > Is there anyway I can make the above faster? I need to run the script > > > above on much bigger arrays (50M). Please note that my two numpy arrays in > > > the lines above, npArray1 and npArray2 are not necessarily the same size, > > > but they are both 1d. > > > > You mean you are looking for the index of the first occurence in npArray2 > > for every value of npArray1? > > > > I don't know how to do this in numpy (I'm not an expert), but even basic > > Python might be acceptable: > > > > lookup = {} > > for i, v in enumerate(npArray2): > > if v not in lookup: > > lookup[v] = i > > > > for v in npArray1: > > print(lookup.get(v, "<not found>")) > > > > That way you iterate once (in Python) instead of 2*len(npArray1) times (in > > C) over npArray2. > > Dear Peter, > > Thanks for your reply. This really helped. It reduces the script time from > 61(s) to 2(s). > > I am still very interested in knowing the correct numpy way to do this, but > till then your fix works great.
Hi, I suppose you have seen this already (in particular the first link): http://numpy-discussion.10968.n7.nabble.com/Implementing-a-quot-find-first-quot-style-function-td33085.htmlI don't thonk it's part of numpy yet. Albert-Jan -- https://mail.python.org/mailman/listinfo/python-list