[Numpy-discussion] Optimized np.digitize for equidistant bins

2020-12-18 Thread Martín Chalela
Hi all! I was wondering if there is a way around to using np.digitize when
dealing with equidistant bins. For example:
bins = np.linspace(0, 1, 20)

The main problem I encountered is that digitize calls np.searchsorted. This
is the correct way, I think, for generic bins, i.e. bins that have
different widths. However, in the special, but not uncommon, case of
equidistant bins, the searchsorted call can be very expensive and
unnecessary. One can perform a simple calculation like the following:

def digitize_eqbins(x, bins):
"""
Return the indices of the bins to which each value in input array belongs.
Assumes equidistant bins.
"""
nbins = len(bins) - 1
digit = (nbins * (x - bins[0]) / (bins[-1] - bins[0])).astype(np.int)
return digit + 1

Is there a better way of computing this for equidistant bins?

Thank you!
Martin.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Optimized np.digitize for equidistant bins

2020-12-18 Thread Joseph Fox-Rabinovitz
Bin index is just value floor divided by the bin size.

On Fri, Dec 18, 2020, 09:59 Martín Chalela  wrote:

> Hi all! I was wondering if there is a way around to using np.digitize when
> dealing with equidistant bins. For example:
> bins = np.linspace(0, 1, 20)
>
> The main problem I encountered is that digitize calls np.searchsorted.
> This is the correct way, I think, for generic bins, i.e. bins that have
> different widths. However, in the special, but not uncommon, case of
> equidistant bins, the searchsorted call can be very expensive and
> unnecessary. One can perform a simple calculation like the following:
>
> def digitize_eqbins(x, bins):
> """
> Return the indices of the bins to which each value in input array belongs.
> Assumes equidistant bins.
> """
> nbins = len(bins) - 1
> digit = (nbins * (x - bins[0]) / (bins[-1] - bins[0])).astype(np.int)
> return digit + 1
>
> Is there a better way of computing this for equidistant bins?
>
> Thank you!
> Martin.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Optimized np.digitize for equidistant bins

2020-12-18 Thread Martín Chalela
Right! I just thought there would/should be a "digitize" function that did
this.

El vie, 18 dic 2020 a las 14:16, Joseph Fox-Rabinovitz (<
jfoxrabinov...@gmail.com>) escribió:

> Bin index is just value floor divided by the bin size.
>
> On Fri, Dec 18, 2020, 09:59 Martín Chalela 
> wrote:
>
>> Hi all! I was wondering if there is a way around to using np.digitize
>> when dealing with equidistant bins. For example:
>> bins = np.linspace(0, 1, 20)
>>
>> The main problem I encountered is that digitize calls np.searchsorted.
>> This is the correct way, I think, for generic bins, i.e. bins that have
>> different widths. However, in the special, but not uncommon, case of
>> equidistant bins, the searchsorted call can be very expensive and
>> unnecessary. One can perform a simple calculation like the following:
>>
>> def digitize_eqbins(x, bins):
>> """
>> Return the indices of the bins to which each value in input array belongs
>> .
>> Assumes equidistant bins.
>> """
>> nbins = len(bins) - 1
>> digit = (nbins * (x - bins[0]) / (bins[-1] - bins[0])).astype(np.int)
>> return digit + 1
>>
>> Is there a better way of computing this for equidistant bins?
>>
>> Thank you!
>> Martin.
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Optimized np.digitize for equidistant bins

2020-12-18 Thread Joseph Fox-Rabinovitz
There is: np.floor_divide.

On Fri, Dec 18, 2020, 14:38 Martín Chalela  wrote:

> Right! I just thought there would/should be a "digitize" function that did
> this.
>
> El vie, 18 dic 2020 a las 14:16, Joseph Fox-Rabinovitz (<
> jfoxrabinov...@gmail.com>) escribió:
>
>> Bin index is just value floor divided by the bin size.
>>
>> On Fri, Dec 18, 2020, 09:59 Martín Chalela 
>> wrote:
>>
>>> Hi all! I was wondering if there is a way around to using np.digitize
>>> when dealing with equidistant bins. For example:
>>> bins = np.linspace(0, 1, 20)
>>>
>>> The main problem I encountered is that digitize calls np.searchsorted.
>>> This is the correct way, I think, for generic bins, i.e. bins that have
>>> different widths. However, in the special, but not uncommon, case of
>>> equidistant bins, the searchsorted call can be very expensive and
>>> unnecessary. One can perform a simple calculation like the following:
>>>
>>> def digitize_eqbins(x, bins):
>>> """
>>> Return the indices of the bins to which each value in input array belongs
>>> .
>>> Assumes equidistant bins.
>>> """
>>> nbins = len(bins) - 1
>>> digit = (nbins * (x - bins[0]) / (bins[-1] - bins[0])).astype(np.int)
>>> return digit + 1
>>>
>>> Is there a better way of computing this for equidistant bins?
>>>
>>> Thank you!
>>> Martin.
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Optimized np.digitize for equidistant bins

2020-12-18 Thread Martín Chalela
Thank you Joseph

El vie, 18 dic 2020 a las 16:56, Joseph Fox-Rabinovitz (<
jfoxrabinov...@gmail.com>) escribió:

> There is: np.floor_divide.
>
> On Fri, Dec 18, 2020, 14:38 Martín Chalela 
> wrote:
>
>> Right! I just thought there would/should be a "digitize" function that
>> did this.
>>
>> El vie, 18 dic 2020 a las 14:16, Joseph Fox-Rabinovitz (<
>> jfoxrabinov...@gmail.com>) escribió:
>>
>>> Bin index is just value floor divided by the bin size.
>>>
>>> On Fri, Dec 18, 2020, 09:59 Martín Chalela 
>>> wrote:
>>>
 Hi all! I was wondering if there is a way around to using np.digitize
 when dealing with equidistant bins. For example:
 bins = np.linspace(0, 1, 20)

 The main problem I encountered is that digitize calls np.searchsorted.
 This is the correct way, I think, for generic bins, i.e. bins that have
 different widths. However, in the special, but not uncommon, case of
 equidistant bins, the searchsorted call can be very expensive and
 unnecessary. One can perform a simple calculation like the following:

 def digitize_eqbins(x, bins):
 """
 Return the indices of the bins to which each value in input array
 belongs.
 Assumes equidistant bins.
 """
 nbins = len(bins) - 1
 digit = (nbins * (x - bins[0]) / (bins[-1] - bins[0])).astype(np.int)
 return digit + 1

 Is there a better way of computing this for equidistant bins?

 Thank you!
 Martin.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@python.org
 https://mail.python.org/mailman/listinfo/numpy-discussion

>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion