[Numpy-discussion] How is "round to N decimal places" defined for binary floating point numbers?

2023-12-28 Thread Stefano Miccoli via NumPy-Discussion
I have always been puzzled about how to correctly define the python built-in 
`round(number, ndigits)` when `number` is a binary float and `ndigits` is 
greater than zero.
Apparently CPython and numpy disagree:
>>> round(2.765, 2)
2.77
>>> np.round(2.765, 2)
2.76

My question for the numpy devs are:
- Is there an authoritative source that explains what `round(number, ndigits)` 
means when the digits are counted in a base different from the one used in the 
floating point representation?
- Which was the first programming language to implement an intrinsic function 
`round(number, ndigits)` where ndgits are always decimal, irrespective of the 
representation of the floating point number? (I’m not interested in algorithms 
for printing a decimal representation, but in languages that allow to store and 
perform computations with the rounded value.)
- Is `round(number, ndigits)` a useful function that deserves a rigorous 
definition, or is its use limited to fuzzy situations, where accuracy can be 
safely traded for speed?

Personally I cannot think of sensible uses of `round(number, ndigits)` for 
binary floats: whenever you positively need `round(number, ndigits)`, you 
should use a decimal floating point representation.

Stefano

smime.p7s
Description: S/MIME cryptographic signature
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: How is "round to N decimal places" defined for binary floating point numbers?

2023-12-29 Thread Stefano Miccoli via NumPy-Discussion
Oscar Gustafsson wrote:
> I would take it that round x to N radix-R digits means
> round_to_integer(x * R**N)/R**N
> (ignoring floating-point issues)

Yes, this is the tried-and-true way: first define the function in exact 
arithmetic, then ask for the floating point implementation to return an 
approximate result within a given tolerance, say 1/2 ulp.
This is what CPython does, and it seems to be quite hard to get it right, at 
least inspecting the code of the current implementation. BTW I think that numpy 
uses the same definitions, but simply accepts a bigger (not specified) 
tolerance in its implementation.

Maybe I should have phrased my question differently: is this definition the 
only accepted one, or there are different formulations which give raise to more 
expedite implementations?
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Introducing quarterly date units to datetime64 and timedelta64

2024-02-24 Thread Stefano Miccoli via NumPy-Discussion
Actually quarters (3 months sub-year groupings) are already supported as 
‘M8[3M]’ and ‘m8[3M]’:
>>> np.datetime64('2024-05').astype('M8[3M]') - 
np.datetime64('2020-03').astype('M8[3M]')
numpy.timedelta64(17,'3M')
So explicitly introducing a ‘Q’ time unit is only to enable more intuitive 
representation/parsing of dates and durations.

I’m moderately negative on this proposal:
- there is no native support of quarters in Python
- ISO 8601-1 does not support sub-year groupings
- the ISO 8601-2 extensions representing sub-year groupings is not sufficiently 
widespread to be adopted by numpy. (E.g. '2001-34' expresses "second quarter of 
2001”, but I suppose nobody would guess this meaning) 

In other words, without a clear normative reference, implementing quarters in 
numpy would risk to introduce a custom/arbitrary notation.

Stefano

smime.p7s
Description: S/MIME cryptographic signature
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Comparison between numpy scalars returns numpy bool class and not native python bool class

2024-06-27 Thread Stefano Miccoli via NumPy-Discussion
It is well known that ‘np.bool' is not interchangeable with python ‘bool’, and 
in fact 'issubclass(np.bool, bool)’ is false.

On the contrary, numpy floats are subclassing python 
floats—'issubclass(np.float64, float) is true—so I’m wondering if the fact that 
scalar comparison returns a np.bool breaks the Liskov substitution principle. 
In fact  ’(np.float64(1) > 0) is True’ is unexpectedly false.

I was hit by this behaviour because in python structural pattern matching, the 
‘a > 1’ subject will not match neither ’True’ or ‘False’ if ‘a' is a numpy 
scalar: In this short example

import numpy as np
a = np.float64(1)
assert isinstance(a, float)
match a > 1:
case True | False:
print('python float')
case _:
print('Huh?: numpy float’)

the default clause is matched. If we set instead ‘a = float(1)’, the first 
clause will be matched. The surprise factor is quite high here, in my opinion.
(Let me add that ‘True', ‘False', ‘None' are special in python structural 
pattern matching, because they are matched by identity and not by equality.)

I’m not sure if this behaviour can be avoided, or if we have to live with the 
fact that numpy floats are to be kept well contained and never mixed with 
python floats.

Stefano

smime.p7s
Description: S/MIME cryptographic signature
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Comparison between numpy scalars returns numpy bool class and not native python bool class

2024-06-28 Thread Stefano Miccoli via NumPy-Discussion


> On 27 Jun 2024, at 23:48, Aaron Meurer  wrote:
> 
> Apparently the reason this happens is that True, False, and None are
> compared using 'is' in structural pattern matching (see

Please let me stress that the ‘match/case’ snippet was only a concrete example 
of a situation in which, say ‘f(a)’ gives the correct result when ‘a’ is a 
‘float’ instance and breaks down when ‘a’ is a ‘np.foat64’ instance.
Now the fact that numpy floats are subclasses of python floats is quite a 
strong promise that this should never be the case…
Realistically this can be solved in a couple of ways.

(i) Refactoring ‘f(a)’ so that it is aware of the numpy float quirks… not 
always possible, especially if ‘f(a)’ belongs to an external package.

(ii) Sanitizing numpy floats, lets say by ‘f(a.item())’ in the calling code.

(iii) Ensuring that scalar comparisons always return python bools and not 
‘np.bool'


(i) and (ii) are quite simple user-side workarouns, but sometimes the surprise 
factor is high, as in the given code snippet. 

On the contrary (iii) is a radical solution on the library side, but I’m not 
sure if it’s worth implementing for a few edge cases. In fact  ‘b is True’ is 
an anti-pattern in python, and probably the places in which this behaviour 
surfaces should be sparse.

Stefano

smime.p7s
Description: S/MIME cryptographic signature
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Precision changes to sin/cos in the next release?

2023-05-31 Thread Stefano Miccoli via NumPy-Discussion


On 31 May 2023, at 16:32, 
numpy-discussion-requ...@python.org 
wrote:

It seems fairly clear that with this recent change, the feeling is that the 
tradeoff is bad and that too much accuracy was lost, for not enough real-world 
gain. However, we now had several years worth of performance work with few 
complaints about accuracy issues. So I wouldn't throw out the baby with the 
bath water now and say that we always want the best accuracy only. It seems to 
me like we need a better methodology for evaluating changes. Contributors have 
been pretty careful, but looking back at SIMD PRs, there were usually detailed 
benchmarks but not always detailed accuracy impact evaluations.

Cheers,
Ralf


If I can throw my 2cents in, my feeling is that most user will not notice 
neither the decrease in accuracy, nor the increase in speed.
(I failed to mention, I'm an engineer so a few ULPs are almost nothing for 
me; unless I have to solve a very ILL conditioned problem, but then I do not 
blame numpy, but myself for formulating such a bad model ;-)

The only real problem is for code that relies on these assumptions:

assert np.sin(np.pi/2) == -np.cos(np.pi) == 1

which will fail in numpy==1.25.rc0 but should hold true for numpy~=1.24.3, at 
least on most runtime environments.

I do not have strong feelings on this issue: in an ideal world code should have 
unit-testing modules and assertion scattered here and there in order to make 
all implicit assumptions explicit. Adapting to the new routines should be 
fairly simple.
Of course we do not live in an ideal world and there will definitely be a 
number of users that will experience hard to debug failures linked to this new 
trig routines.

But again I prefer to remain neutral.

Stefano


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: np.ndenumerate doesn't obey mask?

2024-10-23 Thread Stefano Miccoli via NumPy-Discussion

> On 22 Oct 2024, at 18:00, numpy-discussion-requ...@python.org wrote:
> 
> From: Neal Becker 
> Subject: [Numpy-discussion] np.ndenumerate doesn't obey mask?
> Date: 21 October 2024 at 18:52:41 CEST
> To: Discussion of Numerical Python 
> Reply-To: Discussion of Numerical Python 
> 
> 
> I was using ndenuerate with a masked array, and it seems that the mask is 
> ignored.  Is this true?  If so, isn't that a bug?
> 

This is expected behaviour: you should use the function from the “.ma" 
namespace “numpy.ma.ndenumerate"

 
for skipping masked elements. By design numpy.enumerate does not skip masked 
elements.

Stefano
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Extend ldexp to handle complex inputs

2024-09-27 Thread Stefano Miccoli via NumPy-Discussion
‘np.ldexp’ exists mainly for compatibility with the C/C++ functions ldexp, 
ldexpf, ldexpl, which are defined for float/double/long double.
Quoting the C refs:

> On binary systems (where FLT_RADIX is 2), ldexp is equivalent to scalbn.
> The function ldexp ("load exponent"), together with its dual, frexp, can be 
> used to manipulate the representation of a floating-point number without 
> direct bit manipulations.
> On many implementations, ldexp is less efficient than multiplication or 
> division by a power of two using arithmetic operators.

So in general I do not think that extending ‘np.ldexp’ to complex makes much 
sense, since it would be peculiar to numpy implementation and destroy the C/C++ 
equivalence.

Stefano
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Why ufunc out arg has to be an ndarray?

2025-01-08 Thread Stefano Miccoli via NumPy-Discussion
From time to time I find myself overwriting a python buffer with the output of 
a ufunc, for example like this:

import array
import numpy as np

a = array.array('f', (1,1,1))
np.exp2(a, out=np.asarray(a))
assert a.tolist() == [2, 2, 2]

Here I have to wrap `out=np.asarray(a)` because the more natural `np.exp2(a, 
out=a)` raises "TypeError: return arrays must be of ArrayType”

In general, ufuncs are quite aggressive in utilizing the buffer protocol for 
input arguments. I was wondering, why are they so reluctant to do the same for 
the output argument?
Is this a design choice? Efficiency? Legacy? Would be implementing `np.ufunc(a, 
out=a)` dangerous or cumbersome?

Stefano

smime.p7s
Description: S/MIME cryptographic signature
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com