Re: Document if width_bucket's low and high are inclusive/exclusive

2025-06-21 Thread Dean Rasheed
On Sat, 21 Jun 2025 at 18:09, Tom Lane  wrote:
>
> While looking at those comments, I also noted that there is a
> strange inconsistency between width_bucket_array and
> width_bucket_float8/width_bucket_numeric.  Namely, the latter
> two reject an "operand" that is NaN, while width_bucket_array
> goes out of its way to accept it and treat it in our usual
> fashion as sorting higher than all non-NaNs.
>
> Clearly these functions must reject NaN histogram bounds, for
> the same reason they reject infinite bounds.  But I don't see
> any reason why they couldn't treat a NaN operand as valid.
> Should we change them?  (I imagine this'd be a HEAD-only
> change, and probably v19 material at this point.)
>

Yes, I think that's a good idea (for v19 I would have thought).
Allowing the operand to be NaN definitely seems preferable to throwing
an error, since the operand might well come from data in a table
containing NaNs.

Regards,
Dean




Re: Document if width_bucket's low and high are inclusive/exclusive

2025-06-21 Thread Tom Lane
Dean Rasheed  writes:
> On Sat, 21 Jun 2025 at 18:09, Tom Lane  wrote:
>> Clearly these functions must reject NaN histogram bounds, for
>> the same reason they reject infinite bounds.  But I don't see
>> any reason why they couldn't treat a NaN operand as valid.
>> Should we change them?  (I imagine this'd be a HEAD-only
>> change, and probably v19 material at this point.)

> Yes, I think that's a good idea (for v19 I would have thought).
> Allowing the operand to be NaN definitely seems preferable to throwing
> an error, since the operand might well come from data in a table
> containing NaNs.

I started a new thread for that, since it's no longer docs material:

https://www.postgresql.org/message-id/2822872.1750540911%40sss.pgh.pa.us

regards, tom lane




Re: Document if width_bucket's low and high are inclusive/exclusive

2025-06-21 Thread Dean Rasheed
On Fri, 20 Jun 2025 at 22:19, Tom Lane  wrote:
>
> So concretely, how about the attached?
>

LGTM (though I'm not sure it really needs the word "therefore" in the
first hunk).

There are also a couple of code comments that need fixing --
width_bucket_float8() comes with the following comment:

 * 'bound1' and 'bound2' are the lower and upper bounds of the
 * histogram's range, respectively. 'count' is the number of buckets
 * in the histogram. width_bucket() returns an integer indicating the
 * bucket number that 'operand' belongs to in an equiwidth histogram
 * with the specified characteristics. An operand smaller than the
 * lower bound is assigned to bucket 0. An operand greater than the
 * upper bound is assigned to an additional bucket (with number
 * count+1). We don't allow "NaN" for any of the float8 inputs, and we
 * don't allow either of the histogram bounds to be +/- infinity.

so at the very least, that should be made to say "greater than or
equal to", instead of "greater than". Similarly for
width_bucket_numeric().

Also, since PG14, type numeric has supported infinity, so its comment
should probably include that last part about not allowing +/- infinity
in the histogram bounds.

Regards,
Dean




Re: Document if width_bucket's low and high are inclusive/exclusive

2025-06-21 Thread Tom Lane
Dean Rasheed  writes:
> On Fri, 20 Jun 2025 at 22:19, Tom Lane  wrote:
>> So concretely, how about the attached?

> LGTM (though I'm not sure it really needs the word "therefore" in the
> first hunk).

OK, done that way.

> There are also a couple of code comments that need fixing --

Good points, also done.

While looking at those comments, I also noted that there is a
strange inconsistency between width_bucket_array and
width_bucket_float8/width_bucket_numeric.  Namely, the latter
two reject an "operand" that is NaN, while width_bucket_array
goes out of its way to accept it and treat it in our usual
fashion as sorting higher than all non-NaNs.

Clearly these functions must reject NaN histogram bounds, for
the same reason they reject infinite bounds.  But I don't see
any reason why they couldn't treat a NaN operand as valid.
Should we change them?  (I imagine this'd be a HEAD-only
change, and probably v19 material at this point.)

regards, tom lane