Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-09-28 Thread Martin Kalcher
Am 26.09.22 um 22:16 schrieb Tom Lane: With our current PRNG infrastructure it doesn't cost much to have a separate PRNG for every purpose. I don't object to having array_shuffle() and array_sample() share one PRNG, but I don't think it should go much further than that. Thanks for your thoug

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-09-29 Thread Martin Kalcher
an be affected by setseed().) regards, tom lane New patch: array_shuffle() and array_sample() use pg_global_prng_state now. MartinFrom b9433564f925521f5f6bcebd7cd74a3e12f4f354 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PAT

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-16 Thread Martin Kalcher
Am 16.07.22 um 23:56 schrieb Thomas Munro: On Fri, Jul 15, 2022 at 8:36 PM Martin Kalcher wrote: I would like to see a function like this inside the intarray extension. Is there any way to get to this point? How is the process to deal with such proposals? Hi Martin, I'm redirecting th

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
Am 17.07.22 um 05:37 schrieb Tom Lane: Actually ... is there a reason to bother with an intarray version at all, rather than going straight for an in-core anyarray function? It's not obvious to me that an int4-only version would have major performance advantages. regards

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
Am 17.07.22 um 08:00 schrieb Thomas Munro: I went to see what Professor Lemire would have to say about all this, expecting to find a SIMD rabbit hole to fall down for some Sunday evening reading, but the main thing that jumped out was this article about the modulo operation required by textbook

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
Am 17.07.22 um 05:32 schrieb David G. Johnston: +SELECT sample('{1,2,3,4,5,6,7,8,9,10,11,12}', 6) != sample('{1,2,3,4,5,6,7,8,9,10,11,12}', 6); + ?column? +-- + t +(1 row) + While small, there is a non-zero chance for both samples to be equal. This test should probably just go, I don

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
586dffe6e590ba9 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] introduce array_shuffle() --- src/backend/utils/adt/arrayfuncs.c | 73 ++ src/include/catalog/pg_proc.dat| 3 ++ 2 files changed, 76 insertions(+) d

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
Am 18.07.22 um 00:46 schrieb Tom Lane: This does not look particularly idiomatic, or even type-safe. What you should have done was use deconstruct_array to get an array of Datums and isnull flags, then shuffled those, then used construct_array to build the output. (Or, perhaps, use construct_m

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
598a1dace99c9059514e0fdcb90b9240ed3 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] introduce array_shuffle() --- src/backend/utils/adt/arrayfuncs.c | 178 + src/include/catalog/pg_proc.dat| 3 + 2 files chang

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-18 Thread Martin Kalcher
Am 18.07.22 um 01:20 schrieb Tom Lane: I would expect that shuffle() only shuffles the first dimension and keeps the inner arrays intact. This argument is based on a false premise, ie that Postgres thinks multidimensional arrays are arrays-of-arrays. They aren't, and we're not going to start m

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-18 Thread Martin Kalcher
MartinFrom baec08168357098287342c92672ef97361a91371 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] introduce array_shuffle() --- src/backend/utils/adt/arrayfuncs.c | 61 ++ src/include/catalog/pg_proc.dat| 3 ++ 2 files changed, 64

[PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread Martin Kalcher
ements from an array. Is someone interested in looking at it? What are the next steps? MartinFrom 5498bb2d9f1fab4cad56cd0d3a6eeafa24a26c8e Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] Introduce array_shuffle() and array_sample() * array_s

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread Martin Kalcher
Am 18.07.22 um 21:29 schrieb Tom Lane: The preferred thing to do is to add it to our "commitfest" queue, which will ensure that it gets looked at eventually. The currently open cycle is 2022-09 [1] (see the "New Patch" button there). Thanks Tom, did that. I am not sure if "SQL Commands" is the

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread Martin Kalcher
Am 18.07.22 um 23:03 schrieb Tom Lane: I wrote: Martin had originally proposed (2), which I rejected on the grounds that we don't treat multi-dimensional arrays as arrays-of-arrays for any other purpose. Actually, after poking at it for awhile, that's an overstatement. It's true that the type

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread Martin Kalcher
Am 19.07.22 um 00:18 schrieb Tom Lane: Independently of the dimensionality question --- I'd imagined that array_sample would select a random subset of the array elements but keep their order intact. If you want the behavior shown above, you can do array_shuffle(array_sample(...)). But if we ra

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-19 Thread Martin Kalcher
Am 19.07.22 um 00:52 schrieb Martin Kalcher: On the contrary! I am pretty sure there are people out there wanting sampling-without-shuffling. I will think about that. I gave it some thought. Even though there might be use cases, where a stable order is desired, i would consider them edge

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-19 Thread Martin Kalcher
Am 18.07.22 um 23:48 schrieb Martin Kalcher: If we go with (1) array_shuffle() and array_sample() should shuffle each element individually and always return a one-dimensional array.   select array_shuffle('{{1,2},{3,4},{5,6}}');   ---    {1,4,3,5,6,2}   select ar

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-19 Thread Martin Kalcher
[7,8]]], 1); --- {1,2,3,4,5,6,7,8} select array_flatten(array[[[1,2],[3,4]],[[5,6],[7,8]]], 2); --- {{1,2,3,4},{5,6,7,8}} MartinFrom 2aa6d72ff0f4d8835ee2f09f8cdf16b7e8005e56 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] Introduce arr

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher
Am 21.07.22 um 10:41 schrieb Dean Rasheed: A couple of quick comments on the current patch: Thank you for your feedback! It's important to mark these new functions as VOLATILE, not IMMUTABLE, otherwise they won't work as expected in queries. See https://www.postgresql.org/docs/current/xfunc-

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher
Am 21.07.22 um 14:25 schrieb Dean Rasheed: I'm inclined to say that we want a new pg_global_prng_user_state that is updated by setseed(), and used by random(), array_shuffle(), array_sample(), and any other user-facing random functions we add later. I like the idea. How would you organize the

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher
esql.org/docs/current/parallel-safety.html, which explains why setseed() and random() are parallel restricted. Here is an updated patch that marks the functions VOLATILE PARALLEL RESTRICTED and uses pg_prng_uint64_range() rather than rand().From 26676802f05d00c31e0b2d5997f61734aa421fca Mon Sep

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-22 Thread Martin Kalcher
ode in utils/adt is organized by type and this way it is clear, that this is a thin wrapper around pg_prng. What do you think?From ceda50f1f7f7e0c123de9b2ce2cc7b5d2b2b7db6 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] Introduce array_shuf

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-23 Thread Martin Kalcher
Am 22.07.22 um 11:31 schrieb Martin Kalcher: i came to the same conclusions and went with Option 1 (see patch). Mainly because most code in utils/adt is organized by type and this way it is clear, that this is a thin wrapper around pg_prng. Small patch update. I realized the new functions

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-24 Thread Martin Kalcher
3d8388c Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] Introduce array_shuffle() and array_sample() * array_shuffle() shuffles the elements of an array. * array_sample() chooses max n elements from an array by random. The new functions shar

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-25 Thread Martin Kalcher
If someone wants a limit, they can easily "LEAST(#1 dim size, other limit)" to get it, it is easy enough with a strict function. Convinced. It errors out now if n is out of bounds. MartinFrom afb7c022abd26b82a4fd3611313a83f144909554 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date:

[Patch] Fix bounds check in trim_array()

2022-07-25 Thread Martin Kalcher
:int[], 10); {} select trim_array('{}'::int[], 100); ERROR: number of elements to trim must be between 0 and 64 The attached patch fixes that check. MartinFrom b6173a8f8f94cddd5347db482b8e4480c0e546e7 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Mon, 25 Jul 2022 16:26:14 +

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-08-03 Thread Martin Kalcher
Patch update without merge conflicts. MartinFrom 0ecffcf3ed2eb59d045941b69bb86a34b93f3391 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH v3] Introduce array_shuffle() and array_sample() * array_shuffle() shuffles the elements of an array

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-09-22 Thread Martin Kalcher
Am 22.09.22 um 17:23 schrieb Andres Freund: Hi, On 2022-08-04 07:46:10 +0200, Martin Kalcher wrote: Patch update without merge conflicts. Due to the merge of the meson based build, this patch needs to be adjusted. See https://cirrus-ci.com/build/6580671765282816 Looks like it'd ju