Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-09-29 Thread Martin Kalcher
an be affected by setseed().) regards, tom lane New patch: array_shuffle() and array_sample() use pg_global_prng_state now. MartinFrom b9433564f925521f5f6bcebd7cd74a3e12f4f354 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PAT

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-09-28 Thread Martin Kalcher
Am 26.09.22 um 22:16 schrieb Tom Lane: With our current PRNG infrastructure it doesn't cost much to have a separate PRNG for every purpose. I don't object to having array_shuffle() and array_sample() share one PRNG, but I don't think it should go much further than that. Thanks for your thoug

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-09-22 Thread Martin Kalcher
Am 22.09.22 um 17:23 schrieb Andres Freund: Hi, On 2022-08-04 07:46:10 +0200, Martin Kalcher wrote: Patch update without merge conflicts. Due to the merge of the meson based build, this patch needs to be adjusted. See https://cirrus-ci.com/build/6580671765282816 Looks like it'd ju

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-08-03 Thread Martin Kalcher
Patch update without merge conflicts. MartinFrom 0ecffcf3ed2eb59d045941b69bb86a34b93f3391 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH v3] Introduce array_shuffle() and array_sample() * array_shuffle() shuffles the elements of an array

[Patch] Fix bounds check in trim_array()

2022-07-25 Thread Martin Kalcher
:int[], 10); {} select trim_array('{}'::int[], 100); ERROR: number of elements to trim must be between 0 and 64 The attached patch fixes that check. MartinFrom b6173a8f8f94cddd5347db482b8e4480c0e546e7 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Mon, 25 Jul 2022 16:26:14 +

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-25 Thread Martin Kalcher
If someone wants a limit, they can easily "LEAST(#1 dim size, other limit)" to get it, it is easy enough with a strict function. Convinced. It errors out now if n is out of bounds. MartinFrom afb7c022abd26b82a4fd3611313a83f144909554 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date:

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-24 Thread Martin Kalcher
3d8388c Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] Introduce array_shuffle() and array_sample() * array_shuffle() shuffles the elements of an array. * array_sample() chooses max n elements from an array by random. The new functions shar

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-23 Thread Martin Kalcher
Am 22.07.22 um 11:31 schrieb Martin Kalcher: i came to the same conclusions and went with Option 1 (see patch). Mainly because most code in utils/adt is organized by type and this way it is clear, that this is a thin wrapper around pg_prng. Small patch update. I realized the new functions

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-22 Thread Martin Kalcher
ode in utils/adt is organized by type and this way it is clear, that this is a thin wrapper around pg_prng. What do you think?From ceda50f1f7f7e0c123de9b2ce2cc7b5d2b2b7db6 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] Introduce array_shuf

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher
esql.org/docs/current/parallel-safety.html, which explains why setseed() and random() are parallel restricted. Here is an updated patch that marks the functions VOLATILE PARALLEL RESTRICTED and uses pg_prng_uint64_range() rather than rand().From 26676802f05d00c31e0b2d5997f61734aa421fca Mon Sep

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher
Am 21.07.22 um 14:25 schrieb Dean Rasheed: I'm inclined to say that we want a new pg_global_prng_user_state that is updated by setseed(), and used by random(), array_shuffle(), array_sample(), and any other user-facing random functions we add later. I like the idea. How would you organize the

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher
Am 21.07.22 um 10:41 schrieb Dean Rasheed: A couple of quick comments on the current patch: Thank you for your feedback! It's important to mark these new functions as VOLATILE, not IMMUTABLE, otherwise they won't work as expected in queries. See https://www.postgresql.org/docs/current/xfunc-

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-19 Thread Martin Kalcher
[7,8]]], 1); --- {1,2,3,4,5,6,7,8} select array_flatten(array[[[1,2],[3,4]],[[5,6],[7,8]]], 2); --- {{1,2,3,4},{5,6,7,8}} MartinFrom 2aa6d72ff0f4d8835ee2f09f8cdf16b7e8005e56 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] Introduce arr

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-19 Thread Martin Kalcher
Am 18.07.22 um 23:48 schrieb Martin Kalcher: If we go with (1) array_shuffle() and array_sample() should shuffle each element individually and always return a one-dimensional array.   select array_shuffle('{{1,2},{3,4},{5,6}}');   ---    {1,4,3,5,6,2}   select ar

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-19 Thread Martin Kalcher
Am 19.07.22 um 00:52 schrieb Martin Kalcher: On the contrary! I am pretty sure there are people out there wanting sampling-without-shuffling. I will think about that. I gave it some thought. Even though there might be use cases, where a stable order is desired, i would consider them edge

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread Martin Kalcher
Am 19.07.22 um 00:18 schrieb Tom Lane: Independently of the dimensionality question --- I'd imagined that array_sample would select a random subset of the array elements but keep their order intact. If you want the behavior shown above, you can do array_shuffle(array_sample(...)). But if we ra

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread Martin Kalcher
Am 18.07.22 um 23:03 schrieb Tom Lane: I wrote: Martin had originally proposed (2), which I rejected on the grounds that we don't treat multi-dimensional arrays as arrays-of-arrays for any other purpose. Actually, after poking at it for awhile, that's an overstatement. It's true that the type

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread Martin Kalcher
Am 18.07.22 um 21:29 schrieb Tom Lane: The preferred thing to do is to add it to our "commitfest" queue, which will ensure that it gets looked at eventually. The currently open cycle is 2022-09 [1] (see the "New Patch" button there). Thanks Tom, did that. I am not sure if "SQL Commands" is the

[PATCH] Introduce array_shuffle() and array_sample()

2022-07-18 Thread Martin Kalcher
ements from an array. Is someone interested in looking at it? What are the next steps? MartinFrom 5498bb2d9f1fab4cad56cd0d3a6eeafa24a26c8e Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] Introduce array_shuffle() and array_sample() * array_s

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-18 Thread Martin Kalcher
MartinFrom baec08168357098287342c92672ef97361a91371 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] introduce array_shuffle() --- src/backend/utils/adt/arrayfuncs.c | 61 ++ src/include/catalog/pg_proc.dat| 3 ++ 2 files changed, 64

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-18 Thread Martin Kalcher
Am 18.07.22 um 01:20 schrieb Tom Lane: I would expect that shuffle() only shuffles the first dimension and keeps the inner arrays intact. This argument is based on a false premise, ie that Postgres thinks multidimensional arrays are arrays-of-arrays. They aren't, and we're not going to start m

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
598a1dace99c9059514e0fdcb90b9240ed3 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] introduce array_shuffle() --- src/backend/utils/adt/arrayfuncs.c | 178 + src/include/catalog/pg_proc.dat| 3 + 2 files chang

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
Am 18.07.22 um 00:46 schrieb Tom Lane: This does not look particularly idiomatic, or even type-safe. What you should have done was use deconstruct_array to get an array of Datums and isnull flags, then shuffled those, then used construct_array to build the output. (Or, perhaps, use construct_m

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
586dffe6e590ba9 Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] introduce array_shuffle() --- src/backend/utils/adt/arrayfuncs.c | 73 ++ src/include/catalog/pg_proc.dat| 3 ++ 2 files changed, 76 insertions(+) d

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
Am 17.07.22 um 05:32 schrieb David G. Johnston: +SELECT sample('{1,2,3,4,5,6,7,8,9,10,11,12}', 6) != sample('{1,2,3,4,5,6,7,8,9,10,11,12}', 6); + ?column? +-- + t +(1 row) + While small, there is a non-zero chance for both samples to be equal. This test should probably just go, I don

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
Am 17.07.22 um 08:00 schrieb Thomas Munro: I went to see what Professor Lemire would have to say about all this, expecting to find a SIMD rabbit hole to fall down for some Sunday evening reading, but the main thing that jumped out was this article about the modulo operation required by textbook

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-17 Thread Martin Kalcher
Am 17.07.22 um 05:37 schrieb Tom Lane: Actually ... is there a reason to bother with an intarray version at all, rather than going straight for an in-core anyarray function? It's not obvious to me that an int4-only version would have major performance advantages. regards

Re: Proposal to introduce a shuffle function to intarray extension

2022-07-16 Thread Martin Kalcher
Am 16.07.22 um 23:56 schrieb Thomas Munro: On Fri, Jul 15, 2022 at 8:36 PM Martin Kalcher wrote: I would like to see a function like this inside the intarray extension. Is there any way to get to this point? How is the process to deal with such proposals? Hi Martin, I'm redirecting th