User-defined function with anyrange[]

2018-07-05 Thread Paul A Jungwirth
Hello,

I want to make an aggregate function range_agg(anyrange) that returns
anyrange[]. But when I try to define it, Postgres tells me it doesn't
know what an anyrange[] is. I get this error:

ERROR:  type anyrange[] does not exist

I also tried taking an anyrange and returning an anyarray, which does
let me define the function, but upon calling it I get an error. For
example:

paul=# CREATE OR REPLACE FUNCTION range_agg4(anyrange)
RETURNS anyarray
AS $$
BEGIN
  RETURN ARRAY[$1];
END;
$$
LANGUAGE plpgsql;

paul=# select range_agg4(daterange('2016-05-04', '2016-07-01'));
ERROR:  invalid input syntax for type date: "[2016-05-04,2016-07-01)"
CONTEXT:  PL/pgSQL function range_agg4(anyrange) while casting
return value to function's return type

So I guess it thinks that daterange in means date[] out (not daterange[] out).

The best I can think of is to define range_agg(anyelement) that
returns anyarray. That actually works how I hope:

paul=# CREATE OR REPLACE FUNCTION range_agg3(anyelement)
RETURNS anyarray
AS $$
BEGIN
  RETURN ARRAY[$1];
END;
$$
LANGUAGE plpgsql;

paul=# select range_agg3(daterange('2016-05-04', '2016-07-01'));
 range_agg3
-
 {"[2016-05-04,2016-07-01)"}

But of course that is not as restricted as I'd like. Are there any
better options?

Thanks,
Paul



Re: User-defined function with anyrange[]

2018-07-08 Thread Paul A Jungwirth
> you can use overloading to define
> several functions of the same name, and just write out one for each
> range type you actually need this functionality for.

Thanks! I was hoping to avoid that, but it's what I wound up doing
after all, as you can see here:

https://github.com/pjungwir/range_agg

In the README I give instructions to create a new aggregate with a
custom range type, and it is actually not bad at all, since the same C
function can serve as implementation in all cases.

Thanks again,
Paul



Re: How to split an array into columns

2018-08-24 Thread Paul A Jungwirth
On Fri, Aug 24, 2018 at 2:01 AM, a <372660...@qq.com> wrote:
>
> Say if I have an float8 array:
>
> id| data
> --|---
> a | {1,2}
> b | {2,4}
>
> . . .
>
> Since I would have around 200,000 rows, I would prefer it having enough 
> capacity to carry out the calculation such as sum().

Is this something that would help you?:

https://github.com/pjungwir/aggs_for_vecs

(also on pgxn: https://pgxn.org/dist/aggs_for_vecs)

It would let you say `SELECT vec_to_sum(data)` and get `{3,6}`.

Paul



flinfo NULL in DirectFunctionCall callee?

2019-08-26 Thread Paul A Jungwirth
Hello,

I'm writing a multirange_cmp function which calls range_cmp to compare
individual elements. It looks like this:

cmp = DatumGetInt32(DirectFunctionCall2(
range_cmp,
RangeTypePGetDatum(r1),
RangeTypePGetDatum(r2)));

But I get a segfault when range_cmp tries to reach its
fcinfo->flinfo->fn_extra, because it has a NULL flinfo, as you can see
from these debugging messages:

NOTICE:  multirange_cmp fcinfo = 0x7f875a0a66c0
NOTICE:  multirange_cmp flinfo = 0x7f875a0a6690
NOTICE:  multirange_cmp fn_extra = 0x7f875a099450
NOTICE:  range_cmp fcinfo = 0x7ffee5ff9820
NOTICE:  range_cmp flinfo = 0x0

Is it expected for flinfo to be null when using DirectFunctionCall*?
Is there something else I should use instead? It looks like
FunctionCall2 would let me pass my own flinfo, but I'm not sure how to
set that up first.

Thanks!
Paul




Re: flinfo NULL in DirectFunctionCall callee?

2019-08-26 Thread Paul A Jungwirth
On Mon, Aug 26, 2019 at 8:12 AM Tom Lane  wrote:
>
> Paul A Jungwirth  writes:
> > Is it expected for flinfo to be null when using DirectFunctionCall*?
>
> Yes, per the comment on those functions:
>
>  * These are for invocation of a specifically named function with a
>  * directly-computed parameter list.  Note that neither arguments nor result
>  * are allowed to be NULL.  Also, the function cannot be one that needs to
>  * look at FmgrInfo, since there won't be any.

Thanks Tom! Alas I saw that same comment in fmgr.h but it was missing
the last sentence. Would you like a patch updating it there? Is that
the kind of thing you make CF entries for? (And sorry this is veering
into pgsql-hackers territory)

> Use fmgr_info() or a sibling to fill in the FmgrInfo.

Thanks for this too! I wound up just adding a range_cmp_internal to
avoid the whole issue, which is a pattern already used by a bunch of
the other range_* functions.

Yours,
Paul




Re: Work hours?

2019-08-27 Thread Paul A Jungwirth
On Tue, Aug 27, 2019 at 3:27 PM stan  wrote:
> I need to write a function that, given a month, and a year as input returns
> the "work hours" in that month. In other words something like
>
> 8 * the count of all days in the range Monday to Friday) within that
> calendar month.

This gives you all the weekdays in August 2019:

select t::date
from generate_series('2019-08-01'::date, '2019-09-01'::date, interval
'1 day') s(t)
where extract(dow from t) not in (0, 6);

>From there you could count & multiply by 8 (e.g. `select count(*) * 8`
instead). You'll probably want to remove holidays first though. If
those lived in another table you could do a NOT EXISTS to remove them
before you count.

Yours,
Paul




When does Postgres use binary I/O?

2019-09-17 Thread Paul A Jungwirth
Hello,

I've read the docs at [1] and also this interesting recent post about
adding binary I/O to the hashtypes extension. I wrote send & recv
functions for my new multirange types, but I'm not sure how to test
them. After running installcheck or installcheck-world, the code
coverage report says they aren't tested, nor are the send/recv
functions for ranges or arrays. When does Postgres actually use these
functions? Is there a good way to test them?

Thanks,
Paul

[1] https://www.postgresql.org/docs/11/xtypes.html
[2] 
http://www.myrkraverk.com/blog/2019/08/postgresql-retroactively-adding-binary-i-o-to-a-type/




Re: When does Postgres use binary I/O?

2019-09-22 Thread Paul A Jungwirth
On Sun, Sep 22, 2019 at 11:53 AM Tom Lane  wrote:
> I thought of an easier-to-maintain approach to that part than having
> a reference file.  Binary send functions are invokable from SQL,
> so we could just imagine adding test cases along the lines of
>
> regression=# select int8send(42);
>   int8send
> 
>  \x002a
> (1 row)
>
> for each data type.  This would be enough to detect endianness issues,
> garbage in padding bytes, etc.

I just finished my multirange patch (well, "finished" :-), so I might
be willing to sign up for this. Would you scatter these tests around
in the various existing files? Or add a new cross-cutting file (like
opr_sanity etc)?

> The receive functions are not so easy to call, so we still need a
> round-trip test, but you could imagine a TAP test framework for that.
> Or, perhaps, the thing to do would be to provide a generic test function
> that takes a value, runs it through the type's send and then receive
> functions, and returns the result (or just complains if it gets different
> bits out ...)

So you're saying the latter option is to add a new function that
someone can call from SQL, that just round-trips a value through send
+ recv? And then call that from an ordinary regress test? I guess the
tests themselves can't define the function (like they define
binary_coercible), because you need to call *_recv from C, so this
would actually be a function we ship and document, right? That seems
within my abilities.

Should I move this thread over to pgsql-hackers for this?

Paul




Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Paul A Jungwirth
> That said, I think it is reasonable that a PostgreSQL JSON function
> behaves in the way that JSON users would expect, so here is my +1 for
> interpreting an SQL NULL as a JSON null in the above case

Just to chime in as another application developer: the current
functionality does seem pretty surprising and dangerous to me. Raising
an exception seems pretty annoying. Setting the key's value to a JSON
null would be fine, but I also like the idea of removing the key
entirely, since that gives you strictly more functionality: you can
always set the key to a JSON null by passing one in, if that's what
you want. But there are lots of other functions that convert SQL NULL
to JSON null:

postgres=# select row_to_json(row(null)), json_build_object('foo',
null), json_object(array['foo', null]), json_object(array['foo'],
array[null]);
 row_to_json | json_build_object |  json_object   |  json_object
-+---++
 {"f1":null} | {"foo" : null}| {"foo" : null} | {"foo" : null}
(1 row)

(The jsonb variants give the same results.)

I think those functions are very similar to json_set here, and I'd
expect json_set to do what they do (i.e. convert SQL NULL to JSON
null).

Paul




Rules documentation example

2019-11-11 Thread Paul A Jungwirth
Hello,

I'm reading the docs about the Postgres Rule system here:

https://www.postgresql.org/docs/12/rules-views.html

That page says:

> It turns out that the planner will collapse this tree into a two-level query 
> tree: the bottommost SELECT commands will be “pulled up” into the middle 
> SELECT since there's no need to process them separately. But the middle 
> SELECT will remain separate from the top, because it contains aggregate 
> functions. If we pulled those up it would change the behavior of the topmost 
> SELECT, which we don't want.

But I don't see an aggregate function. Is it referring to MIN? But
that is this two-param version defined on the same page. It isn't an
aggregate:

CREATE FUNCTION min(integer, integer) RETURNS integer AS $$
SELECT CASE WHEN $1 < $2 THEN $1 ELSE $2 END
$$ LANGUAGE SQL STRICT;

Is that an error in the docs, or am I missing something?

Does a non-aggregate function also prevent the subqueries from being
pulled up? Will all levels of that query actually get combined, or
does something else prevent it?

Thanks,
Paul




Re: a proposal for a new functionality: "SELECT * [EXCEPT col1 [,col2]]

2020-02-25 Thread Paul A Jungwirth
On Tue, Feb 25, 2020 at 2:18 AM Stanislav Motyčka
 wrote:
> Sometimes (for tables with many columns) it would be better and easier to 
> write "SELECT" statement with clause "EXCEPT":
> "SELECT * [EXCEPT col1 [,col2]] FROM ..."

I've wanted this feature lots of times and would be delighted to see
it in Postgres.

On Tue, Feb 25, 2020 at 6:51 AM Miles Elam  wrote:
> Do you mean
>   "select everything from tablex except for tablex.col1, and also select 
> tablex.col2 and tabley.col1"
> or
>   "select everything from tablex except for tablex.col1 AND tablex.col2, and 
> also select tabley.col1"
> ?

I take the proposal to mean this:

SELECT listOfColumns [EXCEPT listOfColumns] FROM ...

not this:

SELECT listOfColumns [EXCEPT (listOfColumns) [listOfColumns [EXCEPT
(listOfColumns)]]]... FROM ...

So there is always a single EXCEPT clause (if any) and it comes after
the entire SELECT clause. Then there is no ambiguity. Also this
approach makes the feature easy to understand and use. I don't see any
benefit to letting people interleave selected & excepted columns.

Regards,
Paul




Re: Postgresql range_agg() Return empty list

2024-07-10 Thread Paul A Jungwirth
On Wed, Jul 10, 2024 at 6:37 PM Han Tang  wrote:
> I am using range_agg() function, it works fine with original table value
>
> But when I try to do some update for range value, it will give back an empty 
> list
>
> Select range_agg(b.r)
> From (
>   Select int8range(lower(bin_range)+1, upper(bin_range)+5) as r
>   From bin_data) as b;
>

The value '(,)' signifies a range with no lower bound and no upper
bound. So '{(,)}' is the multirange that contains just that range
(unbounded in either direction)---in other words not an empty list but
every integer. Ranges use null to indicate unbounded, so it is the
same as '(null,null)'. It looks like the inputs in your second
screenshot must include some ranges with null lower & upper bounds, so
then when you combine them you get everything.

It appears that query is on lines 153-156 of some larger sequence? So
I'm guessing the table contents are not the same as when you generated
the first screenshot.

If you think something else is going on, perhaps you could share the
full contents of bin_data.

Yours,
Paul




Temporal Foreign Keys

2017-12-28 Thread Paul A Jungwirth
Since there have been a couple threads on the hackers list about
temporal features [1, 2], I thought I'd share an extension I've been
writing for temporal foreign keys:

https://github.com/pjungwir/time_for_keys

There is a big test suite, but right now it is still basically a
proof-of-concept, restricted to single-column-integer+tstzrange,
implemented mostly in plpgsql, with no extra support when a
table/column is dropped/renamed.

I'd like to add support for:

- multi-column foreign keys
- other FK types than integers
- other range types than tstzrange (at least tsrange, but maybe any range)
- better catalog integration: show one constraint instead of four
constraint triggers, properly restrict/cascade when a table/column is
dropped, make the triggers work when a table/column is renamed.
- support ON UPDATE/DELETE CASCADE/SET NULL/SET DEFAULT
- I think I also need to add FOR KEY SHARE locking to get correct
concurrent behavior.

I think all that would be easier if I rewrote the plpgsql parts in C.
I've been reading the code in commands/tablecmds.c,
catalog/pg_constraint.c, utils/adt/ri_triggers.c,
catalog/dependency.c, etc., and I think I'm ready to get started. I
think I'd be doing a lot of copy/paste/adjust from the normal RI code.
But first I wanted to share what I have so far and see if anyone could
offer advice or feedback.

Some other extensions that have been helpful for me to read are:

- https://github.com/arkhipov/temporal_tables
- https://github.com/CartoDB/pg_schema_triggers for detecting
DROP/ALTER commands

Right now this is an extension, but I'd be honored to contribute it to
the core project eventually if folks want that. One issue there is
that (traditional) foreign keys require indexes, and you can't index a
thing+range without btree_gist, another extension.

Oh, sorry if this belongs on the hackers list instead of general. I
wasn't really sure which was best!

Thanks,
Paul

[1] 
https://www.postgresql.org/message-id/CALNdv1h7TUP24Nro53KecvWB2kwA67p%2BPByDuP6_1GeESTFgSA%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/78aadf6b-86d4-21b9-9c2a-51f1efb8a499%40postgrespro.ru



Re: Using AWS ephemeral SSD storage for production database workload?

2018-01-29 Thread Paul A Jungwirth
> I have come across some ideas on the Internet
> where people hinted at running production PostgreSQL workloads
> on AWS ephemeral SSD storage.

I think people were more serious about that before AWS introduced
PIOPS. I wouldn't do this unless I had streaming replication to a
standby, plus regular backups (e.g. with WAL-E). Actually that's what
I use even with regular EBS volumes, and it's not hard to set up. The
standby gives you fast failover/recovery, and WAL-E gives you an extra
layer of insurance.

Another worry about ephemeral storage is that you can't add more, and
the amount you get depends on the instance type. Also it seems like
modern instances don't come with as much ephemeral storage as they
used to, although maybe that impression is mistaken.

I agree that PIOPS is still expensive though. Some people get around
that by running a RAID0 array of gp2 EBS volumes. The conversation at
https://news.ycombinator.com/item?id=13842044 has some details. I've
set it up for one client, and it wasn't too bad. It's been running
fine for 6 months or so.

Paul