User-defined function with anyrange[]
Hello, I want to make an aggregate function range_agg(anyrange) that returns anyrange[]. But when I try to define it, Postgres tells me it doesn't know what an anyrange[] is. I get this error: ERROR: type anyrange[] does not exist I also tried taking an anyrange and returning an anyarray, which does let me define the function, but upon calling it I get an error. For example: paul=# CREATE OR REPLACE FUNCTION range_agg4(anyrange) RETURNS anyarray AS $$ BEGIN RETURN ARRAY[$1]; END; $$ LANGUAGE plpgsql; paul=# select range_agg4(daterange('2016-05-04', '2016-07-01')); ERROR: invalid input syntax for type date: "[2016-05-04,2016-07-01)" CONTEXT: PL/pgSQL function range_agg4(anyrange) while casting return value to function's return type So I guess it thinks that daterange in means date[] out (not daterange[] out). The best I can think of is to define range_agg(anyelement) that returns anyarray. That actually works how I hope: paul=# CREATE OR REPLACE FUNCTION range_agg3(anyelement) RETURNS anyarray AS $$ BEGIN RETURN ARRAY[$1]; END; $$ LANGUAGE plpgsql; paul=# select range_agg3(daterange('2016-05-04', '2016-07-01')); range_agg3 - {"[2016-05-04,2016-07-01)"} But of course that is not as restricted as I'd like. Are there any better options? Thanks, Paul
Re: User-defined function with anyrange[]
> you can use overloading to define > several functions of the same name, and just write out one for each > range type you actually need this functionality for. Thanks! I was hoping to avoid that, but it's what I wound up doing after all, as you can see here: https://github.com/pjungwir/range_agg In the README I give instructions to create a new aggregate with a custom range type, and it is actually not bad at all, since the same C function can serve as implementation in all cases. Thanks again, Paul
Re: How to split an array into columns
On Fri, Aug 24, 2018 at 2:01 AM, a <372660...@qq.com> wrote: > > Say if I have an float8 array: > > id| data > --|--- > a | {1,2} > b | {2,4} > > . . . > > Since I would have around 200,000 rows, I would prefer it having enough > capacity to carry out the calculation such as sum(). Is this something that would help you?: https://github.com/pjungwir/aggs_for_vecs (also on pgxn: https://pgxn.org/dist/aggs_for_vecs) It would let you say `SELECT vec_to_sum(data)` and get `{3,6}`. Paul
flinfo NULL in DirectFunctionCall callee?
Hello, I'm writing a multirange_cmp function which calls range_cmp to compare individual elements. It looks like this: cmp = DatumGetInt32(DirectFunctionCall2( range_cmp, RangeTypePGetDatum(r1), RangeTypePGetDatum(r2))); But I get a segfault when range_cmp tries to reach its fcinfo->flinfo->fn_extra, because it has a NULL flinfo, as you can see from these debugging messages: NOTICE: multirange_cmp fcinfo = 0x7f875a0a66c0 NOTICE: multirange_cmp flinfo = 0x7f875a0a6690 NOTICE: multirange_cmp fn_extra = 0x7f875a099450 NOTICE: range_cmp fcinfo = 0x7ffee5ff9820 NOTICE: range_cmp flinfo = 0x0 Is it expected for flinfo to be null when using DirectFunctionCall*? Is there something else I should use instead? It looks like FunctionCall2 would let me pass my own flinfo, but I'm not sure how to set that up first. Thanks! Paul
Re: flinfo NULL in DirectFunctionCall callee?
On Mon, Aug 26, 2019 at 8:12 AM Tom Lane wrote: > > Paul A Jungwirth writes: > > Is it expected for flinfo to be null when using DirectFunctionCall*? > > Yes, per the comment on those functions: > > * These are for invocation of a specifically named function with a > * directly-computed parameter list. Note that neither arguments nor result > * are allowed to be NULL. Also, the function cannot be one that needs to > * look at FmgrInfo, since there won't be any. Thanks Tom! Alas I saw that same comment in fmgr.h but it was missing the last sentence. Would you like a patch updating it there? Is that the kind of thing you make CF entries for? (And sorry this is veering into pgsql-hackers territory) > Use fmgr_info() or a sibling to fill in the FmgrInfo. Thanks for this too! I wound up just adding a range_cmp_internal to avoid the whole issue, which is a pattern already used by a bunch of the other range_* functions. Yours, Paul
Re: Work hours?
On Tue, Aug 27, 2019 at 3:27 PM stan wrote: > I need to write a function that, given a month, and a year as input returns > the "work hours" in that month. In other words something like > > 8 * the count of all days in the range Monday to Friday) within that > calendar month. This gives you all the weekdays in August 2019: select t::date from generate_series('2019-08-01'::date, '2019-09-01'::date, interval '1 day') s(t) where extract(dow from t) not in (0, 6); >From there you could count & multiply by 8 (e.g. `select count(*) * 8` instead). You'll probably want to remove holidays first though. If those lived in another table you could do a NOT EXISTS to remove them before you count. Yours, Paul
When does Postgres use binary I/O?
Hello, I've read the docs at [1] and also this interesting recent post about adding binary I/O to the hashtypes extension. I wrote send & recv functions for my new multirange types, but I'm not sure how to test them. After running installcheck or installcheck-world, the code coverage report says they aren't tested, nor are the send/recv functions for ranges or arrays. When does Postgres actually use these functions? Is there a good way to test them? Thanks, Paul [1] https://www.postgresql.org/docs/11/xtypes.html [2] http://www.myrkraverk.com/blog/2019/08/postgresql-retroactively-adding-binary-i-o-to-a-type/
Re: When does Postgres use binary I/O?
On Sun, Sep 22, 2019 at 11:53 AM Tom Lane wrote: > I thought of an easier-to-maintain approach to that part than having > a reference file. Binary send functions are invokable from SQL, > so we could just imagine adding test cases along the lines of > > regression=# select int8send(42); > int8send > > \x002a > (1 row) > > for each data type. This would be enough to detect endianness issues, > garbage in padding bytes, etc. I just finished my multirange patch (well, "finished" :-), so I might be willing to sign up for this. Would you scatter these tests around in the various existing files? Or add a new cross-cutting file (like opr_sanity etc)? > The receive functions are not so easy to call, so we still need a > round-trip test, but you could imagine a TAP test framework for that. > Or, perhaps, the thing to do would be to provide a generic test function > that takes a value, runs it through the type's send and then receive > functions, and returns the result (or just complains if it gets different > bits out ...) So you're saying the latter option is to add a new function that someone can call from SQL, that just round-trips a value through send + recv? And then call that from an ordinary regress test? I guess the tests themselves can't define the function (like they define binary_coercible), because you need to call *_recv from C, so this would actually be a function we ship and document, right? That seems within my abilities. Should I move this thread over to pgsql-hackers for this? Paul
Re: jsonb_set() strictness considered harmful to data
> That said, I think it is reasonable that a PostgreSQL JSON function > behaves in the way that JSON users would expect, so here is my +1 for > interpreting an SQL NULL as a JSON null in the above case Just to chime in as another application developer: the current functionality does seem pretty surprising and dangerous to me. Raising an exception seems pretty annoying. Setting the key's value to a JSON null would be fine, but I also like the idea of removing the key entirely, since that gives you strictly more functionality: you can always set the key to a JSON null by passing one in, if that's what you want. But there are lots of other functions that convert SQL NULL to JSON null: postgres=# select row_to_json(row(null)), json_build_object('foo', null), json_object(array['foo', null]), json_object(array['foo'], array[null]); row_to_json | json_build_object | json_object | json_object -+---++ {"f1":null} | {"foo" : null}| {"foo" : null} | {"foo" : null} (1 row) (The jsonb variants give the same results.) I think those functions are very similar to json_set here, and I'd expect json_set to do what they do (i.e. convert SQL NULL to JSON null). Paul
Rules documentation example
Hello, I'm reading the docs about the Postgres Rule system here: https://www.postgresql.org/docs/12/rules-views.html That page says: > It turns out that the planner will collapse this tree into a two-level query > tree: the bottommost SELECT commands will be “pulled up” into the middle > SELECT since there's no need to process them separately. But the middle > SELECT will remain separate from the top, because it contains aggregate > functions. If we pulled those up it would change the behavior of the topmost > SELECT, which we don't want. But I don't see an aggregate function. Is it referring to MIN? But that is this two-param version defined on the same page. It isn't an aggregate: CREATE FUNCTION min(integer, integer) RETURNS integer AS $$ SELECT CASE WHEN $1 < $2 THEN $1 ELSE $2 END $$ LANGUAGE SQL STRICT; Is that an error in the docs, or am I missing something? Does a non-aggregate function also prevent the subqueries from being pulled up? Will all levels of that query actually get combined, or does something else prevent it? Thanks, Paul
Re: a proposal for a new functionality: "SELECT * [EXCEPT col1 [,col2]]
On Tue, Feb 25, 2020 at 2:18 AM Stanislav Motyčka wrote: > Sometimes (for tables with many columns) it would be better and easier to > write "SELECT" statement with clause "EXCEPT": > "SELECT * [EXCEPT col1 [,col2]] FROM ..." I've wanted this feature lots of times and would be delighted to see it in Postgres. On Tue, Feb 25, 2020 at 6:51 AM Miles Elam wrote: > Do you mean > "select everything from tablex except for tablex.col1, and also select > tablex.col2 and tabley.col1" > or > "select everything from tablex except for tablex.col1 AND tablex.col2, and > also select tabley.col1" > ? I take the proposal to mean this: SELECT listOfColumns [EXCEPT listOfColumns] FROM ... not this: SELECT listOfColumns [EXCEPT (listOfColumns) [listOfColumns [EXCEPT (listOfColumns)]]]... FROM ... So there is always a single EXCEPT clause (if any) and it comes after the entire SELECT clause. Then there is no ambiguity. Also this approach makes the feature easy to understand and use. I don't see any benefit to letting people interleave selected & excepted columns. Regards, Paul
Re: Postgresql range_agg() Return empty list
On Wed, Jul 10, 2024 at 6:37 PM Han Tang wrote: > I am using range_agg() function, it works fine with original table value > > But when I try to do some update for range value, it will give back an empty > list > > Select range_agg(b.r) > From ( > Select int8range(lower(bin_range)+1, upper(bin_range)+5) as r > From bin_data) as b; > The value '(,)' signifies a range with no lower bound and no upper bound. So '{(,)}' is the multirange that contains just that range (unbounded in either direction)---in other words not an empty list but every integer. Ranges use null to indicate unbounded, so it is the same as '(null,null)'. It looks like the inputs in your second screenshot must include some ranges with null lower & upper bounds, so then when you combine them you get everything. It appears that query is on lines 153-156 of some larger sequence? So I'm guessing the table contents are not the same as when you generated the first screenshot. If you think something else is going on, perhaps you could share the full contents of bin_data. Yours, Paul
Temporal Foreign Keys
Since there have been a couple threads on the hackers list about temporal features [1, 2], I thought I'd share an extension I've been writing for temporal foreign keys: https://github.com/pjungwir/time_for_keys There is a big test suite, but right now it is still basically a proof-of-concept, restricted to single-column-integer+tstzrange, implemented mostly in plpgsql, with no extra support when a table/column is dropped/renamed. I'd like to add support for: - multi-column foreign keys - other FK types than integers - other range types than tstzrange (at least tsrange, but maybe any range) - better catalog integration: show one constraint instead of four constraint triggers, properly restrict/cascade when a table/column is dropped, make the triggers work when a table/column is renamed. - support ON UPDATE/DELETE CASCADE/SET NULL/SET DEFAULT - I think I also need to add FOR KEY SHARE locking to get correct concurrent behavior. I think all that would be easier if I rewrote the plpgsql parts in C. I've been reading the code in commands/tablecmds.c, catalog/pg_constraint.c, utils/adt/ri_triggers.c, catalog/dependency.c, etc., and I think I'm ready to get started. I think I'd be doing a lot of copy/paste/adjust from the normal RI code. But first I wanted to share what I have so far and see if anyone could offer advice or feedback. Some other extensions that have been helpful for me to read are: - https://github.com/arkhipov/temporal_tables - https://github.com/CartoDB/pg_schema_triggers for detecting DROP/ALTER commands Right now this is an extension, but I'd be honored to contribute it to the core project eventually if folks want that. One issue there is that (traditional) foreign keys require indexes, and you can't index a thing+range without btree_gist, another extension. Oh, sorry if this belongs on the hackers list instead of general. I wasn't really sure which was best! Thanks, Paul [1] https://www.postgresql.org/message-id/CALNdv1h7TUP24Nro53KecvWB2kwA67p%2BPByDuP6_1GeESTFgSA%40mail.gmail.com [2] https://www.postgresql.org/message-id/78aadf6b-86d4-21b9-9c2a-51f1efb8a499%40postgrespro.ru
Re: Using AWS ephemeral SSD storage for production database workload?
> I have come across some ideas on the Internet > where people hinted at running production PostgreSQL workloads > on AWS ephemeral SSD storage. I think people were more serious about that before AWS introduced PIOPS. I wouldn't do this unless I had streaming replication to a standby, plus regular backups (e.g. with WAL-E). Actually that's what I use even with regular EBS volumes, and it's not hard to set up. The standby gives you fast failover/recovery, and WAL-E gives you an extra layer of insurance. Another worry about ephemeral storage is that you can't add more, and the amount you get depends on the instance type. Also it seems like modern instances don't come with as much ephemeral storage as they used to, although maybe that impression is mistaken. I agree that PIOPS is still expensive though. Some people get around that by running a RAID0 array of gp2 EBS volumes. The conversation at https://news.ycombinator.com/item?id=13842044 has some details. I've set it up for one client, and it wasn't too bad. It's been running fine for 6 months or so. Paul