Postgres read jsonb content from stdin

2020-12-26 Thread Markur Sens
Hello,

I'm trying to build a few data pipelines with Unix tools but can't figure
out how to insert in a slurp mode (e.g. not COPY line by line) content
inside a variable.

Consider the following script (using a heredoc)

json_url="https://file.json";
local_file="/tmp/a.json"

curl -s -m 10 -A 'Mozilla/5.0 (X11; Linux x86_64; rv:30.0) Gecko/20100101
Firefox/30.0' \
--max-redirs 0 -o ${local_file} ${json_url}

psql "$PG_URI" -qAt <

Re: Postgres read jsonb content from stdin

2020-12-26 Thread Markur Sens
Hello,

Hadn't really thought of using a foreign table up at this point...
thanks for that.

Will the first solution handle formatting issues (e.g. single quotes)
gracefully?

I think I'd tried it in the past and it didn't work.

PD: I have such a script that handle's the intricacies but it'd still emit
to stdout. (hence the curl simplified in the example)


On Sat, Dec 26, 2020 at 2:40 PM Ian Lawrence Barwick 
wrote:

> 2020年12月26日(土) 20:19 Markur Sens :
> >
> > Hello,
> >
> > I'm trying to build a few data pipelines with Unix tools but can't
> figure out how to insert in a slurp mode (e.g. not COPY line by line)
> content inside a variable.
> >
> > Consider the following script (using a heredoc)
> >
> > json_url="https://file.json";
> > local_file="/tmp/a.json"
> >
> > curl -s -m 10 -A 'Mozilla/5.0 (X11; Linux x86_64; rv:30.0)
> Gecko/20100101 Firefox/30.0' \
> > --max-redirs 0 -o ${local_file} ${json_url}
> >
> > psql "$PG_URI" -qAt < > create table if not exists (data jsonb);
> >
> > insert into my_table(data) values (pg_read_file('${local_file}')::jsonb)
> > on conflict do nothing;
> > SQL
> >
> > The question is, how can I achieve the same result, without having to
> hit the disk due. to the temporary file.
> > I tried running by using pg_read_file('/dev/stdin')::jsonb
>
> It can be done like this:
>
> $ curl http://localhost/json.txt
> {"bar": "baz", "balance": 7.77, "active": false}
>
> $ psql -v jsonval="`curl -s http://localhost/json.txt`
> <http://localhost/json.txt>" -d
> 'host=localhost dbname=postgres user=postgres' < INSERT INTO json_test values(:'jsonval')
> SQL
>
> INSERT 0 1
> Time: 0.374 ms
>
> though TBH if I were doing that on a regular basis, I'd do it via a script
> which
> could cope with errors retrieving the remote file, etc.
>
> If the data source (URL) is constant, you could try something along these
> lines
> with file_fdw:
>
> CREATE EXTENSION file_fdw;
>
> CREATE SERVER json_curl FOREIGN DATA WRAPPER file_fdw;
> CREATE FOREIGN TABLE json_src (
>   json_data jsonb
> )
> SERVER json_curl
> OPTIONS (
>   PROGRAM 'curl -s http://localhost/json.txt'
> );
>
>
> Better alternatives may be available.
>
>
> Regards
>
> Ian Barwick
>
>
>
> --
> EnterpriseDB: https://www.enterprisedb.com
>


Docker Image with Python support ?

2021-01-13 Thread Markur Sens
Is there any available Postgres image that has been compiled with the
--with-python flag?

I could use and modify (comment out this)
https://github.com/docker-library/postgres/blob/03e769531fff4c97cb755e4a608b24935c27/13/alpine/Dockerfile
but looks like too much for a simple flag.


Alternative to slow SRF in SELECT?

2022-05-17 Thread Markur Sens
I have the following  case

select 
my_srf_func(otherfunc(h))
from (values (‘…’::mytype),
 ‘…’::mytype),
 (‘…’::mytype),
 ‘…’::mytype),
 (‘…’::mytype),)
 as temp(h);


I think it’s obvious what I’m trying to do.

My_srf_func is similar to unnest(array) function.

The problem now is that the result size of this could be around 400-500 
elements for every function call of otherfunc(h) so when my_srf_func unnests 
these I should expect 2K+ items even for this simple case.

It’s also not advised to have SRFs in the SELECT of the query, becuase I do get 
some unexpected/unintuitive results,
but most importantly it’s way too slow.

Any alternatives on how I could rework the query or the my_srf_func to speed 
things up, without having too many subqueries? 

I guess there should be some kind of pattern there.

If there’s any help, both my_srf_func and otherfunc are coded in PL/Python

Thanks.



PG_GETARG_TEXT_PP vs PG_GETARG_TEXT_P

2022-06-11 Thread Markur Sens
In the “Extending SQL” chapter I see both of these forms are mentioned. 

But can’t find info about when to use which one.



Re: PG_GETARG_TEXT_PP vs PG_GETARG_TEXT_P

2022-06-11 Thread Markur Sens



> On 12 Jun 2022, at 12:06 AM, Tom Lane  wrote:
> 
> Markur Sens  writes:
>> In the “Extending SQL” chapter I see both of these forms are mentioned. 
>> But can’t find info about when to use which one.
> 
> PG_GETARG_TEXT_P returns a traditional-format, 4-byte-header value.
> 
> PG_GETARG_TEXT_PP is allowed to return either that or a 1-byte-header
> value, in case that's what the input is.
> 
> PG_GETARG_TEXT_PP is preferred in new code since it can avoid one
> step of palloc-and-copy-the-value; the only real downside is you
> have to use the appropriate macros to get the string's start address
> and length.
> 
>   regards, tom lane

Ah Thanks. I wouldn’t have guessed that. 
And I don’t see this mentioned in the header files either. 

Is it worth adding a relevant comment in the documentation section? 





Using vars in jsonbpath operator ?

2023-06-16 Thread Markur Sens
I understand that on a where clause a gin index can be used for the following 
predicate

a.data @? '$.results.docs[*].accs[*] ? (@.id == “123")

I have a join query however on the following condition

jsonb_path_exists(a.data, '$.results.docs[*].accs[*] ? (@.number == $id)', 
jsonb_build_object(‘id', b.id))  but this cannot use a gin index afaik and 
results in a seq scan.

Is there anyway to use the operator syntax above by passing the b.id 
 variable somwehow so that the gin index can be used ? ? 






Re: pl/pgsql outside the DB, (i.e. on the terminal) possible ?

2024-03-07 Thread Markur Sens


> On 7 Mar 2024, at 4:56 PM, Achilleas Mantzios - cloud 
>  wrote:
> 
> Hello
> 
> I notice both my kids struggling with either C or Python as first programming 
> languages. I believe both are unsuitable for use as introductory languages to 
> college juniors.

https://scratch.mit.edu/ would be more suitable, maybe ? 

> 
> Python IMHO is too advanced, too rich, weird indentation rules, no simple for 
> loop etc.
> 
> C, ok, punishing little kids with segmentation faults, calling by value VS by 
> reference and 3ple pointers is pure sadism.
> 
> So this brings me to memory good old PASCAL from the time I was junior, circa 
> 1986. PL/SQL resembles PASCAL heavily. Everything seems well defined, strong 
> typing, simplicity, I think it has everything a programming language should 
> have in order to be taught as an introductory language. But it lacks IO and 
> file handling.
> 
> So, I ask, have there been any efforts to bring PL/PGSQL to the terminal?

Imho the best you can do is bash + psql + pl/pgsql , via heredocs .
pql  
> Thanks!
> 
> 
> 



CI/CD Boilerplate for extension binaries without source code ?

2022-01-10 Thread Markur Sens
Hi, 

For a gig, I’m developing a small Postgres extension, adding a few data types 
and some functions (some of them written in C and some of them written in 
plpython3).

Now, my client doesn’t want the source code to be visible and/or open even 
internally to other teams; They just want a bunch of yum/deb/apt packages to be 
distributed to servers & users internally.
Docker is also an option but I prefer to rely just on my Makefile (using PGXS) 
for now and produce deb/apt/yum

Can you recommend any CI/CD boilerplate that can help with this? Especially the 
plpython3 part can be tricky as the CREATE FUNCTION body obviously contains 
visible python code and is included in the extension—0.1.sql that goes under 
/share/postgresql/extension/ . 

Most of the sophisticated extensions (Postgis, citus) I’ve looked at have 
either too complex CI/CD processes or ship source code with them. And the 
plpython part seems quite unique I think. 







Additional accessors via the Extension API ?

2022-02-19 Thread Markur Sens
Suppose  I have defined an additional type in a PG extension.

Is it possible to add custom accessors to that type -much like jsonb does- but 
use an API/hook without touching the core PG grammar & parser? 

Hypothetical Examples: 

Assuming I have a TextFile type I’d like to implement syntax like:

(‘/home/me/a.txt’::TextFile).firstline
(‘/home/me/a.txt’::TextFile).lastline
(‘/home/me/a.txt’::TextFile).countlines()
(‘/home/me/a.txt’::TextFile).size()
(‘/home/me/a.txt’::TextFile).datemodified()


The only relevant patch I could find is [1] but it’s a dead-end

[1] https://www.postgresql.org/message-id/20210501072458.adqjoaqnmhg4l34l%40nol





Re: Additional accessors via the Extension API ?

2022-02-20 Thread Markur Sens



> On 20 Feb 2022, at 12:12 PM, Julien Rouhaud  wrote:
> 
> Hi,
> 
> On Sun, Feb 20, 2022 at 08:07:20AM +0200, Markur Sens wrote:
>> Suppose  I have defined an additional type in a PG extension.
>> 
>> Is it possible to add custom accessors to that type -much like jsonb does-
>> but use an API/hook without touching the core PG grammar & parser?
> 
> Unfortunately no.
> 
>> Hypothetical Examples:
>> 
>> Assuming I have a TextFile type I’d like to implement syntax like:
>> 
>> (‘/home/me/a.txt’::TextFile).firstline
>> (‘/home/me/a.txt’::TextFile).lastline
>> (‘/home/me/a.txt’::TextFile).countlines()
>> (‘/home/me/a.txt’::TextFile).size()
>> (‘/home/me/a.txt’::TextFile).datemodified()
> 
> Maybe you could rely on some old grammar hack to have something a bit similar,
> as (expr).funcname is an alias for funcname(expr).  For instance:

Is this documented & expected behavior or it’s just happens to work?

> 
> # create function f1(int) returns text as $$
> begin
> return 'val: ' || $1::text;
> end;
> $$ language plpgsql;
> 
> # create table t as select 1 as id;
> 
> # select (5).f1, (id).f1 from t;
>   f1   |   f1
> +
> val: 5 | val: 1
> (1 row)
> 
> I don't know if that would be enough for you needs.  Otherwise, the only 
> option
> would be tocreate an operator instead, like mytype -> 'myaccessor' or 
> something
> like that.


Yes, that’s what I’m doing at the moment:
Syntax like type -> ‘accessor’ is pretty straight forward to implement as an 
operator as the rightarg is text.

Things get more complicating as I’m adding support for
mytype -> function(arg=1) 

for that case I have to create an intermediate type of function(arg) so that I 
can then define the left and right args for the -> operator.
But it’s a lot of boilerplate code.






Re: Additional accessors via the Extension API ?

2022-02-20 Thread Markur Sens


> On 20 Feb 2022, at 12:35 PM, Julien Rouhaud  wrote:
> 
> On Sun, Feb 20, 2022 at 12:31:22PM +0200, Markur Sens wrote:
>>> 
>>> Maybe you could rely on some old grammar hack to have something a bit 
>>> similar,
>>> as (expr).funcname is an alias for funcname(expr).  For instance:
>> 
>> Is this documented & expected behavior or it’s just happens to work?
> 
> I don't think it's documented but it's an expected behavior, see
> 
> https://github.com/postgres/postgres/blob/master/src/backend/parser/parse_func.c#L57-L88
> 

Ah thanks for this

> /*
> * Parse a function call
> *
> * For historical reasons, Postgres tries to treat the notations tab.col
> * and col(tab) as equivalent: if a single-argument function call has an
> * argument of complex type and the (unqualified) function name matches
> * any attribute of the type, we can interpret it as a column projection.

and the (unqualified) function name matches
*   any attribute of the type, we can interpret it as a column projection.


> * Conversely a function of a single complex-type argument can be written
> * like a column reference, allowing functions to act like computed 
> columns.
> *
> * If both interpretations are possible, we prefer the one matching the
> * syntactic form, but otherwise the form does not matter.
> *
> * Hence, both cases come through here.  If fn is null, we're dealing with
> * column syntax not function syntax.  In the function-syntax case,
> * the FuncCall struct is needed to carry various decoration that applies
> * to aggregate and window functions.
> [...]