from:"hector vass"

Re: Migrating PostgreSQL Stored Procedures to MSSQL 2019 for example

2022-08-14 Thread hector vass

I would suggest no easy way with a tool .. postgresql is powerful because
you can write functions in different languages in addition to PLpgSQL so
python,perl,tcl,js,c++ and there are many popular extensions.  So very
possible to encapsulate complex application/business logic within
database.  But if it is just a database with a bit of PLpgSQL to expose
data to a front end application that does all the work then maybe a tool
might work.  You may have noticed how powerful AWS SCT is converting MSSQL
& Oracle to Postgresql but alas will I believe only convert Postgresql to
Postres alike databases Aurora and MySQL ...
Not to say someone somewhere hasn’t but a claimed strength of Postgresql is
probably a hindrance in this instance.

Hector Vass
Data Engineer
07773 352559

On Fri, 12 Aug 2022, 11:24 Scott Simpson,  wrote:

> Hi,
>
> I need to migrate many PostgreSQL Stored Procedures and functions to MSSQL.
>
> I can find anything online that seems to handle this task.
>
> Are there any tools that you have that can do this job?
>
>
> *Kind Regards*
>
>
> *Zellis* | Scott Simpson | Senior Engineer
>
>
>
> Thorpe Park
>
> United Kingdom
>
> Work : +44 (0)20 3986 3523
>
> Email : scott.simp...@zellis.com 
>
> Web : www.Zellis.com <http://www.zellis.com/>
>
>
> --
>
> *Zellis is the trading name for Zellis Holdings Ltd and its associated
> companies “Zellis”.*
>
> The contents of this email are confidential to Zellis and are solely for
> the use of the intended recipient. If you received this email in error,
> please inform the sender immediately and delete the email from your system.
> Unless Zellis have given you express permission to do so, please do not
> disclose, distribute or copy the contents of this email.
>
> Unless this email expressly states that it is a contractual offer or
> acceptance, it is not sent with the intention of creating a legal
> relationship and does not constitute an offer or acceptance which could
> give rise to a contract.
>
> Any views expressed in this email are those of the individual sender
> unless the email specifically states them to be the views of Zellis.
>
> Zellis Holdings Ltd - registered in England and Wales - Company No:
> 10975623 - Registered Office: 740 Waterside Drive, Aztec West, Almondsbury,
> Bristol, BS32 4UF, UK.
>

Re: How to eliminate extra "NOT EXISTS"-query here?

2023-11-25 Thread hector vass

Not sure you need to use array why not simple table joins, so a table with
your criteria x y z t joined to stuff to give you candidates that do match,
then left join with coalesce to add the 'd'

select

--a.id,b.test_id,

coalesce(a.id,b.test_id) as finalresult

from test a

left join (

select

test_id

from stuff a

inner join (values ('x'),('y'),('z'),('t')) b (v) using(v)

group by 1

)b on(a.id=b.test_id);


Regards
Hector Vass



On Sat, Nov 25, 2023 at 4:08 PM Tom Lane  wrote:

> Andreas Joseph Krogh  writes:
> > -- This works, but I'd rather not do the extra EXISTS
> > select * from test t
> > WHERE (NOT ARRAY ['x', 'y', 'z', 't']::varchar[] <@ (select
> array_agg(s.v) from
> > stuffs WHERE s.test_id = t.id)
> > OR NOT EXISTS (
> > select * from stuff s where s.test_id = t.id
> > )
> >  )
> > ;
>
> > So, I want to return all entries in test not having any of ARRAY ['x',
> 'y',
> > 'z', 't'] referenced in the table stuff, and I'd like to have test.id="d"
>
> > returned as well, but in order to do that I need to execute the “or not
> > exists”-query. Is it possible to avoid that?
>
> Probably not directly, but perhaps you could improve the performance of
> this query by converting the sub-selects into a left join:
>
> select * from test t
>   left join
> (select s.test_id, array_agg(s.v) as arr from stuffs group by
> s.test_id) ss
>   on ss.test_id = t.id
> WHERE (NOT ARRAY ['x', 'y', 'z', 't']::varchar[] <@ ss.arr)
>   OR ss.test_id IS NULL;
>
> Another possibility is
>
> ...
> WHERE (ARRAY ['x', 'y', 'z', 't']::varchar[] <@ ss.arr) IS NOT TRUE
>
> but I don't think that's more readable really, and it will save little.
>
> In either case, this would result in computing array_agg once for
> each group of test_id values in "stuffs", while your original computes
> a similar aggregate for each row in "test".  So whether this is better
> depends on the relative sizes of the tables, although my proposal
> avoids random access to "stuffs" so it will have some advantage.
>
> regards, tom lane
>
>
>

Re: How to eliminate extra "NOT EXISTS"-query here?

2023-11-28 Thread hector vass

Not equivalent to the use of NOT ARRAY and entirely possible I have
misunderstood the requirement ...do you have some more test cases the non
array solution does not work for

Regards
Hector Vass
07773 352559


On Mon, Nov 27, 2023 at 9:29 AM Dominique Devienne 
wrote:

> On Sat, Nov 25, 2023 at 5:53 PM hector vass  wrote:
>
>> Not sure you need to use array why not simple table joins, so a table
>> with your criteria x y z t joined to stuff to give you candidates that do
>> match, then left join with coalesce to add the 'd'
>>
>> select
>>
>> --a.id,b.test_id,
>>
>> coalesce(a.id,b.test_id) as finalresult
>>
>> from test a
>>
>> left join (
>>
>> select
>>
>> test_id
>>
>> from stuff a
>>
>> inner join (values ('x'),('y'),('z'),('t')) b (v) using(v)
>>
>> group by 1
>>
>> )b on(a.id=b.test_id);
>>
>
> Hi Hector. Hopefully this is not a stupid question...
>
> How is that equivalent from the `NOT ARRAY ... <@ ...` though?
> The inner-join-distinct above will return test_id's on any match, but you
> can't know if all array values are matches. Which is different from
>
> > Is the first array contained by the second
>
> from the <@ operator, no?
> I'm unfamiliar with these operators, so am I missing something?
> Just trying to understand the logic here. Thanks, --DD
>

Re: Get back the number of columns of a result-set prior to JSON aggregation

2023-11-28 Thread hector vass

I think you are just trying to get the number of columns in the underlying
table, no real cost to read the metadata


select count(id), (select count(attrelid) from pg_attribute where attrelid=
't1'::regclass and attnum>0) , json_agg(t) from t1 t;

select count(id), (select count(attrelid) from pg_attribute where attrelid=
't2'::regclass and attnum>0) , json_agg(t) from t2 t;



Regards
Hector Vass
07773 352559


On Tue, Nov 28, 2023 at 12:12 PM Dominique Devienne 
wrote:

> Hi. I've got a nice little POC using PostgreSQL to implement a REST API
> server.
> This uses json_agg(t) to generate the JSON of tables (or subqueries in
> general),
> which means I always get back a single row (and column, before I added the
> count(t.*)).
>
> But I'd like to get statistics on the number of rows aggregated (easy,
> count(*)),
> but also the number of columns of those rows! And I'm stuck for the
> latter...
>
> Is there a (hopefully efficient) way to get back the cardinality of a
> select-clause basically?
> Obviously programmatically I can get the row and column count from the
> result-set,
> but I see the result of json_agg() myself, while I want the value prior to
> json_agg().
>
> Is there a way to achieve this?
>
> Thanks, --DD
>
> PS: In the example below, would return 1 for the 1st query, and 2 for the
> 2nd.
>
> ```
> migrated=> create table t1 (id integer);
> CREATE TABLE
> migrated=> insert into t1 values (1), (2);
> INSERT 0 2
> migrated=> create table t2 (id integer, name text);
> CREATE TABLE
> migrated=> insert into t2 values (1, 'one'), (2, 'two');
> INSERT 0 2
> migrated=> select count(t.*), json_agg(t) from t1 t;
>  count |  json_agg
> ---+-
>  2 | [{"id":1}, +
>|  {"id":2}]
> (1 row)
>
>
> migrated=> select count(t.*), json_agg(t) from t2 t;
>  count | json_agg
> ---+--
>  2 | [{"id":1,"name":"one"}, +
>|  {"id":2,"name":"two"}]
> (1 row)
> ```
>

Re: Mimic ALIAS in Postgresql?

2024-01-16 Thread hector vass

On Tue, 16 Jan 2024, 17:21 Ron Johnson,  wrote:

> Some RDBMSs have CREATE ALIAS, which allows you to refer to a table by a
> different name (while also referring to it by the original name).
>
> We have an application running on DB2/UDB which (for reasons wholly
> unknown to me, and probably also to the current developer) extensively uses
> this with two schemas: MTUSER and MTQRY.  For example, sometimes refer to
> MTUSER.sometable and other times refer to it as MYQRY.sometable.
>
> My goal is to present a way to migrate from UDB to PG with as few
> application changes as possible.  Thus, the need to mimic aliases.
>
> Maybe updatable views?
> CREATE VIEW mtqry.sometable AS SELECT * FROM mtuser.sometable;
>


I think views will work.  Alternative might be interpose a proxy to rewrite
the SQL.  https://www.galliumdata.com/ gives you an idea of what this might
look like although could do a lite version yourself.



>

Re: creating a subset DB efficiently ?

2024-03-09 Thread hector vass

On Fri, Mar 8, 2024 at 4:22 PM David Gauthier  wrote:

> Here's the situation
>
> - The DB contains data for several projects.
> - The tables of the DB contain data for all projects (data is not
> partitioned on project name or anything like that)
> - The "project" identifier (table column) exists in a few "parent" tables
> with many child... grandchild,... tables under them connected with foreign
> keys defined with "on delete cascade".  So if a record in one of the parent
> table records is deleted, all of its underlying, dependent records get
> deleted too.
> - New projects come in, and old ones need to be removed and "archived" in
> DBs of their own.  So there's a DB called "active_projects" and there's a
> DB called "project_a_archive" (identical metadata).
> - The idea is to copy the data for project "a" that's in "active_projects"
> to the "project_a_arhchive" DB AND delete the project a data out of
> "active_projects".
> - Leave "project_a_archive" up and running if someone needs to attach to
> that and get some old/archived data.
>
> The brute-force method I've been using is...
> 1)  pg_dump "active_projects" to a (huge) file then populate
> "project_a_archive" using that (I don't have the privs to create database,
> IT creates an empty one for me, so this is how I do it).
> 2) go into the "project_a_archive" DB and run... "delete from par_tbl_1
> where project <> 'a' ", "delete from par_tbl_2 where project <> 'a' ",
> etc... leaving only project "a" data in the DB.
> 3) go into the "active_projects" DB and "delete from par_tbl_1 where
> project = 'a' ", etc... removing project "a" from the "active_projects DB.
>
> Ya, not very elegant, it takes a long time and it takes a lot of
> resources.  So I'm looking for ideas on how to do this better.
>
> Related question...
> The "delete from par_tbl_a where project <> 'a' " is taking forever.  I
> fear it's because it's trying to journal everything in case I want to
> rollback.  But this is just in the archive DB and I don't mind taking the
> risk if I can speed this up outside of a transaction.  How can I run a
> delete command like this without the rollback recovery overhead ?
>


>(I don't have the privs to create database, IT creates an empty one for
me, so this is how I do it).

That's a shame.  You can do something similar with tablespaces
  Template your existing schema to create a new schema for the project
(pg_dump -s)
  Create tablespace for this new project and schema

 You can then move the physical tablespace to cheaper disk and use symbolic
links or... archive and/or back it up at the schema level with pg_dump -n

...as long as you don't put anything in the public schema all you are
really sharing is roles otherwise a bit like a separate database

Fwd: Getting error while upgrading

2024-03-09 Thread hector vass

On Sat, Mar 9, 2024 at 12:18 PM omkar narkar 
wrote:

> Hello Team,
>
> I am trying to upgrade my edb 10.5 community version to postgres 15.6
> version and while doing this i am getting error regarding OIDS are not
> stable across Postgresql version (sys.callback_queue_table.user_data).
> Kindly help me to get the solution of this issue.
>
> Thanks and regards,
> Omkar Narkar
>

Usually get this error if there are composite data types or data types that
cannot be translated between 10.5 and 15.6.
The clue may be in the error message just before it says 'OIDS are not
stable across Postgresql version'
You state edb 10.5 community guessing you are using pg_upgrade and going
from windows to linux ? I am impressed if you can do that, would you not
end up with collation issues?
If you are using pg_upgrade what does pg_upgrade --check say
I would dump the schema to a sql file
pg_dump -s >dumped.sql
Then run the sql one command at a time to track down where you are going to
have a problem.

Re: Insert with Jsonb column hangs

2024-03-09 Thread hector vass

On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh 
wrote:

> Hi,
>
> We are inserting data close to 1M record & having a single Jsonb column
> but query is getting stuck.
>
> We are using insert into select * .. , so all the operations are within
> the DB.
>
> If we are running select query individually then it is returning the data
> in 40 sec for all rows but with insert it is getting stuck.
>
> PG Version - 15.
>
> What could be the problem here ?
>
> Regards,
> KD
>

insert 1M rows especially JSON that can be large, variable in size and
stored as blobs and indexed is not perhaps the correct way to do this
insert performance will also depend on your tuning.  Supporting
transactions, users or bulk processing are 3x sides of a compromise.
you should perhaps consider that insert is for inserting a few rows into
live tables ... you might be better using copy or \copy, pg_dump if you are
just trying to replicate a large table

Re: Insert with Jsonb column hangs

2024-03-09 Thread hector vass

copy syntax can include any valid select statement

  COPY (any valid select statement joining tables and converting it
row_to_json) TO 'some_dump_file'

or can copy a view
  CREATE VIEW myview (any valid select statement joining tables and
converting it row_to_json);
  COPY myview TO 'some_dump_file'


Regards
Hector Vass
07773 352559


On Sat, Mar 9, 2024 at 4:01 PM kuldeep singh 
wrote:

> Copy may not work in our scenario since we need to join data from multiple
> tables & then  convert it to json using  row_to_json . This json data
> eventually  needs to be stored in a target table .
>
> Will it be better if we break the process into batches of like 10,000 rows
> & insert the data in its individual transactions? Or any other better
> solution available ?
>
> On Sat, Mar 9, 2024 at 9:01 PM hector vass  wrote:
>
>>
>>
>> On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh 
>> wrote:
>>
>>> Hi,
>>>
>>> We are inserting data close to 1M record & having a single Jsonb column
>>> but query is getting stuck.
>>>
>>> We are using insert into select * .. , so all the operations are within
>>> the DB.
>>>
>>> If we are running select query individually then it is returning the
>>> data in 40 sec for all rows but with insert it is getting stuck.
>>>
>>> PG Version - 15.
>>>
>>> What could be the problem here ?
>>>
>>> Regards,
>>> KD
>>>
>>
>> insert 1M rows especially JSON that can be large, variable in size and
>> stored as blobs and indexed is not perhaps the correct way to do this
>> insert performance will also depend on your tuning.  Supporting
>> transactions, users or bulk processing are 3x sides of a compromise.
>> you should perhaps consider that insert is for inserting a few rows into
>> live tables ... you might be better using copy or \copy, pg_dump if you are
>> just trying to replicate a large table
>>
>>

Re: Insert with Jsonb column hangs

2024-03-09 Thread hector vass

On Sat, Mar 9, 2024 at 4:10 PM Adrian Klaver 
wrote:

> On 3/9/24 08:00, kuldeep singh wrote:
> > Copy may not work in our scenario since we need to join data from
> > multiple tables & then  convert it to json using  row_to_json . This
> > json data eventually  needs to be stored in a target table .
>
> Per:
>
> https://www.postgresql.org/docs/current/sql-copy.html
>
> "
> COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
>
> <...>
>
> query
>
>  A SELECT, VALUES, INSERT, UPDATE, or DELETE command whose results
> are to be copied. Note that parentheses are required around the query.
>
>  For INSERT, UPDATE and DELETE queries a RETURNING clause must be
> provided, and the target relation must not have a conditional rule, nor
> an ALSO rule, nor an INSTEAD rule that expands to multiple statements.
> "
>
> >
> > Will it be better if we break the process into batches of like 10,000
> > rows & insert the data in its individual transactions? Or any other
> > better solution available ?
> >
> > On Sat, Mar 9, 2024 at 9:01 PM hector vass  > <mailto:hector.v...@gmail.com>> wrote:
> >
> >
> >
> > On Sat, Mar 9, 2024 at 3:02 PM kuldeep singh
> > mailto:kuldeeparor...@gmail.com>> wrote:
> >
> > Hi,
> >
> > We are inserting data close to 1M record & having a single Jsonb
> > column but query is getting stuck.
> >
> > We are using insert into select * .. , so all the operations are
> > within the DB.
> >
> > If we are running select query individually then it is returning
> > the data in 40 sec for all rows but with insert it is getting
> stuck.
> >
> > PG Version - 15.
> >
> > What could be the problem here ?
> >
> > Regards,
> > KD
> >
> >
> > insert 1M rows especially JSON that can be large, variable in size
> > and stored as blobs and indexed is not perhaps the correct way to do
> > this
> > insert performance will also depend on your tuning.  Supporting
> > transactions, users or bulk processing are 3x sides of a compromise.
> > you should perhaps consider that insert is for inserting a few rows
> > into live tables ... you might be better using copy or \copy,
> > pg_dump if you are just trying to replicate a large table
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com


What Adrian Klaver said ^
discovered even this works...


create view myview as (select row_to_json from mytable);

create table newtable as select * from myview where 1=0;

copy myview to program 'psql mydb postgres -c ''copy newtable from stdin'' '
;

Re: Please recommend postgresql.conf improvements for osm2pgsql loading Europe

2024-03-31 Thread hector vass

On Sat, 30 Mar 2024, 10:04 Alexander Farber, 
wrote:

> Thank you, Justin -
>
> On Sat, Mar 30, 2024 at 4:33 AM Justin Clift 
> wrote:
>
>> On 2024-03-30 05:53, Alexander Farber wrote:
>> > I use the following postgresql.conf in my Dockerfile
>> > ( the full version at https://stackoverflow.com/a/78243530/165071 ),
>> > when loading a 28 GByte large europe-latest.osm.pbf
>>
>> Not specific conf file improvements, but for an initial data load
>> have you done things like turning off fsync(), deferring index
>> creating until after the data load finishes, and that kind of thing?
>>
>
> I will try the following commands in my Dockerfile then
> and later report back on any improvements:
>
> RUN set -eux && \
> pg_ctl init && \
> echo "shared_buffers = 1GB">> $PGDATA/postgresql.conf
> && \
> echo "work_mem = 50MB" >> $PGDATA/postgresql.conf
> && \
> echo "maintenance_work_mem = 10GB" >> $PGDATA/postgresql.conf
> && \
> echo "autovacuum_work_mem = 2GB"   >> $PGDATA/postgresql.conf
> && \
> echo "wal_level = minimal" >> $PGDATA/postgresql.conf
> && \
> echo "checkpoint_timeout = 60min"  >> $PGDATA/postgresql.conf
> && \
> echo "max_wal_size = 10GB" >> $PGDATA/postgresql.conf
> && \
> echo "checkpoint_completion_target = 0.9"  >> $PGDATA/postgresql.conf
> && \
> echo "max_wal_senders = 0" >> $PGDATA/postgresql.conf
> && \
> echo "random_page_cost = 1.0"  >> $PGDATA/postgresql.conf
> && \
> echo "password_encryption = scram-sha-256" >> $PGDATA/postgresql.conf
> && \
> echo "fsync = off">>
> $PGDATA/postgresql.conf && \
> pg_ctl start && \
> createuser --username=postgres $PGUSER && \
> createdb --username=postgres --encoding=UTF8 --owner=$PGUSER
> $PGDATABASE && \
> psql --username=postgres $PGDATABASE --command="ALTER USER $PGUSER
> WITH PASSWORD '$PGPASSWORD';" && \
> psql --username=postgres $PGDATABASE --command='CREATE EXTENSION IF
> NOT EXISTS postgis;' && \
> psql --username=postgres $PGDATABASE --command='CREATE EXTENSION IF
> NOT EXISTS hstore;' && \
> osm2pgsql --username=$PGUSER --database=$PGDATABASE --create
> --cache=6 --hstore --latlong /data/map.osm.pbf && \
> rm -f /data/map.osm.pbf && \
> pg_ctl stop && \
> echo "fsync = on">>
> $PGDATA/postgresql.conf && \
> echo '# TYPE DATABASE USER ADDRESS METHOD'>
> $PGDATA/pg_hba.conf && \
> echo "local all postgres peer"   >>
> $PGDATA/pg_hba.conf && \
> echo "local $PGDATABASE $PGUSER   scram-sha-256" >>
> $PGDATA/pg_hba.conf && \
> echo "host  $PGDATABASE $PGUSER 0.0.0.0/0 scram-sha-256" >>
> $PGDATA/pg_hba.conf
>
> The later fsync = on will override the former, right?
>
> Best regards
> Alex
>
>
>

2hrs sounds reasonable for Europe, it's a big place in terms of osm data
and osm2pgsql is doing processing to convert to geometry objects prior to
doing anything on the Postgresql side.
If you examine the --log--sql output for a small test country you can see
what it does in terms of the postgresql.
osm2pgsql gives options to trim the output to only what you need (so if you
don't want waterways, traffic features, parking places or places of worship
etc.. why load them)
Hopefully you have found the excellent geofabrik
https://download.geofabrik.de/ source for osm data.
Rather than load this data afresh each update cycle you would be better off
simply loading the changes so the .osc files or ... osm osmosis will create
the equivalent of a diff file for you
Looks like you are already using osm2psql's recommended postgresql.config
settings, I'd be surprised if this was way off.  Getting as close to tin
rather than virtual machines and containers will also help, lots of io
going on here.
If you are only interested in the geography you might consider geofabrik's
shapefile available for many countries, they have already done some of the
work for you.

Apologies if you are already a long way down this route & just asking about
the final stage of loading the osm2pgsql output to Postgresql but however
well you do here I would only expect small marginal gains.

Re: Migrating PostgreSQL Stored Procedures to MSSQL 2019 for example

Re: How to eliminate extra "NOT EXISTS"-query here?

Re: How to eliminate extra "NOT EXISTS"-query here?

Re: Get back the number of columns of a result-set prior to JSON aggregation

Re: Mimic ALIAS in Postgresql?

Re: creating a subset DB efficiently ?

Fwd: Getting error while upgrading

Re: Insert with Jsonb column hangs

Re: Insert with Jsonb column hangs

Re: Insert with Jsonb column hangs

Re: Please recommend postgresql.conf improvements for osm2pgsql loading Europe

11 matches

Site Navigation

Mail list logo

Footer information