Re: Question on overall design

2023-12-11 Thread veem v
Thank you for your response.

Apology if it sounds silly, but is it advisable to use just one database in
this flow, say for e.g snowflake for olap usecase and use a cache
layer(like reddish) on top of it to cater our oltp Usecase, rather having a
dedicated oltp database like Aurora postgresql?

Basically wanted to know, if anythings we would be missing if not having an
oltp database here for our usecase.


On Sun, 10 Dec, 2023, 10:25 pm Ron Johnson,  wrote:

> * PG has pgbench; *maybe* you can hack it to work on Oracle.
> * If you want to know how well an RDBMS will work on your workload, then
> you must provide it with a simulated workload.  Right?
> * AWS RDS Postgresql has a dashboard that *might* be similar to AWR.  Or
> it might not...
> * We departitioned because SELECT statements were *slow*.  All partitions
> were scanned, even when the partition key was specified in the WHERE clause.
>
> On Sun, Dec 10, 2023 at 8:45 AM veem v  wrote:
>
>> Thank you so much Ron.  I have some more doubts related to this.
>>
>> We were thinking , if there is any utility in PG with which we can
>> create/generate large sample data volume which we can use to run it on our
>> on premise Oracle exadata box and use the same on the aurora postgresql in
>> cloud to see the read and write performance comparison. Is there any such
>> exists, to quickly get some large data volume?  But i think , what you are
>> saying is, we should use real data to get actual or closest possible
>> benchmarks, correct me if wrong?
>>
>> We used to see the data dictionary views (called AWR views) in Oracle to
>> see the current and historical performance statistics like CPU, IO , Memory
>> usage, object level contentions etc. in the oracle database. Do we have
>> such a thing available in Aurora postgre, so as to monitor the
>> performance and get some idea of how well the load test goes and what
>> capacity is available or are we saturating it?
>>
>> When you said "*Beware of excessive partitioning.  We had to "departion"
>> most tables, because of the impact on performance.*" , as I understand
>> partitioning helps in reducing IO while reading , as it scans less data
>> because of partition pruning. And while writing there is almost minimal
>> overhead in identifying the required partition to which the
>> INSERTed/Updated/Deleted data will be landing ON. So can you please help me
>> understand what exact performance impact you are highlighting here? Are you
>> talking about the size of the partitions or total count of the partitions?
>> In our case we are planning to do either daily range partition or hourly
>> based on data data volume, not sure if there exists some sweet spot in
>> regards to what should be the size of each partition in postgresql be. If
>> you are pointing to higher count of partitions of table , then in our case
>> if we persist ~90 days data then for a transaction table it would be ~90
>> daily range partitions or ~2160 hourly range partitions in the aurora
>> postgresql. It would be helpful , if you can explain a bit regarding what
>> exact performance impact you faced in regards to the partitioning in aurora
>> postgresql.
>>
>> *"Use ora2pg to export views in the Oracle database. It's very easy; a
>> small EC2 VM running Linux with enough disk space lets you automate the
>> extraction from Oracle and importation into AWS Postgresql.)"*
>>
>> Need to explore a bit more on this I believe. We have an oracle on
>> premise database, so we can move data directly to aurora postgresql in the
>> cloud. Another thing , is we have some sample data available in the AWS
>> snowflake but then not sure if some mechanism is available to move the same
>> data to the aurora postgresql ?
>>
>> On Sun, 10 Dec 2023 at 02:27, Ron Johnson 
>> wrote:
>>
>>> I don't know anything about Aurora, only have experience with RDS
>>> Postgresql.
>>>
>>> We successfully migrated from on-prem Oracle (12c, I think) to RDS
>>> Postgresql 12, and were very happy: little down time (I take pride in
>>> designing/implementing that; thanks, ora2pg!), with lower disk (8TB, down
>>> to 5TB) and CPU usage.
>>>
>>> I'm not sure what the TPS was in Oracle, but the server level "we" are
>>> on (I'm no longer with that company, and don't remember the specifics (48
>>> vCPU / 132 GB RAM, I think, with 10,000 IOPs) is over-powered for *our*
>>>  needs.
>>>
>>> You're going to have to spin up a full-sized instance, import a *lot* of
>>> real data(*) into a production-identical schema and then run your batch
>>> load process using test data (or copies of real batches).  That's the only
>>> way you'll *really* know.
>>>
>>> Beware of excessive partitioning.  We had to "departion" most tables,
>>> because of the impact on performance.
>>>
>>> (*) Use ora2pg to export views in the Oracle database.  It's *very*
>>> easy; a small EC2 VM running Linux with enough disk space lets you automate
>>> the extraction from Oracle and importation into AWS Postgresql.)
>>>
>>> On Sat

Assistance Needed: Error during PostgreSQL Configuration

2023-12-11 Thread Ayush Vatsa
Hello, PostgreSQL community,
I'm reaching out to you about an issue I've had while trying to configure
PostgreSQL by cloning its files from the GitHub repository.
During the configuration process, I encountered an error message that
appeared to be related to a syntax issue within the 'config.status' file.
Specifically, the error message reads as follows:
./config.status: line 486: syntax error near unexpected token `)'
./config.status: line 486: ` *\'*) ac_optarg=`$as_echo "$ac_optarg" | sed
"s/'/'''/g"` ;;'
I have tried to resolve this on my own, but unfortunately, I haven't been
successful in finding a solution.
Could you please provide guidance or assistance in resolving this issue? I
would appreciate any insights or suggestions you have to help me move past
this obstacle.
Looking forward to your response.

Thanks
Ayush Vatsa
SDE Amazon


Re: running \copy through perl dbi ?

2023-12-11 Thread Vincent Veyron
On Fri, 8 Dec 2023 10:45:28 -0500
David Gauthier  wrote:
> 
> I'm trying to run a PG client side "\copy" command from a perl script.  I
> tried using $dbh->do("\\copy ...") but it barffed when it saw the '\'...
> ERROR:  syntax error at or near "\"
> 
> I can do this with a command line approach, attaching to the DB  then run
> using...

Duh! I just realized that what I proposed with system() is a command line 
approach.

As David Johnston mentionned, you can use the SQL COPY command. 

However, you need then to deal with permissions so that the server may write 
the file, so I wonder what approach is cleaner?


-- 

Bien à vous, Vincent Veyron

https://marica.fr
Logiciel de gestion des contentieux juridiques, des contrats et des sinistres 
d'assurance





Re: Question on overall design

2023-12-11 Thread Dominique Devienne
On Sun, Dec 10, 2023 at 5:56 PM Ron Johnson  wrote:

> * We departitioned because SELECT statements were *slow*.  All partitions
> were scanned, even when the partition key was specified in the WHERE clause.
>

Surely that's no the case on newer PostgreSQL, is it? Otherwise what's the
point of partitioning?
Also, I remember reading something about recent improvements with a large
number of partitions, no?

As someone who's interested on partitioning, I'd appreciate details.
Thanks, --DD


Re: running \copy through perl dbi ?

2023-12-11 Thread Chris Travers
\copy in psql just wraps PostgreSQL's COPY FROM STDIN.

if you are trying to do it from your own client program it is trivial to
change to that call instead.

On Mon, Dec 11, 2023 at 4:09 PM Vincent Veyron 
wrote:

> On Fri, 8 Dec 2023 10:45:28 -0500
> David Gauthier  wrote:
> >
> > I'm trying to run a PG client side "\copy" command from a perl script.  I
> > tried using $dbh->do("\\copy ...") but it barffed when it saw the '\'...
> > ERROR:  syntax error at or near "\"
> >
> > I can do this with a command line approach, attaching to the DB  then run
> > using...
>
> Duh! I just realized that what I proposed with system() is a command line
> approach.
>
> As David Johnston mentionned, you can use the SQL COPY command.
>
> However, you need then to deal with permissions so that the server may
> write the file, so I wonder what approach is cleaner?
>

I wouldn't do COPY FROM FILE in that case.  I would do COPY FROM STDIN and
hten write the data.

Here's the general docs in the DBD::Pg module:
https://metacpan.org/pod/DBD::Pg#COPY-support

The general approach is to COPY FROM STDIN and then use pg_putcopydata for
each row, and finally pg_putcopyend to close out this.  It's not too
different from what psql does in the background.

>
>
> --
>
> Bien à vous, Vincent Veyron
>
> https://marica.fr
> Logiciel de gestion des contentieux juridiques, des contrats et des
> sinistres d'assurance
>
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Assistance Needed: Error during PostgreSQL Configuration

2023-12-11 Thread Ian Lawrence Barwick
2023年12月11日(月) 18:09 Ayush Vatsa :
>
> Hello, PostgreSQL community,
> I'm reaching out to you about an issue I've had while trying to configure 
> PostgreSQL by cloning its files from the GitHub repository.
> During the configuration process, I encountered an error message that 
> appeared to be related to a syntax issue within the 'config.status' file.

FWIW 'config.status' is the log file generated by 'configure'.

> Specifically, the error message reads as follows:
> ./config.status: line 486: syntax error near unexpected token `)'
> ./config.status: line 486: ` *\'*) ac_optarg=`$as_echo "$ac_optarg" | sed 
> "s/'/'''/g"` ;;'
> I have tried to resolve this on my own, but unfortunately, I haven't been 
> successful in finding a solution.
> Could you please provide guidance or assistance in resolving this issue? I 
> would appreciate any insights or suggestions you have to help me move past 
> this obstacle.
> Looking forward to your response.

You should provide, at the very least, the following information:
- the PostgreSQL version you are trying to build
- the environment you are trying to build it in

A very hand-wavy guess, but possibly your local "sed" installation is
not GNU sed?

Regards

Ian Barwick




Re: Assistance Needed: Error during PostgreSQL Configuration

2023-12-11 Thread Tom Lane
Ian Lawrence Barwick  writes:
> 2023年12月11日(月) 18:09 Ayush Vatsa :
>> ./config.status: line 486: syntax error near unexpected token `)'
>> ./config.status: line 486: ` *\'*) ac_optarg=`$as_echo "$ac_optarg" | sed 
>> "s/'/'''/g"` ;;'

> You should provide, at the very least, the following information:
> - the PostgreSQL version you are trying to build
> - the environment you are trying to build it in

+1

> A very hand-wavy guess, but possibly your local "sed" installation is
> not GNU sed?

I'm also wondering if the shell in use is nonstandard.  The configure
script would likely already have failed if you tried to run it with,
say, csh; but maybe more-subtly-incompatible shells could give rise
to this symptom.

regards, tom lane




Re: Question on overall design

2023-12-11 Thread Ron Johnson
On Mon, Dec 11, 2023 at 4:41 AM Dominique Devienne 
wrote:

> On Sun, Dec 10, 2023 at 5:56 PM Ron Johnson 
> wrote:
>
>> * We departitioned because SELECT statements were *slow*.  All
>> partitions were scanned, even when the partition key was specified in the
>> WHERE clause.
>>
>
> Surely that's no the case on newer PostgreSQL, is it? Otherwise what's the
> point of partitioning?
> Also, I remember reading something about recent improvements with a large
> number of partitions, no?
>
> As someone who's interested on partitioning, I'd appreciate details.
> Thanks, --DD
>

This was on 12.5.  v13 was just released, and we weren't confident about
running a mission-critical system on a .1 version.

All "transaction" tables were partitioned by month on partion_date, while
the PK was table_name_id, partition_date.

Queries were _slow_, even when the application knew the partion_date range
(since queries might span months).  PG just wouldn't prune.

When I departitioned the tables, performance became acceptable.


Re: how can I fix my accent issues?

2023-12-11 Thread Igniris Valdivia Baez
hello to all, thanks for your answers i've changed the encoding using this:
ALTER DATABASE testdb
SET client_encoding = WIN1252;

now when we try to select data from a table we get this error:

ERROR: character with byte sequence 0xe2 0x80 0x8b in encoding "UTF8"
has no equivalent in encoding "WIN1252" SQL state: 22P05ERROR:
character with byte sequence 0xe2 0x80 0x8b in encoding "UTF8" has no
equivalent in encoding "WIN1252" SQL state: 22P05

i want to clarify that the postgres on dev is in a docker environment
that already have databases in it so we can't change encoding for the
hole container

thanks in advance

El sáb, 9 dic 2023 a las 1:01, Laurenz Albe
() escribió:
>
> On Fri, 2023-12-08 at 23:58 -0500, Igniris Valdivia Baez wrote:
> > hello, I have an ETL process collecting data from a postgresql
> > database and xls files and inserting in a postgresql database that
> > process occurs great in a local DB in postgres 14 with UTF8
> > codification and Spanish_Cuba.1952 collation but when I execute that
> > process in dev which is in postgres 15 and UTF8 with collation
> > en_US.utf8 the words with accents and ñ looks like an interrogation
> > symbol, what can I do to fix this?
>
> If the data you are sending are encoded in WINDOWS-1252 (I assume that
> "1952" is just a typo), you should set the client encoding to WIN1252,
> so that PostgreSQL knows how to convert the data correctly.
>
> You can do that in several ways; the simplest might be to set the
> environment variable PGCLIENTENCODING to WIN1252.
>
> Yours,
> Laurenz Albe




Re: how can I fix my accent issues?

2023-12-11 Thread Adrian Klaver



On 12/11/23 10:54 AM, Igniris Valdivia Baez wrote:

hello to all, thanks for your answers i've changed the encoding using this:
ALTER DATABASE testdb
SET client_encoding = WIN1252;

now when we try to select data from a table we get this error:

ERROR: character with byte sequence 0xe2 0x80 0x8b in encoding "UTF8"
has no equivalent in encoding "WIN1252" SQL state: 22P05ERROR:
character with byte sequence 0xe2 0x80 0x8b in encoding "UTF8" has no
equivalent in encoding "WIN1252" SQL state: 22P05



That is not surprising as your database has per a previous post from you:

"... postgres 15 and UTF8 with collation
en_US.utf8 ..."

It is entirely possible there are values in the database that have no 
corresponding sequence  in WIN1252.


At this point you will need to stick to UTF8.





i want to clarify that the postgres on dev is in a docker environment
that already have databases in it so we can't change encoding for the
hole container

thanks in advance

El sáb, 9 dic 2023 a las 1:01, Laurenz Albe
() escribió:

On Fri, 2023-12-08 at 23:58 -0500, Igniris Valdivia Baez wrote:

hello, I have an ETL process collecting data from a postgresql
database and xls files and inserting in a postgresql database that
process occurs great in a local DB in postgres 14 with UTF8
codification and Spanish_Cuba.1952 collation but when I execute that
process in dev which is in postgres 15 and UTF8 with collation
en_US.utf8 the words with accents and ñ looks like an interrogation
symbol, what can I do to fix this?

If the data you are sending are encoded in WINDOWS-1252 (I assume that
"1952" is just a typo), you should set the client encoding to WIN1252,
so that PostgreSQL knows how to convert the data correctly.

You can do that in several ways; the simplest might be to set the
environment variable PGCLIENTENCODING to WIN1252.

Yours,
Laurenz Albe







Re: Question on overall design

2023-12-11 Thread Chris Travers
On Tue, Dec 12, 2023 at 2:11 AM Ron Johnson  wrote:

> On Mon, Dec 11, 2023 at 4:41 AM Dominique Devienne 
> wrote:
>
>> On Sun, Dec 10, 2023 at 5:56 PM Ron Johnson 
>> wrote:
>>
>>> * We departitioned because SELECT statements were *slow*.  All
>>> partitions were scanned, even when the partition key was specified in the
>>> WHERE clause.
>>>
>>
>> Surely that's no the case on newer PostgreSQL, is it? Otherwise what's
>> the point of partitioning?
>> Also, I remember reading something about recent improvements with a large
>> number of partitions, no?
>>
>> As someone who's interested on partitioning, I'd appreciate details.
>> Thanks, --DD
>>
>
> This was on 12.5.  v13 was just released, and we weren't confident about
> running a mission-critical system on a .1 version.
>

Something's wrong if all partitions are scanned even when the partition
clause is explicit in the where clause.

There are however some things which can cause problems here, such as type
casts of the partition key, or when the partition key is being brought in
from a join.

>
> All "transaction" tables were partitioned by month on partion_date, while
> the PK was table_name_id, partition_date.
>
> Queries were _slow_, even when the application knew the partion_date range
> (since queries might span months).  PG just wouldn't prune.
>

Was there a datatype issue here?  Like having a partition key of type
timestamp, but the query casting from date?

>
> When I departitioned the tables, performance became acceptable.
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Question on overall design

2023-12-11 Thread Ron Johnson
On Mon, Dec 11, 2023 at 10:34 PM Chris Travers 
wrote:

> On Tue, Dec 12, 2023 at 2:11 AM Ron Johnson 
> wrote:
>
>> On Mon, Dec 11, 2023 at 4:41 AM Dominique Devienne 
>> wrote:
>>
>>> On Sun, Dec 10, 2023 at 5:56 PM Ron Johnson 
>>> wrote:
>>>
 * We departitioned because SELECT statements were *slow*.  All
 partitions were scanned, even when the partition key was specified in the
 WHERE clause.

>>>
>>> Surely that's no the case on newer PostgreSQL, is it? Otherwise what's
>>> the point of partitioning?
>>> Also, I remember reading something about recent improvements with a
>>> large number of partitions, no?
>>>
>>> As someone who's interested on partitioning, I'd appreciate details.
>>> Thanks, --DD
>>>
>>
>> This was on 12.5.  v13 was just released, and we weren't confident about
>> running a mission-critical system on a .1 version.
>>
>
> Something's wrong if all partitions are scanned even when the partition
> clause is explicit in the where clause.
>
> There are however some things which can cause problems here, such as type
> casts of the partition key, or when the partition key is being brought in
> from a join.
>

Here's a snippet.  part_date (type timestamp without time zone) is the
partition key:

  and separation0_.part_date>=to_date('01-Jun-2021', 'DD-Mon-')
  and transmissi1_.part_date>=to_date('01-Jun-2021', 'DD-Mon-')
  and separation0_.part_date=transmissi1_.part_date


>
>> All "transaction" tables were partitioned by month on partion_date, while
>> the PK was table_name_id, partition_date.
>>
>> Queries were _slow_, even when the application knew the partion_date
>> range (since queries might span months).  PG just wouldn't prune.
>>
>
> Was there a datatype issue here?  Like having a partition key of type
> timestamp, but the query casting from date?
>

The partition key was of type timestamp, while "the right hand side of the
predicate".would be whatever to_char() generated.