Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo


On 01/03/2018 16:00 , Ron Johnson wrote:

On 03/01/2018 12:32 PM, Daevor The Devoted wrote:



On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson > wrote:



On 03/01/2018 11:47 AM, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar
mailto:rakeshkumar...@aol.com>> wrote:


>Adding a surrogate key to such a table just adds overhead,
although that could be useful
>in case specific rows need updating or deleting without
also modifying the other rows with
>that same data - normally, only insertions and selections
happen on such tables though,
>and updates or deletes are absolutely forbidden -
corrections happen by inserting rows with
>an opposite transaction.

I routinely add surrogate keys like serial col to a table
already having a nice candidate keys
to make it easy to join tables.  SQL starts looking ungainly
when you have a 3 col primary
key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as
you describe) is good practice.
Sure there may be a unique key according to business logic
(which may be consist of those "ungainly" multiple columns), but
guess what, business logic changes, and then you're screwed!


And so you drop the existing index and build a new one.  I've
done it before, and I'll do it again.


So using a primary key whose sole purpose is to be a primary key
makes perfect sense to me.


I can't stand synthetic keys.  By their very nature, they're so
purposelessly arbitrary, and allow you to insert garbage into the
table.


Could you perhaps elaborate on how a surrogate key allows one to 
insert garbage into the table? I'm afraid I don't quite get what 
you're saying.


If your only unique index is a synthetic key, then you can insert the 
same "business data" multiple times with different synthetic keys.



--
Angular momentum makes the world go 'round.
IMHO, business logic can and must preclude "garbage insertion". Except 
you are inserting data directly to database using SQL, any n-tier 
architecture will be checking data validity.



---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 16:42 , Ron Johnson wrote:

On 03/01/2018 01:11 PM, marcelo wrote:


On 01/03/2018 16:00 , Ron Johnson wrote:

[snip]

If your only unique index is a synthetic key, then you can insert
the same "business data" multiple times with different synthetic keys.


--
Angular momentum makes the world go 'round.

IMHO, business logic can and must preclude "garbage insertion".
Except you are inserting data directly to database using SQL, any
n-tier architecture will be checking data validity.


Any n-tier architecture that's bug-free.


Do you know about unit testing?

---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 17:21 , Ron Johnson wrote:

On 03/01/2018 02:08 PM, marcelo wrote:



On 01/03/2018 16:42 , Ron Johnson wrote:

On 03/01/2018 01:11 PM, marcelo wrote:


On 01/03/2018 16:00 , Ron Johnson wrote:

[snip]
If your only unique index is a synthetic key, then you can insert 
the same "business data" multiple times with different synthetic 
keys.



--
Angular momentum makes the world go 'round.
IMHO, business logic can and must preclude "garbage insertion". 
Except you are inserting data directly to database using SQL, any 
n-tier architecture will be checking data validity.

bl
Any n-tier architecture that's bug-free.


Do you know about unit testing?


Way Back When Dinosaurs Still Roamed The Earth and I first learned the 
trade, the focus was on proper design instead of throwing crud against 
the wall and hoping tests caught any bugs.  Because, of course, unit 
tests are only as good as you imagination in devising tests.



So, you are fully convinced that there´s no bug free software... Same as 
I (and you) can code following the business rules, you (and me) can 
design unit tests not from "imagination" but from same rules.
Moreover: you can have a surrogate key (to speedup foreign keys) and 
simultaneously put a unique constraint on the columns requiring it. 
What´s the question?


---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 17:32 , David G. Johnston wrote:
On Thu, Mar 1, 2018 at 1:24 PM, Ron Johnson >wrote:


Why have the overhead of a second unique index?  If it's "ease of
joins", then I agree with Francisco Olarte and use the business
logic keys in your joins even though it's a bit of extra work.


​The strongest case, for me, when a surrogate key is highly desirable 
is when there is no truly natural key and the best key for the model 
is potentially alterable. Specific, the "name" of something.  If I add 
myself to a database and make name unique, so David Johnston, then 
someone else comes along with the same name and now I want to add the 
new person as, say David A. Johnston AND rename my existing record to 
David G. Johnston.  I keep the needed uniqueness ​and don't need to 
cobble together other data elements.  Or, if I were to use email 
address as the key the same physical entity can now change their 
address without me having to cascade update all FK instances too. 
Avoiding the FK cascade when enforcing a non-ideal PK is a major good 
reason to assign a surrogate.


David J.


I suffered myself what David said as an example...


---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 18:41 , Adrian Klaver wrote:

On 03/01/2018 01:26 PM, Ron Johnson wrote:

On 03/01/2018 03:14 PM, Adrian Klaver wrote:

On 03/01/2018 01:03 PM, Ron Johnson wrote:

On 03/01/2018 02:32 PM, David G. Johnston wrote:


There's always the "account number", which is usually synthetic. 
Credit Card numbers are also synthetic. 


Actually, no:

https://en.wikipedia.org/wiki/Payment_card_number

There is a method to the madness, not just random issuance of 
numbers. It was made it relatively easy for folks to *generate 
numbers*. Hence the addition of CSC codes.


Right.  And how do the issuers generate the individual account 
identifier within their IIN ranges?


Who knows, that is their business, though there is nothing to say they 
don't use some sort of internal 'natural' logic. It has been awhile 
since we have gone down this rabbit hole on this list, mostly because 
it is an issue that is usually left at 'we agree to disagree'. Though 
the thing that always strikes me is the assumption that a 
number/surrogate key is less 'natural' then some other sort of tag or 
combination of tags. Because that is what PK's are, a tag to identify 
a record.

+1.






ICD numbers are (relatively)

synthetic, too.

But that doesn't mean we have to use them willy-nilly everywhere.


--
Angular momentum makes the world go 'round.






---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 19:05 , Gavin Flower wrote:

On 02/03/18 06:47, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar > wrote:



    >Adding a surrogate key to such a table just adds overhead,
    although that could be useful
    >in case specific rows need updating or deleting without also
    modifying the other rows with
    >that same data - normally, only insertions and selections happen
    on such tables though,
    >and updates or deletes are absolutely forbidden - corrections
    happen by inserting rows with
    >an opposite transaction.

    I routinely add surrogate keys like serial col to a table already
    having a nice candidate keys
    to make it easy to join tables.  SQL starts looking ungainly when
    you have a 3 col primary
    key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as you 
describe) is good practice.
Sure there may be a unique key according to business logic (which may 
be consist of those "ungainly" multiple columns), but guess what, 
business logic changes, and then you're screwed! So using a primary 
key whose sole purpose is to be a primary key makes perfect sense to me.


I once worked in a data base that had primary keys of at least 4 
columns, all character fields, Primary Key could easily exceed 45 
characters.  Parent child structure was at least 4 deep.


A child table only needs to know its parent, so there is no logical 
need to include its parent and higher tables primary keys, and then 
have to add a field to make the composite primary key unique!  So if 
every table has int (or long) primary keys, then a child only need a 
single field to reference its parent.


Some apparently safe Natural Keys might change unexpectedly.  A few 
years aback there was a long thread on Natural versus Surrogate keys - 
plenty of examples were using Natural Keys can give grief when they 
had to be changed!  I think it best to isolate a database from 
external changes as much as is practicable.


Surrogate keys also simply coding, be it in SQL or Java, or whatever 
language is flavour of the month.  Also it makes setting up testdata 
and debugging easier.


I almost invariably define a Surrogate key when I design tables.


Cheers,
Gavin





+5. I fully agree.

---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: Enforce primary key on every table during dev?

2018-03-02 Thread marcelo



On 02/03/2018 01:10 , Daevor The Devoted wrote:



On Fri, Mar 2, 2018 at 12:05 AM, Gavin Flower 
mailto:gavinflo...@archidevsys.co.nz>> 
wrote:


On 02/03/18 06:47, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar
mailto:rakeshkumar...@aol.com>
>> wrote:


    >Adding a surrogate key to such a table just adds overhead,
    although that could be useful
    >in case specific rows need updating or deleting without also
    modifying the other rows with
    >that same data - normally, only insertions and selections
happen
    on such tables though,
    >and updates or deletes are absolutely forbidden - corrections
    happen by inserting rows with
    >an opposite transaction.

    I routinely add surrogate keys like serial col to a table
already
    having a nice candidate keys
    to make it easy to join tables.  SQL starts looking
ungainly when
    you have a 3 col primary
    key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as
you describe) is good practice.
Sure there may be a unique key according to business logic
(which may be consist of those "ungainly" multiple columns),
but guess what, business logic changes, and then you're
screwed! So using a primary key whose sole purpose is to be a
primary key makes perfect sense to me.


I once worked in a data base that had primary keys of at least 4
columns, all character fields, Primary Key could easily exceed 45
characters.  Parent child structure was at least 4 deep.

A child table only needs to know its parent, so there is no
logical need to include its parent and higher tables primary keys,
and then have to add a field to make the composite primary key
unique!  So if every table has int (or long) primary keys, then a
child only need a single field to reference its parent.

Some apparently safe Natural Keys might change unexpectedly.  A
few years aback there was a long thread on Natural versus
Surrogate keys - plenty of examples were using Natural Keys can
give grief when they had to be changed!  I think it best to
isolate a database from external changes as much as is practicable.

Surrogate keys also simply coding, be it in SQL or Java, or
whatever language is flavour of the month.  Also it makes setting
up testdata and debugging easier.

I almost invariably define a Surrogate key when I design tables.


Cheers,
Gavin


Thank you! I think you have expressed far more clearly what I have 
been trying to say. +10 to you.

Me too. Another +10.


---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Do we need yet another IDE (SQL development assistant) for PostgreSQL?

2018-07-15 Thread marcelo

For windows platforms only, there is
https://www.sqlmanager.net/products/postgresql/manager/ which in Lite
version is free.
I use it near daily and works like a charm.

On 15/07/2018 16:08 , Dmitry Igrishin wrote:

Thank you all for your responses!

Okay, if I decide to start this project, the Linux platform will be
supported.

As I see it, pgspa (PostgreSQL Server Programming Assistant) should
have both the command line interface,
and the GUI for visualizing some aspects of the work. The tool should
be friendly for your favorite editors (Emacs,
VSCode etc) for easy integration to the daily workflow. (There is no
plans to write feature rich text editor inside
the IDE. It's more rational to provide the integration with the
existing text editors for professional programmers.)
What do you think?



---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Logical locking beyond pg_advisory

2018-09-16 Thread marcelo

I need a mechanism of "logical locking" more ductile than the
pg_advisory family.
I'm thinking of a table ("lock_table") that would be part of the
database, with columns
* tablename varchar - name of the table "locked"
* rowid integer, - id of the row "locked"
* ownerid varchar, - identifier of the "user" who acquired the lock
* acquired timestamp - to be able to release "abandoned" locks after a
certain time

and a group of functions
1) lock_table (tablename varchar, ownerid varchar) bool - get to lock
over the entire table, setting rowid to zero
2) unlock_table (tablename varchar, ownerid varchar) bool - unlock the
table, if the owner is the recorded one
3) locked_table (tablename varchar, ownerid varchar) bool - ask if the
table is locked by some user other than the ownerid argument
4) lock_row (tablename varchar, rowid integer, ownerid varchar) bool -
similar to pg_try_advisory_lock
5) unlock_row (tablename varchar, rowid integer, ownerid varchar) bool -
similar to pg_advisory_unlock
6) unlock_all (ownerid varchar) bool - unlock all locks owned by ownerid

The timeout (default, maybe 15 minutes) is implicitly applied if the
lock is taken by another user (there will be no notification).
Redundant locks are not queued, they simply return true, may be after an
update of the acquired column.
Successful locks insert a new row, except the rare case of a timeout,
which becomes an update (ownerid and acquired)
Unlock operations deletes the corresponding row

My question is double
a) What is the opinion on the project?
b) What are the consequences of the large number of inserts and deletions
c) Performance. In fact, pg_advisory* implies a network roundtrip, but
(I think) no table operations.

TIA



---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Logical locking beyond pg_advisory

2018-09-17 Thread marcelo
/Would be nice if you explain more about what kind of problem you want 
to solve./


There are two main questions "to solve"
a) generally speaking, I like to get some form of "exclusive access" to 
the row before updating or deleting. None of the optimistic / pesimistic 
automatic variants of concurrency management glad me. Nor, of course, 
the "versioning" variants.
b) some of the tables I´m working on have a "number" column (_not the 
PK_) which sometimes come from material, external sources, but sometimes 
must be assigned by the system. This could be solved two main ways
b.1) Use a trigger to get the famous "max(n) + 1". At least in one of 
the cases, the number automatically generated must fall into different 
ranges conditioned by the value of another column, and the ranges vary 
between versions of the database.
b.2) "Lock the entire table", get programmatically the next number for 
the correct range, assign it and free the table lock.


Of course, and beforehand, all database manipulations are done thru 
applications.
Till yesterday, I was working with the "advisory" family. Having a 
bigint as the only "lock identifier" I was working with a hash of the 
table name XORed with the id of the row or zero for the entire table. 
(All my tables have an autosequential integer id as PK).
Even if I found a very robust hash algorithm for the table name, I 
cannot discard some collision once the id was xored. I tested five or 
six table names, along 2 ids every one, without collision. But...
Of course, I need the "full table lock" for inserts. So, it´s a very 
separated concern with updates and deletions. But...


TIA


On 17/09/2018 03:19 , Fabrízio de Royes Mello wrote:


Em dom, 16 de set de 2018 às 17:53, marcelo <mailto:marcelo.nico...@gmail.com>> escreveu:

>
> I need a mechanism of "logical locking" more ductile than the 
pg_advisory family.
> I'm thinking of a table ("lock_table") that would be part of the 
database, with columns

> * tablename varchar - name of the table "locked"
> * rowid integer, - id of the row "locked"
> * ownerid varchar, - identifier of the "user" who acquired the lock
> * acquired timestamp - to be able to release "abandoned" locks after 
a certain time

>
> and a group of functions
> 1) lock_table (tablename varchar, ownerid varchar) bool - get to 
lock over the entire table, setting rowid to zero
> 2) unlock_table (tablename varchar, ownerid varchar) bool - unlock 
the table, if the owner is the recorded one
> 3) locked_table (tablename varchar, ownerid varchar) bool - ask if 
the table is locked by some user other than the ownerid argument
> 4) lock_row (tablename varchar, rowid integer, ownerid varchar) bool 
- similar to pg_try_advisory_lock
> 5) unlock_row (tablename varchar, rowid integer, ownerid varchar) 
bool - similar to pg_advisory_unlock

> 6) unlock_all (ownerid varchar) bool - unlock all locks owned by ownerid
>
> The timeout (default, maybe 15 minutes) is implicitly applied if the 
lock is taken by another user (there will be no notification).
> Redundant locks are not queued, they simply return true, may be 
after an update of the acquired column.
> Successful locks insert a new row, except the rare case of a 
timeout, which becomes an update (ownerid and acquired)

> Unlock operations deletes the corresponding row
>
> My question is double
> a) What is the opinion on the project?

Would be nice if you explain more about what kind of problem you want 
to solve.


> b) What are the consequences of the large number of inserts and 
deletions


The first thing came to my mind with this approach is table bloat.

> c) Performance. In fact, pg_advisory* implies a network roundtrip, 
but (I think) no table operations.


Yeap... no table operations.

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento




---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Logical locking beyond pg_advisory

2018-09-17 Thread marcelo



On 17/09/2018 12:21 , Chris Travers wrote:



On Mon, Sep 17, 2018 at 5:09 PM Merlin Moncure <mailto:mmonc...@gmail.com>> wrote:


On Sun, Sep 16, 2018 at 3:53 PM marcelo mailto:marcelo.nico...@gmail.com>> wrote:
>
> I need a mechanism of "logical locking" more ductile than the
pg_advisory family.
> I'm thinking of a table ("lock_table") that would be part of the
database, with columns
> * tablename varchar - name of the table "locked"
> * rowid integer, - id of the row "locked"
> * ownerid varchar, - identifier of the "user" who acquired the lock
> * acquired timestamp - to be able to release "abandoned" locks
after a certain time
>
> and a group of functions
> 1) lock_table (tablename varchar, ownerid varchar) bool - get to
lock over the entire table, setting rowid to zero
> 2) unlock_table (tablename varchar, ownerid varchar) bool -
unlock the table, if the owner is the recorded one
> 3) locked_table (tablename varchar, ownerid varchar) bool - ask
if the table is locked by some user other than the ownerid argument
> 4) lock_row (tablename varchar, rowid integer, ownerid varchar)
bool - similar to pg_try_advisory_lock
> 5) unlock_row (tablename varchar, rowid integer, ownerid
varchar) bool - similar to pg_advisory_unlock
> 6) unlock_all (ownerid varchar) bool - unlock all locks owned by
ownerid
>
> The timeout (default, maybe 15 minutes) is implicitly applied if
the lock is taken by another user (there will be no notification).
> Redundant locks are not queued, they simply return true, may be
after an update of the acquired column.
> Successful locks insert a new row, except the rare case of a
timeout, which becomes an update (ownerid and acquired)
> Unlock operations deletes the corresponding row
>
> My question is double
> a) What is the opinion on the project?
> b) What are the consequences of the large number of inserts and
deletions
> c) Performance. In fact, pg_advisory* implies a network
roundtrip, but (I think) no table operations.

Why can't you use the advisory lock functions?  The challenge with
manually managed locks are they they are slow and you will lose the
coordination the database provides you.  For example, if your
application crashes you will have to clean up all held locks yourself.
Building out that infrastructure will be difficult.


First, I think in an ideal world, you wouldn't handle this problem 
with either approach but sometimes you have to.


I have done both approaches actually.  LedgerSMB uses its own lock 
table because locks have to persist across multiple HTTP requests and 
we have various automatic cleanup processes.


When I was working on  the queue management stuff at Novozymes we used 
advisory locks extensively.


These two approaches have serious downsides:
1.  Lock tables are *slow* and require careful thinking through 
cleanup scenarios.  In LedgerSMB we tied to the application session 
with an ON DELETE event that would unlock the row.  We estimated that 
for every 2 seconds that the db spent doing useful work, it spent 42 
seconds managing the locks.  Additionally the fact that locks take 
effect on snapshot advance is a problem here.


2.  In my talk, "PostgreSQL at 10TB and  Beyond" I talk about a 
problem we had using advisory locks for managing rows that were being 
processed for deletion.  Since the deletion was the scan for items at 
the head of an index, under heavy load we could spend long enough 
checking dead rows that the locks could go away with our snapshot 
failing to advance.  This would result in duplicate processing.  So 
the fact that advisory locks don't really follow snapshot semantics is 
a really big problem here since it means you can have race conditions 
in advisory locks that can't happen with other locking issues.  I 
still love advisory locks but they are not a good tool for this.


The real solution most of the time is actually to lock the rows by 
selecting FOR UPDATE and possibly SKIP LOCKED. The way update/delete 
row locking in PostgreSQL works is usually good enough except in a few 
rare edge cases.  Only in *very rare* cases do lock tables or advisory 
locks make sense for actual row processing.



merlin



--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor 
lock-in.

http://www.efficito.com/learn_more
I´m using an ORM (Devart´s) to access the database, so, I cannot "select 
... FOR UPDATE". The application paradigm is that a user have a list of 
records (after a query) and she could update or delete any of them as 
the business rules allows it. So, at least an advisory lock is a must.
I´m conv

Re: Logical locking beyond pg_advisory

2018-09-17 Thread marcelo



On 17/09/2018 14:27 , Chris Travers wrote:



On Mon, Sep 17, 2018 at 6:04 PM marcelo <mailto:marcelo.nico...@gmail.com>> wrote:




I´m using an ORM (Devart´s) to access the database, so, I cannot
"select ... FOR UPDATE". The application paradigm is that a user
have a list of records (after a query) and she could update or
delete any of them as the business rules allows it. So, at least
an advisory lock is a must.
I´m convinced by now: I would stay with advisory locks...
expecting no app crash could occur...


I would say to fix this in the ORM rather than reinvent what the 
database already gives you in the database.



You are right. But you know...


Thank you all.
Marcelo


<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
Libre de virus. www.avast.com

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>


<#m_-9091154853724945458_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>



--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor 
lock-in.

http://www.efficito.com/learn_more




---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Select into table%ROWTYPE failed

2018-09-18 Thread marcelo

I´m testing a trigger function in a 9.4 installation.
It´s for bill number assignation, but with a twist: there are various 
numbering ranges. This ranges are defined by a text code, a minimum and 
maximum. Every bill have some code taken from the set defined in a 
specific table (billnumberrange)
The first approach was the obvious "select into" a row instance, using 
table%ROWTYPE as the destination.

That failed, leaving all fields of the  instance as null.
But selecting into the interesting fields works ok. The trigger function 
follows; the initial approach lines are commented.


CREATE FUNCTION nextbillnumber() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
DECLARE
  lastnumber integer;
  lowerlimit integer;
  upperlimit integer;
  -- ranger billnumberrange%ROWTYPE;
BEGIN
  if NEW.billnumber = 0 THEN
      select billnumberrange.lowerlimit, billnumberrange.upperlimit 
from billnumberrange
        where billnumberrange.groupcode = NEW.groupcode into 
lowerlimit, upperlimit;

        --where billnumberrange.groupcode = NEW.groupcode into ranger;
        -- RAISE NOTICE 'first select result % % <> %', 
ranger.groupcode, ranger.lowerlimit, ranger.upperlimit; this NOTICE 
showed   <> 
       RAISE NOTICE 'first select result % <> %', lowerlimit, 
upperlimit;-- this shows the expected values

       select max(billnumber) from bill
        where billnumber BETWEEN lowerlimit and upperlimit
        --  where billnumber BETWEEN ranger.lowerlimit and 
ranger.upperlimit

        into lastnumber;
    RAISE NOTICE 'second select result %', FOUND;
    if lastnumber is null THEN
        lastnumber := lowerlimit;
        -- lastnumber := ranger.lowerlimit;
    end if;
    RAISE NOTICE 'lastnumber is %', lastnumber;
    NEW.billnumber = lastnumber + 1;
  end if;
  return NEW;
END;
$$;

What was wrong in the first approach?
TIA


---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: Select into table%ROWTYPE failed

2018-09-18 Thread marcelo

Ohh, you are right!
Thank you!

On 18/09/2018 14:10 , Tom Lane wrote:

marcelo  writes:

What was wrong in the first approach?

plpgsql's "SELECT INTO" expects a one-for-one match between the output
columns of the SELECT and the columns of the INTO destination.  So I'd
expect something like this to work:

DECLARE
   ranger billnumberrange%ROWTYPE;
BEGIN
  SELECT * FROM billnumberrange WHERE ... INTO ranger;

Your example clearly wasn't selecting all the columns, and it
wasn't clear whether you paid any attention to column ordering;
but both of those matter.

regards, tom lane




---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




ORM

2018-09-28 Thread marcelo
For a new big and convoluted project I (am/was) using Devart´s 
LinqConnect as ORM.
But today I experienced some inexplicable "object reference not set to 
an instance of an object" exceptions or other more specific to this 
libraries.

I would wish to change the ORM.
Some experiences would be appreciated. Of course, I prefer open source 
software, at least to chase errors thru debugging.

TIA

---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: ORM

2018-09-29 Thread marcelo



On 28/09/2018 21:39 , Adrian Klaver wrote:

On 9/28/18 3:49 PM, marcelo wrote:
For a new big and convoluted project I (am/was) using Devart´s 
LinqConnect as ORM.
But today I experienced some inexplicable "object reference not set 
to an instance of an object" exceptions or other more specific to 
this libraries.

I would wish to change the ORM.
Some experiences would be appreciated. Of course, I prefer open 
source software, at least to chase errors thru debugging.


Information that might help folks steer you to alternatives:

1) What programming language(s) are you using?

/C#, from 4.5 framework./


2) The OS'es involved

/Linux for the database server, Windows for client machines.//
/

//
3) Are you using any frameworks between the database and the end user?
/Yes, a very big one, developed by me (or using open source libraries at 
some spots), which tries to implement MVP pattern. It´s near completely 
agnostic regarding the DAL layer, but I cannot deny that it may be 
somewhat influenced by the Devart´s model.//

/

//
4) Is an ORM even necessary?
/Yes; there are around 50 entities and with lots of foreign keys. It´s 
well worth to have pre-plumbed the code to get "fathers" and "child 
collections". Ah, and I´m working alone./




TIA

---
El software de antivirus Avast ha analizado este correo electrónico 
en busca de virus.

https://www.avast.com/antivirus










---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: ORM

2018-09-29 Thread marcelo




On 29/09/2018 16:09 , Edson Carlos Ericksson Richter wrote:

Em 28/09/2018 19:49, marcelo escreveu:
For a new big and convoluted project I (am/was) using Devart´s 
LinqConnect as ORM.
But today I experienced some inexplicable "object reference not set 
to an instance of an object" exceptions or other more specific to 
this libraries.

I would wish to change the ORM.
Some experiences would be appreciated. Of course, I prefer open 
source software, at least to chase errors thru debugging.

TIA

---
El software de antivirus Avast ha analizado este correo electrónico 
en busca de virus.

https://www.avast.com/antivirus





For .Net I know nothing.
For Java, at other side, EclipseLink is fantastic.

Regards,

Edson.



Thank you. Do you know about a porting to .NET?


---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




How to know if a database has changed

2017-12-11 Thread marcelo
The installation I'm planning will manage several databases, but not all 
of them will change every day.
In order to planning/scripting the pg_dump usage, I would need to know 
which databases had some change activity at the end of some day.

How can it be done?
TIA



pg_dump and logging

2017-12-11 Thread marcelo
When pg_dump runs on a database, is it warranted that the log is fully 
impacted, or at least, taken into account for the dumping?

TIA



Re: How to know if a database has changed

2017-12-12 Thread marcelo

Hi Sam

You are right, and here are the reason behind my question: The server 
where postgres will be installed is not on 24/7. It turns on in the 
morning and goes off at the end of the day. The idea is that, as part of 
the shutdown process, a local backup is made. The next day, that backup 
will be copied to the cloud.
In order not to lengthen the shutdown process, we are trying to limit 
pg_dump to the databases that have had some change, not so much in their 
schema as in their data.
Of course, to add a trigger for every table and CUD operation on every 
database is not an option.


On 11/12/17 23:23, Sam Gendler wrote:
I think there's a more useful question, which is why do you want to do 
this?  If it is just about conditional backups, surely the cost of 
backup storage is low enough, even in S3 or the like, that a duplicate 
backup is an afterthought from a cost perspective? Before you start 
jumping through hoops to make your backups conditional, I'd first do 
some analysis and figure out what the real cost of the thing I'm 
trying to avoid actually is, since my guess is that you are deep into 
a premature optimization  
here, where either the cost of the duplicate backup isn't 
consequential or the frequency of duplicate backups is effectively 0.  
It would always be possible to run some kind of checksum on the backup 
and skip storing it if it matches the previous backup's checksum if 
you decide that there truly is value in conditionally backing up the 
db.  Sure, that would result in dumping a db that doesn't need to be 
dumped, but if your write transaction rate is so low that backups end 
up being duplicates on a regular basis, then surely you can afford the 
cost of a pg_dump without any significant impact on performance?


On Mon, Dec 11, 2017 at 10:49 AM, Andreas Kretschmer 
mailto:andr...@a-kretschmer.de>> wrote:




Am 11.12.2017 um 18:26 schrieb Andreas Kretschmer:

it's just a rough idea...


... and not perfect, because you can't capture ddl in this way.



Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.

www.2ndQuadrant.com 







Strange behavior with missing column in SQL function

2018-07-30 Thread Marcelo Lacerda
Here's the code that reproduces the behavior:
http://paste.debian.net/1035412/

I have already discussed this in the IRC channel but there doesn't seem to
be a consensus on whether this is a bug here's a brief transcript of
RhodiumToad's opinion:

> this isn't new, goes back to 9.1 at least
> basically, the error path in sql_fn_post_column_ref is a bit confused.
> seeing r.c it tries to resolve it as parameter.field, fails, and rather
than reporting the error directly as being a missing field, it just returns
with the reference unresolved
>then the outer parser code, having failed to resolve it as table.column
and having had the hook function not override it, reports it on the
assumption that it's a missing table
> so it's probably been this way for as long as named parameters have
worked in language sql
> msl09: as far as I can tell it's just giving the wrong error in an error
path, everything that's supposed to work does work
> msl09: but the error is definitely misleading

My question is "Is this a bug? Should it be reported?"


Re: Strange behavior with missing column in SQL function

2018-07-31 Thread Marcelo Lacerda
> CREATE OR REPLACE FUNCTION myfunction(myrow mytable)
> RETURNS INTEGER AS $$
> SELECT myrow.c + myrow.b FROM myrow;
> $$ LANGUAGE sql;

> where "myrow" is a table with a different set of column names from
> "mytable".  The existing behavior for that is to seek the column name
> in "myrow" (the table), failing that to seek it in the parameter,
> and only to throw an error if both fail.
Wow I never thought this would be possible. why didn't the designers of the
language use myrow mytable%ROWTYPE for rows of a table as a parameter,
given that it's a valid type in PL/PGSQL? I figure that way it would have
been way easier to disambiguate the definitions.


check_function_bodies not doing much

2018-08-07 Thread Marcelo Lacerda
I was trying to get postgres to warn me that I'm referencing a table that
it doesn't exists inside a function so I was told on the IRC to check the
setting "check_function_bodies", however when I use it in a plpgsql
function it doesn't actually check if the tables in the body exist. Is this
the correct behavior?

Example:
http://paste.debian.net/1037080/


Re: check_function_bodies not doing much

2018-08-07 Thread Marcelo Lacerda
That's a whole different nightmare that I'm expecting.

"Yep I double-checked all my functions to see if any would break if I
change this field mytable.a into 2 fields mytable.a1 and mytable.a2 and
everything is ok."

*1 month later*

"Why is this error log for this application that I wrote one year ago so
big? I haven't changed anything!"

Error table mytable has no column a
Error table mytable has no column a
Error table mytable has no column a
...

It's frustrating that the references that a function make to the tables and
fields it access aren't taken in account for the validation of whether a
change to the structure of the database breaks the APIs that the database
exposes.

On Tue, Aug 7, 2018 at 6:44 PM Merlin Moncure  wrote:

> On Tue, Aug 7, 2018 at 2:31 PM Tom Lane  wrote:
> >
> > Marcelo Lacerda  writes:
> > > I was trying to get postgres to warn me that I'm referencing a table
> that
> > > it doesn't exists inside a function so I was told on the IRC to check
> the
> > > setting "check_function_bodies", however when I use it in a plpgsql
> > > function it doesn't actually check if the tables in the body exist. Is
> this
> > > the correct behavior?
> >
> > Yes.  It's supposed to be a syntax check, not a check that the function
> > would work when executed.  (Depending on the particular PL you're using,
> > which you didn't mention, it might be a pretty weak syntax check too.)
> >
> > An example of why a thorough check would be inadvisable is that a trigger
> > function might contain references to OLD and NEW that are in code paths
> > protected by checks on the trigger event type.  That could be perfectly
> > OK, but a static check couldn't tell.
> >
> > I believe there are some external tools floating around that check things
> > more aggressively, and hence with a higher rate of false positives.
>
> The only valid use of this GUC that I can think of is to work around
> this problem;
> postgres=# create or replace function f() returns void as
> $$
>   create temp table x(id int);
>   delete from x;
> $$ language sql;
> ERROR:  relation "x" does not exist
>
> ...I've since given up on writing plain sql functions except for
> inline cases though so I don't use it anymore.  Static resolution of
> tables is not very useful since the state of the database as the time
> of function creation is different than what it might be when the
> function is run (as opposed to compiled languages obviously).
>
> merlin
>


Re: check_function_bodies not doing much

2018-08-07 Thread Marcelo Lacerda
I'll take a look at it. Thanks for the recommendation.

On Tue, Aug 7, 2018 at 7:22 PM Pavel Stehule 
wrote:

>
>
> 2018-08-08 0:02 GMT+02:00 Marcelo Lacerda :
>
>> That's a whole different nightmare that I'm expecting.
>>
>
>
>
>>
>> "Yep I double-checked all my functions to see if any would break if I
>> change this field mytable.a into 2 fields mytable.a1 and mytable.a2 and
>> everything is ok."
>>
>> *1 month later*
>>
>> "Why is this error log for this application that I wrote one year ago so
>> big? I haven't changed anything!"
>>
>> Error table mytable has no column a
>> Error table mytable has no column a
>> Error table mytable has no column a
>> ...
>>
>> It's frustrating that the references that a function make to the tables
>> and fields it access aren't taken in account for the validation of whether
>> a change to the structure of the database breaks the APIs that the database
>> exposes.
>>
>
> This cannot be done due possible dynamic SQL. And this issue solve
> plpgsql_check really well.
>
> Regards
>
> Pavel
>
>
>>
>> On Tue, Aug 7, 2018 at 6:44 PM Merlin Moncure  wrote:
>>
>>> On Tue, Aug 7, 2018 at 2:31 PM Tom Lane  wrote:
>>> >
>>> > Marcelo Lacerda  writes:
>>> > > I was trying to get postgres to warn me that I'm referencing a table
>>> that
>>> > > it doesn't exists inside a function so I was told on the IRC to
>>> check the
>>> > > setting "check_function_bodies", however when I use it in a plpgsql
>>> > > function it doesn't actually check if the tables in the body exist.
>>> Is this
>>> > > the correct behavior?
>>> >
>>> > Yes.  It's supposed to be a syntax check, not a check that the function
>>> > would work when executed.  (Depending on the particular PL you're
>>> using,
>>> > which you didn't mention, it might be a pretty weak syntax check too.)
>>> >
>>> > An example of why a thorough check would be inadvisable is that a
>>> trigger
>>> > function might contain references to OLD and NEW that are in code paths
>>> > protected by checks on the trigger event type.  That could be perfectly
>>> > OK, but a static check couldn't tell.
>>> >
>>> > I believe there are some external tools floating around that check
>>> things
>>> > more aggressively, and hence with a higher rate of false positives.
>>>
>>> The only valid use of this GUC that I can think of is to work around
>>> this problem;
>>> postgres=# create or replace function f() returns void as
>>> $$
>>>   create temp table x(id int);
>>>   delete from x;
>>> $$ language sql;
>>> ERROR:  relation "x" does not exist
>>>
>>> ...I've since given up on writing plain sql functions except for
>>> inline cases though so I don't use it anymore.  Static resolution of
>>> tables is not very useful since the state of the database as the time
>>> of function creation is different than what it might be when the
>>> function is run (as opposed to compiled languages obviously).
>>>
>>> merlin
>>>
>>
>


Clarification on the release notes of postgresql 12 regarding pg_upgrade

2019-10-04 Thread Marcelo Lacerda
There are a few instances where the release notes seem to indicate that the
administrator should use pg_dump to upgrade a database so that improvements
on btree can be available.

Here are they:

1.

>In new btree indexes, the maximum index entry length is reduced by eight
bytes, to improve handling of duplicate entries (Peter Geoghegan)
-

> This means that a REINDEX
 operation on an index
pg_upgrade'd from a previous release could potentially fail.
-
- 2.
>Improve performance and space utilization of btree indexes with many
duplicates (Peter Geoghegan, Heikki Linnakangas)
>...
>Indexes pg_upgrade'd from previous releases will not have these benefits.

3.
>Allow multi-column btree indexes to be smaller (Peter Geoghegan, Heikki
Linnakangas)
>...
>Indexes pg_upgrade'd from previous releases will not have these benefits.


My questions are:

1. Is this a current limitation of pg_upgrade that will be dealt afterwards?

2. Are we going to see more of such cases were pg_upgrade leaves the
database incompatible with newer features.

3. What's the recommendation for administrators with databases that are too
large to be upgraded with pg_dump?


Problem enabling LDAP login

2021-02-22 Thread Marcelo Lacerda
I'm having some trouble configuring ldap login to postgres. I have
configured LDAP on pg_hba.conf and postgres picks up the correct
configuration during login but I get an error message whenever I attempt to
login with psql to a database named teste.

psql: error: could not connect to server: FATAL:  LDAP authentication
failed for user "my_user_id"

Looking at the log I get a more detailed message but it isn't very useful
either

2021-02-22 10:41:29 -03:172.17.1.139(54178):my_user_id@teste:[50727]: LOG:
 could not search LDAP for filter "(sAMAccountName=my_user_id)" on server
"my_ldap_server": Operations error
2021-02-22 10:41:29 -03:172.17.1.139(54178):my_user_id@teste:[50727]:
DETAIL:  LDAP diagnostics: 04DC: LdapErr: DSID-0C09075A, comment: In
order to perform this operation a successful bind must be completed on the
connection., data 0, v1db1
2021-02-22 10:41:29 -03:172.17.1.139(54178):my_user_id@teste:[50727]:
FATAL:  LDAP authentication failed for user "my_user_id"
2021-02-22 10:41:29 -03:172.17.1.139(54178):my_user_id@teste:[50727]:
DETAIL:  Connection matched pg_hba.conf line 96: "hostall
all 172.17.1.0/24 ldap ldapserver="my_ldap_server"
ldapbasedn="mybasedn" ldapsearchattribute="sAMAccountName"
ldapbinddn="CN=my_bind_dn" ldapbindpasswd="my_bind_password"

Searching postgres source code I was able to find where the problem
happens[1] however it makes things a bit more confusing because from my
understanding postgres was able to successfully bind to the LDAP server in
line 2676 but on line 2700 it reported that a bind was necessary before
querying.

Looking at the tcp packages with tcpdump I was able to notice that openldap
does in fact receive my_user_id information from the server but for some
reason it tries to perform another operation afterwards and fails.

Can anyone help me with this?

[1] -
https://github.com/postgres/postgres/blob/ca3b37487be333a1d241dab1bbdd17a211a88f43/src/backend/libpq/auth.c#L2700


Re: Re: Problem enabling LDAP login

2021-02-22 Thread Marcelo Lacerda
Thank you joão, that solved the problem!


In-depth commercial postgresql training

2019-06-27 Thread Marcelo Lacerda
Our company is looking for commercial training in postgresql. We want a
training option that's as in-depth as possible (going as far as being able
to read and patch postgresql source code). Is there any company that offers
something like that?

Another important thing to mention is that due to legal reasons we can't
make payments in any currency other than BRL.


Yum Update nothing provides libarmadillo.so.12()(64bit) needed by gdal36-libs-3.6.4-6PGDG.rhel9.x86_64 from pgdg-common

2024-02-02 Thread Marcelo Marques
*PROBLEM*

*yum update nothing provides libarmadillo.so.12()(64bit) needed by
gdal36-libs-3.6.4-6PGDG.rhel9.x86_64 from pgdg-common  *

[root@rhel9-pg14 ~]# yum update --exclude=postgis33_14
--exclude=postgresql14*
Updating Subscription Management repositories.
EPEL_9

   26 kB/s | 2.3 kB
00:00
Red Hat Enterprise Linux 9 for x86_64 - AppStream (RPMs)

   50 kB/s | 4.5 kB
00:00
Red Hat CodeReady Linux Builder for RHEL 9 x86_64 (RPMs)

   49 kB/s | 4.5 kB
00:00
Red Hat Enterprise Linux 9 for x86_64 - BaseOS (RPMs)

  40 kB/s | 4.1 kB
00:00
Error:
 Problem: cannot install the best update candidate for package
gdal36-libs-3.6.4-5PGDG.rhel9.x86_64
 * - nothing provides libarmadillo.so.12()(64bit) needed by
gdal36-libs-3.6.4-6PGDG.rhel9.x86_64 from pgdg-common*
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to
use not only best candidate packages)
[root@rhel9-pg14 ~]#

*NOTE:*  --skip-broken still returns the same error message, and --nobest
just skips the gdal36-libs update

*REMARKS*
similar issue https://postgrespro.com/list/thread-id/2679095

*"Pushed GDAL 3.6.4-6 RPMS to the repos built against armadillo 12 for
RHEL 9, 8 and Fedora."*

*RHEL 9.3 DETAILS*

[root@rhel9-pg14 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux release 9.3 (Plow)
[root@rhel9-pg14 ~]#

[root@rhel9-pg14 ~]# rpm -qa | grep postgres | sort
postgresql14-14.10-1PGDG.rhel9.x86_64
postgresql14-contrib-14.10-1PGDG.rhel9.x86_64
postgresql14-libs-14.10-1PGDG.rhel9.x86_64
postgresql14-server-14.10-1PGDG.rhel9.x86_64

[root@rhel9-pg14 ~]# rpm -qa | grep postgis | sort
postgis34_14-3.4.0-1PGDG.rhel9.x86_64
[root@rhel9-pg14 ~]#

[root@rhel9-pg14 ~]# subscription-manager repos --list-enabled
+--+
Available Repositories in /etc/yum.repos.d/redhat.repo
+--+
Repo ID:   rhel-9-for-x86_64-appstream-rpms
Repo Name: Red Hat Enterprise Linux 9 for x86_64 - AppStream (RPMs)
Repo URL:
https://red-inf-cap-p01.esri.com/pulp/content/ESRI/LEP_Patch/cv_patch/content/dist/rhel9/$releasever/x86_64/appstream/os
Enabled:   1

Repo ID:   ESRI_EPEL_9_EPEL_9
Repo Name: EPEL_9
Repo URL:
https://red-inf-cap-p01.esri.com/pulp/content/ESRI/LEP_Patch/cv_patch/custom/EPEL_9/EPEL_9
Enabled:   1

Repo ID:   rhel-9-for-x86_64-baseos-rpms
Repo Name: Red Hat Enterprise Linux 9 for x86_64 - BaseOS (RPMs)
Repo URL:
https://red-inf-cap-p01.esri.com/pulp/content/ESRI/LEP_Patch/cv_patch/content/dist/rhel9/$releasever/x86_64/baseos/os
Enabled:   1

Repo ID:   codeready-builder-for-rhel-9-x86_64-rpms
Repo Name: Red Hat CodeReady Linux Builder for RHEL 9 x86_64 (RPMs)
Repo URL:
https://red-inf-cap-p01.esri.com/pulp/content/ESRI/LEP_Patch/cv_patch/content/dist/rhel9/$releasever/x86_64/codeready-builder/os
Enabled:   1

[root@rhel9-pg14 ~]# ls -la /etc/yum.repos.d/
total 52
drwxr-xr-x.   2 root root85 Jan 31 16:13 .
drwxr-xr-x. 136 root root  8192 Feb  2 10:33 ..
-rw-r--r--.   1 root root 13067 Dec 25 15:11 pgdg-redhat-all.repo
-rw-r--r--.   1 root root 14013 Oct 17 04:10 pgdg-redhat-all.repo.old
-rw-r--r--.   1 root root  4657 Jan 31 16:09 redhat.repo
[root@rhel9-pg14 ~]#

Thanks,
Marcelo Marques
Principal Product Engineer, Esri, www.esri.com


Re: Yum Update nothing provides libarmadillo.so.12()(64bit) needed by gdal36-libs-3.6.4-6PGDG.rhel9.x86_64 from pgdg-common

2024-02-04 Thread Marcelo Marques
BUG #18328: yum update nothing provides libarmadillo.so.12()(64bit) needed
by gdal36-libs-3.6.4-6PGDG.rhel9.x86
<https://www.postgresql.org/message-id/18328-1e16fac373918f71%40postgresql.org>


On Fri, Feb 2, 2024 at 11:05 AM Marcelo Marques 
wrote:

> *PROBLEM*
>
> *yum update nothing provides libarmadillo.so.12()(64bit) needed by
> gdal36-libs-3.6.4-6PGDG.rhel9.x86_64 from pgdg-common  *
>
> [root@rhel9-pg14 ~]# yum update --exclude=postgis33_14
> --exclude=postgresql14*
> Updating Subscription Management repositories.
> EPEL_9
>
>26 kB/s | 2.3 kB
> 00:00
> Red Hat Enterprise Linux 9 for x86_64 - AppStream (RPMs)
>
>50 kB/s | 4.5 kB
> 00:00
> Red Hat CodeReady Linux Builder for RHEL 9 x86_64 (RPMs)
>
>49 kB/s | 4.5 kB
> 00:00
> Red Hat Enterprise Linux 9 for x86_64 - BaseOS (RPMs)
>
> 40 kB/s | 4.1 kB
>   00:00
> Error:
>  Problem: cannot install the best update candidate for package
> gdal36-libs-3.6.4-5PGDG.rhel9.x86_64
>  * - nothing provides libarmadillo.so.12()(64bit) needed by
> gdal36-libs-3.6.4-6PGDG.rhel9.x86_64 from pgdg-common*
> (try to add '--skip-broken' to skip uninstallable packages or '--nobest'
> to use not only best candidate packages)
> [root@rhel9-pg14 ~]#
>
> *NOTE:*  --skip-broken still returns the same error message, and --nobest
> just skips the gdal36-libs update
>
> *REMARKS*
> similar issue https://postgrespro.com/list/thread-id/2679095
>
> *"Pushed GDAL 3.6.4-6 RPMS to the repos built against armadillo 12 for RHEL 
> 9, 8 and Fedora."*
>
> *RHEL 9.3 DETAILS*
>
> [root@rhel9-pg14 ~]# cat /etc/redhat-release
> Red Hat Enterprise Linux release 9.3 (Plow)
> [root@rhel9-pg14 ~]#
>
> [root@rhel9-pg14 ~]# rpm -qa | grep postgres | sort
> postgresql14-14.10-1PGDG.rhel9.x86_64
> postgresql14-contrib-14.10-1PGDG.rhel9.x86_64
> postgresql14-libs-14.10-1PGDG.rhel9.x86_64
> postgresql14-server-14.10-1PGDG.rhel9.x86_64
>
> [root@rhel9-pg14 ~]# rpm -qa | grep postgis | sort
> postgis34_14-3.4.0-1PGDG.rhel9.x86_64
> [root@rhel9-pg14 ~]#
>
> [root@rhel9-pg14 ~]# subscription-manager repos --list-enabled
> +--+
> Available Repositories in /etc/yum.repos.d/redhat.repo
> +--+
> Repo ID:   rhel-9-for-x86_64-appstream-rpms
> Repo Name: Red Hat Enterprise Linux 9 for x86_64 - AppStream (RPMs)
> Repo URL:
> https://red-inf-cap-p01.esri.com/pulp/content/ESRI/LEP_Patch/cv_patch/content/dist/rhel9/$releasever/x86_64/appstream/os
> Enabled:   1
>
> Repo ID:   ESRI_EPEL_9_EPEL_9
> Repo Name: EPEL_9
> Repo URL:
> https://red-inf-cap-p01.esri.com/pulp/content/ESRI/LEP_Patch/cv_patch/custom/EPEL_9/EPEL_9
> Enabled:   1
>
> Repo ID:   rhel-9-for-x86_64-baseos-rpms
> Repo Name: Red Hat Enterprise Linux 9 for x86_64 - BaseOS (RPMs)
> Repo URL:
> https://red-inf-cap-p01.esri.com/pulp/content/ESRI/LEP_Patch/cv_patch/content/dist/rhel9/$releasever/x86_64/baseos/os
> Enabled:   1
>
> Repo ID:   codeready-builder-for-rhel-9-x86_64-rpms
> Repo Name: Red Hat CodeReady Linux Builder for RHEL 9 x86_64 (RPMs)
> Repo URL:
> https://red-inf-cap-p01.esri.com/pulp/content/ESRI/LEP_Patch/cv_patch/content/dist/rhel9/$releasever/x86_64/codeready-builder/os
> Enabled:   1
>
> [root@rhel9-pg14 ~]# ls -la /etc/yum.repos.d/
> total 52
> drwxr-xr-x.   2 root root85 Jan 31 16:13 .
> drwxr-xr-x. 136 root root  8192 Feb  2 10:33 ..
> -rw-r--r--.   1 root root 13067 Dec 25 15:11 pgdg-redhat-all.repo
> -rw-r--r--.   1 root root 14013 Oct 17 04:10 pgdg-redhat-all.repo.old
> -rw-r--r--.   1 root root  4657 Jan 31 16:09 redhat.repo
> [root@rhel9-pg14 ~]#
>
> Thanks,
> Marcelo Marques
> Principal Product Engineer, Esri, www.esri.com
>
>


Re: Yum Update nothing provides libarmadillo.so.12()(64bit) needed by gdal36-libs-3.6.4-6PGDG.rhel9.x86_64 from pgdg-common

2024-02-06 Thread Marcelo Marques
I appreciate the reply.
We have an internal Satellite that we pull from the EPEL repo.
The Satellite Repos are refreshed daily by the IT department.
It seems then that the EPEL repo has not been refreshed.
I will check with my IT department.
Thanks again for your help.

On Mon, Feb 5, 2024 at 8:18 AM Devrim Gündüz  wrote:

> Hi,
>
> On Mon, 2024-02-05 at 08:16 -0800, Marcelo Marques wrote:
> > Yes, the EPEL repo is enabled.
>
> (Please keep the list CC'ed)
>
> Armadillo 12 packages *are* in the EPEL repo:
>
> https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/a/
>
> If you have a local mirror, make sure that it is enabled. Also make sure
> that armadillo is not in the excludes list in either dnf.conf or in the
> repo config files.
>
> -HTH
>
> Regards,
> --
> Devrim Gündüz
> Open Source Solution Architect, PostgreSQL Major Contributor
> Twitter: @DevrimGunduz , @DevrimGunduzTR
>


Re: Yum Update nothing provides libarmadillo.so.12()(64bit) needed by gdal36-libs-3.6.4-6PGDG.rhel9.x86_64 from pgdg-common

2024-02-06 Thread Marcelo Marques
gdal-libs package belongs to PostGIS
PostGIS Support Ticket #5664 -
https://trac.osgeo.org/postgis/ticket/5664#ticket

On Tue, Feb 6, 2024 at 10:12 AM Marcelo Marques 
wrote:

> I appreciate the reply.
> We have an internal Satellite that we pull from the EPEL repo.
> The Satellite Repos are refreshed daily by the IT department.
> It seems then that the EPEL repo has not been refreshed.
> I will check with my IT department.
> Thanks again for your help.
>
> On Mon, Feb 5, 2024 at 8:18 AM Devrim Gündüz  wrote:
>
>> Hi,
>>
>> On Mon, 2024-02-05 at 08:16 -0800, Marcelo Marques wrote:
>> > Yes, the EPEL repo is enabled.
>>
>> (Please keep the list CC'ed)
>>
>> Armadillo 12 packages *are* in the EPEL repo:
>>
>> https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/a/
>>
>> If you have a local mirror, make sure that it is enabled. Also make sure
>> that armadillo is not in the excludes list in either dnf.conf or in the
>> repo config files.
>>
>> -HTH
>>
>> Regards,
>> --
>> Devrim Gündüz
>> Open Source Solution Architect, PostgreSQL Major Contributor
>> Twitter: @DevrimGunduz , @DevrimGunduzTR
>>
>


Can't Remote connection by IpV6

2024-06-06 Thread Marcelo Marloch
Hi everyone, is it possible to remote connect through IpV6? IpV4 works fine
but I cant connect through V6

postgresql.conf is to listen all address and pg_hba.conf is set with host
all all :: md5 i've tried ::/0 and ::0/0 but had no success

my provider is out of ipv4 and they're sending ips by cgnat if I want a
public ipv4 I have to sign a very expensive service fee

thanks a lot


ERROR: could not open relation with OID XXXX

2024-08-25 Thread Marcelo Zabani
Hi all,

I can reproduce the error in the subject from time to time when querying
catalog tables while DDL is happening concurrently. Here's a bash script
that reproduces it (not always, you might have to run it many times until
you see ERROR:  could not open relation with OID ):

#!/usr/bin/env bash
psql -c "create table test(x serial primary key); select oid, relname from
pg_class where relname='test'"
# The next two queries will run concurrently
psql -c "select oid, relname, pg_sleep(3), pg_get_indexdef(oid) from
pg_class join pg_index on indexrelid=pg_class.oid WHERE
relname='test_pkey';" 2>&1 1>/tmp/pgbug.log &
sleep 1
psql -c "drop table test"
cat /tmp/pgbug.log
wait

I am confused as to how this is possible. I assume if the row with the
test_pkey index exists in the pg_index catalog table, that the snapshot of
the catalog tables contains the test table itself and is generally
consistent, so querying the catalog should not run into such errors.

I've seen this happen in Production without pg_sleep in the mix, too. I
added pg_sleep to the example above only because it makes the error easier
to reproduce.

Is there something I can do to avoid this? Is my understanding of how the
catalog tables work wrong?

Thanks,
Marcelo.


Re: ERROR: could not open relation with OID XXXX

2024-08-25 Thread Marcelo Zabani
> What version (including minor number)?

On my computer I can reproduce the error with v16.1. At my job I know it's
v15 but I can't access it right now so don't know the minor version. In any
case, at my job it's much bigger queries we run; I'll elaborate why below.

> Just out of curiosity, *WHY* do you do this?  It's never occurred to me
to do that; maybe it's something useful that I've been overlooking.

A project developed by me, codd - https://github.com/mzabani/codd - applies
postgresql migrations and checks tables, columns names/order/types,
indexes, etc. ("the full schema") to ensure databases in every environment
match what developers have on their computers. This is why the queries are
bigger, e.g.
https://github.com/mzabani/codd/blob/master/src/Codd/Representations/Database/Pg12.hs#L523-L540

One yet untested hypothesis is that codd is picking up temporary tables
when we deploy that are soon destroyed by the application, and that maybe
postgres is running into this error because it executes some of these
functions _before_ filtering out temporary relations (WHERE relpersistence
<> 't'). This might be possible depending on query plan, I believe. But
again, an untested hypothesis.

On Sun, Aug 25, 2024 at 11:31 AM Ron Johnson 
wrote:

> On Sun, Aug 25, 2024 at 9:42 AM Marcelo Zabani  wrote:
>
>> Hi all,
>>
>> I can reproduce the error in the subject from time to time when querying
>> catalog tables while DDL is happening concurrently. Here's a bash script
>> that reproduces it (not always, you might have to run it many times until
>> you see ERROR:  could not open relation with OID ):
>>
> [snip]
>
>> I've seen this happen in Production without pg_sleep in the mix, too. I
>> added pg_sleep to the example above only because it makes the error easier
>> to reproduce.
>>
>
> What version (including minor number)?
>
>
>> Is there something I can do to avoid this? Is my understanding of how the
>> catalog tables work wrong?
>>
>
> Just out of curiosity, *WHY* do you do this?  It's never occurred to me
> to do that; maybe it's something useful that I've been overlooking.
>
> --
> Death to America, and butter sauce.
> Iraq lobster!
>


Re: ERROR: could not open relation with OID XXXX

2024-08-25 Thread Marcelo Zabani
> we do some special stuff for catalogs

That is good to know, thanks!

> I believe you could actually lock the pg_class rows for update. Just add
FOR UPDATE at the end of the query.

Thanks, but I tried that and got "ERROR:  permission denied for table
pg_class", even if I try it only for tables the user owns.

At least considering the use-case of avoiding this error due to temporary
tables/indexes (which are a part of normal application execution), I was
thinking of using materialized CTEs that filters those out, and only after
that using other functions that for example take OIDs and return
definitions. Other kinds of DDL that create non-temporary tables can be
"blamed" on developers in my case.
Do you think using those materialized CTEs could help? And do you think
this can be considered a bug that I should report or is it just too
edge-casey to consider?

Regards.

On Sun, Aug 25, 2024 at 12:06 PM Tomas Vondra  wrote:

> On 8/25/24 15:42, Marcelo Zabani wrote:
> > Hi all,
> >
> > I can reproduce the error in the subject from time to time when querying
> > catalog tables while DDL is happening concurrently. Here's a bash script
> > that reproduces it (not always, you might have to run it many times
> > until you see ERROR:  could not open relation with OID ):
> >
> > #!/usr/bin/env bash
> > psql -c "create table test(x serial primary key); select oid, relname
> > from pg_class where relname='test'"
> > # The next two queries will run concurrently
> > psql -c "select oid, relname, pg_sleep(3), pg_get_indexdef(oid) from
> > pg_class join pg_index on indexrelid=pg_class.oid WHERE
> > relname='test_pkey';" 2>&1 1>/tmp/pgbug.log &
> > sleep 1
> > psql -c "drop table test"
> > cat /tmp/pgbug.log
> > wait
> >
> > I am confused as to how this is possible. I assume if the row with the
> > test_pkey index exists in the pg_index catalog table, that the snapshot
> > of the catalog tables contains the test table itself and is generally
> > consistent, so querying the catalog should not run into such errors.
> >
>
> I think you're assuming the whole query runs with a single snapshot, and
> AFAIK that's not quite accurate - we do some special stuff for catalogs,
> for example. There's also the additional complexity of maintaining a
> cache on catalogs, invalidating it, etc.
>
> I don't have a great simple "this happens because X" explanation, but a
> lot of this relies on proper locking - in particular, that we lock all
> the objects before execution, which also invalidates all the caches etc.
>
> But that can't happen here, because we only realize we need to access
> the OID very late in the execution, when we get to pg_get_indexdef.
>
> > I've seen this happen in Production without pg_sleep in the mix, too. I
> > added pg_sleep to the example above only because it makes the error
> > easier to reproduce.
> >
>
> It's a race condition, essentially. The sleep just makes it easier to
> hit, but it can happen without it.
>
> > Is there something I can do to avoid this? Is my understanding of how
> > the catalog tables work wrong?
> >
>
> I believe you could actually lock the pg_class rows for update. Just add
> FOR UPDATE at the end of the query.
>
>
> regards
>
> --
> Tomas Vondra
>


Re: ERROR: could not open relation with OID XXXX

2024-08-25 Thread Marcelo Zabani
Thank you all for your comments.

> I would think that the queries in that case would be running as a
superuser in order to do the migrations.

Users of codd can choose the role that applies their migrations. Codd even
supports individual migrations running with ad-hoc users (so that a
migration can use the _postgres_ user to create the application's database,
for example) and users are free to add statements like `SET ROLE` inside
their migrations, too. So it's sadly not possible AFAICT to force superuser
onto them.

But I think I have plenty of things to try to avoid this problem, from
retrying like Tomas suggested to materialized CTEs that filter out
temporary relations before functions like pg_get_indexdef are called.

I will give these things a shot.

Regards.

On Sun, Aug 25, 2024 at 1:13 PM Adrian Klaver 
wrote:

> On 8/25/24 08:36, Marcelo Zabani wrote:
> >  > we do some special stuff for catalogs
> >
> > That is good to know, thanks!
> >
> >  > I believe you could actually lock the pg_class rows for update. Just
> > add FOR UPDATE at the end of the query.
> >
> > Thanks, but I tried that and got "ERROR:  permission denied for table
> > pg_class", even if I try it only for tables the user owns.
> >
>
> As I understand it this issue came up in:
>
> https://github.com/mzabani/codd
>
> I would think that the queries in that case would be running as a
> superuser in order to do the migrations.
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: RESET, NULL and empty-string valued settings and transaction isolation

2024-10-19 Thread Marcelo Zabani
Thank you both for the explanations and the link to the discussion of
Pavel's patches.


RESET, NULL and empty-string valued settings and transaction isolation

2024-10-19 Thread Marcelo Zabani
Hi, according to the docs, RESET sets the value of a setting to
"The default value is defined as the value that the parameter would have
had, if no SET had ever been issued for it in the current session"

Which confuses me given that the value starts being NULL in the session and
then turns into an empty string:
$ psql
=> select current_setting('my.test', true) is null; -- true
=> set my.test = 'abc';
=> reset my.test;
=> select current_setting('my.test', true) is null; -- false
=> select current_setting('my.test', true)=''; -- true

A similar effect happens with transactions and SET LOCAL:
=> begin;
=*> set local my.test='abc';
=*> rollback; -- commit works too
=> select current_setting('my.test', true) = ''; -- true

Is this expected? I thought even if I misunderstand the docs, the effect
isn't very nice because SQL like
current_setting('my.some_boolean_setting')::boolean will fail after a
transaction with SET LOCAL sets it, a side-effect that can be particularly
confusing and basically requires usage of nullif(.., '') or other explicit
checks around every current_setting call-site in practice.

Thanks in advance,
Marcelo.


Re: Logging queries executed by SPI_execute

2025-02-03 Thread Marcelo Fernandes
On Mon, Feb 3, 2025 at 6:46 PM Pavel Stehule  wrote:
> The queries executed by SPI are never executed on the top level. These 
> queries are marked as nested.
>
> So you need to use auto_explain 
> https://www.postgresql.org/docs/current/auto-explain.html with active 
> auto_explain.log_nested_statements

Thanks Pavel, I have tried that but however couldn't succeed.

I have added this to my postgres.conf file:

session_preload_libraries = 'auto_explain'
auto_explain.log_nested_statements = true

After running pg_repack, which calls SPI_execute* functions, I can't really see
those functions being propagated to the logs.

I have a workaround in place, which is to add elog calls such as

elog(LOG, "...")

to capture the queries. But of course, ideally a postgres.conf setting would be
more handy.

Best,
Marcelo




Logging queries executed by SPI_execute

2025-02-02 Thread Marcelo Fernandes
Hi there,

I have been trying to debug what queries an extension is firing. After reading
the code for the extension, I noticed that all the statements are fired via the
SPI interface, most specifically, using the SPI_execute* family of functions.

However, the problem is that these statements don't seem to feature in the
logfile. It has become a bit of a "dead end" for me to fully analyse what
queries are fired in the end.

My postgresql.conf is fairly simple, but I have added "log_statement = 'all'"
and thus was expecting to see information for these statements too.

Is there anything I am missing? Some configuration or perhaps another way to
find out about these statements?

It may not be relevant, but the extension I am having a look at is pg_repack if
anyone is interested. I am trying to get a deeper understanding of what the
function repack_apply does.

Thanks,
Marcelo.




What is the story behind _SPI_PLAN_MAGIC?

2025-02-02 Thread Marcelo Fernandes
Hi there,

Reading through the SPI code I see this definition:

  #define _SPI_PLAN_MAGIC 569278163

Which is used in he _SPI_plan struct in src/include/executor/spi_priv.h:

  typedef struct _SPI_plan
  {
  int magic;
  ...
  }

What is its purpose?

Thank you.
Marcelo




Re: What is the story behind _SPI_PLAN_MAGIC?

2025-02-02 Thread Marcelo Fernandes
On Mon, Feb 3, 2025 at 3:17 PM Tom Lane  wrote:
> Just to catch programming errors, ie passing the wrong pointer
> value to some SPI function.  See the checks for it in spi.c.

Aha! Perfect, I thought it was something like that. Thank you!
Marcelo.




Re: Best Approach for Swapping a Table with its Copy

2025-02-13 Thread Marcelo Fernandes
On Fri, Feb 14, 2025 at 12:35 PM Adrian Klaver
 wrote:
> 1) In a previous post you said:
>
> "Yes, in this scenario the copy is already created, and triggers keep
> the copy in sync with the original table."
>
> In that case the copy will already have TOAST tables associated with it.

If we follow the idea behind repack_swap, we would have swapped the oid's of
the two tables.

This means you have to swap the TOAST table in the catalogue as well.

Otherwise the new table will be linked to the old TOAST and the old table will
be linked to the new TOAST. We want the opposite.

> 2) What are the FK relationships and how many?

I think that for theoretical purposes we can just say there are "N" FKs.
Because no matter how many there are, they need to be updated to point towards
the new table.

> Also could you just not add the FK's as NOT VALID?

That's an interesting compromise I haven't thought of. Thanks.

However, ideally I'd like to swap the catalogue entries instead - as that would
be a cleaner approach since it wouldn't require dropping old constraints,
creating NOT VALID ones, and then optionally validating them later.

Regards,
  - Marcelo




Best Approach for Swapping a Table with its Copy

2025-02-12 Thread Marcelo Fernandes
Hi folks,

I have a scenario where I need to swap an original table with a copy of that
table.

The copy has an exclusion constraint that the original does not have. The main
challenge is to ensure that the swap does not break any existing foreign keys
to the original table and handles the associated TOAST tables correctly.

Both tables are fairly large and exist in a system where there are no
maintenance time windows where the application is shut down.

My key questions are:

- What is the best process for swapping the original table with the copy in
  such a way that the foreign key relations are preserved?

- Are there any special considerations for managing the TOAST tables during
  this swap?

- Should I perform this operation in multiple steps, or is there a
  straightforward way to achieve this atomically?

- Are there any risks of potential issues I should be aware of when doing this
  swap? Specifically related to foreign key integrity and TOAST data?

Thank you!
  - Marcelo




Re: Best Approach for Swapping a Table with its Copy

2025-02-16 Thread Marcelo Fernandes
On Sat, Feb 15, 2025 at 4:12 AM Greg Sabino Mullane  wrote:
> The pg_repack link posted earlier has the details on how it is done. But 
> messing with system catalogs like this is highly discouraged, for good 
> reasons. Still, if you need to go that route, test heavily and post the 
> solutions here for feedback.

I'm trying to digest what pg_repack does by reproducing the same behaviour
using SQL. I have come up with the script below to test the whole scenario, but
I have two major problems:

- The foreign keys are not being updated to point to the new table.

- pg_repack seems to update relcache entries. I am not sure how to do that with
  SQL. See:
  
https://github.com/marcelofern/pg_repack/blob/9f36c65bd57ca1b228025687843758556b56df8e/lib/repack.c#L1373-L1387

And for reference, here is the script I have used so far. Keen for any
suggestions on how to swap the foreign keys so that they can point towards the
new table.

```sql
-- Create the original table that will be later swapped by its copy.
DROP TABLE IF EXISTS original CASCADE;
CREATE TABLE original (
id SERIAL PRIMARY KEY,
name VARCHAR(5000) NOT NULL, -- necessary for the TOAST table.
value INTEGER NOT NULL
);
-- Insert 10_000 rows into it.
INSERT INTO original (name, value)
SELECT
'item_' || generate_series(1, 1) AS name,
(generate_series(1, 1) % 1) + 1 AS value;

-- Create the copy table, this table will be swapped for the original table
-- later
DROP TABLE IF EXISTS copy;
CREATE TABLE copy (
id SERIAL PRIMARY KEY,
name VARCHAR(5000) NOT NULL,
value INTEGER NOT NULL
);
-- Pull all the data from the original table into the copy table.
INSERT INTO copy SELECT id, name, value FROM ONLY original;

-- Create a table with a foreign key to the original table to verify if the
-- swap addresses the foreign key table.
DROP TABLE IF EXISTS table_with_fk;
CREATE TABLE table_with_fk (
id SERIAL PRIMARY KEY,
original_id INTEGER NOT NULL,
CONSTRAINT fk_original FOREIGN KEY (original_id)
REFERENCES original(id)
ON DELETE CASCADE
ON UPDATE CASCADE
);
-- Insert 10_000 rows into it.
INSERT INTO table_with_fk (original_id)
SELECT generate_series(1, 1);

-- Analogously, create a table with a NOT VALID foreign key.
DROP TABLE IF EXISTS table_with_not_valid_fk;
CREATE TABLE table_with_not_valid_fk (
id SERIAL PRIMARY KEY,
original_id INTEGER NOT NULL,
CONSTRAINT not_valid_fk_original FOREIGN KEY (original_id)
REFERENCES original(id)
ON DELETE CASCADE
ON UPDATE CASCADE
NOT VALID
);
-- Insert 10_000 rows
INSERT INTO table_with_not_valid_fk (original_id)
SELECT generate_series(1, 1);

-- All tables must have 10_000 rows in them.
SELECT count(*) FROM original;
SELECT count(*) FROM copy;
SELECT count(*) FROM table_with_fk;
SELECT count(*) FROM table_with_not_valid_fk;

-- See relation info for the tables and their TOASTs.
SELECT
X.relname,
X.reltablespace,
X.oid,
X.reltoastrelid,
X.relowner,
X.relkind,
X.relfrozenxid,
X.relminmxid,
X.relpages,
X.reltuples,
X.relallvisible,
X.relfilenode,
TOAST_X.indexrelid as toast_indexrelid
FROM pg_catalog.pg_class X
LEFT JOIN
  pg_catalog.pg_index TOAST_X ON X.reltoastrelid = TOAST_X.indrelid
AND TOAST_X.indisvalid
WHERE X.oid IN (('original')::regclass, ('copy')::regclass)
ORDER BY X.relname;
-- -[ RECORD 1 ]+-
-- relname  | copy
-- reltablespace| 0
-- oid  | 22522
-- reltoastrelid| 22526
-- relowner | 10
-- relkind  | r
-- relfrozenxid | 2068
-- relminmxid   | 1
-- relpages | 64
-- reltuples| 1
-- relallvisible| 64
-- relfilenode  | 22522
-- toast_indexrelid | 22527
-- -[ RECORD 2 ]+-
-- relname  | original
-- reltablespace| 0
-- oid  | 22513
-- reltoastrelid| 22517
-- relowner | 10
-- relkind  | r
-- relfrozenxid | 2065
-- relminmxid   | 1
-- relpages | 64
-- reltuples| 1
-- relallvisible| 64
-- relfilenode  | 22513
-- toast_indexrelid | 22518

-- Take note of the dependencies for the toast table to compare later.
SELECT
d1.objid AS original_objid,
d2.objid AS copy_objid
FROM pg_depend d1, pg_depend d2
WHERE d1.objid = ('original'::regclass) AND d2.objid = ('copy'::regclass);
-- -[ RECORD 1 ]--+--
-- original_objid | 22513
-- copy_objid | 22522

-- Start table swap inside a transaction.
BEGIN;
LOCK TABLE original, copy IN ACCESS EXCLUSIVE MODE;

SELECT * FROM pg_class
WHERE relname in ('original', 'copy')
FOR UPDATE;

WITH swapped AS (
SELECT
c1.oid AS original_oid, c2.oid AS copy_oid,
c1.relfilenode AS original_filenode, c2.relfilenode AS copy_filenode,
c1.reltablespace AS original_tablespace, c2.reltablespace AS
copy_tablespace,
c1.reltoastrelid AS original_toast, c2.reltoastrelid AS copy_toast,
c1.relfrozenxi

Re: Best Approach for Swapping a Table with its Copy

2025-02-13 Thread Marcelo Fernandes
On Thu, Feb 13, 2025 at 7:37 PM Laurenz Albe  wrote:
> Yes, but only if you are willing to write C code that runs inside the
> database server.  That way, you can do anything (and cause arbitrary
> damage).
>
> The big challenge here would be to do the swap in a safe way.  How do
> you intend to guarantee that the foreign keys are valid without a table
> scan?  How do you handle concurrent data modifications?

Exactly! This is part of what I am trying to figure out (-:

The plan that I have in mind so far to swap a table safely is:

1. Run the following query to grab information about the toast and the toast
   index of the tables involved in the swapping:

  SELECT
X.oid,
X.reltoastrelid,
X.relowner,
TOAST_X.indexrelid
  FROM pg_catalog.pg_class X
LEFT JOIN
  pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
TOAST_X.indrelid AND TOAST_X.indisvalid
  WHERE X.oid = ('my_sweet_table')::regclass;

2. Open a transaction and acquire an access exclusive lock.
3. Ensure that the cloned table has the same owner as the original table

  ALTER TABLE copy_table OWNER TO owner_of_the_original_table;

4. Now I need to swap all the data in the catalogue tables that point to the
   old table and the toast to point to the new one and vice-versa (in case I
   need to rollback).
5. Commit

Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
And for what reasons?

It may be a risky operation, as you say and I might decide not to do pursue it
in the end, but first I must understand (-:

Regards,
  - Marcelo




Re: Best Approach for Swapping a Table with its Copy

2025-02-13 Thread Marcelo Fernandes
On Fri, Feb 14, 2025 at 4:09 AM Greg Sabino Mullane  wrote:
> It is surgically replacing all pointers to the old data with pointers to the 
> new data. Yes, with lots of system catalog shenanigans.

Love your analogy Greg, thanks for that.

> It's technically possible to do something similar for your use case, but it's 
> not trivial. All the cab to trailer wires must be precisely changed. 
> Everything directly related to the data must be swapped: heap, indexes, toast.

I'd really appreciate to know more about how I can do this, as I think this is
the crux of what I am trying to solve.

If you have pointers, thoughts, or resources where I can better understand
what's involved, that would be much appreciated.

In terms of where I am at currently, I summarised in my previous reply:

On Fri, Feb 14, 2025 at 11:59 AM Marcelo Fernandes  wrote:
> The plan that I have in mind so far to swap a table safely is:
>
> 1. Run the following query to grab information about the toast and the toast
>index of the tables involved in the swapping:
>
>   SELECT
> X.oid,
> X.reltoastrelid,
> X.relowner,
> TOAST_X.indexrelid
>   FROM pg_catalog.pg_class X
> LEFT JOIN
>   pg_catalog.pg_index TOAST_X ON X.reltoastrelid =
> TOAST_X.indrelid AND TOAST_X.indisvalid
>   WHERE X.oid = ('my_sweet_table')::regclass;
>
> 2. Open a transaction and acquire an access exclusive lock.
> 3. Ensure that the cloned table has the same owner as the original table
>
>   ALTER TABLE copy_table OWNER TO owner_of_the_original_table;
>
> 4. Now I need to swap all the data in the catalogue tables that point to the
>old table and the toast to point to the new one and vice-versa (in case I
>need to rollback).
> 5. Commit
>
> Step 4 is what I am stuck on. What bits of the catalogue do I need to change?
> And for what reasons?
>
> It may be a risky operation, as you say and I might decide not to do pursue it
> in the end, but first I must understand (-:

Regards,
  - Marcelo




Re: Best Approach for Swapping a Table with its Copy

2025-02-12 Thread Marcelo Fernandes
On Thu, Feb 13, 2025 at 1:33 PM Adrian Klaver  wrote:
> Not seeing it:
>
> https://reorg.github.io/pg_repack/
>
> "Details
> Full Table Repacks
>
> [...]
>
> pg_repack will only hold an ACCESS EXCLUSIVE lock for a short period
> during initial setup (steps 1 and 2 above) and during the final
> swap-and-drop phase (steps 6 and 7). For the rest of its time, pg_repack
> only needs to hold an ACCESS SHARE lock on the original table, meaning
> INSERTs, UPDATEs, and DELETEs may proceed as usual."
>
> During the ACCESS EXCLUSIVE stages you will not have access.

The operations under ACCESS EXCLUSIVE are short-lived. In my benchmarks, the
"repack_swap" function finishes within the order of magnitude of milliseconds.
The operations seem to be catalogue-only. I'd love someone to confirm this
though, because I am not a specialist in C and Postgres extensions.

Here is the code if you want to have a look:

https://github.com/reorg/pg_repack/blob/306b0d4f7f86e807498ac00baec89ecd33411398/lib/repack.c#L843

What I am after is the same, but I seek a deeper understanding of what it does,
and why it does it. For example, it swaps relfilenode. Why?

> Not only that with pg_repack you are not changing the table definition,
> whereas in your case you are introducing a new constraint and associated
> index.

Correct, but I am not using pg_repack. I have cloned the table using my own
tool. I'm citing pg_repack because it does perform a table swap (a bloated table
is swapped by a new non-bloated table).

Given that I know pg_repack works well in large databases, it has to follow
that the approach they have to swapping the tables is robust.

> Do you have a dev setup where you can test alternatives with a test
> sample of data?

Do you mean alternatives to table-cloning-and-swapping?

Regards,
  - Marcelo




Re: Best Approach for Swapping a Table with its Copy

2025-02-12 Thread Marcelo Fernandes
On Thu, Feb 13, 2025 at 10:02 AM Adrian Klaver
 wrote:
> This needs more information:
> 1) Postgres version.

That would be for Postgres 15.

> 2) The table definition.

This is a simplified version of the original table:

CREATE TABLE bookings (
id SERIAL PRIMARY KEY,
resource_id INT NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL
);

> 3) The exclusion constraint definition.

The copy table would have an exclusion constraint such as:

ALTER TABLE bookings
ADD CONSTRAINT no_date_overlap_for_resource_id
EXCLUDE USING gist (
resource_id WITH =,
daterange(start_date, end_date, '[]') WITH &&
);

> 4) Definition of what 'fairly large' is.

This table is over 400GB

> 5) How is the application interfacing with the database?

This is a web application that interfaces with the database using psycopg.

Also pulling in your question in the other reply:

> Why can't you just add the exclusion constraint to the original table?

With unique constraints, one can use a unique index to create the constraint
concurrently.

With check constraints, one can create the constraint as invalid and then
validate it while only requiring a share update exclusive lock.

But with exclusion constraints, neither of those techniques are available. In
that sense, there is no way to create this type of constraint in a large table
without copying the original table, adding the constraint, and performing a
table swap.

This is done to avoid having to hold an exclusive lock for a long amount of
time, thus creating application outages.

Hope that clarifies the situation a bit better
  - Marcelo




Re: Best Approach for Swapping a Table with its Copy

2025-02-12 Thread Marcelo Fernandes
On Thu, Feb 13, 2025 at 10:40 AM Adrian Klaver
 wrote:
> Do you know this will not fail on the existing data?

Yes, all the current data in the original table respects the constraint.

> Do you have room for a complete copy of the table?

Yes, in this scenario the copy is already created, and triggers keep the copy
in sync with the original table.

> I am not seeing how this can be done without some outage for that table.

Swapping tables without an outage is possible. I believe that this something
involved in the process repacking a table when using pg_repack as it relies on
copying the whole table and then swapping it for the original.

My main question is "How?". I know that the pg_repack involves swapping the
relfilenode values and something about TOAST tables, but I am not super
acquainted with pg_repack code or debugging tools to verify precisely what it
does.

> What sort of time frame is acceptable?

The scan phase in this table is very slow, on top of it the exclusion
constraint needs to create the underlying index to service the constraint.

Anything that takes more than 10s in this system is prohibitive, in this sense
creating the constraint without having a table copy is not viable for the size
of this table.

Regards,




Default Value Retention After Dropping Default

2025-02-23 Thread Marcelo Fernandes
Hi folks,

I am experiencing an interesting behavior in PostgreSQL and would like to seek
some clarification.

In the following snippet, I first add a column with a default value, then drop
that default. However, when I query the table, the column still retains the
dropped default for existing rows:

  SET client_min_messages=debug1;

  DROP TABLE IF EXISTS foo CASCADE;
  CREATE TABLE foo (id SERIAL PRIMARY KEY);

  INSERT INTO foo (id) SELECT generate_series(1, 1);

  ALTER TABLE foo ADD COLUMN bar varchar(255) NOT NULL DEFAULT 'default';
  ALTER TABLE foo ALTER COLUMN bar DROP DEFAULT;

  SELECT * from foo order by id desc limit 5;
  --   id   |   bar
  -- ---+-
  --  1 | default
  --    | default
  --   9998 | default
  --   9997 | default
  --   9996 | default

In this example, even after dropping the default value from the bar column, the
rows that were previously inserted (prior to dropping the default) still show
'default' as their value in the bar column.

It does not see that the table has been rewritten or rescanned, otherwise the
debug1 messages would be triggered.

Can anyone explain how PostgreSQL "knows about" the default value that has just
been dropped and what is happened under the scenes? I am keen on a deep
understanding on how Postgres achieves this.

Here is what I could find in the docs, but it does not satisfy my question:

> From PostgreSQL 11, adding a column with a constant default value no longer
> means that each row of the table needs to be updated when the ALTER TABLE
> statement is executed. Instead, the default value will be returned the next
> time the row is accessed, and applied when the table is rewritten, making the
> ALTER TABLE very fast even on large tables.

https://www.postgresql.org/docs/current/ddl-alter.html#DDL-ALTER-ADDING-A-COLUMN




Best way to check if a table is empty

2025-04-05 Thread Marcelo Fernandes
Hi folks,

I came up with three strategies to verify whether a table is empty.

I wanted to sound the community to check whether my assumptions are correct for
each of these strategies, and to also discuss which strategy is best.

## Strategy 1 [possibly best?]

SELECT EXISTS (SELECT 1 FROM foo LIMIT 1);

Pros:
  1. Works with any table.
  2. Relatively fast (if the table is well organised).

Cons:
  1. Sequential Scan
  2. If the table is bloated, it reads more buffers.

## Strategy 2

SELECT min(id) FROM foo;

Pros:
  1. Does an index-only scan on a field that presumably has a PK index.
  2. Works well even if the table is bloated.

Cons:
  1. Sequential Scan if the table does not have a PK index.
  2. Reads a few more buffers than Strategy 1 when the table is well organised.
  3. Performs worse if the index is bloated.

## Strategy 3 [worst]

SELECT count(*) FROM foo;

Pros:
  1. Uses a widespread and intuitive operation (count)

Cons:
  1. Very slow on large tables as it performs a Sequential Scan.


How does all of that sound? Are there further strategies I should consider?
Anything I have missed in the Strategies above?

Regards,
Marcelo.




pg_get_serial_sequence not working for manually set seq

2025-04-21 Thread Marcelo Fernandes
Hi folks,

I've been testing the pg_get_serial_sequence function and noticed that I can
only get reliable results when using a SERIAL or IDENTITY column.

However, shouldn't it work for manually set sequences too?

In the docs[0] we have that this function:

> Returns the name of the sequence associated with a column, or NULL if no
> sequence is associated with the column

But according to my test below, that does not hold for manually set sequences
on a column.

Is this expected behaviour?

Test:

-- Identity column ✓
DROP TABLE IF EXISTS foo CASCADE;
CREATE TABLE foo (id INT GENERATED ALWAYS AS IDENTITY);
SELECT pg_get_serial_sequence('foo', 'id');
--  pg_get_serial_sequence
-- 
--  public.foo_id_seq

-- Test with a serial column ✓
DROP TABLE IF EXISTS bar CASCADE;
CREATE TABLE bar (id SERIAL);
SELECT pg_get_serial_sequence('bar', 'id');
--  pg_get_serial_sequence
-- 
--  public.bar_id_seq

-- Manually set seq ✗
DROP TABLE IF EXISTS buzz CASCADE;
CREATE SEQUENCE seq;
CREATE TABLE buzz (id INTEGER);
ALTER TABLE buzz ALTER COLUMN id SET DEFAULT nextval('seq');
SELECT pg_get_serial_sequence('buzz', 'id');
-- No results
--  pg_get_serial_sequence


[0] https://www.postgresql.org/docs/current/functions-info.html




Resetting the lock_timeout value for a transaction

2025-04-27 Thread Marcelo Fernandes
Hi folks,

I have been using:

  -- Setting the SESSION lock timeout to 10 seconds
  SET lock_timeout = '10s';

  -- Setting the TRANSACTION lock timeout to 20 seconds
  BEGIN;
  SET LOCAL lock_timeout = '20s';
  COMMIT;

However, I have been caught by the behaviour of "RESET lock_timeout;" when
inside and outside a transaction.

  -- Resets the lock_timeout value for the SESSION.
  RESET lock_timeout;

  -- WARNING: This will reset BOTH the SESSION and TRANSACTION lock_timeouts.
  BEGIN;
  SET LOCAL lock_timeout = '20s';
  RESET lock_timeout;
  COMMIT;

I would have expected that the "RESET lock_timeout;" inside a transaction would
only reset the value of lock_timeout for that specific transaction. Or else,
there would be an equivalent "RESET LOCAL lock_timeout;" to be used for that.

But I can't find anything that does just that.
Am I missing something? Example script for convenience:

  -- This is the default lock_timeout (0s)
  SHOW lock_timeout;

  -- Set the SESSION lock timeout (42s)
  SET lock_timeout = '42s';
  SHOW lock_timeout;

  BEGIN;

  -- WARNING: This will set a new value for the SESSION lock_timeout from within
  -- the transaction because it is missing the LOCAL key word!
  SET lock_timeout = '10s';
  SHOW lock_timeout;

  -- Set it again but this time only for the transaction. This value will not
  -- affect the session lock_timeout.
  SET LOCAL lock_timeout = '9s';
  SHOW lock_timeout;

  -- Reset BOTH the SESSION and Transaction lock_timeout (both go back to 0, the
  -- default).
  RESET lock_timeout;
  SHOW lock_timeout;

  COMMIT;

  -- Should now be 0s because it was reset inside the transaction.
  SHOW lock_timeout;

Thanks, Marcelo




Changing default fillfactor for the whole database

2025-04-26 Thread Marcelo Fernandes
Hi there,

I have a scenario where virtually all user tables in the database will need to
have a lower fill factor.

It would have been handy to have a way to set this default, but as of now, I
don't think the default can be changed.

I have looked at `share/postgresql.conf.sample` file, and couldn't see
anything indicating this value could be changed.

Though I'm aware there's an ALTER TABLE command that can be used for that, I
think it would be good to avoid some future errors (forgetting to set it) if I
could change the default value.

Does this make sense? Have I missed something about being able to change this
on a database level?

Thanks,
Marcelo.