from:"Joe Conway"

Re: selecting timestamp

2018-02-27 Thread Joe Conway

On 02/27/2018 12:16 PM, chris wrote:
> 
> What is the best way of selecting current timestamp in UTC? 
> 
> SELECT
> CURRENT_TIMESTAMP   as ct1
> ,timezone('UTC',CURRENT_TIMESTAMP) as ct2
> ,timezone('utc',now()) as ct3
> ,CURRENT_TIMESTAMP at time zone 'UTC' as ct4
> ,NOW() at time zone 'utc' as ct5

Depends on whether you want a volatile or stable answer:

\x
SELECT clock_timestamp(),
   clock_timestamp() AT TIME ZONE 'UTC',
   now(),
   now() AT TIME ZONE 'UTC'
FROM generate_series(1,3);
-[ RECORD 1 ]---+--
clock_timestamp | 2018-02-27 12:30:10.46699-08
timezone| 2018-02-27 20:30:10.466991
now | 2018-02-27 12:30:10.466692-08
timezone| 2018-02-27 20:30:10.466692
-[ RECORD 2 ]---+--
clock_timestamp | 2018-02-27 12:30:10.467017-08
timezone| 2018-02-27 20:30:10.467017
now | 2018-02-27 12:30:10.466692-08
timezone| 2018-02-27 20:30:10.466692
-[ RECORD 3 ]---+--
clock_timestamp | 2018-02-27 12:30:10.467023-08
timezone| 2018-02-27 20:30:10.467023
now | 2018-02-27 12:30:10.466692-08
timezone| 2018-02-27 20:30:10.466692

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Reindex doesn't eliminate bloat

2018-03-13 Thread Joe Conway

On 03/12/2018 09:16 PM, Ron Johnson wrote:
> On 03/12/2018 10:48 PM, Nikolay Samokhvalov wrote:
>> Those queries from wiki for table and index bloat estimation are for
>> estimation only. In many cases they show very wrong results. Better
>> (yet not ideal) approach is using pgstattuple extension (though I'm
>> not sure it existed back in 2009).
>>
>> Can you provide table and index definition and, if you can, some
>> sample data?
> 
> Sadly, no sample data.  (It's all PCI controlled.)
> 
> Index idx_item_mapping_rp7_y2016m03itemmapping_custom_userfield_801 has
> 40% bloat.

Assuming the data in the indexed column(s) is not highly correlated with
the physical table order (i.e. it is roughly random), about 50% density
is theoretically expected. In fact, in some empirical testing, I have
seen a long term steady state value of closer to 44% if I remember
correctly (but perhaps that was related to the way I was testing). For a
discussion on why this is the case, see for example:

https://www.postgresql.org/message-id/flat/87oa4xmss7.fsf%40news-spur.riddles.org.uk#87oa4xmss7@news-spur.riddles.org.uk

So what is being reported at 40% bloat is probably actually not really
bloat.

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: Space Stalker in SQL Output

2018-06-27 Thread Joe Conway

On 06/27/2018 12:45 PM, Susan Hurst wrote:
> Wow!  The -A option worked perfectly!
> 
> Thanks for the syntax lesson Steve and Jerry!

If you are going to be doing lots of scripting with Postgres, you might
want to take a look here: https://github.com/jconway/shebang

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: Will there ever be support for Row Level Security on Materialized Views?

2018-08-28 Thread Joe Conway

On 08/27/2018 07:38 PM, Ken Tanzer wrote:
> On Mon, Aug 27, 2018 at 4:23 PM David Rowley
> mailto:david.row...@2ndquadrant.com>> wrote:
> 
>> If I had to guess what's going on here then I'd say that nobody has
>> been sufficiently motivated to work on this yet. If that's the case,
>> everyone who reads your email is not the person working on this
>> feature, so can't answer your question.   I just answered to suggest
>> the reasons why you might not be getting an answer.
> 
> Thanks, that makes sense, and I appreciate that.

Since no one else mentioned it, I would add that questions about
development of Postgres features might be better directed at the hackers
list rather than here.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: PostgreSQL FIPS 140-2 on Window

2018-10-01 Thread Joe Conway

On 10/01/2018 11:44 AM, Bradley May wrote:
> I understand the PostgreSQL 9.x installed on a RHEL distribution can
> be configured and supports FIPS 140-2 when using and properly
> configuring OpenSSL FIPS. My question is can the same be accomplished
> with a Windows installation, as easily or similar to the RHEL
> installation/configuration procedures?
> 
> Apologies for such an abstract questions, but I remember reading
> somewhere that PostgreSQL 9.x  when installed on Windows does not
> support FIPS 140-2 without installing a more commercial product that
> has performed the additional compilation requirements.

As I understand it, FIPS 140-2 support is both "mechanical" (as in your
application will use only FIPS 140-2 approved algorithms and openssl
will be in "FIPS mode" if asked) as well as "compliance" (as in using
software that is actually certified to be FIPS 140-2 compliant).

While without a support subscription you can get "mechanical" FIPS 140-2
support with properly patched OpenSSL library (e.g. using CentOS
configured for FIPS 140-2 system-wide), you will not have FIPS 140-2
"compliance" unless you pay for support from a company that maintains
the certification (e.g. Red Hat, Ubuntu, or SUSE).

The challenge on Windows is to find an SSL library that:

1. Works with Postgres (i.e. openssl or something compatible)
2. Enables Postgres to have mechanical compliance (i.e. works
   system wide in a way that is transparent to Postgres)
3. Is backed by a company that has FIPS 140-2 certification for it

The only one that I have run across that appears to meet all three of
these on Windows is wolfSSL:

  https://www.wolfssl.com/products/wolfssl/

Note that I have no affiliation with them, nor have I actually tried the
product. It claims to have an "OpenSSL Compatibility Layer", so perhaps
it might work for you. If you try it, I'd love to hear back how it goes :-)

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: Error: rows returned by function are not all of the same row type

2019-07-05 Thread Joe Conway

On 7/5/19 5:39 AM, Andrey Sychev wrote:
> Thank you very much for answering my question, Tom.
> Yes,  I  have  always  assumed  that  returning  from function without
> calling  SPI_freetuptable  is not good idea, but I do not know another
> way to achieve same result.


Please do not top post on the Postgres lists.


> 1. As I understand there are tests for SFRM_Materialize in code above.
> In  my  case  a  caller  of my function is PL/pgSQL  procedure.
> Does it accept this returning mode?


See contrib/tablefunc/tablefunc.c crosstab_hash() as an example.

> 2. Our current production server versions is 9.1 and 9.6.
> Do this versions support returning of tuplestore?

Works since Postgres 7.3 if I recall correctly.

> 3. Currently my function defined as "RETURNS SETOF".
> Does  definition  of  the function need to be changed if I rewrite code to
> return tuplestore?

No

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: I think that my data is saved correctly, but when printing again, other data appears

2019-10-28 Thread Joe Conway

On 10/27/19 8:01 PM, Yessica Brinkmann wrote:
> Thank you so much for the answers.
> By telling me this: "MemoryContextStrdup enables you to create a copy of
> a string in an explicitly specified memory context." I better understood
> the function of MemoryContextStrdup.
> And thank you very much to Mr. Jony Cohen for giving me the reference of
> his work. I really was already researching about his work, and I
> included it in the state of the art of my thesis, but I didn't look at
> the source code.
> I will be looking at the source code for a help, and especially in this
> case to see for the moment, the maintenance of the context for its
> subsequent restoration.

For better understanding of how Postgres manages memory, you might want
to also read this:

https://github.com/postgres/postgres/blob/master/src/backend/utils/mmgr/README

and possibly browse through this:

https://github.com/postgres/postgres/blob/master/src/backend/utils/mmgr/mcxt.c

HTH,

Joe
-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Help with configuring pgAudit

2019-11-20 Thread Joe Conway

On 11/20/19 8:09 AM, Dave Hughes wrote:
> Hey,
> Thanks for reaching out.  It looks like I have that parameter set as
> well.  Here is a list of settings I have turned on in postgresql.conf
> since I installed pgAudit:
> 
> shared_preload_libraries = 'pgaudit'
> log_destination = 'csvlog'
> logging_collector = on
> log_directory = '/work/PostgreSQL/10/data'
> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
> log_connections = on
> log_disconnections = on
> log_line_prefix = '<%m:%r:%u@%d:[%p]:>'
> pgaudit.log = 'ddl'
> 
> Is there some other setting I may be missing?

I don't see anything wrong that jumps out except perhaps the OS postgres
user does not have sufficient permissions to write to
'/work/PostgreSQL/10/data'.

Beyond that, please see the supplemental PDF here for an example setup
instruction (section 2.2/appendix B, and possibly section 2.3 as well):

https://dl.dod.cyber.mil/wp-content/uploads/stigs/zip/U_PostgreSQL_9-x_V1R6_STIG.zip

Also read through the pgaudit README if you have not already done so:

  https://github.com/pgaudit/pgaudit

Finally (again if not already done), see the docs section on logging:

  https://www.postgresql.org/docs/12/runtime-config-logging.html

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: Help with configuring pgAudit

2019-11-21 Thread Joe Conway

On 11/20/19 5:54 PM, Dave Hughes wrote:
> Thanks for the tips Joe!  After fighting with this all day, I realized
> while I was testing this, I was logging into the database as the
> "postgres" user.  For some reason those actions were not being logged. 
> But once I logged in as another superuser account I have, I saw all my
> ddl statements being audited in the log file.   So it was working after
> all, but just not for the "postgres" user.  Do you happen to know if
> that behavior is how pgaudit is supposed to work?  You'd think even the
> "postgres" user activity would be logged as well?

I'm not sure what you are doing wrong, but the "postgres" user actions
should get logged just like everything else.

For grins I followed the aforementioned supplement pdf sections 2.2
(pgaudit) except I used cvslog instead of stderr as a destination, and
2.3 (logging), started up postgres, logged in as postgres, created a
table, and then tailed the postgres log:


tail -n 33 $PGDATA/postgresql.conf
shared_preload_libraries = 'pgaudit'

# Enable catalog logging - default is 'on'
pgaudit.log_catalog='on'
# Specify the verbosity of log information (INFO, NOTICE, LOG, WARNING,
DEBUG)
pgaudit.log_level='log'
# Log the parameters being passed
pgaudit.log_parameter='on'
# Log each relation (TABLE, VIEW, etc.) mentioned in a SELECT or DML
statement
pgaudit.log_relation='off'
# For every statement and substatement, log the statement and parameters
every time
pgaudit.log_statement_once='off'
# Define the master role to use for object logging
# pgaudit.role=''
# Choose the statements to log:
# READ - SELECT, COPY
# WRITE - INSERT, UPDATE, DELETE, TRUNCATE, COPY
# FUNCTION - Function Calls and DO Blocks
# ROLE - GRANT, REVOKE, CREATE/ALTER/DROP ROLE
# DDL - All DDL not included in ROLE
# MISC - DISCARD, FETCH, CHECKPOINT, VACUUM
pgaudit.log='ddl, role, read'

log_line_prefix = '%m %u %d: '
log_destination = 'csvlog'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%a.log'
log_file_mode = 0600
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0


(restart postgres)


psql test
psql (11.1)
Type "help" for help.

test=# CREATE TABLE pgatest(id int);
CREATE TABLE
test=# \q


tail -n 1 $PGDATA/pg_log/postgresql-Thu.csv
2019-11-21 10:07:39.320
EST,"postgres","test",14809,"[local]",5dd6a829.39d9,1,"CREATE
TABLE",2019-11-21 10:07:21 EST,3/8,394984,LOG,0,"AUDIT:
SESSION,1,1,DDL,CREATE TABLE,,,CREATE TABLE pgatest(id
int);,","psql"

Joe
-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Help with configuring pgAudit

2019-11-21 Thread Joe Conway

On 11/21/19 1:27 PM, Dave Hughes wrote:
> Thank you so much for all your help!  I found out my issue on accident
> actually.  I backed up all my user accounts into a SQL scripts and after
> reviewing it, I noticed there were some lines that said:
> ALTER ROLE postgres SET "pgauid.log" to 'Role';
> ALTER ROLE postgres SET "pgaudit.log_level" to 'notice';
> ALTER ROLE postgres SET "pgaudit.log_client" to 'on';
> 
> I think these commands were leftover from when I ran an integrity check
> on the pgaudit install (it crashed for other reasons) so it never
> cleaned up these settings.  Once I reset those settings back, it's
> working perfectly now.  
> 
> Thanks again for helping me getting this thing setup and working!

Ah, makes sense now.

For future reference, you can inspect the pgaudit (and other) active
settings using the pg_settings system view, e.g.:

select name, setting, source
from pg_settings where name like 'pgaudit.%';
name| setting |   source
+-+
 pgaudit.log| ddl, role, read | configuration file
 pgaudit.log_catalog| on  | configuration file
 pgaudit.log_client | off | default
 pgaudit.log_level  | log | configuration file
 pgaudit.log_parameter  | on  | configuration file
 pgaudit.log_relation   | off | configuration file
 pgaudit.log_statement_once | off | configuration file
 pgaudit.role   | | default
(8 rows)

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: I think that my data is saved correctly, but when printing again, other data appears

2019-11-26 Thread Joe Conway

On 11/25/19 4:38 PM, Yessica Brinkmann wrote:
> Well, as I was told that I should save the
> CurrentMemoryContext before starting the SPI, Let's say it would be
> before doing SPI_connect (). Is this correct? And then I must use
> MemoryContextStrdup. As you told me the MemoryContextStrdup It is
> used to create a copy of a string in a specific memory context. Well,
> where in the source code should I use MemoryContextStrdup? After 
> doing the SPI_connect () or where? I would use it from 
> MemoryContextStrdup to copy the data variable as I understand it, But
> in what context would I have to create the copy? In a new context or
> what would the theme be like? Should I use AllocSetContextCreate to
> create the new context or what would the theme be like? And if I have
> to create the new context with AllocSetContextCreate, where in the
> source code will I have to create it? After doing SPI_connect () or
> where? The truth is that I also read the source code of 
> https://github.com/cohenjo/pg_idx_advisor but I don't see that The 
> MemoryContextStrdup that they told me to use has been used there. 
> Sorry for the inconvenience and see the same thing again. But as I 
> indicated, I also read the explanations they told me to read (which
> were explanations about the memory contexts in Postgresql mainly) but
> there were no examples of source code. And since there is nothing on
> the Internet of examples that will help me to use, I am asking
> again. And the truth is that I didn't find examples of this in the
> Postgres source code, just definitions, That is the source code
> where MemoryContextStrdup is defined. It may be very easy for you and
> you will see it very clearly, but for me it really is not, and there
> are no examples of use on the Internet. I really searched a lot and
> found nothing. I would greatly appreciate a help please.
Sorry but I am not going to try to address that wall of text ;-)
But here is some general information about how that stuff works:
-
1. The most common pattern is something like this:

   MemoryContext oldcontext;

   oldcontext = MemoryContextSwitchTo();

   /* do stuff that allocates memory
* using PostgreSQL allocation functions
* e.g. palloc, pstrdup, other exported
* backend functions, etc
*/

   MemoryContextSwitchTo(oldcontext);

2. MemoryContextStrdup() is similar to the above, except in that case
   you do not need MemoryContextSwitchTo(). It directly allocates into
   the specified memory context without all the switching back and
   forth. If you are simply copying one string and need it in a context
   other than what is current, it is more convenient. But either method
   could be used.

3. When you run SPI_connect() the memory context is switched
   transparently for you to a special SPI memory context. When you run
   SPI_finish() the original memory context (the one in effect before
   SPI_connect) is restored.

4. Depending on what you are trying to do, use method #1 or method #2 if
   needed, including while doing SPI related things (in between
   SPI_connect and SPI_finish)

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Range contains element filter not using index of the element column

2019-11-27 Thread Joe Conway

On 11/27/19 6:33 AM, Lauri Kajan wrote:
> On Wed, Nov 27, 2019 at 1:05 PM Игорь Выскорко  > wrote:
> 
> Hi!
> Do you use GIST index?
> According to
> https://www.postgresql.org/docs/12/rangetypes.html#RANGETYPES-INDEXING
> <@ operator is supported:
> > A GiST or SP-GiST index can accelerate queries involving these
> range operators: =, &&, <@, @>, <<, >>, -|-, &<, and &>
> 
> 
> Hi,
> 
> I have understood that gist indexes can be used if the column is range
> type but my column is just plain timestamp.
> I tried actually to add gist index for the timestamp column. That was
> not possible without installing the btree_gist extension. But that
> didn't work.


Try this:

create table tstest(id int, ts timestamptz);
insert into tstest
  select
   g.i,
   now() - (g.i::text || ' days')::interval
  from generate_series(1, 10) as g(i);

create index tstest_gin
 on tstest using gist((tstzrange(ts,ts,'[]')));

explain analyze
 select * from tstest
 where
  tstzrange(ts,ts,'[]') <@
  tstzrange(now()- '9 days'::interval,
now()-'7 days'::interval,'(]');
QUERY PLAN


 Bitmap Heap Scan on tstest  (cost=24.17..590.16 rows=500 width=12)
(actual time=0.069..0.070 rows=2 loops=1)
   Recheck Cond: (tstzrange(ts, ts, '[]'::text) <@ tstzrange((now() - '9
days'::interval), (now() - '7 days'::interval), '(]'::text))
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on tstest_gin  (cost=0.00..24.04 rows=500
width=0) (actual time=0.063..0.063 rows=2 loops=1)
 Index Cond: (tstzrange(ts, ts, '[]'::text) <@ tstzrange((now()
- '9 days'::interval), (now() - '7 days'::interval), '(]'::text))
 Planning Time: 20.920 ms
 Execution Time: 0.115 ms
(7 rows)

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Interpolation problem - pg 12.4 - full correct version!

2020-08-19 Thread Joe Conway

On 8/19/20 3:08 PM, David G. Johnston wrote:
> On Wed, Aug 19, 2020 at 11:51 AM Pól Ua Laoínecháin  > wrote:
> 
> 
> I think my *MAJOR* problem is that I've developed what is,
> essentially, a totally brute force approach - and this simply won't
> work at the scenario becomes more complex - take a look at the CASE
> statement - it's horrible and would only become exponentially worse as
> the number NULLs rises.
> 
> So, my question is: Is there a recognised technique (using SQL only,
> not PL/pgSQL - soutions based on the latter are easy to find) whereby
> I can do a basic Linear Interpolation?
> 
> I don't have a recognized technique, nor care to ponder one right now, but 
> what
> you've described would best be done in pure SQL using WITH RECURSIVE, which
> provides an iterative approach to SQL result building.  Which is more commonly
> done in a procedural language.  The algorithm you describe is an iterative
> algorithm and so I'm wondering why place the arbitrary restriction on using 
> pure
> SQL when it likely wouldn't provide a very readable nor performant solution
> relative to a procedural (pl/pgsql or otherwise) one?


Might want to consider using PL/R with something like the R imputeTS package:

https://cran.r-project.org/web/packages/imputeTS/readme/README.html

HTH,

Joe
-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Christopher Browne

2020-11-04 Thread Joe Conway

On 11/4/20 7:45 PM, Jan Wieck wrote:
> Christopher Browne was my colleague and friend at Afilias and past for more 
> than
> a decade. I have countless memories that connect us. He will be missed.
> 
> 
> Jan


He will definitely be missed. Rest in Peace Chris.

Joe


> On Wed, Nov 4, 2020, 18:29 Steve Singer  > wrote:
> 
> 
> It is with much sadness that I am letting the community know that Chris
> Browne passed away recently.
> 
> Chris had been a long time community member and was active on various
> Postgresql mailing lists.  He was a member of the funds committee, the 
> PgCon
> program committee and served on the board of the PostgreSQL Community
> Association of Canada. Chris was a maintainer of the Slony
> replication system and worked on various PG related tools.
> 
> I worked with Chris for over 10 years and anytime someone had a problem he
> would jump at the chance to help and propose solutions. He
> always had time to listen to your problem and offer ideas or explain how
> something worked.
> 
> I will miss Chris
> 
> Steve
> 
> 
> 


-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: postgres-10 with FIPS

2020-12-09 Thread Joe Conway

On 12/9/20 4:51 AM, Aravindhan Krishnan wrote:
> The paid version I had mentioned about was the paid OS (ubuntu) for FIPS
> compliancy. I understand that postgres as is completely available for 
> open-source.
> 
> Since we can't get the paid version of the OS to support FIPS compliancy the
> idea was to build postgres against FIPS compliant SSL/crypto of 1.0.2g and get
> it to work on ubuntu 20.04 for which I was interested in the configure option.

Actual FIPS compliance is held by the distributor of the SSL library you use.

While you can, for example, configure a CentOS 7 system to be in "FIPS mode", it
is still not "FIPS compliant" if you didn't get the bits (the SSL library
itself) from Red Hat (which you did not if you are running CentOS).

The situation is the same with Ubuntu, except as far as I am aware you cannot
even get your hands on the SSL library for "FIPS mode" from Ubuntu unless you
pay them, unlike CentOS.

So no matter what you do with Postgres itself, you will not be FIPS compliant
without paying RHEL/Ubuntu/SUSE or getting your stack certified yourself (which
is not likely something you will want to do and would cost you more anyway).

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: CROSSTAB( .. only one column has values... )

2021-01-05 Thread Joe Conway

On 1/5/21 12:11 PM, Tim Clarke wrote:
> imho the crosstab() function isn't a good implementation. The biggest failure 
> it
> has is that you must know exactly how many output columns you will have in the
> result /_prior to running it


*That* is a function of how Postgres set returning functions work, and not
specific to crosstab(). It is not easily fixed. Patches to fix that would be
welcomed!

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: CROSSTAB( .. only one column has values... )

2021-01-05 Thread Joe Conway

On 1/5/21 10:46 AM, Adam Tauno Williams wrote:
> I'm using the crosstab feature and do not understand why I am only
> getting values in the first column.

> So I put this in as a crosstab:
> 
> SELECT * FROM crosstab(
> $$

>  $$) AS final_result (invoice_date VARCHAR, "parts" BIGINT, "rental"
> BIGINT, "sales" BIGINT, "service" BIGINT);
> 
>  - and I get the results of -
> 
> invoice_date parts rental sales  service 
>  - -- -- --- 
> 2001-09  1 (null) (null) (null)  
> 2007-07  1 (null) (null) (null)  
> 2013-02  5353  (null) (null) (null)  
> 2013-02  3454  (null) (null) (null)  
> 2013-03  3512  (null) (null) (null)  
> 2013-03  5366  (null) (null) (null)  
> ...
> 
> Only the first column has values, all the rest are NULL.
> I assume I am missing something stupid.

Seems you are using the wrong form of the crosstab() function. See

  https://www.postgresql.org/docs/current/tablefunc.html#id-1.11.7.47.5.7.2

"The main limitation of the single-parameter form of crosstab is that it treats
all values in a group alike, inserting each value into the first available
column. If you want the value columns to correspond to specific categories of
data, and some groups might not have data for some of the categories, that
doesn't work well. The two-parameter form of crosstab handles this case by
providing an explicit list of the categories corresponding to the output 
columns."

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: permanent setting of config variables

2021-02-18 Thread Joe Conway

On 2/18/21 4:09 AM, Joao Miguel Ferreira wrote:
> I have a few PL/pgSQL functions that use queires like "SHOW company.product 
> INTO
> _product_name" and "SHOW company.cluster INTO _cluster_number".
> 
> But these are failing because I don't know how to set those values on a
> permanent basis, or maybe on startup would also be ok.
> 
> I did read the "ALTER DATABASE " and the "set_config(...)" documentation and I
> experimented a bit with that but without succes.
> 
> So my question would be: how to permanently set user specific config values 
> that
> would become accessible to the "SHOW " SQL comand.

Perhaps I don't understand your issue, but this works for me:

8<---
nmx=# alter database nmx set a.b = 'c';
ALTER DATABASE

nmx=# \q


psql nmx
psql (12.5)
Type "help" for help.

nmx=# show a.b;
 a.b
-
 c
(1 row)

CREATE OR REPLACE FUNCTION test() RETURNS text AS $$
DECLARE
 pn text;
BEGIN
 SHOW a.b INTO pn;
 RETURN pn;
END;
$$ LANGUAGE plpgsql;

nmx=# SELECT test();
 test
--
 c
(1 row)

nmx=# \q


psql nmx
psql (12.5)
Type "help" for help.

nmx=# SELECT test();
 test
--
 c
(1 row)
8<---

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Permission inconsistency with views that call functions

2021-02-23 Thread Joe Conway

On 2/22/21 10:32 PM, David Wheeler wrote:
> I’m seeing some inconsistency with how permissions are enforced within views. 
> In
> particular, if the view accesses a table directly, then the table is 
> accessible,
> however if the view uses a function to access the table then permission is 
> denied.

Without looking too closely at the details, I can almost guarantee that the
issue is that FUNCTIONs default to "SECURITY INVOKER" whereas VIEWs operate as
if they were "SECURITY DEFINER". See slide 33 here:

http://joeconway.com/presentations/security-pgcon2020.pdf

The solution to your issue is possibly to make the function "SECURITY DEFINER".

I have mused previously (not sure if I ever did on the lists, but in any case)
that it would be cool if VIEWs could have the option to be either DEFINER or
INVOKER so that VIEWs and FUNCTIONs could be treated the same, but no efforts
have been made in that direction as far as I am aware.

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: PostgreSQL 14.4 ERROR: out of memory issues

2022-08-04 Thread Joe Conway


On 8/4/22 08:34, Aleš Zelený wrote:

SELECT ... simple join of two tables...
      WHERE opd.id_data_provider = _id_data_provider
        AND CASE WHEN _external_id IS NULL
                 THEN external_id IS NULL
                 ELSE external_id = _external_id
            END
        AND CASE WHEN _external_complete_id IS NULL
                 THEN _external_complete_id IS NULL


Unrelated to your question, but shouldn't that actually read:

   AND CASE WHEN _external_complete_id IS NULL
THEN external_complete_id IS NULL
^^^


                 ELSE external_complete_id = _external_complete_id
            END;
$function$


--
Joe Conway
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: how to check specific user_name have “SELECT ON ALL TABLES IN SCHEMA ” privilege or not ?

2022-09-12 Thread Joe Conway


On 9/11/22 12:03, David G. Johnston wrote:
On Sunday, September 11, 2022, yanliang lei <mailto:msdnch...@163.com>> wrote:


Dear everyone，
how to check specific   have “SELECT ON ALL TABLES IN
SCHEMA ”  privilege or not ?


This is not an appropriate list to ask general usage questions.  We have 
a -general list for those.


(therefore moving thread to pgsql-general)

But, that isn’t a privilege.  Its a grant action.  The privilege is just 
select on a table in a schema.  So you have to check every table in the 
schema individually.  The functions and the catalogs documented to 
retrieve this information - start putting them together into a query.  
I’m unable to go look for the specific details right now.


For an extension that will help you with this, see
 https://github.com/CrunchyData/crunchy_check_access
Either use that directly, or see how it gathers the info you are 
interested in:


https://github.com/CrunchyData/crunchy_check_access/blob/master/sql/check_access.sql

HTH,

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-30 Thread Joe Conway


On 10/28/22 16:24, Peter J. Holzer wrote:

On 2022-10-28 17:23:02 +0300, Kristjan Mustkivi wrote:

By the way, index rebuild while completing successfully did not fix
the indexes - the data in the tables was still missing even after the
successful rebuild command.


This is surprising. As I understand it, REINDEX scans the complete table
and builds a new index. So after a REINDEX the index should be complete
and usable.

However, reading
https://www.postgresql.org/docs/current/sql-reindex.html I'm unsure what
happens if you have issued a REINDEX for a table, schema, etc. and
rebuilding an index fails: Does it abort at that time or does it just
issue a notice and continue to rebuild the other indexes? In the latter
case it migh be easy to miss a problem.


Debian 9 appears to be glibc 2.24 and Debian 11 is glibc 2.31. Therefore 
the issue here is almost certainly corrupt indexes due to the glibc changes.


Note that some forms of the problem can manifest in primary keys or 
unique indexes that were valid in the original collation behavior, but 
contained duplicate rows according to the new collation. Those indexes 
would not be able to be reindexed without manual intervention.


I don't know offhand what happens if you concurrently reindex an entire 
database and just a few indexes fail to rebuild, but I suspect as you 
mentioned above the failure might easily be missed and invalid indexes 
would be left behind. I think you will find both indisready and 
indisvalid false for these indexes.


HTH,

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: IMMUTABLE function to cast enum to/from text?

2022-11-10 Thread Joe Conway


On 11/10/22 14:52, Philip Semanchuk wrote:

Hi all,
I know that Postgres' enum_in()/enum_out() functions have a
volatility class of STABLE, and STABLE is required because enum
elements can be renamed. We have an enum in our database used in a
number of custom functions, all of which require casting the enum
to/from text. Since enum_in() and enum_out() are STABLE, that means
our functions that rely on those casts must also be STABLE, and as a
result we can't use them in generated columns.

I have seen conversations that suggest creating a custom IMMUTABLE
function to perform the cast, but I can't figure out how to do that
except with a CASE statement that enumerates every possible value. Is
there a more elegant approach?


When asking for help here, it always helps us to help you if you provide 
a self-contained set of SQL that illustrates what you are looking for.


That said, I think you are looking for something like this:

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
CREATE TABLE person (name text, current_mood mood);
INSERT INTO person VALUES ('Moe', 'happy');
CREATE OR REPLACE FUNCTION mood2text(mood)
RETURNS text AS
$$
 select $1
$$ STRICT IMMUTABLE LANGUAGE sql;

SELECT name, mood2text(current_mood) FROM person;
 name | mood2text
--+---
 Moe  | happy
(1 row)

HTH,

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: How do the Linux distributions create the Linux user/group "postgres"?

2023-01-09 Thread Joe Conway


On 1/9/23 07:41, Matthias Apitz wrote:

Please note: I'm talking about the user and group "postgres" in the
Linux OS and not in the PostgreSQL server.

We're compiling PostgreSQL from source (actually 14.1) and distribute that
to our customers. They're asked to setup user and group "postgres"
before creating the cluster. As nowadays there are a lot of setup such
things in bigger installations, like LDAP or AD, etc. I'd like to know
how other installations for Linux deal with this?


See for example the PGDG RPM spec file:

https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-15/main/postgresql-15.spec;h=60cd42147a7563ba76c401643d0a7c79b59d2520;hb=HEAD

8<---
%pre server
groupadd -g 26 -o -r postgres >/dev/null 2>&1 || :
useradd -M -g postgres -o -r -d /var/lib/pgsql -s /bin/bash \
-c "PostgreSQL Server" -u 26 postgres >/dev/null 2>&1 || :
8<---

HTH,

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: How do the Linux distributions create the Linux user/group "postgres"?

2023-01-09 Thread Joe Conway

On 1/9/23 09:11, Ron wrote:

On 1/9/23 07:15, Joe Conway wrote:

On 1/9/23 07:41, Matthias Apitz wrote:

Please note: I'm talking about the user and group "postgres" in the
Linux OS and not in the PostgreSQL server.

We're compiling PostgreSQL from source (actually 14.1) and distribute that
to our customers. They're asked to setup user and group "postgres"
before creating the cluster. As nowadays there are a lot of setup such
things in bigger installations, like LDAP or AD, etc. I'd like to know
how other installations for Linux deal with this?

See for example the PGDG RPM spec file:

https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-15/main/postgresql-15.spec;h=60cd42147a7563ba76c401643d0a7c79b59d2520;hb=HEAD

8<---
%pre server
groupadd -g 26 -o -r postgres >/dev/null 2>&1 || :
useradd -M -g postgres -o -r -d /var/lib/pgsql -s /bin/bash \
-c "PostgreSQL Server" -u 26 postgres >/dev/null 2>&1 || :
8<---

What if (for example in an already-installed package) uid and gid 26 already
exist?

The "|| :" at the end of those lines is essentially "or true" -- i.e. if
the first command fails *due to preexisting uid/gid, the line still
evaluates as successful. Whether that is a good thing or not is in the
eye of the beholder I guess.

I have thought about, and played with a bit, the idea of making the
uid/gid in the community spec file a variable so that it could be passed
in at rpmbuild time. But that quickly bloomed into a host of related
issues -- e.g. the default "/var/lib/pgsql" for the home dir, the
default "postgres" for the user/group names, and/or what to do if the
username does not exist but the directory does, etc -- I ran out of
energy to chase it all down.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: glibc initdb options vs icu compatibility questions (PG15)

2023-01-16 Thread Joe Conway


On 1/16/23 08:26, Laurenz Albe wrote:

On Mon, 2023-01-16 at 09:30 +0100, Robert Sjöblom wrote:
We have a fleet of postgres 10 servers (1 primary, 2 replicas) that 
we're now planning to upgrade. We've historically been forced to use the 
same distro (centos7) and libc version, or rely on pg_dump/restore, 
across pg versions due to the fact that the servers/databases were 
initialized with the following options:

--lc-collate=sv_SE.UTF-8

We're upgrading all servers to pg15 through logical replication, and 
with that we'd like to remove our dependency on any specific libc 
version (to avoid corruption issues etc). We hope to do this by 
initializing our pg15 clusters with swedish icu locale 
(--icu-locale=sv-SE-x-icu). By using icu like this, we should have the 
same sorting behavior as the pg10 servers have today? By our 
understanding, we then should be able to use physical replication across 
different OS versions, without worrying about index corruption etc?


According to 
http://peter.eisentraut.org/blog/2022/09/26/icu-features-in-postgresql-15, 
we still need to set a regular libc locale/option; does this mean that 
the dependency on libc is still present as it pertains to corruption 
issues and being forced to use the same libc version across all replicas?


I'd say no.  The problem is the collation, and for that, the ICU
collation will be used.


check


The libc locale is for other aspects of locale.


Although I am not aware of any specific/known issues, keep in mind that 
there is still a possibility of changes across major glibc versions for 
things affected by LC_CTYPE, LC_TIME, LC_MONETARY, and LC_NUMERIC - 
lower()/upper() for example.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway


On 3/13/23 13:21, Israel Brewster wrote:
I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit 
more memory constrained than I would like, such that every week or so 
the various processes running on the machine will align badly and the 
OOM killer will kick in, killing off postgresql, as per the following 
journalctl output:


Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A 
process of this unit has been killed by the OOM killer.
Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed 
with result 'oom-kill'.
Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: 
Consumed 5d 17h 48min 24.509s CPU time.


And the service is no longer running.

When this happens, I go in and restart the postgresql service, and 
everything is happy again for the next week or two.


Obviously this is not a good situation. Which leads to two questions:

1) is there some tweaking I can do in the postgresql config itself to 
prevent the situation from occurring in the first place?
2) My first thought was to simply have systemd restart postgresql 
whenever it is killed like this, which is easy enough. Then I looked at 
the default unit file, and found these lines:


# prevent OOM killer from choosing the postmaster (individual backends will
# reset the score to 0)
OOMScoreAdjust=-900
# restarting automatically will prevent "pg_ctlcluster ... stop" from 
working,

# so we disable it here. Also, the postmaster will restart by itself on most
# problems anyway, so it is questionable if one wants to enable external
# automatic restarts.
#Restart=on-failure

Which seems to imply that the OOM killer should only be killing off 
individual backends, not the entire cluster to begin with - which should 
be fine. And also that adding the restart=on-failure option is probably 
not the greatest idea. Which makes me wonder what is really going on?


First, are you running with a cgroup memory.limit set (e.g. in a container)?

Assuming no, see:

https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT

That will tell you:
1/ Turn off memory overcommit: "Although this setting will not prevent 
the OOM killer from being invoked altogether, it will lower the chances 
significantly and will therefore lead to more robust system behavior."


2/ set /proc/self/oom_score_adj to -1000 rather than -900 
(OOMScoreAdjust=-1000): the value -1000 is important as it is a "magic" 
value which prevents the process from being selected by the OOM killer 
(see: 
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/oom.h#L6) 
whereas -900 just makes it less likely.


All that said, even if the individual backend gets killed, the 
postmaster will still go into crash recovery. So while technically 
postgres does not restart, the effect is much the same. So see #1 above 
as your best protection.


HTH,

Joe

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway


On 3/13/23 13:55, Israel Brewster wrote:

1) They reference a “Memory cgroup out of memory”, which refers back
to the opening comment on Joe Conway’s message - this would imply to
me that I *AM* running with a cgroup memory.limit set. Not sure how
that changes things?


cgroup memory limit is enforced regardless of the actual host level 
memory pressure. As an example, if your host VM has 128 GB of memory, 
but your cgroup memory limit is 512MB, you will get an OOM kill when the 
sum memory usage of all of your postgres processes (and anything else 
sharing the same cgroup) exceeds 512 MB, even if the host VM has nothing 
else going on consuming memory.


You can check if a memory is set by reading the corresponding virtual 
file, e.g:


8<---
# cat 
/sys/fs/cgroup/memory/system.slice/postgresql.service/memory.limit_in_bytes

9223372036854710272
8<---

A few notes:
1/ The specific path to memory.limit_in_bytes might vary, but this 
example is the default for the RHEL 8 postgresql 10 RPM.


2/ The value above, 9223372036854710272 basically means "no limit" has 
been set.


3/ The example assumes cgroup v1. There are very few distro's that 
enable cgroup v2 by default, and generally I have not seen much cgroup 
v2 usage in the wild (although I strongly recommend it), but if you are 
using cgroup v2 the names have changed. You can check by doing:


8<--cgroupv2 enabled-
# stat -fc %T /sys/fs/cgroup/
cgroup2fs
8<--cgroupv1 enabled-
# stat -fc %T /sys/fs/cgroup/
tmpfs
8<---


2) All the entries contain the line "oom_score_adj:0”, which would
seem to imply that the postmaster, with its -900 score is not being
directly targeted by the OOM killer.


Sounds correct

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway


On 3/13/23 14:50, Israel Brewster wrote:

Looks like V2:

root@novarupta:~# stat -fc %T /sys/fs/cgroup/
cgroup2fs


Interesting -- it does indeed look like you are using cgroup v2

So the file you want to look at in that case is:
8<---
cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@14.service/memory.max

4294967296

cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@14.service/memory.high

3221225472
8<---
If the value comes back as "max" it means no limit is set.

In this example (on my Linux Mint machine with a custom systemd unit 
file) I have memory.max set to 4G and memory.high set to 3G.


The value of memory.max determines when the OOM killer will strike. The 
value of memory.high will determine when the kernel goes into aggressive 
memory reclaim (trying to avoid memory.max and thus an OOM kill).


The corresponding/relevant systemd unit file parameters are:
8<---
MemoryAccounting=yes
MemoryHigh=3G
MemoryMax=4G
8<---

There are other ways that memory.max may get set, but it seems most 
likely that the systemd unit file is doing it (if it is in fact set).


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway


On 3/13/23 15:18, Israel Brewster wrote:

root@novarupta:~# cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@13-main.service/memory.max
max
root@novarupta:~# cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@13-main.service/memory.high
max
root@novarupta:~#

which would presumably indicate that it’s a system level limit being
exceeded, rather than a postgresql specific one?


Yep


The syslog specifically says "Memory cgroup out of memory”, if that means
something (this is my first exposure to cgroups, if you couldn’t
tell).


I am not entirely sure, but without actually testing it I suspect that 
since memory.max = high (that is, the limit is whatever the host has 
available) the OOM kill is technically a cgroup OOM kill even though it 
is effectively a host level memory pressure event.


Did you try setting "vm.overcommit_memory=2"?

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway


On 3/13/23 16:18, Israel Brewster wrote:

On Mar 13, 2023, at 11:42 AM, Joe Conway  wrote:
I am not entirely sure, but without actually testing it I suspect
that since memory.max = high (that is, the limit is whatever the
host has available) the OOM kill is technically a cgroup OOM kill
even though it is effectively a host level memory pressure event.


Sorry, actually meant "memory.max = max" here



Did you try setting "vm.overcommit_memory=2"?



root@novarupta:~# sysctl -w vm.overcommit_memory=2
sysctl: setting key "vm.overcommit_memory", ignoring: Read-only file system



I’m thinking I wound up with a container rather than a full VM after
all - and as such, the best solution may be to migrate to a full VM
with some swap space available to avoid the issue in the first place.
I’ll have to get in touch with the sys admin for that though.


Hmm, well big +1 for having swap turned on, but I recommend setting 
"vm.overcommit_memory=2" even so.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Properly handle OOM death?

2023-03-19 Thread Joe Conway


On 3/18/23 18:02, Tomas Pospisek wrote:

On 13.03.23 21:25, Joe Conway wrote:

Hmm, well big +1 for having swap turned on, but I recommend setting 
"vm.overcommit_memory=2" even so.


I've snipped out the context here, since my advice is very unspecific:
do use swap only as a safety net. Once your system starts swapping
performance goes down the toilet.



While I agree with this statement in principle, it is exactly the notion 
that "once your system starts swapping performance goes down the toilet" 
that leads people to conclude that having lots of memory and disabling 
swap will solve all their problems.


Because of how the Linux kernel works, you should, IMHO, always have 
some swap available. For more on why, see:


https://chrisdown.name/2018/01/02/in-defence-of-swap.html

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Very slow queries followed by checkpointer process killed with signal 9

2023-04-03 Thread Joe Conway


On 4/2/23 21:40, Tom Lane wrote:

Jason McLaurin  writes:

Is there anywhere you'd suggest we start looking for hints? I'd be
interested in increasing relevant logging verbosity so that we can see when
key background processes are running, both in Postgres core and Timescale.


It might be helpful to try to identify which wait events the slow
queries are blocking on (pg_stat_activity.wait_event_type and
.wait_event).  I'm not sure if you're going to be able to extract
useful data, because your query on pg_stat_activity is likely to
be slow too.  But it's a place to start.

Also, given that you're evidently incurring the wrath of the OOM
killer, you should try to understand why the kernel thinks it's
under memory pressure.  Do you have too many processes, or perhaps
you've configured too much shared memory?


Given this:


This is Postgres 14.5 running in the TimescaleDB Docker image.


Possibly the docker image is running with a cgroup memory.limit set?

The OOM killer will trigger on any cgroup limit even if the host has 
plenty of free memory.



--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: SV: SV: Implementing pgaudit extension on Microsoft Windows

2019-02-13 Thread Joe Conway

On 2/13/19 3:32 AM, Niels Jespersen wrote:
> Dive into the source for pgaudit and try to understand how it works,
> perhaps getting a small example extension working along the way. 
> Perhaps contacting the authors of pgaudit to have them confirm that
> there  exists working implementations of pgaudit on Windows (and if
> so if they are on PostgreSQL 11).

pgaudit definitely *should* work on Windows. If you can post your other
log related conf settings I will fire up a Windows VM and try to
duplicate the issue.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: SV: SV: Implementing pgaudit extension on Microsoft Windows

2019-02-13 Thread Joe Conway

On 2/13/19 9:17 AM, Joe Conway wrote:
> On 2/13/19 3:32 AM, Niels Jespersen wrote:
>> Dive into the source for pgaudit and try to understand how it works,
>> perhaps getting a small example extension working along the way. 
>> Perhaps contacting the authors of pgaudit to have them confirm that
>> there  exists working implementations of pgaudit on Windows (and if
>> so if they are on PostgreSQL 11).
> 
> pgaudit definitely *should* work on Windows. If you can post your other
> log related conf settings I will fire up a Windows VM and try to
> duplicate the issue.

I just compiled pgaudit on a Win10 VM and loaded it into a Postgres 11
Win64 build downloaded from the PGDG community download. Works perfectly
for me. Will definitely need to understand more details about your
installation. Probably not appropriate to do that on this list, so
please submit an issue on github.com/pgaudit/pgaudit

Thanks,

Joe


-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Implementing pgaudit extension on Microsoft Windows

2019-02-20 Thread Joe Conway

On 2/20/19 5:58 AM, Niels Jespersen wrote:
> I did not submit an issue. Instead, I read your input on another
> existing issue. In that, you recommended to place pgaudit source in
> contrib in a PostgreSQL source tree and build the whole thing.

I was hoping you would see that -- it looked pretty much the same as
your issue.

> I did just that this morning using 11.2 (my first compile of
> PostgreSQL (which succeeeded)). The resulting pgaudit.dll works fine.
> So far at least, I haven't tried all features, but pgaudit.log =
> 'all' generates lots of log.

Awesome -- thanks for the follow up!

> I  also pulled the commands to build pgaudit from the build log and
> can now compile pgaudit to a working dll against only a binary
> installation.

Oh, I didn't even know that was possible. Can you describe in a little
more detail how that is done? I only know the minimum possible to get by
with the Windows build system...

> Perhaps my experiences could be added to the project. I am willing to
> write a short markdown document describing this, if there is
> interest.

That would be awesome, although I bet it applies to more than just
pgaudit. I have had people report similar issues over the years with
plr.dll, for example.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: why not using a mountpoint as PGDATA?

2019-02-27 Thread Joe Conway

On 2/27/19 11:49 AM, Peter J. Holzer wrote:
> On 2019-02-27 10:42:12 -0500, Tom Lane wrote:
>> Luca Ferrari  writes:
>> > On Wed, Feb 27, 2019 at 12:33 PM Julien Rouhaud  wrote:
>> >> You can see most obvious reasons at
>> >> https://bugzilla.redhat.com/show_bug.cgi?id=1247477
> [...]
>> The case that I can recall most clearly was actually in the other
>> direction: during system bootup, some NFS volume that was being abused
>> this way (mount point == data dir) was slow to mount.  Compounding the
>> problem, postgres was being started through some init script that would
>> helpfully run initdb if it saw the specified data directory was empty.
>> So, rather than failing like a properly paranoid DBA would wish, it
>> ran initdb and then started the postmaster.
> 
> Ouch.
> 
> I wonder though why that directory was writable by the postgres user.
> But maybe the helpful start script chown'ed it to fix the "wrong"
> permissions.

FWIW, if you want to read the whole gory details of that incident, here
it is:

https://www.postgresql.org/message-id/flat/41D04FA4.7010402%40joeconway.com#dfc38927745e238d49569ffd5b33beba

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Query much slower on 9.6.5 than on 9.3.5

2019-04-07 Thread Joe Conway

On 4/5/19 6:23 AM, Rob Northcott wrote:
> Ah, fair enough – I’ll get into the habit of doing that then.  Thanks again.
> 
> Rob

> *From:*Ron 
> *Sent:* 05 April 2019 11:07
> *To:* pgsql-general@lists.postgresql.org
> *Subject:* Re: Query much slower on 9.6.5 than on 9.3.5
> 
> pg_dump/restore gets rid of all the dead space, and you should *always*
> run an ANALYZE after pg_restore, since pg_restore doesn't populate the
> statistics tables.

After the dump/restore cycle I would recommend a "VACUUM FREEZE ANALYZE"
if you can tolerate the time it takes, as this will not only ensure that
the statistics are populated, but it will also get future rewrites of
the rows out of the way all at once (visibility hint bits and freezing
of the tuples).

Joe
-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Joe Conway

On 4/24/19 4:17 PM, pabloa98 wrote:
> Sadly today we hit the 1600 columns limit of Postgresql 11.
> 
> How could we add more columns?
> 
> Note: Tables are OK. We truly have 2400 columns now. Each column
> represents a value in a matrix.

As everyone else has mentioned, your use case sounds like arrays are
appropriate. Note that PostgreSQL supports 2 dimensional (actually more
than 2d if you wanted) arrays which are essentially perfect for
representing a matrix.

If this works for your data model it will likely save a ton of storage
space and perform much better than alternatives.

FWIW, if you are needing to do matrix math, you might want to look into
PL/R (https://github.com/postgres-plr/plr) as it supports 2d arrays as
arguments which are converted directly into R matrices.

I don't know for sure but likely PL/Python could be used to process
matrices as well.

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: Is _ a supported way to create a column of array type?

2019-04-26 Thread Joe Conway

On 4/26/19 6:50 AM, Piotr Findeisen wrote:
> Presto type system however distinguishes array(integer),
> array(array(integer))...
> (using Presto therms). And execution is (expectedly) not as flexible.

Fine point here -- Postgres does not actually have arrays-of-arrays, it
has multi-dimension arrays.

So in other words, you cannot have an array of element type _int4, but
rather you literally have an array of element type int4 with two
dimension, etc..

Joe
-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: Row data is reflected in DETAIL message when constraints fail on insert/update

2019-06-22 Thread Joe Conway

On 6/22/19 6:00 AM, William Denton wrote:
>>I guess the original question basically boils down to "Given a
>>rogue/dumb app, and a DBA who neglected his job, is it PG's
>>business (or even within its possibilities) to mop up ?"
> 
> It feels like you aren't setting people up to land in the pit of success. 
> It's easy to sit back and call people negligent because they failed to 
> change settings from their defaults. Data breaches are all too common 
> due to mis-configured systems, we can all have a good laugh at the 
> poor people who have suffered breaches due to defaults that 
> come/came with s3, mongo and many other data stores, but why must 
> we operate on that level to rather than being a little more defensive?
> 
> How is it useful in a normally configured database to return row data in
> error messages?
> 
> Is the client application supposed to parse that data?
> 
> Must the client perform another query to discover column names and 
> attributes so the data can be parsed?

+many

I agree entirely.

> I can definitely see a use for it during debugging and development where a 
> human has their eyes on what the database is returning, but I would argue 
> if you wanted that information for debugging purposes you would enable 
> verbose logging.

Exactly.

Dev and test environments it makes sense to send verbose messages to the
client. In production it does not.

> I have spent a few minutes searching google for terms like 
> "harden postgres for production" or "locking down postgres" or  
> "postgres production configuration". NONE mention log_error_verbosity. 
> Searching the postgres wiki yields no results for log_error_verbosity. Only 
> once you start searching for the problems caused by log_error_verbosity 
> can you become aware that this setting exists and should be changed in 
> production environments. Yet the only mention on of this parameter on any
> postgres site (docs or wiki) is the one pasted below Calling people
> negligent
> for not knowing something, when you have failed to tell them seems
> disingenuous.
> 
> Further, the documentation for log_error_verbosity mentions nothing
> about the 
> data returned to the client. This text is explicitly talking about the
> server log.
> 
>>Controls the amount of detail written in the server log for each
> message that is 
>>logged. Valid values are TERSE, DEFAULT, and VERBOSE, each adding more 
>>fields to displayed messages. TERSE excludes the logging of DETAIL, 
>>HINT, QUERY, and CONTEXT error information. VERBOSE output includes the 
>>SQLSTATE error code (see also Appendix A) and the source code file name, 
>>function name, and line number that generated the error. Only
> superusers can 
>>change this setting.
> 
> I would suggest that row data should be reclassified as only appearing in 
> VERBOSE configurations as there is nothing an application client could
> do with 
> that information, it is only useful to a human operating interactively
> with the db.

In my experience the log files are generally required to be locked down
similarly to postgres itself, as was pointed out by (I think) Tom
elsewhere on this thread. To me, logging these details to the log file
is not the issue. And log files can be controlled in such a way as to
protect them even from the superuser and/or be shipped off to a log
collector for audit purposes.

The issue is the level of detail sent to the client. There is a setting,
client_min_messages, which could be used as a broad hammer to clamp what
gets sent to the client except that it is user settable. I think if we
made that superuser set instead it might at least be a step in the right
direction, although I have not tested to see what the results of an
error look like at the client in that case.

Separately I previously submitted a patch to optionally redact the
messages that go to the client. That would allow the client app to get
the SQL error code at least, but not the message. But it was rejected in
the last commitfest. See the relevant thread here:

https://www.postgresql.org/message-id/flat/2811772.0XtDgEdalL@peanuts2

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: How To: A large [2D] matrix, 100,000+ rows/columns

2023-06-09 Thread Joe Conway


On 6/8/23 22:17, Pat Trainor wrote:
Imagine something akin to stocks, where you have a row for every stock, 
and a column for every stock. Except where the same stock is the row & 
col, a number is at each X-Y (row/column), and that is the big picture. 
I need to have a very large matrix to maintain & query, and if not 
(1,600 column limit), then how could such data be broken down to work?


 100,000 rows *
 100,000 columns *
 8 bytes (assuming float8)
= about 80 GB per matrix if I got the math correct.

Is this really a dense matrix or is it sparse? What kind of operations?

Does it really need to be stored as such or could it be stored as 
vectors that are converted to a matrix on the fly when needed?


Seems like using python or R makes more sense. Perhaps it might make 
sense to store the data in Postgres and use plpython or plr. But it is 
hard to say with more details.



--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: "paths" between two ROLEs

2023-06-13 Thread Joe Conway


On 6/13/23 04:17, Dominique Devienne wrote:

Hi. We emulated a legacy security model (enforced in C/C++ code)
into "layers" of PostgreSQL ROLEs and GRANTs, thus enforced database-side.

To troubleshoot and validate that emulation, I'd like to introspect ROLE 
membership to:


1) Output the ROLE "path(s)" between any two ROLEs. Typically between 
the LOGIN USER and the ROLE that control access to a particular SCHEMA. 
In our model, there can be several ways the two end-roles are connected, 
involving a variable number of roles. So it has to be a recursive query.


2) target-end ROLEs (controlling access to SCHEMAs, again) follow a 
naming convention, so they can be identified using a LIKE pattern. 
Output all target ROLEs (aggregating each "paths" to the source-ROLE in 
an text[]) a given LOGIN USER has access to.


I'd appreciate either example SQL for the above; or hints to achieve the 
above.
My CTE "foo" is not great, thus reaching out to the community to avoid 
wasting too much time on this on my own.


This shows the path between roles taken which provides a particular 
privilege for a particular object:


https://github.com/CrunchyData/crunchy_check_access

It might do for you as-is, or at least you can use it as an example.

HTH,

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: PostgreSQL Commercial Support

2023-06-26 Thread Joe Conway


On 6/26/23 05:56, Oduba, Ernest wrote:
We want to start using PostgreSQL and we intent to have a commercial 
support. Kindly advise who can assist with this request for further 
engagement.


See:

https://www.postgresql.org/support/professional_support/

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: postgreSQL UPPER Method is converting the character "µ" into "M"

2023-09-07 Thread Joe Conway


On 9/6/23 11:22, Helmut Bender wrote:

Am 06.09.23 um 16:04 schrieb Sai Teja:
In my local it is windows OS and locale is English_united_states.1252 
and in local it is converting as expected ( testµ into TESTµ)


You wrote PostgreSQL is hosted on an utf-8 system. That's why it's
working different.

And be careful: the M isn't the uppercase latin m, but in fact the
uppercase µ (U+039C in unicode)! If you would compare it to latin
alphabet 'TESTM' it would be different.



Yep, there are interactions between the encoding used by your terminal, 
the server encoding, and the client encoding.


You have to be particularly careful when cutting and pasting that the 
client encoding and your terminal encoding match or you can get 
surprising results.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: libpq crashing on macOS during connection startup

2023-11-30 Thread Joe Conway


On 11/30/23 09:45, John DeSoi wrote:

I have a macOS web server using Postgres that has been very stable until a 
month or two ago. If I restart the web server the problem seems to go away for 
a while, but starts happening again within days. I thought it was a PHP issue 
as discussed in the link below, but I just noticed in the crash report it seems 
to be something related to a call from libpq.

https://github.com/shivammathur/homebrew-php/issues/1862

Any ideas or suggestions appreciated.



Did you recently get an OpenSSL upgrade to v3.2.0? This is a shot in the 
dark, but perhaps related to the discussion here?


https://www.postgresql.org/message-id/flat/CAN55FZ1eDDYsYaL7mv%2BoSLUij2h_u6hvD4Qmv-7PK7jkji0uyQ%40mail.gmail.com


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Generic File Access Function to read program output

2024-02-07 Thread Joe Conway


On 2/7/24 05:28, Carsten Klein wrote:

Hi there,

on PostgreSQL 14, I'm using function pg_read_file to read a JSON file on
the server. After that, the JSON file gets casted to jsonb and with
function jsonb_array_elements I'm iterating over the "records", which I
transform into a PostgreSQL ROWTYPE with jsonb_populate_record...

Since the source files are actually XML files, these are turned into
JSON files with Node JS and the fast-xml-parser module (processing JSON
is much faster and more comfortable than processing XML in PostgreSQL).

The command line of this conversion process is like this:

# node /opt/my_node_apps/xml_to_json.js 

In order to do this without temporary JSON files (which need to be
deleted at some time), it would be great to have a new Generic File
Access Function

pg_read_program_output(command)


Although one could argue, that it's not a Generic *File* Access
Function, that function would be a worthwhile addition and could use the
same semantics and rules as with the

COPY table_name FROM PROGRAM 'command'

statement. Also the implementation (running a command with the shell and
capture it's STDOUT) is nearly the same.

In contrast to the other Generic File Access Functions, it will be
almost impossible to restrict access to programs or commands within the
database cluster directory (could be a complex shell command). Aside
from that this makes no sense since, typically, there are no executable
programs in those directories.

Even worse, it's likely also not possible to restrict the source of the
content read (the STDOUT) to be any of these directories, since the
program could just dump anything to its STDOUT.

AFAIT, that's not really an issue but only makes this new Generic File
Access Function special, in that these restrictions and the meaning of
role pg_read_server_files just do not apply for it.

Do you know if there is already such a function, maybe provided by an
extension I do not yet know?



Maybe write your own in plpython or plperlu?

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Pgxs - How to reference another extension

2024-03-11 Thread Joe Conway


On 3/11/24 09:48, Michał Kłeczek wrote:




On 11 Mar 2024, at 14:08, Artur Zakirov  wrote:

On Mon, 11 Mar 2024 at 13:26, Michał Kłeczek <mailto:mic...@kleczek.org>> wrote:




On 11 Mar 2024, at 11:41, Michał Kłeczek  wrote:

Hi,

I am trying to create an extension that delegates some calls to 
btree_gist functions:


DirectFunctionCall5Coll(
  gbt_text_consistent, …other arguments);

Basic PGXS Makefile does not work - I get linker error:

Undefined symbols for architecture arm64:
"_gbt_text_consistent", referenced from:


Anyone could provide me with some hints?


I’ve added:
PG_LDFLAGS += -L$(shell $(PG_CONFIG) --pkglibdir) -lbtree_gist


You can try FunctionCall5Coll() or OidFunctionCall5Coll() functions.

OidFunctionCall5Coll() calls fmgr_info() and FunctionCall5Coll(). What
you only need is Oid of the target function.


What I am trying to do is wrapping and decoration of gbt_text_consistent 
function.
The reason I want to use DirectFunctionCall5Col is that other variants 
require catalog lookup

as I don’t have old of the wrapped function.
The lookup itself is problematic as the only piece of information I have 
is the strategy number.
What’s more - the result of the lookup should be cached in fn_extra and 
that makes things even more complex.


Is there any way to simply link against another extension library?



I used this successfully in the past with postgis:

postgis_libdir := $(shell pg_config --pkglibdir)
postgis_libver := $(shell ls -1 $(postgis_libdir) | grep "^postgis")
SHLIB_LINK += -L$(postgis_libdir) -l:$(postgis_libver) -llwgeom

HTH,

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Feature Request: Option for TLS no SSLRequest with psql

2024-05-11 Thread Joe Conway


On 5/11/24 11:19, AJ ONeal wrote:
Could we get a `postgrestls://` or `sslmode=tls` or --tls option that 
instructs psql to sends straight TLS, skipping the 000804d2162f / 
000804d21630 + N / Y / S handshake?



You should probably be following the hackers mailing list. See:

https://www.postgresql.org/message-id/flat/CAM-w4HOEAzxyY01ZKOj-iq%3DM4-VDk%3DvzQgUsuqiTFjFDZaebdg%40mail.gmail.com

and commits:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=91044ae4baeac2e501e34164a69bd5d9c4976d21

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d39a49c1e459804831302807c724fa6512e90cf0


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Feature Request: Option for TLS no SSLRequest with psql

2024-05-11 Thread Joe Conway


On 5/11/24 11:59, Joe Conway wrote:

On 5/11/24 11:19, AJ ONeal wrote:
Could we get a `postgrestls://` or `sslmode=tls` or --tls option that 
instructs psql to sends straight TLS, skipping the 000804d2162f / 
000804d21630 + N / Y / S handshake?



You should probably be following the hackers mailing list. See:

https://www.postgresql.org/message-id/flat/CAM-w4HOEAzxyY01ZKOj-iq%3DM4-VDk%3DvzQgUsuqiTFjFDZaebdg%40mail.gmail.com

and commits:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=91044ae4baeac2e501e34164a69bd5d9c4976d21

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d39a49c1e459804831302807c724fa6512e90cf0


And as of a few moments ago:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=407e0b023cdb449dde65fd370c6cc48f5b8a5579

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: PG16.1 security breach?

2024-06-07 Thread Joe Conway


On 6/7/24 07:04, Zwettler Markus (OIZ) wrote:
I am running the following on Postgres 16.1 in database "postgres" as a 
superuser:





create or replace function oiz.f_set_dbowner (p_dbowner text, p_dbname text)





create role testuser with password 'testuser' login;




than this new role is able to execute the function oiz.f_set_dbowner 
immediately even I did not grant execute on this function to this role!


See:
https://www.postgresql.org/docs/current/sql-createfunction.html

In particular, this part:
8<
Another point to keep in mind is that by default, execute privilege is 
granted to PUBLIC for newly created functions (see Section 5.7 for more 
information). Frequently you will wish to restrict use of a security 
definer function to only some users. To do that, you must revoke the 
default PUBLIC privileges and then grant execute privilege selectively. 
To avoid having a window where the new function is accessible to all, 
create it and set the privileges within a single transaction. For example:

8<--------

HTH,

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: PG16.1 security breach?

2024-06-13 Thread Joe Conway


On 6/12/24 18:56, Tom Lane wrote:

Ron Johnson  writes:

On Wed, Jun 12, 2024 at 4:36 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

I think my point is that a paragraph like the following may be a useful
addition:

If one wishes to remove the default privilege granted to public to execute
all newly created procedures it is necessary to revoke that privilege for
every superuser in the system



That seems... excessive.


More to the point, it's wrong.  Superusers have every privilege there
is "ex officio"; we don't even bother to look at the catalog entries
when considering a privilege check for a superuser.  Revoking their
privileges will accomplish nothing, and it does nothing about the
actual source of the problem (the default grant to PUBLIC) either.

What I'd do if I didn't like this policy is some variant of

ALTER DEFAULT PRIVILEGES IN SCHEMA public
   REVOKE EXECUTE ON FUNCTIONS FROM PUBLIC;


In a past blog[1] I opined that this cleans up the default security 
posture fairly completely:


8<--
REVOKE CREATE ON SCHEMA public FROM PUBLIC;
REVOKE EXECUTE ON ALL ROUTINES IN SCHEMA public FROM PUBLIC;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
REVOKE EXECUTE ON ROUTINES FROM PUBLIC;

-- And/or possibly, more drastic options:
-- REVOKE USAGE ON SCHEMA public FROM PUBLIC;
-- DROP SCHEMA public CASCADE;

REVOKE TEMPORARY ON DATABASE  FROM PUBLIC;
REVOKE USAGE ON LANGUAGE sql, plpgsql FROM PUBLIC;
8<--


Repeat for each schema that you think might be publicly readable
(which is only public by default).


indeed


BTW, in PG 15 and up, the public schema is not writable by
default, which attacks basically the same problem from a different
direction.


also a good point


[1] 
https://www.crunchydata.com/blog/postgresql-defaults-and-impact-on-security-part-2


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Question about when PostgreSQL 11.0 was released

2021-03-08 Thread Joe Conway

On 3/8/21 1:22 PM, Bysani, Ram wrote:
> I am not finding the release cadence for the PostgreSQL databases. Please let 
> me
> know how I can find the base and dot version release dates for:
> 
> PostgreSQL 11
> PostgreSQL 12
> PostgreSQL 13


I'm not sure a consolidated list of the minor release dates exists, but the
major release dates can be seen in the table here:

https://www.postgresql.org/support/versioning/

You can pull the minor release dates from the release notes yourself -- for
example from this link for pg11:

https://www.postgresql.org/docs/11/release.html

go to this for pg 11.11

https://www.postgresql.org/docs/11/release-11-11.html

and you can see the release date was 2021-02-11

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: Question about when PostgreSQL 11.0 was released

2021-03-09 Thread Joe Conway

On 3/9/21 8:20 AM, Greg Sabino Mullane wrote:
> Ram Bysani asked:
>> I am not finding the release cadence for the PostgreSQL databases.
>> Please let me know how I can find the base and dot version release dates for:
> 
> You can see all the release dates on one page here:
> 
> https://bucardo.org/postgres_all_versions

Nice!

Joe
-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: Compare with default value?

2021-03-13 Thread Joe Conway


On 3/13/21 1:05 PM, Ulrich Goebel wrote:

I would like to get the rows, where a column has the default value,
similar to:

select id fromt tbl where col = default

Is there a chance?


It isn't pretty, and not all that practical, but for example:

8<
create table def(id int, f1 text not null default 'foo');

insert into def values(1,'hello'),(2,default),(3,default),(4,'bar');


SELECT *
FROM def
WHERE $$'$$ || f1 || $$'::text$$ =
 (SELECT pg_get_expr(d.adbin, d.adrelid, true)
  FROM pg_attrdef d
  JOIN pg_attribute a on d.adrelid = a.attrelid
   AND d.adnum = a.attnum
  WHERE a.attrelid = 'def'::regclass
  AND a.attname = 'f1');
 id | f1
+-
  2 | foo
  3 | foo
(2 rows)
8<

Hope this helps,

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: How to deny access to Postgres when connected from host/non-local

2021-04-03 Thread Joe Conway


On 4/2/21 7:06 PM, A. Reichstadt wrote:

Hello,

I try to deny access to all databases on my server if the user “postgres" tries 
to connect from a non-local host. Here is what I did in pg_hba.conf:



# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             all                                     md5
# IPv4 local connections:
host    all             all             127.0.0.1/32            md5
# IPv6 local connections:
host    all             all             ::1/128                 md5
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     all                                     md5
host    replication     all             127.0.0.1/32            md5
host    replication     all             ::1/128                 md5
host    all             all             0.0.0.0/0               md5
local   all             postgres                                trust
host    all             postgres        0.0.0.0/0               reject


But it continues to allow for Postgres to connect from anywhere through PGAdmin 
but also as a direct connection to port 5432. I also relaunched the server. This 
is version 12.


What else do I have to do?

Thanks for any help.


See:
https://www.postgresql.org/docs/13/auth-pg-hba-conf.html

In particular:

  "Each record specifies a connection type, a client IP
   address range (if relevant for the connection type),
   a database name, a user name, and the authentication
   method to be used for connections matching these
   parameters. The first record with a matching
   connection type, client address, requested database,
   and user name is used to perform authentication."

So your reject line is never being reached.

HTH,

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: EXCLUDE USING and tstzrange

2021-06-04 Thread Joe Conway


On 6/4/21 10:58 AM, Laura Smith wrote:

One other question, what's the syntax for manipulating only the upper
bound of a range.

Say I have a Postgres function that does a "SELECT INTO" for an
existing tsrange.  Is there an easy way to change the variable's
upper bound whilst leaving the "old" lower bound intact ?


There may be easier/better ways, but for example this works:

8<--
insert into test
 values(42, '[2021-01-01, 2021-06-03)');
INSERT 0 1

select test_range from test where test_id = '42';
 test_range
-
 ["2021-01-01 00:00:00-05","2021-06-03 00:00:00-04")
(1 row)

update test
 set test_range = tstzrange(lower(test_range),
'2021-06-04', '[)')
where test_id = '42';
UPDATE 1

select test_range from test where test_id = '42';
 test_range
-
 ["2021-01-01 00:00:00-05","2021-06-04 00:00:00-04")
(1 row)
8<--

HTH,

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: Struggling with EXCLUDE USING gist

2021-06-04 Thread Joe Conway


On 6/4/21 1:32 PM, Laura Smith wrote:

What is the solution then ?  I need to keep historical versions but
at the same time I need a "current" version.  If I am not able to use
"infinity" as bounds for "current" version then clearly I'm wasting
my time trying to use EXCLUDE AS for version tracking because clearly
using fixed timestamps instead of "infinity" for tstzrange would be a
hacky fix that will be fragile and prone to breakage.


This is not exactly the same thing you are trying to do (I think), but 
maybe you can get some useful ideas from this:


https://www.joeconway.com/presentations/RLS_TimeTravel-FOSDEM2019.pdf

HTH,

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: Need to omit time during weekends from age calculations

2021-06-07 Thread Joe Conway


On 6/7/21 3:12 PM, David Gauthier wrote:

Hi:

I suspect I'm not the first to ask about this but couldn't find anything 
after googling for a bit.  So here goes


I'd like to get the "age" difference between two times which span either 
all or part of a weekend but exclude any time that transpired during the 
weekend.

Example (please pardon the non-timestamp vals here...)

age('Monday-Noon','Prev-Friday-Noon')
would give me '1 day'.

...and...

age('Sunday-Noon','Prev-Friday-Noon')
would give me '12 hours'

You get the picture.

Has this wheel already been invented ?
I don't see an easy way to do this off-hand.
All Ears :-)


Perhaps not the prettiest of solutions, but what about something like this?

8<
CREATE OR REPLACE FUNCTION nonweekendhours(startts timestamptz, endts 
timestamptz)

RETURNS interval AS $$
  SELECT
(SUM(case when extract(dow from g.ts) > 0
   and extract(dow from g.ts) < 6 then
   1
 else
   0 end) || ' hours')::interval
  FROM generate_series(startts, endts - '1 hour'::interval,'1 hour') AS 
g(ts)

$$ LANGUAGE sql;

SELECT nonweekendhours('2021-06-04 12:00:00','2021-06-06 12:00:00');
 nonweekendhours
-
 12:00:00
(1 row)

SELECT nonweekendhours('2021-06-04 12:00:00','2021-06-07 12:00:00');
 nonweekendhours
-
 24:00:00
(1 row)

SELECT nonweekendhours('2021-06-04 12:00:00','2021-06-11 12:00:00');
 nonweekendhours
-
 120:00:00
(1 row)
8<

HTH,

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: removing "serial" from table definitions.

2021-06-24 Thread Joe Conway


On 6/24/21 8:33 AM, Marc Mamin wrote:

Hi,

Is there  a way to change a data type from serial to int?

I tried with :

   ALTER TABLE "admin".db_jobs ALTER COLUMN id TYPE int USING id::int;

But this seems not to change anything, as if Posgres woud consider the 
statement as a no-op.


serial is not an actual data type -- it is essentially an integer with a 
default and an automatically created sequence. See:


https://www.postgresql.org/docs/13/datatype-numeric.html#DATATYPE-SERIAL


My problem is that "serial" is not exported with pg_dump.

Creating a db from the dump will hence result into a different table 
definition (which is equivalent tough)


We are trying a transfer/migration tool on Azure, that check the table 
definitions between the source and target before starting the data 
transfer, and it blocks on that difference.


From the linked doc above:
--
The data types smallserial, serial and bigserial are not true types, but 
merely a notational convenience for creating unique identifier columns 
(similar to the AUTO_INCREMENT property supported by some other 
databases). In the current implementation, specifying:


CREATE TABLE tablename (
colname SERIAL
);

is equivalent to specifying:

CREATE SEQUENCE tablename_colname_seq AS integer;
CREATE TABLE tablename (
colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;
---

I haven't checked, but I am guessing that creating the table using the 
former method results in a dump that looks like the latter? In that 
case, just define the table in the second way to begin with and they 
will match from the migration tools standpoint I should think.


Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: pgcrypto - real life examples to encrypt / decrypt

2021-08-03 Thread Joe Conway


On 8/3/21 8:43 AM, Luca Ferrari wrote:

On Tue, Aug 3, 2021 at 1:03 PM Vikas Sharma  wrote:

My question is, can I use the gpg public/secret key instead of the 'Secret 
password' in above PGP_Sym_encrypt/decrypt?  I can create a wrapper function to 
read the public/secret keys to hide it from appearing as clear text.


I think you are looking for something like:

pgp_pub_encrypt( clear_text,
dearmor( '-BEGIN PGP PUBLIC KEY BLOCK-
 ...
 -END PGP PUBLIC KEY BLOCK-' ) );




still researching how to encrypt a column with sensitive data as a best 
practice to use in OLTP production with minimal impact on performance.


Clearly, as you add more stuff to do, performances will be lower. I
strongly recommend you to analyze if column encryption is really what
you need for your purposes, because in my little experience it is
often too much work with regard to other approaches (e.g., disk and
backup encryption).


Generally agreed. This topic is vast and complex and probably beyond 
what most people want to discuss by typing (at least for me) ;-)


That said, you might find this extension written by Bruce Momjian useful:

https://momjian.us/download/pgcryptokey/

HTH,

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: Pg stuck at 100% cpu, for multiple days

2021-08-30 Thread Joe Conway


On 8/30/21 10:36 AM, hubert depesz lubaczewski

Anyway - it's 12.6 on aarm64. Couple of days there was replication
slot started, and now it seems to be stuck.



#0  hash_seq_search (status=status@entry=0xdd90f380) at 
./build/../src/backend/utils/hash/dynahash.c:1448
#1  0xc3042060 in RelfilenodeMapInvalidateCallback (arg=, relid=105496194) at ./build/../src/backend/utils/cache/relfilenodemap.c:64
#2  0xc3033aa4 in LocalExecuteInvalidationMessage (msg=0x9b66eec8) 
at ./build/../src/backend/utils/cache/inval.c:595
#3  0xc2ec8274 in ReorderBufferExecuteInvalidations (rb=0xc326bb00 , 
txn=0xc326b998 , txn=0xc326b998 ) 
at ./build/../src/backend/replication/logical/reorderbuffer.c:2149
#4  ReorderBufferCommit (rb=0xc326bb00 , xid=xid@entry=2668396569, 
commit_lsn=187650393290540, end_lsn=, 
commit_time=commit_time@entry=683222349268077, origin_id=origin_id@entry=0, 
origin_lsn=origin_lsn@entry=0) at 
./build/../src/backend/replication/logical/reorderbuffer.c:1770
#5  0xc2ebd314 in DecodeCommit (xid=2668396569, parsed=0xdd90f7e0, 
buf=0xdd90f960, ctx=0xf5d396a0) at 
./build/../src/backend/replication/logical/decode.c:640
#6  DecodeXactOp (ctx=ctx@entry=0xf5d396a0, buf=0xdd90f960, 
buf@entry=0xdd90f9c0) at 
./build/../src/backend/replication/logical/decode.c:248
#7  0xc2ebd42c in LogicalDecodingProcessRecord (ctx=0xf5d396a0, 
record=0xf5d39938) at 
./build/../src/backend/replication/logical/decode.c:117
#8  0xc2ecfdfc in XLogSendLogical () at 
./build/../src/backend/replication/walsender.c:2840
#9  0xc2ed2228 in WalSndLoop (send_data=send_data@entry=0xc2ecfd98 
) at ./build/../src/backend/replication/walsender.c:2189
#10 0xc2ed2efc in StartLogicalReplication (cmd=0xf5d175a8) at 
./build/../src/backend/replication/walsender.c:1133
#11 exec_replication_command (cmd_string=cmd_string@entry=0xf5c0eb00 "START_REPLICATION SLOT cdc 
LOGICAL 1A2D/4B3640 (\"proto_version\" '1', \"publication_names\" 'cdc')") at 
./build/../src/backend/replication/walsender.c:1549
#12 0xc2f258a4 in PostgresMain (argc=, 
argv=argv@entry=0xf5c78cd8, dbname=, username=) at 
./build/../src/backend/tcop/postgres.c:4257
#13 0xc2eac338 in BackendRun (port=0xf5c68070, port=0xf5c68070) 
at ./build/../src/backend/postmaster/postmaster.c:4484
#14 BackendStartup (port=0xf5c68070) at 
./build/../src/backend/postmaster/postmaster.c:4167
#15 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1725
#16 0xc2ead364 in PostmasterMain (argc=, argv=) at ./build/../src/backend/postmaster/postmaster.c:1398
#17 0xc2c3ca5c in main (argc=5, argv=0xf5c07720) at 
./build/../src/backend/main/main.c:228

The thing is - I can't close it with pg_terminate_backend(), and I'd
rather not kill -9, as it will, I think, close all other connections,
and this is prod server.



still makes me ask: why does Pg end up in such place,> where it
doesn't do any syscalls, doesn't accept pg_terminate_backend(), and
is using 100% of cpu?
src/backend/utils/hash/dynahash.c:1448 is in the middle of a while loop, 
which is apparently not exiting.


There is no check for interrupts in there and it is a fairly tight loop 
which would explain both symptoms.


As to how it got that way, I have to assume data corruption or a bug of 
some sort. I would repost the details to hackers for better visibility.


Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: Account privileges

2021-11-17 Thread Joe Conway


On 11/16/21 9:07 PM, Prathima Mulpuri wrote:

Hi all, I need some help regarding Postgres and I have checked and
tried many queries. I am working on RDS Postgres 13 and in the
process of preparing the script for auditing all the user privileges
. As a part of our auditing, I need a script to list down all the
privileges of all the users to each database and if any privilege
that I need is missing, it should automatically execute the grant
/revoke (for example account1 should have only select privileges. If
it is granted with anything else it should revoke the permissions and
if select is not granted it should grant the select privilege) .
This should be done for all the databases in an instance in 1
script. I want to use cursor to list the databases and to run the
check and execute queries using a function or a stored procedure. The
results of the script should be sent to an email.

Please share any ideas and solutions for my requirement.


I am not aware of an existing solution that does all of those things, 
but you could probably build one starting with the "check_access" extension:


https://github.com/CrunchyData/crunchy_check_access

You can see examples of use of the extension here:

https://blog.crunchydata.com/blog/postgresql-defaults-and-impact-on-security-part-1

In particular, you could use check_access to enumerate all privileges 
when in a known-good state, save that output somewhere as the required 
baseline state (e.g. in a text file), and then compare later audit runs 
against that baseline (e.g. using diff).


Automation of remediation is left as an exercise for you ;-)

HTH,

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Re: Feature idea: Dynamic Data Making

2017-12-10 Thread Joe Conway

On 12/05/2017 01:03 PM, Alvaro Herrera wrote:
> Riccardo Bassani wrote:
>> As SQL Server 2016 (
>> https://docs.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking
>> ) , or Oracle does: it could be a neat feature having Dynamic Data
>> Masking, even if it has some limitations, it could help in the
>> application development, imho.
>> 
>> I know it's possible to put this logic in the application (if you look
>> in internet for "spring framework data masking" you find a lot of
>> results).
>> 
>> What do you think?
>> I know there is more important things to do (like partitioning or
>> parallelism), but it could be neat to see that in a future release .
> 
> Data masking is pretty important actually.  Since you seem to have an
> interest in seeing it done, care to submit a patch?


I was recently discussing this with some other hackers, and the
consensus seemed to be that this should be implemented with a new kind
of POLICY that operates at the field level. We didn't go into much
detail beyond that though.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Help in dealing with OOM

2024-10-14 Thread Joe Conway


On 10/14/24 14:37, Siraj G wrote:

This is from the OS log (/var/log/kern.log):

oom- 
kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.sli      ce/system-postgresql.slice/postgresql@12-main.service,task=postgres,pid=2334587,uid=114
   494 Oct 14 09:58:10 gce-k12-prod-as1-erp-pg-secondary kernel: 
[6905020.514569] Out of memory: Killed process 2334587 (postgres) total- 
vm:26349584kB, anon-rss:3464kB, file-rss:0kB, shmem-rs 
  s:21813032kB, UID:114 pgtables:49024kB oom_score_adj:0



1. Do you happen to have swap disabled? If so, don't do that.

2. Does the postgres cgroup have memory.limit (cgroup v1) or memory.max
   (cgroup v2) set?

3. If #2 answer is no, have you followed the documented guidance here
   (in particular vm.overcommit_memory=2):

 
https://www.postgresql.org/docs/12/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT



--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: glibc updarte 2.31 to 2.38

2024-09-22 Thread Joe Conway


On 9/22/24 12:53, Adrian Klaver wrote:

On 9/22/24 09:48, Paul Förster wrote:

Hi Joe


On 22. Sep, 2024, at 15:47, Joe Conway  wrote:

Note that moving to ICU might improve things, but there are similar potential 
issues with ICU as well. The trick there would be to get your OS distro 
provider to maintain the same ICU version across major versions of the distro, 
which is not the case currently. Nor does the PGDG repo do that.


Then I strongly suggest that the PostgreSQL developers develop a fail safe 
sorting mechanism that holds for generations of locale changes.


https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-HIGHLIGHTS

Add a builtin platform-independent collation provider (Jeff Davis)

This supports C and C.UTF-8 collations.



Yep, what he said

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: glibc updarte 2.31 to 2.38

2024-09-22 Thread Joe Conway


On 9/21/24 15:19, Paul Foerster wrote:

I already expressed the idea of changing all locales to ICU. The
problem there is that I'd have to create new instances and then move
each database individually. I wish I could convert already running
databases… This also takes time. Still, I think I'm going to try
this route. It's always a gamble if reindexing is needed or not with
any glibc change.



Note that moving to ICU might improve things, but there are similar 
potential issues with ICU as well. The trick there would be to get your 
OS distro provider to maintain the same ICU version across major 
versions of the distro, which is not the case currently. Nor does the 
PGDG repo do that.



--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: glibc updarte 2.31 to 2.38

2024-09-19 Thread Joe Conway


On 9/19/24 11:14, Tom Lane wrote:

Paul Foerster  writes:

we have SLES 15.5 which has glibc 2.31. Our admin told us that
he's about to install the SLES 15.6 update which contains glibc
2.38.



2) Does the glibc update have any impact?

Maybe.  We don't really track glibc changes, so I can't say for sure,
but it might be advisable to reindex indexes on string columns.



Every glibc major version change potentially impacts the sorting of some 
strings, which would require reindexing. Whether your actual data trips 
into any of these changes is another matter.


You could check by doing something equivalent to this on every 
collatable column with an index built on it, in every table:


8<---
WITH t(s) AS (SELECT  FROM  ORDER BY 1)
 SELECT md5(string_agg(t.s, NULL)) FROM t;
8<---

Check the before and after glibc upgrade result -- if it is the same, 
you are good to go. If not, rebuild the index before *any* DML is done 
to the table.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: glibc updarte 2.31 to 2.38

2024-09-19 Thread Joe Conway


On 9/19/24 13:07, Joe Conway wrote:

On 9/19/24 11:14, Tom Lane wrote:

Paul Foerster  writes:

we have SLES 15.5 which has glibc 2.31. Our admin told us that
he's about to install the SLES 15.6 update which contains glibc
2.38.



2) Does the glibc update have any impact?

Maybe.  We don't really track glibc changes, so I can't say for sure,
but it might be advisable to reindex indexes on string columns.



Every glibc major version change potentially impacts the sorting of some
strings, which would require reindexing. Whether your actual data trips
into any of these changes is another matter.

You could check by doing something equivalent to this on every
collatable column with an index built on it, in every table:

8<---
WITH t(s) AS (SELECT  FROM  ORDER BY 1)
   SELECT md5(string_agg(t.s, NULL)) FROM t;
8<---

Check the before and after glibc upgrade result -- if it is the same,
you are good to go. If not, rebuild the index before *any* DML is done
to the table.



... and I should have mentioned that in a similar way, if you have any 
tables that are partitioned by range on collatable columns, the 
partition boundaries potentially are affected. Similarly, constraints 
involving expressions on collatable columns may be affected.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: glibc updarte 2.31 to 2.38

2024-09-19 Thread Joe Conway


On 9/19/24 13:56, Paul Foerster wrote:

On 19 Sep 2024, at 17:14, Tom Lane  wrote:
Maybe.  We don't really track glibc changes, so I can't say for sure,
but it might be advisable to reindex indexes on string columns.



Advisable is a word I undfortunately can't do much with. We have
terabytes and terabytes of data in hundreds of databases each having
potentially hundreds of columns that are candidates. Just reindexing
and taking down applications during that time is not an option in a
24x7 high availability environment.


See my thread-adjacent email, but suffice to say that if there are 
collation differences that do affect your tables/data, and you allow any 
inserts or updates, you may wind up with corrupted data (e.g. duplicate 
data in your otherwise unique indexes/primary keys).


For more examples about that see 
https://joeconway.com/presentations/glibc-SCaLE21x-2024.pdf


An potential alternative for you (discussed at the end of that 
presentation) would be to create a new branch based on your original 
SLES 15.5 glibc RPM equivalent to this:


https://github.com/awslabs/compat-collation-for-glibc/tree/2.17-326.el7

The is likely a non trivial amount of work involved (the port from the 
AL2 rpm to the RHEL7 rpm took me the better part of a couple of days), 
but once done your collation is frozen to the specific version you had 
on 15.5.



--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Automatic upgrade of passwords from md5 to scram-sha256

2025-01-13 Thread Joe Conway


On 1/12/25 17:59, Tom Lane wrote:

"Peter J. Holzer"  writes:

The web framework Django will automatically and transparently rehash any
password with the currently preferred algorithm if it isn't stored that
way already.


Really?  That implies that the framework has access to the original
cleartext password, which is a security fail already.


Can PostgreSQL do that, too? (I haven't found anything)


No.  The server has only the hashed password, it can't reconstruct
the original.


If the password for the user is stored as an MD5 hash, the server
replies to the startup message with an AuthenticationCleartextPassword
respnse to force the client to send the password in the clear
(obviously you only want to do that if the connection is TLS-encrypted
or otherwise safe from eavesdropping).


I think this idea is a nonstarter, TLS or not.  We're generally moving
in the direction of never letting the server see cleartext passwords.
It's already possible to configure libpq to refuse such requests
(see require_auth parameter), although that hasn't been made the
default.



Given PQchangePassword[1] in pg17, I wonder if the next step could be to 
have libpq somehow know/detect that an algorithm change is needed and 
execute that (or some equivalent) from the client side? And presumably 
we could ask pgjdbc to implement something similar.



Joe

[1] 
https://www.postgresql.org/docs/17/libpq-misc.html#LIBPQ-PQCHANGEPASSWORD

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Need help understanding has_function_privilege

2025-03-20 Thread Joe Conway


On 3/19/25 04:24, Cosimo Simeone wrote:

Hi!
I'd need help understanding what i am doing wrong...

Where am I:
=# \c
psql (15.8 (Homebrew), server 14.15 (Debian 14.15-1.pgdg120+1))
You are now connected to database "postgres" as user "postgres".

Init:
=# create role my_user;
=# create schema my_schema;
=# create function my_schema.my_func(p1 text) returns integer as $ 
$select 1::integer;$$ language sql;


See:

https://www.postgresql.org/docs/current/sql-createfunction.html#:~:text=execute%20privilege%20is%20granted%20to%20PUBLIC

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Kubernetes, cgroups v2 and OOM killer - how to avoid?

2025-04-08 Thread Joe Conway


On 4/8/25 13:58, Ancoron Luciferis wrote:

On 2025-04-07 15:21, Joe Conway wrote:

On 4/5/25 07:53, Ancoron Luciferis wrote:

I've been investigating this topic every now and then but to this day
have not come to a setup that consistently leads to a PostgreSQL backend
process receiving an allocation error instead of being killed externally
by the OOM killer.

Why this is a problem for me? Because while applications are accessing
their DBs (multiple services having their own DBs, some high-frequency),
the whole server goes into recovery and kills all backends/connections.

While my applications are written to tolerate that, it also means that
at that time, esp. for the high-frequency apps, events are piling up,
which then leads to a burst as soon as connectivity is restored. This in
turn leads to peaks in resource usage in other places (event store,
in-memory buffers from apps, ...), which sometimes leads to a series of
OOM killer events being triggered, just because some analytics query
went overboard.

Ideally, I'd find a configuration that only terminates one backend but
leaves the others working.

I am wondering whether there is any way to receive a real ENOMEM inside
a cgroup as soon as I try to allocate beyond its memory.max, instead of
relying on the OOM killer.

I know the recommendation is to have vm.overcommit_memory set to 2, but
then that affects all workloads on the host, including critical infra
like the kubelet, CNI, CSI, monitoring, ...

I have already gone through and tested the obvious:

https://www.postgresql.org/docs/current/kernel-resources.html#LINUX- 
MEMORY-OVERCOMMIT


Importantly vm.overcommit_memory set to 2 only matters when memory is 
constrained at the host level.


As soon as you are running in a cgroup with a hard memory limit, 
vm.overcommit_memory is irrelevant.


You can have terabytes of free memory on the host, but if cgroup memory 
usage exceeds memory.limit (cgv1) or memory.max (cgv2) the OOM killer 
will pick the process in the cgroup with the highest oom_score and whack 
it.


Unfortunately there is no equivalent to vm.overcommit_memory within the 
cgroup.



And yes, I know that Linux cgroups v2 memory.max is not an actual hard
limit:

https://www.kernel.org/doc/html/latest/admin-guide/cgroup- 
v2.html#memory-interface-files


Read that again -- memory.max *is* a hard limit (same as memory.limit in 
cgv1).


   "memory.max

     A read-write single value file which exists on non-root cgroups. The
     default is “max”.

     Memory usage hard limit. This is the main mechanism to limit memory
     usage of a cgroup. If a cgroup’s memory usage reaches this limit and
     can’t be reduced, the OOM killer is invoked in the cgroup."


Yes, I know it says "hard limit", but then any app still can go beyond
(might just be on me here to assume any "hard limit" to imply an actual
error when trying to go beyond). The OOM killer then will kick in
eventually, but not in any way that any process inside the cgroup could
prevent. So there is no signal that the app could react to saying "hey,
you just went beyond what you're allowed, please adjust before I kill you".


No, that really is a hard limit and the OOM killer is *really* fast. 
Once that is hit there is no time to intervene. The soft limit 
(memory.high) is the one you want for that.


Or you can monitor PSI and try to anticipate problems, but that is 
difficult at best. If you want to see how that is done, check out 
senpai: https://github.com/facebookincubator/senpai/blob/main/README.md



If you want a soft limit use memory.high.

   "memory.high

     A read-write single value file which exists on non-root cgroups. The
     default is “max”.

     Memory usage throttle limit. If a cgroup’s usage goes over the high
     boundary, the processes of the cgroup are throttled and put under
     heavy reclaim pressure.

     Going over the high limit never invokes the OOM killer and under
     extreme conditions the limit may be breached. The high limit should
     be used in scenarios where an external process monitors the limited
     cgroup to alleviate heavy reclaim pressure.

You want to be using memory.high rather than memory.max.


Hm, so solely relying on reclaim? I think that'll just get the whole
cgroup into ultra-slow mode and would not actually prevent too much
memory allocation. While this may work out just fine for the PostgreSQL
instance, it'll for sure have effects on the other workloads on the same
node (which I have apparently, more PG instances).

Apparently, I also don't see a way to even try this out in a Kubernetes
environment, since there doesn't seem to be a way to set this field
through some workload manifests field.


Yeah, that part I have no idea about. I quit looking at kubernetes 
related things about 3 years ago. Although, this link seems to indicate 
there is a way related to how it does QoS: 
https://kubernetes.

Re: Kubernetes, cgroups v2 and OOM killer - how to avoid?

2025-04-08 Thread Joe Conway


On 4/5/25 07:53, Ancoron Luciferis wrote:

I've been investigating this topic every now and then but to this day
have not come to a setup that consistently leads to a PostgreSQL backend
process receiving an allocation error instead of being killed externally
by the OOM killer.

Why this is a problem for me? Because while applications are accessing
their DBs (multiple services having their own DBs, some high-frequency),
the whole server goes into recovery and kills all backends/connections.

While my applications are written to tolerate that, it also means that
at that time, esp. for the high-frequency apps, events are piling up,
which then leads to a burst as soon as connectivity is restored. This in
turn leads to peaks in resource usage in other places (event store,
in-memory buffers from apps, ...), which sometimes leads to a series of
OOM killer events being triggered, just because some analytics query
went overboard.

Ideally, I'd find a configuration that only terminates one backend but
leaves the others working.

I am wondering whether there is any way to receive a real ENOMEM inside
a cgroup as soon as I try to allocate beyond its memory.max, instead of
relying on the OOM killer.

I know the recommendation is to have vm.overcommit_memory set to 2, but
then that affects all workloads on the host, including critical infra
like the kubelet, CNI, CSI, monitoring, ...

I have already gone through and tested the obvious:

https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT


Importantly vm.overcommit_memory set to 2 only matters when memory is 
constrained at the host level.


As soon as you are running in a cgroup with a hard memory limit, 
vm.overcommit_memory is irrelevant.


You can have terabytes of free memory on the host, but if cgroup memory 
usage exceeds memory.limit (cgv1) or memory.max (cgv2) the OOM killer 
will pick the process in the cgroup with the highest oom_score and whack it.


Unfortunately there is no equivalent to vm.overcommit_memory within the 
cgroup.



And yes, I know that Linux cgroups v2 memory.max is not an actual hard
limit:

https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory-interface-files


Read that again -- memory.max *is* a hard limit (same as memory.limit in 
cgv1).


  "memory.max

A read-write single value file which exists on non-root cgroups. The
default is “max”.

Memory usage hard limit. This is the main mechanism to limit memory
usage of a cgroup. If a cgroup’s memory usage reaches this limit and
can’t be reduced, the OOM killer is invoked in the cgroup."


If you want a soft limit use memory.high.

  "memory.high

A read-write single value file which exists on non-root cgroups. The
default is “max”.

Memory usage throttle limit. If a cgroup’s usage goes over the high
boundary, the processes of the cgroup are throttled and put under
heavy reclaim pressure.

Going over the high limit never invokes the OOM killer and under
extreme conditions the limit may be breached. The high limit should
be used in scenarios where an external process monitors the limited
cgroup to alleviate heavy reclaim pressure.

You want to be using memory.high rather than memory.max.

Also, I don't know what kubernetes recommends these days, but it used to 
require you to disable swap. In more recent versions of kubernetes you 
are able to run with swap enabled but I have no idea what the default is 
-- make sure you run with swap enabled.


The combination of some swap being available, and the throttling under 
heavy reclaim will likely mitigate your problems.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Moving from Linux to Linux?

2025-03-11 Thread Joe Conway


On 3/11/25 12:30, Devrim Gündüz wrote:
Streaming replication won't work between SLES and RHEL, mostly because 
of glibc incompatibilities. Use logical replication.


To be clear, it probably will "work" for some definition of "work" as 
long as the architecture is the same. In other words if you try it, it 
might well appear to work ok.


However as Devrim correctly points out, differences in glibc versions 
almost certainly will mean that you will have corrupt indexes on 
collatable columns. These indexes can be fixed via REINDEX, but if you 
don't do that *before* allowing any DML you risk further complications 
such as inability to REINDEX due to duplicate primary key values.


HTH,

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Moving from Linux to Linux?

2025-03-13 Thread Joe Conway


On 3/13/25 06:10, Paul Foerster wrote:

The other option, which may be equally untenable, is to upgrade in-
place to pg17 and convert everything to use the new built-in
collation provider. That ought to be portable across different
versions of Linux.



Is C.UTF8 really the same as en_US.UTF8? I ask because though we use
en_US.UTF8, we are located in Switzerland and using non English
characters is not exactly the exception. We have characters from all
over the world in our databases. There must be no sorting
differences between en_US.UTF8 and C.UTF8. Otherwise we will run
into trouble with unhappy customers. So, C.UTF8 would only be an
option if the collation would be identical.


Definitely not exactly the same. It does handle all the same characters 
(UTF8).


How often do you really depend on the ordering to the client being 
exactly what the end user wants to see? Often the final ordering is done 
in the application. Or could be done. You can also use a COLLATE clause 
to get exactly the ordering you want when you need it.


The builtin collation has two big advantages -- 1) it should be stable 
and portable, and 2) it should perform substantially faster (in simple 
tests I have seen 10X speed ups).



The problem you might find with libicu is that different versions
of ICU can have the same issues as different versions of glibc,
and you might not have the same ICU version available on SLES and
RHEL.



Yes, I know. As far as I have been told, libicu is far less prone to
major collation changes than glibc is.


Less prone, yes, but still it happens. And when it happens you can get 
corruption of your index, which can lead to things like duplicate 
primary/unique keys and data going to the wrong partition to cite two 
examples.



Also, libicu offers the possibility to pin a version for a certain
time. Our sysadmins will naturally not be able to pin a glibc
version without wrecking an inevitable server upgrade.


Yes, in theory libicu can be pinned more easily than glibc, for sure. 
The reality is that you would either need your Linux distro to provide 
that pinned version as you upgrade, which I don't think any of them do 
(or in the case of SUSE to RHEL they would have to match from the get 
go), or you would have to take on maintaining your own pinned version 
going forward. That latter option is essentially the same as the glibc 
compatibility library approach though, so perhaps not horrible.



If you want to explore the compatibility library approach contact
me off list and I will try to get you started. It has been a
couple of years since I touched it, but when I did it took me a
couple of days to get from the AL2 (glibc 2.26) branch (which was
done first) to the RHEL 7 (glibc 2.17) branch.



I just took a quick glance. I don't have a Github account (and also
don't want one 🤣). I can do a git clone, but that's basically all I
know. Also, right now, I'm just exploring possibilities. As far as I
understand the readme on Github, this will replace the glibc on Red
Had with one with adapted collation rules? If this is the case, then
our admins will definitely say no to this.


No, it does not replace glibc. It extracts just the locale functionality 
from glibc into its own portable library, pretty much like libicu. Then 
you can link against it. So for example you wind up with "glibc 2.26 
locale semantics" with Postgres on your Linux distro which has glibc 
2.34 installed. All of the non-locale functionality comes from the 
system glibc 2.34.



[1] https://www.joeconway.com/presentations/2025-PGConf.IN-glibc.pdf

This is a really good one. Thanks very much for this.


You should probably watch this presentation in its entirety:
  https://www.youtube.com/watch?v=KTA6oau7tl8
Jeremy does a really good job of dispelling misconceptions and if I 
remember correctly Jeff Davis talks about the builtin provider.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Moving from Linux to Linux?

2025-03-12 Thread Joe Conway

On 3/12/25 16:55, Paul Foerster wrote:

Hi Adrian,

On 12 Mar 2025, at 21:26, Adrian Klaver wrote:

A good summary of the glibc issue:

https://wiki.postgresql.org/wiki/Locale_data_changes

With distro information:

https://wiki.postgresql.org/wiki/Locale_data_changes#What_Linux_distributions_are_affected

I know the article, thanks. We needed a full reindex when upgrading from SLES
15.2 to 15.3. We're on 15.5 now, partly already on 15.6. Thank god, we didn't
have that ugly database back then that we have now. But I see the next doomsday
coming on the horizon. 🤣

Also, we try to get rid of the libc locale provider and switch to libicu,
wherever possible.

If you are desperate you could start with
https://github.com/awslabs/compat-collation-for-glibc and create a new
branch for your current version of SLES/glibc and deploy the resulting
rpm to RHEL. At least in theory. FWIW I was able to get the glibc locale
compatibility library for AL2 (glibc 2.26) to work fine on Linux Mint
system (glibc 2.34).

For more on the compatibility library you could watch a presentation[1]
starting here:

https://youtu.be/0E6O-V8Jato?t=1749

The other option, which may be equally untenable, is to upgrade in-place
to pg17 and convert everything to use the new built-in collation
provider. That ought to be portable across different versions of Linux.

The problem you might find with libicu is that different versions of ICU
can have the same issues as different versions of glibc, and you might
not have the same ICU version available on SLES and RHEL.

If you want to explore the compatibility library approach contact me off
list and I will try to get you started. It has been a couple of years
since I touched it, but when I did it took me a couple of days to get
from the AL2 (glibc 2.26) branch (which was done first) to the RHEL 7
(glibc 2.17) branch.

HTH,

[1] https://www.joeconway.com/presentations/2025-PGConf.IN-glibc.pdf

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

77 matches

Mail list logo