AW: Concurrent CTE

2018-04-04 Thread kpi6288
Did you look at this approach using dblink already? 

https://gist.github.com/mjgleaso/8031067

In your situation, you will have to modify the example but it may give an idea 
where to start. 

Klaus

> -Ursprüngliche Nachricht-
> Von: Artur Formella 
> Gesendet: Dienstag, 3. April 2018 22:01
> An: pgsql-general@lists.postgresql.org
> Betreff: Concurrent CTE
> 
> Hello!
> We have a lot of big CTE (~40 statements, ~1000 lines) for very dynamic OLTP
> content and avg response time 50-300ms. Our setup has 96 threads (Intel
> Xeon Gold 6128), 256 GB RAM and 12 SSD (3 tablespaces). DB size < RAM.
> Simplifying the problem:
> 
> WITH aa as (
>SELECT * FROM table1
> ), bb (
>SELECT * FROM table2
> ), cc (
>SELECT * FROM table3
> ), dd (
>SELECT * FROM aa,bb
> ), ee (
>SELECT * FROM aa,bb,cc
> ), ff (
>SELECT * FROM ee,dd
> ), gg (
>SELECT * FROM table4
> ), hh (
>SELECT * FROM aa
> )
> SELECT * FROM gg,hh,ff /* primary statement */
> 
> Execution now:
> time-->
> Thread1: aa | bb | cc | dd | ee | ff | gg | hh | primary
> 
> And the question: is it possible to achieve more concurrent execution plan to
> reduce the response time? For example:
> Thread1: aa | dd | ff | primary
> Thread2: bb | ee | gg
> Thread3: cc | -- | hh
> 
> Table1, table2 and table3 are located on separate tablespaces and are
> independent.
> Partial results (aa,bb,cc,dd,ee) are quite big and slow (full text search, 
> arrays,
> custom collations, function scans...).
> 
> We consider resigning from the CTE and rewrite to RX Java but we are afraid
> of downloading partial results and sending it back with WHERE IN(...).
> 
> Thanks!
> 
> Artur Formella
> 
> 





Disable TRUST authentication by using ClientAuthentication_hook

2018-07-13 Thread kpi6288
I'd like to disable the TRUST authentication method for certain servers
where modification of pg_hba.conf and restarting a service is fairly easy
for a number of users. 

 

I looked at this example
https://wiki.postgresql.org/images/e/e3/Hooks_in_postgresql.pdf It appears
that creating a ClientAuthentication_hook and call ereport(ERROR) in case
that Port->HbaLine contains TRUST would do the job. Is that right? 

 

I am aware that this would not make the server entirely secure but it would
make it at least a bit more difficult to enter. 

 

Thanks Klaus



AW: Disable TRUST authentication by using ClientAuthentication_hook

2018-07-13 Thread kpi6288
 

> I'm not sure this is such a good idea. You may need the trust authentication 
> method, 

> for example if you forgot the superuser password. Otherwise, there's good 
> chance 

> you might use the ClientAuthentication hook to do what you want.

 

Thanks for your feedback. 

 

Klaus

 



AW: Disable TRUST authentication by using ClientAuthentication_hook

2018-07-13 Thread kpi6288
> -Ursprüngliche Nachricht-
> Von: Tom Lane 
> 
> > If you're an server admin you can disable the extension (editing
> > shared_pre_load_libraries GUC), change password and then enable the
> > extension again...

I am aware of this and all the other points. 

> Or more to the point: exactly what is the threat model here?  

It is similar like with your garage door: locking it with a simple 50 
year-old-key is still better than just clamping it with a wedge. It is 
certainly not as good as enforcing the door and putting a modern and solid lock 
to it. 

> ISTM that
> someone with enough privilege to alter pg_hba.conf can probably suppress
> loading of an extension too, so that the security added by this idea is not 
> just
> questionable but completely illusory.

This is a valid point of concern. However, settings in pg_hba.conf need to be 
documented to allow modification of IP address ranges etc. A few people have 
access to this and it is likely that they look into the manuals and find 
alternative settings. Configuration of libraries is not clear to everyone. 

> 
> What would actually move the goalposts a bit is to build a modified server
> which doesn't have the TRUST code path at all, so that there is no question of
> installing an extension or not; then somebody who wants to defeat the
> security needs to be able to replace the server executable.  But you don't
> need any hook if you do that.

That is true but I came across a discussion that for several reasons a proposal 
to add build-time options for authentication methods was not implemented. I'm 
trying to avoid modification of the source code if I can. I agree that I may 
have to build a modified server if I don't find a better solution. 

Regards Klaus




AW: Do we need yet another IDE (SQL development assistant) for PostgreSQL?

2018-07-16 Thread kpi6288
We – and the majority of our customers - are mainly focused on Windows. We use 
pgadmin iii and our own assistants. pgadmin iv ist still too slow on Windows 
compared to pgadmin iii. That is one reason why we still use PostgreSQL 9.6.  

 

That said, one requirement on a commercial tool for us would be royalty free 
distribution to our customers. It should however provide the functions of 
pgadmin iii. 

 

Regards Klaus

 

 

Von: Dmitry Igrishin  
Gesendet: Sonntag, 15. Juli 2018 18:59
An: pgsql-gene...@postgresql.org
Betreff: Do we need yet another IDE (SQL development assistant) for PostgreSQL?

 

Hello all,

 

Colleagues. There is an idea to develop a commercial IDE for PostgreSQL under 
Windows.

At the initial stage, not so much an IDE, as an assistant for the server side 
development.

What features would you like to see in such an instrument? Thanks.



CTE with JOIN of two tables is much faster than a regular query

2018-08-18 Thread kpi6288
Running PostgreSQL 9.5 on Windows. 

 

The CTE mentioned below completes the query in 4.5 seconds while the regular
query takes 66 seconds. I read from EXPLAIN ANALYSE that the regular query
starts with a full table scan over "Doc" while the CTE joins the two tables
first and applies the filter condition in the 2nd step. 

I believe that some rows in "Doc" which are not referenced by "F" contain a
large amount of data in the field "szText" and this will slow down the ILIKE
operator. 

 

What can I do to improve the performance of the regular query without using
a CTE? 

 

This is a much simplified extract from a larger application: 

 

CREATE TABLE Doc (

  oID UUID NOT NULL PRIMARY KEY,

  uDocID UUID NOT NULL UNIQUE,

  szText TEXT 

  );

 

CREATE TABLE F (

  oID UUID NOT NULL PRIMARY KEY,

  uDocRef UUID,

  CONSTRAINT F_fkey1 FOREIGN KEY (uDocRef) REFERENCES Doc (uDocID)

  );  

 

-- just in case .

ALTER TABLE Doc ALTER uDocID SET STATISTICS 1;

ALTER TABLE Doc ALTER szText SET STATISTICS 1;

VACUUM ANALYSE Doc;

 

SELECT COUNT(*) FROM Doc;

=> 125946 records 

 

ALTER TABLE F ALTER uDocRef SET STATISTICS 1;

VACUUM ANALYSE F;

 

SELECT COUNT(*) FROM F;

=> 32605 records

 

Result with CTE: 

 

EXPLAIN ANALYSE 

  WITH a AS (

  SELECT F.oID, Doc.szText

  FROM F

  JOIN Doc ON F.uDocRef = Doc.udocid

) 

SELECT * 

  FROM a 

  WHERE szText ILIKE '%480GB%';

 

"CTE Scan on a  (cost=9463.42..10197.03 rows=52 width=48) (actual
time=478.770..4551.613 rows=10 loops=1)"

"  Filter: (sztext ~~* '%480GB%'::text)"

"  Rows Removed by Filter: 32595"

"  CTE a"

"->  Hash Join  (cost=973.61..9463.42 rows=32605 width=359) (actual
time=36.998..100.337 rows=32605 loops=1)"

"  Hash Cond: (doc.udocid = f.udocref)"

"  ->  Seq Scan on doc  (cost=0.00..7691.46 rows=125946 width=359)
(actual time=0.008..18.269 rows=125946 loops=1)"

"  ->  Hash  (cost=566.05..566.05 rows=32605 width=32) (actual
time=35.825..35.825 rows=32605 loops=1)"

"Buckets: 32768  Batches: 1  Memory Usage: 2294kB"

"->  Seq Scan on f  (cost=0.00..566.05 rows=32605 width=32)
(actual time=0.005..14.677 rows=32605 loops=1)"

"Planning time: 4.689 ms"

"Execution time: 4554.893 ms"

 

Result with regular query: 

 

EXPLAIN ANALYSE 

SELECT F.oID, Doc.szText

FROM F

JOIN Doc ON F.uDocRef = Doc.udocid

WHERE szText ILIKE '%480GB%';

 

"Hash Join  (cost=8006.56..8694.93 rows=5 width=359) (actual
time=66500.415..66506.978 rows=10 loops=1)"

"  Hash Cond: (f.udocref = doc.udocid)"

"  ->  Seq Scan on f  (cost=0.00..566.05 rows=32605 width=32) (actual
time=0.002..3.143 rows=32605 loops=1)"

"  ->  Hash  (cost=8006.32..8006.32 rows=19 width=359) (actual
time=66500.023..66500.023 rows=16 loops=1)"

"Buckets: 1024  Batches: 1  Memory Usage: 19kB"

"->  Seq Scan on doc  (cost=0.00..8006.32 rows=19 width=359) (actual
time=8864.720..66499.991 rows=16 loops=1)"

"  Filter: (sztext ~~* '%480GB%'::text)"

"  Rows Removed by Filter: 125930"

"Planning time: 263.542 ms"

"Execution time: 66507.003 ms"

 

 

 

 



AW: CTE with JOIN of two tables is much faster than a regular query

2018-08-18 Thread kpi6288



> -Ursprüngliche Nachricht-
> Von: Andreas Kretschmer 
> Gesendet: Samstag, 18. August 2018 12:27
 
> Am 18.08.2018 um 11:36 schrieb kpi6...@gmail.com:
> > What can I do to improve the performance of the regular query without
> > using a CTE?
> 
> try to rewrite it to a subselect:
> 
> select ... from ... join (selec ... from ... where ...) x on ...
> 

Do mean like this?

EXPLAIN ANALYSE 
SELECT F.oID, D.szText
FROM F
JOIN (SELECT Doc.uDocID, Doc.szText FROM Doc WHERE szText ILIKE '%480GB%')
AS D ON D.uDocID = F.uDocRef;

Just as bad as my regular query: 

"Hash Join  (cost=8006.56..8694.93 rows=5 width=359) (actual
time=66777.898..66784.630 rows=10 loops=1)"
"  Hash Cond: (f.udocref = doc.udocid)"
"  ->  Seq Scan on f  (cost=0.00..566.05 rows=32605 width=32) (actual
time=0.002..3.563 rows=32605 loops=1)"
"  ->  Hash  (cost=8006.32..8006.32 rows=19 width=359) (actual
time=66777.471..66777.471 rows=16 loops=1)"
"Buckets: 1024  Batches: 1  Memory Usage: 19kB"
"->  Seq Scan on doc  (cost=0.00..8006.32 rows=19 width=359) (actual
time=9013.317..66777.438 rows=16 loops=1)"
"  Filter: (sztext ~~* '%480GB%'::text)"
"  Rows Removed by Filter: 125930"
"Planning time: 236.354 ms"
"Execution time: 66784.651 ms"




AW: AW: CTE with JOIN of two tables is much faster than a regular query

2018-08-18 Thread kpi6288
> -Ursprüngliche Nachricht-
> Von: Adrian Klaver 
> Gesendet: Samstag, 18. August 2018 16:24
> 
> To try to replicate what the CTE is doing I would try:
> SELECT  *
> FROM  Doc
> JOIN  (SELECT uDocRef, F.oID, Doc.szText
> FROM F JOIN   Doc ON  F.uDocRef = Doc.udocid) AS D
> ON D.uDocRef = Doc.udocid
> WHERE D.szText ILIKE '%480GB%'

No difference - still starting with the full scan on Doc and lasting 67 
seconds: 

"Nested Loop  (cost=8006.98..8700.40 rows=5 width=750) (actual 
time=66845.857..66852.705 rows=10 loops=1)"
"  ->  Hash Join  (cost=8006.56..8694.93 rows=5 width=391) (actual 
time=66845.838..66852.613 rows=10 loops=1)"
"Hash Cond: (f.udocref = doc_1.udocid)"
"->  Seq Scan on f  (cost=0.00..566.05 rows=32605 width=32) (actual 
time=0.002..3.428 rows=32605 loops=1)"
"->  Hash  (cost=8006.32..8006.32 rows=19 width=359) (actual 
time=66845.431..66845.431 rows=16 loops=1)"
"  Buckets: 1024  Batches: 1  Memory Usage: 19kB"
"  ->  Seq Scan on doc doc_1  (cost=0.00..8006.32 rows=19 
width=359) (actual time=9042.984..66845.398 rows=16 loops=1)"
"Filter: (sztext ~~* '%480GB%'::text)"
"Rows Removed by Filter: 125930"
"  ->  Index Scan using doc_udocid_key on doc  (cost=0.42..1.08 rows=1 
width=375) (actual time=0.008..0.008 rows=1 loops=10)"
"Index Cond: (udocid = f.udocref)"
"Planning time: 252.162 ms"
"Execution time: 66852.737 ms"





AW: CTE with JOIN of two tables is much faster than a regular query

2018-08-18 Thread kpi6288
> -Ursprüngliche Nachricht-
> Von: Stephen Frost 
> Gesendet: Samstag, 18. August 2018 16:39

Hello,

> 
> > What can I do to improve the performance of the regular query without
> > using a CTE?
> 
> You could possibly build a trigram index on the field you're searching,
which
> could avoid the full table scan.  Of course, that index could be quite
large, so
> there's downsides to that.  If these are words you're looking for then you
> could use PG's full text indexing to build indexes on the words and then
use
> that instead.  If you are fine working with words but are concerned about
> misspellings then you can extract out the distinct words, build a trigram
index
> on those, find the most similar words based on the input and then search
for
> those words using the FTI.
> 
> Unfortunately, we don't currently pay attention to things like average
string
> length when considering the cost of performing an 'ilike', so we figure
that
> doing the filtering first and then the join will be faster, but that
obviously falls
> over in some cases, like this one.  Using the CTE forces PG to (today, at
least)
> do the join first, but that isn't really good to rely on.

A trigram index would be a possible help in this particular scenario but
size and updating the index in other parts of the application would be
probably create other issues. I may try it, though. 

But thanks to confirming my assumption. I just thought that it should be
obvious to the optimizer to do the join first and filter on this result. But
I'm reading you r post that there is nothing that I can do to modify the
behavior of the optimizer. Or is there a way to specify the cost for an
operator (ILIKE in this case) on a specific column? 

Thanks 
Klaus




AW: CTE with JOIN of two tables is much faster than a regular query

2018-08-18 Thread kpi6288
> -Ursprüngliche Nachricht-
> Von: Tom Lane 
> Gesendet: Samstag, 18. August 2018 17:29
> 
> Well, it's simpler than that: filter quals are always evaluated at the
lowest
> possible plan level.  

Thank you. This "always" was not clear to me, but it explains a few similar
cases (with not-so-extreme differences) that I could not understand.

Regards
Klaus




AW: CTE with JOIN of two tables is much faster than a regular query

2018-08-18 Thread kpi6288
> -Ursprüngliche Nachricht-
> Von: Ravi Krishna 
> Gesendet: Samstag, 18. August 2018 18:25
> 
> > What can I do to improve the performance of the regular query without
> using a CTE?
> 
> Why do you care ?  When I find that I can write a SQL 3 different ways, I will
> go for the most efficient one.  So why not accept the CTE version of this SQL.
> Just curious.

We're using object mapping / entity frameworks (e.g. XPO, Entity Framework 
Core). These frameworks support regular queries out-of-the box; a CTEs require 
additional effort and are more difficult to maintain. 

Regards
Klaus




AW: CTE with JOIN of two tables is much faster than a regular query

2018-08-19 Thread kpi6288
> -Ursprüngliche Nachricht-
> Von: Tom Lane 
> Gesendet: Samstag, 18. August 2018 17:29
> 
> In any case, given that the ILIKE selects so few rows (and the planner
knows
> it!), finding a way to index that is clearly the right answer.

A trigram index took 9 minutes to build but improved the regular query from
67 seconds down to 500 ms. 

Although this is an impressive improvement, I'm afraid that the index might
create a delays in other parts of the application (INSERT / UPDATE). We will
probably rework the design of this particular table. 

Thanks to everyone who helped me in this matter. 

Regards 

Klaus 




AW: AW: CTE with JOIN of two tables is much faster than a regular query

2018-08-19 Thread kpi6288
> -Ursprüngliche Nachricht-
> Von: Tim Cross 
> Gesendet: Sonntag, 19. August 2018 04:57
> >
> > We're using object mapping / entity frameworks (e.g. XPO, Entity
> Framework Core). These frameworks support regular queries out-of-the
> box; a CTEs require additional effort and are more difficult to maintain.
> >
> 
> Ah, another reason to avoid object mapping/entity frameworks! I guess
> really the same reason - loss of flexibility and expressive power.

While I agree that you loose control over certain details, we are overall quite 
happy using the frameworks. 

The frameworks nowadays provide the ability to call procedures if required - 
but using the objects directly is more convenient for the developers. SQL 
procedures add (just like a CTE) an additional layer to the application design 
which needs maintenance. That's fine if it really helps overall but we try to 
avoid it if it isn't needed. 

Regards
Klaus





AW: Forks of pgadmin3?

2019-03-22 Thread kpi6288
This is probably my 10th attempt to move from pgadminIII to pgadmin4. At least 
the performance has significantly improved over time and seems now acceptable. 

The biggest drawback is however that all elements are locked up in one browser 
window – I cannot find any option to detach a query windows and put it on a 
different monitor. 

95% of my time I use pgadminIII just to type select and update statements and 
review the output rows. 

I know that I can do this in psql but it’s not handy with many columns. 

For that reason we currently stay with pgadminIII (and this is for us also one 
of several reasons to delay any move from 9.6 to a more recent version).

 

Klaus

 

Von: Tony Shelver  
Gesendet: Freitag, 22. März 2019 15:34
Cc: PG-General Mailing List 
Betreff: Re: Forks of pgadmin3?

 

Or just persevere with pgadmin4 for a few months?   Pretty common for people to 
hate any major changes to a tool that they are very comfortable with.

 

This year I've invested the time to learn a few new toolsets (not on Postgresql 
necessarily) and found it to be well worth while.

 

At least pgAdmin4 is up to date with all the new features in 11.



AW: Forks of pgadmin3?

2019-03-25 Thread kpi6288
Thanks for the link but we're very reluctant to use Java based programs. 
The main reason is that we need to do some works on servers whose admins simply 
do not allow to install Java. 
The screenshots look very promises, however. 

Regards
Klaus

> -Ursprüngliche Nachricht-
> Von: Thomas Kellerer 
> Gesendet: Montag, 25. März 2019 12:06
> An: pgsql-general@lists.postgresql.org
> Betreff: Re: Forks of pgadmin3?
> 
> kpi6...@gmail.com schrieb am 22.03.2019 um 17:25:
> > 95% of my time I use pgadminIII just to type select and update
> > statements and review the output rows.
> >
> > I know that I can do this in psql but it’s not handy with many
> > columns.
> 
> An alternative you might want to try is SQL Workbench/J: https://www.sql-
> workbench.eu/
> 
> Full disclosure: I am the author of that tool.
> 
> It's a cross DBMS tool, but my primary focus is Postgres.
> 
> It focuses on running SQL queries rather then being a DBA tool.
> 
> Regards
> Thomas





AW: Forks of pgadmin3?

2019-03-25 Thread kpi6288
Thank you, I was not aware of this option - this certainly helps. 

 

Regards

Klaus

 

Von: Murtuza Zabuawala  
Gesendet: Freitag, 22. März 2019 18:48
An: kpi6...@gmail.com
Cc: PostgreSQL mailing lists 
Betreff: Re: Forks of pgadmin3?

 

Opening Query tool or Debugger window in a new separate browser window is 
configurable option in pgAdmin4, Goto

 

File -> Preferences -> Query Tool -> Display -> Open in new browser tab 

Set it to: True

 



Use left hand column for null values

2018-01-19 Thread kpi6288
I'm trying to fill up columns containing NULL with the most recent NOT NULL
value from left to right. 

Example: 

Select 2, 1, null, null, 3 

Should be converted into 

2, 1, 1, 1, 3

 

The following query works but I wonder if there is an easier way for tables
with 50 or more columns: 

 

with a (c1, c2, c3, c4, c5) as (

values(2, 1, null::int, null::int, 3)

)

select 

c1,

coalesce (c2, c1) as c2, 

coalesce (c3, c2, c1) as c3,

coalesce (c4, c3, c2, c1) as c4,

coalesce (c5, c4, c3, c2, c1) as c5

from a

 

Thanks

Klaus

 

 



AW: Use left hand column for null values

2018-01-19 Thread kpi6288
> [David G. Johnston]

> My $0.02​

> 

> ​Unless you have some need to generalize I'd say ​just brute-force it and be 
> done.

> 

> You could maybe play with arrays to get something that looks different but I 
> don't think it would be much shorter to code or easier to understand.

 

My concern was performance, because I feared that coalesce would evaluate e.g. 
column c1 50 times. But I did some testing and it seems to scale linear. 
100,000 rows with 50 columns take 25 seconds on my test machine. Not exciting 
but something I can work with. 

 

I played already with arrays but what I came up with was slower than the plain 
coalesce. 

 

 

Klaus