AW: Concurrent CTE
Did you look at this approach using dblink already? https://gist.github.com/mjgleaso/8031067 In your situation, you will have to modify the example but it may give an idea where to start. Klaus > -Ursprüngliche Nachricht- > Von: Artur Formella > Gesendet: Dienstag, 3. April 2018 22:01 > An: pgsql-general@lists.postgresql.org > Betreff: Concurrent CTE > > Hello! > We have a lot of big CTE (~40 statements, ~1000 lines) for very dynamic OLTP > content and avg response time 50-300ms. Our setup has 96 threads (Intel > Xeon Gold 6128), 256 GB RAM and 12 SSD (3 tablespaces). DB size < RAM. > Simplifying the problem: > > WITH aa as ( >SELECT * FROM table1 > ), bb ( >SELECT * FROM table2 > ), cc ( >SELECT * FROM table3 > ), dd ( >SELECT * FROM aa,bb > ), ee ( >SELECT * FROM aa,bb,cc > ), ff ( >SELECT * FROM ee,dd > ), gg ( >SELECT * FROM table4 > ), hh ( >SELECT * FROM aa > ) > SELECT * FROM gg,hh,ff /* primary statement */ > > Execution now: > time--> > Thread1: aa | bb | cc | dd | ee | ff | gg | hh | primary > > And the question: is it possible to achieve more concurrent execution plan to > reduce the response time? For example: > Thread1: aa | dd | ff | primary > Thread2: bb | ee | gg > Thread3: cc | -- | hh > > Table1, table2 and table3 are located on separate tablespaces and are > independent. > Partial results (aa,bb,cc,dd,ee) are quite big and slow (full text search, > arrays, > custom collations, function scans...). > > We consider resigning from the CTE and rewrite to RX Java but we are afraid > of downloading partial results and sending it back with WHERE IN(...). > > Thanks! > > Artur Formella > >
Disable TRUST authentication by using ClientAuthentication_hook
I'd like to disable the TRUST authentication method for certain servers where modification of pg_hba.conf and restarting a service is fairly easy for a number of users. I looked at this example https://wiki.postgresql.org/images/e/e3/Hooks_in_postgresql.pdf It appears that creating a ClientAuthentication_hook and call ereport(ERROR) in case that Port->HbaLine contains TRUST would do the job. Is that right? I am aware that this would not make the server entirely secure but it would make it at least a bit more difficult to enter. Thanks Klaus
AW: Disable TRUST authentication by using ClientAuthentication_hook
> I'm not sure this is such a good idea. You may need the trust authentication > method, > for example if you forgot the superuser password. Otherwise, there's good > chance > you might use the ClientAuthentication hook to do what you want. Thanks for your feedback. Klaus
AW: Disable TRUST authentication by using ClientAuthentication_hook
> -Ursprüngliche Nachricht- > Von: Tom Lane > > > If you're an server admin you can disable the extension (editing > > shared_pre_load_libraries GUC), change password and then enable the > > extension again... I am aware of this and all the other points. > Or more to the point: exactly what is the threat model here? It is similar like with your garage door: locking it with a simple 50 year-old-key is still better than just clamping it with a wedge. It is certainly not as good as enforcing the door and putting a modern and solid lock to it. > ISTM that > someone with enough privilege to alter pg_hba.conf can probably suppress > loading of an extension too, so that the security added by this idea is not > just > questionable but completely illusory. This is a valid point of concern. However, settings in pg_hba.conf need to be documented to allow modification of IP address ranges etc. A few people have access to this and it is likely that they look into the manuals and find alternative settings. Configuration of libraries is not clear to everyone. > > What would actually move the goalposts a bit is to build a modified server > which doesn't have the TRUST code path at all, so that there is no question of > installing an extension or not; then somebody who wants to defeat the > security needs to be able to replace the server executable. But you don't > need any hook if you do that. That is true but I came across a discussion that for several reasons a proposal to add build-time options for authentication methods was not implemented. I'm trying to avoid modification of the source code if I can. I agree that I may have to build a modified server if I don't find a better solution. Regards Klaus
AW: Do we need yet another IDE (SQL development assistant) for PostgreSQL?
We – and the majority of our customers - are mainly focused on Windows. We use pgadmin iii and our own assistants. pgadmin iv ist still too slow on Windows compared to pgadmin iii. That is one reason why we still use PostgreSQL 9.6. That said, one requirement on a commercial tool for us would be royalty free distribution to our customers. It should however provide the functions of pgadmin iii. Regards Klaus Von: Dmitry Igrishin Gesendet: Sonntag, 15. Juli 2018 18:59 An: pgsql-gene...@postgresql.org Betreff: Do we need yet another IDE (SQL development assistant) for PostgreSQL? Hello all, Colleagues. There is an idea to develop a commercial IDE for PostgreSQL under Windows. At the initial stage, not so much an IDE, as an assistant for the server side development. What features would you like to see in such an instrument? Thanks.
CTE with JOIN of two tables is much faster than a regular query
Running PostgreSQL 9.5 on Windows. The CTE mentioned below completes the query in 4.5 seconds while the regular query takes 66 seconds. I read from EXPLAIN ANALYSE that the regular query starts with a full table scan over "Doc" while the CTE joins the two tables first and applies the filter condition in the 2nd step. I believe that some rows in "Doc" which are not referenced by "F" contain a large amount of data in the field "szText" and this will slow down the ILIKE operator. What can I do to improve the performance of the regular query without using a CTE? This is a much simplified extract from a larger application: CREATE TABLE Doc ( oID UUID NOT NULL PRIMARY KEY, uDocID UUID NOT NULL UNIQUE, szText TEXT ); CREATE TABLE F ( oID UUID NOT NULL PRIMARY KEY, uDocRef UUID, CONSTRAINT F_fkey1 FOREIGN KEY (uDocRef) REFERENCES Doc (uDocID) ); -- just in case . ALTER TABLE Doc ALTER uDocID SET STATISTICS 1; ALTER TABLE Doc ALTER szText SET STATISTICS 1; VACUUM ANALYSE Doc; SELECT COUNT(*) FROM Doc; => 125946 records ALTER TABLE F ALTER uDocRef SET STATISTICS 1; VACUUM ANALYSE F; SELECT COUNT(*) FROM F; => 32605 records Result with CTE: EXPLAIN ANALYSE WITH a AS ( SELECT F.oID, Doc.szText FROM F JOIN Doc ON F.uDocRef = Doc.udocid ) SELECT * FROM a WHERE szText ILIKE '%480GB%'; "CTE Scan on a (cost=9463.42..10197.03 rows=52 width=48) (actual time=478.770..4551.613 rows=10 loops=1)" " Filter: (sztext ~~* '%480GB%'::text)" " Rows Removed by Filter: 32595" " CTE a" "-> Hash Join (cost=973.61..9463.42 rows=32605 width=359) (actual time=36.998..100.337 rows=32605 loops=1)" " Hash Cond: (doc.udocid = f.udocref)" " -> Seq Scan on doc (cost=0.00..7691.46 rows=125946 width=359) (actual time=0.008..18.269 rows=125946 loops=1)" " -> Hash (cost=566.05..566.05 rows=32605 width=32) (actual time=35.825..35.825 rows=32605 loops=1)" "Buckets: 32768 Batches: 1 Memory Usage: 2294kB" "-> Seq Scan on f (cost=0.00..566.05 rows=32605 width=32) (actual time=0.005..14.677 rows=32605 loops=1)" "Planning time: 4.689 ms" "Execution time: 4554.893 ms" Result with regular query: EXPLAIN ANALYSE SELECT F.oID, Doc.szText FROM F JOIN Doc ON F.uDocRef = Doc.udocid WHERE szText ILIKE '%480GB%'; "Hash Join (cost=8006.56..8694.93 rows=5 width=359) (actual time=66500.415..66506.978 rows=10 loops=1)" " Hash Cond: (f.udocref = doc.udocid)" " -> Seq Scan on f (cost=0.00..566.05 rows=32605 width=32) (actual time=0.002..3.143 rows=32605 loops=1)" " -> Hash (cost=8006.32..8006.32 rows=19 width=359) (actual time=66500.023..66500.023 rows=16 loops=1)" "Buckets: 1024 Batches: 1 Memory Usage: 19kB" "-> Seq Scan on doc (cost=0.00..8006.32 rows=19 width=359) (actual time=8864.720..66499.991 rows=16 loops=1)" " Filter: (sztext ~~* '%480GB%'::text)" " Rows Removed by Filter: 125930" "Planning time: 263.542 ms" "Execution time: 66507.003 ms"
AW: CTE with JOIN of two tables is much faster than a regular query
> -Ursprüngliche Nachricht- > Von: Andreas Kretschmer > Gesendet: Samstag, 18. August 2018 12:27 > Am 18.08.2018 um 11:36 schrieb kpi6...@gmail.com: > > What can I do to improve the performance of the regular query without > > using a CTE? > > try to rewrite it to a subselect: > > select ... from ... join (selec ... from ... where ...) x on ... > Do mean like this? EXPLAIN ANALYSE SELECT F.oID, D.szText FROM F JOIN (SELECT Doc.uDocID, Doc.szText FROM Doc WHERE szText ILIKE '%480GB%') AS D ON D.uDocID = F.uDocRef; Just as bad as my regular query: "Hash Join (cost=8006.56..8694.93 rows=5 width=359) (actual time=66777.898..66784.630 rows=10 loops=1)" " Hash Cond: (f.udocref = doc.udocid)" " -> Seq Scan on f (cost=0.00..566.05 rows=32605 width=32) (actual time=0.002..3.563 rows=32605 loops=1)" " -> Hash (cost=8006.32..8006.32 rows=19 width=359) (actual time=66777.471..66777.471 rows=16 loops=1)" "Buckets: 1024 Batches: 1 Memory Usage: 19kB" "-> Seq Scan on doc (cost=0.00..8006.32 rows=19 width=359) (actual time=9013.317..66777.438 rows=16 loops=1)" " Filter: (sztext ~~* '%480GB%'::text)" " Rows Removed by Filter: 125930" "Planning time: 236.354 ms" "Execution time: 66784.651 ms"
AW: AW: CTE with JOIN of two tables is much faster than a regular query
> -Ursprüngliche Nachricht- > Von: Adrian Klaver > Gesendet: Samstag, 18. August 2018 16:24 > > To try to replicate what the CTE is doing I would try: > SELECT * > FROM Doc > JOIN (SELECT uDocRef, F.oID, Doc.szText > FROM F JOIN Doc ON F.uDocRef = Doc.udocid) AS D > ON D.uDocRef = Doc.udocid > WHERE D.szText ILIKE '%480GB%' No difference - still starting with the full scan on Doc and lasting 67 seconds: "Nested Loop (cost=8006.98..8700.40 rows=5 width=750) (actual time=66845.857..66852.705 rows=10 loops=1)" " -> Hash Join (cost=8006.56..8694.93 rows=5 width=391) (actual time=66845.838..66852.613 rows=10 loops=1)" "Hash Cond: (f.udocref = doc_1.udocid)" "-> Seq Scan on f (cost=0.00..566.05 rows=32605 width=32) (actual time=0.002..3.428 rows=32605 loops=1)" "-> Hash (cost=8006.32..8006.32 rows=19 width=359) (actual time=66845.431..66845.431 rows=16 loops=1)" " Buckets: 1024 Batches: 1 Memory Usage: 19kB" " -> Seq Scan on doc doc_1 (cost=0.00..8006.32 rows=19 width=359) (actual time=9042.984..66845.398 rows=16 loops=1)" "Filter: (sztext ~~* '%480GB%'::text)" "Rows Removed by Filter: 125930" " -> Index Scan using doc_udocid_key on doc (cost=0.42..1.08 rows=1 width=375) (actual time=0.008..0.008 rows=1 loops=10)" "Index Cond: (udocid = f.udocref)" "Planning time: 252.162 ms" "Execution time: 66852.737 ms"
AW: CTE with JOIN of two tables is much faster than a regular query
> -Ursprüngliche Nachricht- > Von: Stephen Frost > Gesendet: Samstag, 18. August 2018 16:39 Hello, > > > What can I do to improve the performance of the regular query without > > using a CTE? > > You could possibly build a trigram index on the field you're searching, which > could avoid the full table scan. Of course, that index could be quite large, so > there's downsides to that. If these are words you're looking for then you > could use PG's full text indexing to build indexes on the words and then use > that instead. If you are fine working with words but are concerned about > misspellings then you can extract out the distinct words, build a trigram index > on those, find the most similar words based on the input and then search for > those words using the FTI. > > Unfortunately, we don't currently pay attention to things like average string > length when considering the cost of performing an 'ilike', so we figure that > doing the filtering first and then the join will be faster, but that obviously falls > over in some cases, like this one. Using the CTE forces PG to (today, at least) > do the join first, but that isn't really good to rely on. A trigram index would be a possible help in this particular scenario but size and updating the index in other parts of the application would be probably create other issues. I may try it, though. But thanks to confirming my assumption. I just thought that it should be obvious to the optimizer to do the join first and filter on this result. But I'm reading you r post that there is nothing that I can do to modify the behavior of the optimizer. Or is there a way to specify the cost for an operator (ILIKE in this case) on a specific column? Thanks Klaus
AW: CTE with JOIN of two tables is much faster than a regular query
> -Ursprüngliche Nachricht- > Von: Tom Lane > Gesendet: Samstag, 18. August 2018 17:29 > > Well, it's simpler than that: filter quals are always evaluated at the lowest > possible plan level. Thank you. This "always" was not clear to me, but it explains a few similar cases (with not-so-extreme differences) that I could not understand. Regards Klaus
AW: CTE with JOIN of two tables is much faster than a regular query
> -Ursprüngliche Nachricht- > Von: Ravi Krishna > Gesendet: Samstag, 18. August 2018 18:25 > > > What can I do to improve the performance of the regular query without > using a CTE? > > Why do you care ? When I find that I can write a SQL 3 different ways, I will > go for the most efficient one. So why not accept the CTE version of this SQL. > Just curious. We're using object mapping / entity frameworks (e.g. XPO, Entity Framework Core). These frameworks support regular queries out-of-the box; a CTEs require additional effort and are more difficult to maintain. Regards Klaus
AW: CTE with JOIN of two tables is much faster than a regular query
> -Ursprüngliche Nachricht- > Von: Tom Lane > Gesendet: Samstag, 18. August 2018 17:29 > > In any case, given that the ILIKE selects so few rows (and the planner knows > it!), finding a way to index that is clearly the right answer. A trigram index took 9 minutes to build but improved the regular query from 67 seconds down to 500 ms. Although this is an impressive improvement, I'm afraid that the index might create a delays in other parts of the application (INSERT / UPDATE). We will probably rework the design of this particular table. Thanks to everyone who helped me in this matter. Regards Klaus
AW: AW: CTE with JOIN of two tables is much faster than a regular query
> -Ursprüngliche Nachricht- > Von: Tim Cross > Gesendet: Sonntag, 19. August 2018 04:57 > > > > We're using object mapping / entity frameworks (e.g. XPO, Entity > Framework Core). These frameworks support regular queries out-of-the > box; a CTEs require additional effort and are more difficult to maintain. > > > > Ah, another reason to avoid object mapping/entity frameworks! I guess > really the same reason - loss of flexibility and expressive power. While I agree that you loose control over certain details, we are overall quite happy using the frameworks. The frameworks nowadays provide the ability to call procedures if required - but using the objects directly is more convenient for the developers. SQL procedures add (just like a CTE) an additional layer to the application design which needs maintenance. That's fine if it really helps overall but we try to avoid it if it isn't needed. Regards Klaus
AW: Forks of pgadmin3?
This is probably my 10th attempt to move from pgadminIII to pgadmin4. At least the performance has significantly improved over time and seems now acceptable. The biggest drawback is however that all elements are locked up in one browser window – I cannot find any option to detach a query windows and put it on a different monitor. 95% of my time I use pgadminIII just to type select and update statements and review the output rows. I know that I can do this in psql but it’s not handy with many columns. For that reason we currently stay with pgadminIII (and this is for us also one of several reasons to delay any move from 9.6 to a more recent version). Klaus Von: Tony Shelver Gesendet: Freitag, 22. März 2019 15:34 Cc: PG-General Mailing List Betreff: Re: Forks of pgadmin3? Or just persevere with pgadmin4 for a few months? Pretty common for people to hate any major changes to a tool that they are very comfortable with. This year I've invested the time to learn a few new toolsets (not on Postgresql necessarily) and found it to be well worth while. At least pgAdmin4 is up to date with all the new features in 11.
AW: Forks of pgadmin3?
Thanks for the link but we're very reluctant to use Java based programs. The main reason is that we need to do some works on servers whose admins simply do not allow to install Java. The screenshots look very promises, however. Regards Klaus > -Ursprüngliche Nachricht- > Von: Thomas Kellerer > Gesendet: Montag, 25. März 2019 12:06 > An: pgsql-general@lists.postgresql.org > Betreff: Re: Forks of pgadmin3? > > kpi6...@gmail.com schrieb am 22.03.2019 um 17:25: > > 95% of my time I use pgadminIII just to type select and update > > statements and review the output rows. > > > > I know that I can do this in psql but it’s not handy with many > > columns. > > An alternative you might want to try is SQL Workbench/J: https://www.sql- > workbench.eu/ > > Full disclosure: I am the author of that tool. > > It's a cross DBMS tool, but my primary focus is Postgres. > > It focuses on running SQL queries rather then being a DBA tool. > > Regards > Thomas
AW: Forks of pgadmin3?
Thank you, I was not aware of this option - this certainly helps. Regards Klaus Von: Murtuza Zabuawala Gesendet: Freitag, 22. März 2019 18:48 An: kpi6...@gmail.com Cc: PostgreSQL mailing lists Betreff: Re: Forks of pgadmin3? Opening Query tool or Debugger window in a new separate browser window is configurable option in pgAdmin4, Goto File -> Preferences -> Query Tool -> Display -> Open in new browser tab Set it to: True
Use left hand column for null values
I'm trying to fill up columns containing NULL with the most recent NOT NULL value from left to right. Example: Select 2, 1, null, null, 3 Should be converted into 2, 1, 1, 1, 3 The following query works but I wonder if there is an easier way for tables with 50 or more columns: with a (c1, c2, c3, c4, c5) as ( values(2, 1, null::int, null::int, 3) ) select c1, coalesce (c2, c1) as c2, coalesce (c3, c2, c1) as c3, coalesce (c4, c3, c2, c1) as c4, coalesce (c5, c4, c3, c2, c1) as c5 from a Thanks Klaus
AW: Use left hand column for null values
> [David G. Johnston] > My $0.02 > > Unless you have some need to generalize I'd say just brute-force it and be > done. > > You could maybe play with arrays to get something that looks different but I > don't think it would be much shorter to code or easier to understand. My concern was performance, because I feared that coalesce would evaluate e.g. column c1 50 times. But I did some testing and it seems to scale linear. 100,000 rows with 50 columns take 25 seconds on my test machine. Not exciting but something I can work with. I played already with arrays but what I came up with was slower than the plain coalesce. Klaus