po 31. 8. 2020 v 10:04 odesílatel Thorsten Schöning <tschoen...@am-soft.de> napsal:
> Hi all, > > I have lots of queries in which I need to restrict access to rows > using some decimal row-ID and am mostly doing so with using the > operator IN in WHERE-clauses. Additionally I'm mostly embedding the > IDs as ","-seperated list into the query directly, e.g. because I > already hit a limitation of ~32k parameters of the JDBC-driver[1] for > Postgres. > > I really thought that in most cases simply sending a large amount of > IDs embedded into the query is better than looping, because it safes > roundtrips to access the DB, the planner of the DB has all pieces of > information it needs to decide best strategies etc. OTOH, with recent > tests and an increased number of IDs of about factor 100, I have > additional load in Tomcat before actually sending the query to the DB > already and in the DB itself as well of course. I've attached an > example query and plan. > > > -> Hash (cost=242592.66..242592.66 rows=6825 width=39) (actual > time=91.117..91.117 rows=40044 loops=3) > > Buckets: 65536 (originally 8192) Batches: 1 (originally 1) > Memory Usage: 3016kB > > -> Hash Join (cost=137.57..242592.66 rows=6825 width=39) (actual > time=10.194..82.412 rows=40044 loops=3) > > Hash Cond: (meter.meter_bcd = meter_bcd.id) > > -> Index Scan using pk_meter on meter > (cost=0.42..242237.10 rows=40044 width=25) (actual time=9.350..71.276 > rows=40044 loops=3) > > Index Cond: (id = ANY ('{[...]}'::integer[])) > > -> Hash (cost=135.73..135.73 rows=113 width=22) (actual > time=0.830..0.830 rows=113 loops=3) > > Do you know of any obvious limitations of the JDBC-driver of handling > such large queries? In the end, the query is mostly large text with > only very few bind parameters. > > Do you know of any obvious problem in Postgres itself with that query, > when parsing it or alike? Do things simply take how long they take and > are mostly comparable to looping or is there some additional overhead > the larger the query itself gets? From my naive expectation, comparing > IDs shouldn't care if things get looped or transmitted at once. > > I'm just trying to collect some input for where to look at to optimize > things in the future. Thanks! > It is not good - it increases the memory necessary for query parsing, optimizer and executor are slower. Postgres currently has not any optimization for processing searching in these long lists - so this search is very slow against other methods. I think this is a signal so something in the design database or architecture is wrong. Sure, there can be exception, but the Postgres has not any optimization for this design Regards Pavel > [1]: https://github.com/pgjdbc/pgjdbc/issues/90 > > Mit freundlichen Grüßen, > > Thorsten Schöning > > -- > Thorsten Schöning E-Mail: thorsten.schoen...@am-soft.de > AM-SoFT IT-Systeme http://www.AM-SoFT.de/ > > Telefon...........05151- 9468- 55 > Fax...............05151- 9468- 88 > Mobil..............0178-8 9468- 04 > > AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln > AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow