You use temporary tables extensively.
>
>
David J.
On Tue, 13 May 2025 at 03:19, Maxim Boguk wrote:
> On Mon, May 12, 2025 at 6:01 PM David Rowley wrote:
>> This is just an artifact of the fact that runtime pruning is not factored
>> into the costs. Note the cost of the generic plan. The plan_cache_mode GUC
>> is about the
> 2. You have a massive default_statistics_target for a table involved.
>
This is just an artifact of the fact that runtime pruning is not factored
into the costs. Note the cost of the generic plan. The plan_cache_mode GUC
is about the only way to overrule the choice to use the custom plan.
David
>
the number you've chosen gives you the best performance. It's very
common for people to over-partition and not properly consider the
overheads of partitioning.
David
point in the past, it might not have been
and you have some bloat either in an index or in a catalogue table as
a result.
David
On Thu, 9 Jan 2025, 12:26 Feike Steenbergen,
wrote:
> I'm trying to change a few applications to fully use this, as PostgreSQL
> 17 added this support.
>
> The application does something like this:
>
> - fetch information from a source system and store it in a temp table
> - run a MERGE with a ta
u. However, I did already mention the cost part in the
final paragraph of my email and I also hinted on why it might be more
difficult than you might imagine to implement. I don't want to put
anyone off making improvements in this area. I only aim to highlight
that it's not trivial to do so.
David
tails, another two tables and
> it's query are similar.
>
> Thanks,
>
> James
>
> David Mullineux 於 2024年12月22日週日 上午12:41寫道:
>
>> Depends on a lot of thongs...Visibility map sounds like it's impacted
>> here. Are your inserts towards the index (like a
Depends on a lot of thongs...Visibility map sounds like it's impacted here.
Are your inserts towards the index (like a monotonically increasing serial
id) or scattered around the index values ? How big is the table index
and shared buffers ? An example would really help
On Sat, 21 Dec 2024,
Yes.
Also, are you sure you mean deadlock ? It sounded like this is just a
simple lock conflict and not a deadlock.
The first transaction could be blocked on something else (even tho it's not
ally short and quick ).
Lock requests are in a queue
On Mon, 9 Dec 2024, 21:16 Eric Schwarzenbach,
wrot
On Wed, 20 Nov 2024, 22:26 Tom Lane, wrote:
> [ please don't top-quote, it makes the conversation hard to follow ]
>
> David Mullineux writes:
> > On Wed, 20 Nov 2024, 15:46 Tom Lane, wrote:
> >> Yeah. Also, are you building with openssl, or not?
>
> > N
join search and quite a large portion of planning once
for each combination of CTE inlined vs not-inlined. If you had a query
with a dozen or so CTEs, that's going to be a very costly thing to
plan.
David
[1] https://www.postgresql.org/docs/current/queries-with.html
duced will be more evident if you use
EXPLAIN ANALYZE VERBOSE. The verbose option will show which joins are
marked "Inner Unique". If you want to learn more about the
optimisation you've broken, see [1].
David
[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9c7f5229ad68d7e0e4dd149e3f80257893e404d4
No, not at all!
On Wed, 20 Nov 2024, 15:46 Tom Lane, wrote:
> =?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= writes:
> > On 9/11/24 12:47, David Mullineux wrote:
> >> This feels like a build configuration problem. Just can't put my finger
> >> on it yet.
>
> &g
7;s no longer in my scollback buffer.
You can set random_page_cost for just the session you're connected to
and try it. SET random_page_cost = ; before running
EXPLAIN (ANALYZE, BUFFERS).
David
On Fri, 25 Oct 2024 at 22:26, Ba Jinsheng wrote:
> I guess the better query plan is not considered when comparing the cost of
> paths?
You might want to change effective_cache_size is set high enough.
Something like 50-75% of RAM is likely fine.
David
mpt did the author give to which tool, and
maybe, if we are feeling generous, how much editing of the response was
performed.
David J.
on
it yet.
On Wed, 11 Sept 2024, 10:40 Peter Eisentraut, wrote:
> On 10.09.24 15:58, David Mullineux wrote:
> > I'm getting a bit concerned by the slow performance of generating uidds
> > on latest dev code versus older versions. Here I compare the time to
> > generate
e modify step ?
>
>
>
This tells you when parallelism is used:
https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html
David J.
ion.
You might try the -general list if you want to brainstorm workarounds
because pg_dump itself doesn't provide any command line options to give you
this specific subset of your database.
David J.
all 32 hash partitions since the first query estimated 8 rows
> only ?
> extend statistics may help estimate count(partitionkeyid) based on
> other columns bind variables, but looks like that did not help table join
> case.
I can't quite follow this. You'll need to better explain where you're
getting these numbers for me to be able to understand.
David
ache_size to encourage the nested loop -> index
scan plan. Good ranges for effective_cache_size is anywhere between 50
- 75% of your servers's RAM. However, that might not be ideal if your
server is under memory pressure from other running processes. It also
depends on how large shared_buffers are as a percentage of total RAM.
David
On Mon, 1 Jul 2024 at 22:20, Pavel Stehule wrote:
> The planners get min/max range from indexes. So some user's indexes can be
> bloated too with similar effect
I considered that, but it doesn't apply to this query as there are no
range quals.
David
haps you have lots of bloat in your system catalogue tables. That
could happen if you make heavy use of temporary tables. There are many
other reasons too. It's maybe worth doing some vacuum work on the
catalogue tables.
David
On Wednesday, June 12, 2024, nikhil kumar wrote:
>
> Can anyone please help on SMTP configuration for send gmail. If any
> document please let me know.
>
This seems like an exceedingly unusual place to be asking for such help…
David J.
y ANALYZE on tracktrip, or perhaps increasing the
statistics targets on the columns being queried here.
If there's a correlation between the "a" and "route" columns then you
might want to try CREATE STATISTICS:
CREATE STATISTICS ON a,route FROM tracktrip;
ANALYZE tracktrip;
David
I was told that partitioned table indexed must always start with the
partition key columns.
Is this always the case or does it depend on use case? When would you want
to create indexes in this way?
The documentation just mentions that it is strictly unnecessary but can be
helpful. My understandin
quest for the explain
(analyze, buffers) output with track_io_timing on will help confirm
this.
If it is just reading empty pages that's causing this issue then
adding that missing index would improve the situation after running
just plain VACUUM each time there's a bulk delete.
David
on the planning time as it seems
unlikely that disabling an enable* GUC would result in increased
planning time. However, it does not seem impossible that that *could*
happen.
David
0;
EXPLAIN (SUMMARY ON) ;
RESET enable_hashjoin;
The following will show others that you could try.
select name,setting from pg_settings where name like 'enable%';
David
[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1e731ed12aa
executed or after the top
node returns NULL.
If you're using psql, if you do \timing on, how long does EXPLAIN take
without ANALYZE? That also goes through executor startup and shutdown.
It just skips the running the executor part.
David
hint_plan extension so as expected hints should not
>> get ignored by the optimizer .*
>>
>
Sounds like a bug you should go tell the pg_hint_plan authors about then.
David J.
d. Simply restating your first email isn’t
productive.
You cannot enforce the number of workers used, only the the maximum. That
is you knob.
David J.
semantically, not sure about execution dynamics)
as:
FROM periods AS p
LEFT JOIN LATERAL (SELECT count(distinct? d.id) FROM data AS d WHERE
d.timestamp >= p.period_start AND d.timestamp < p.period_end AND d.sn =
'BLAH') AS cnt_d
-- NO grouping required at this query level
David J.
skewed dataset, then
this might not be very good.
You might find things run better if you adjust postgresql.conf and set
plan_cache_mode = force_custom_plan then select pg_reload_conf();
Please also check the documentation so that you understand the full
implications for that.
David
On Tue, 12 Sept 2023 at 02:27, Tom Lane wrote:
>
> David Rowley writes:
> > I'm not sure if you're asking for help here because you need planning
> > to be faster than it currently is, or if it's because you believe that
> > planning should always be
ack if you find PG14 to be much faster here.
You could also experiment with a set of tables which are empty. It's
possible getting the relation sizes are a factor to consider here.
mdnblocks() needs to do a bit more work when the relation has multiple
segments.
David
index or are just a duplicate of some other index.
Getting rid of those 3 will save some time in create_index_paths().
David
be. Better to focus on trying to make it faster. I
suggest you create the asins_statistics (asin_id) index. However, I
can't say with any level of confidence that the planner would opt to
use that index if it did exist. Lowering random_page_cost or
increasing effective_cache_size would increase the chances of that.
David
7;, time) >=
'2021-01-01' is the same as time >= '2021-01-01'. It would be
possible to make PostgreSQL do that, but that's a core code change,
not something that you can do from SQL.
David
* from pg_stat_user_tables where relid = 'media.block'::regclass;
David
e given Constants have been hashed, finding the partition is
just a single divide operation away.
David
not then -general.
You may want to install the auto-explain extension to get the inner query
explain plan(s) into the logs for examination.
David J.
definitive answer, just trying to get some hints from
> more experienced users before I fill up the drives with terabytes of data.
>
>
Store a hash (and other metadata, like the hashing algorithm) as well as
the path to some system better designed for object storage and retrieval
instead.
David J.
I
suggest starting with a minor version update then posting again if still
having problems.
David J.
multiple passes would be required. The chunk of memory for dead
tuple storage is capped at 1GB.
David
lanner to flip to a Hash
Aggregate which would eliminate the Sort before aggregation. You'd
only need to sort 236 rows after the Hash Aggregate for the ORDER BY.
Plus, what Justin said.
David
e
> PARTITION BY RANGE (EXTRACT(YEAR FROM dob)) LIST (SUBSTRING(hash, 1, 1));
Effectively, multi-level partitioning gives you that, It's just the
DDL is different from how you wrote it.
David
ew tips about partitioning in [1] that you may
wish to review.
David
[1] https://www.postgresql.org/docs/devel/ddl-partitioning.html
On Fri, 10 Feb 2023 at 06:40, Benjamin Tingle wrote:
> Thanks for the helpful response david! I'll have a shot at getting the patch
> to work myself & submitting to pgsql-hackers.
I took some time today for this and fixed up a few mistakes in the
patch and added it to the Marc
rsuing this then feel free to take the patch
to the pgsql-hackers mailing list and propose it. It's unlikely I'll
get time to do that for a while, but I will keep a branch locally with
it to remind me in case I do at some point in the future.
David
[1]
https://git.postgresql.org/git
ause if you decided that the check on the domain
should be "value is not null" (don't do this though...) the column addition
would have to fail for existing rows (modulo defaults...).
David J.
On Thu, 2 Feb 2023 at 14:49, Thomas Munro wrote:
> If I had more timerons myself, I'd like to try to make parallel
> function scans, or parallel CTE scans, work...
I've not really looked in detail but I thought parallel VALUES scan
might be easier than those two.
David
ose to set it to
"on"?
David
are
repeated by setting some negative value, as described in the
documents. You'll need to analyze the table again after changing this
setting.
David
[1] https://www.postgresql.org/docs/12/sql-altertable.html
On Thu, 15 Dec 2022 at 21:12, James Pang (chaolpan) wrote:
>We had some load test ( DML inserts/deletes/updates/ on tens of hash
> partition tables) and found that PGV14 slow down 10-15% compared with PGV13.
> Same test server, same schema tables and data. From pg_stat_statements, sql
> e
t'll take the non-Const path for planning generic
plans.
David
arams values, but when generic_plan,
> planner() use boundparams=NULL, it try to calculate average value based on
> mcv list of the index attributes (starttime,endtime) ?
IIRC, generic plan estimates become based on distinct estimations
rather than histograms or MCVs.
David
rmine the number of times
a given node will be executed before we can determine how worthwhile
JITting an expression in a node will be.
David
> [1]
> https://www.postgresql.org/message-id/B6025887-D73F-4B5B-9925-4DA4B675F7E5%40elevated-dev.com
ould experiment with in particular ?
Perhaps if you want to keep a small high-chun table in check you might
want to consider if autovacuum_naptime is set low enough. You may not
care if the space being consumed in the standard 1min
autovacuum_naptime is small enough not to be of concern.
David
[1] https://www.postgresql.org/docs/release/12.0/
[2] https://www.postgresql.org/docs/13/sql-update.html
it is overall time that matters, it's no use boasting the COPY
is fast if you end up spending hours waiting for ALTER TABLE at the end.
David J.
On Sun, Jun 12, 2022 at 2:47 PM Josh wrote:
>
> This was only possible because I was dealing with arrays though, and an
> operation such as `in (select unnest...)` can be easily converted to `=
> any(...)`. However for the general case,
In the general case you don't have subqueries inside join
On Tuesday, May 31, 2022, Praneel Devisetty
wrote:
>
> Initially it was processing 1000 tables per minute. Performance is
>> gradually dropping and now after 24 hr it was processing 90 tables per
>> minute.
>>
>
That seems like a fairly problematic metric given the general vast
disparities in siz
plan you want if you requite the query and replace your
date range with shipping_date = '2022-05-01'. Your use of WHERE TRUE
indicates to me that you might be building this query in an
application already, so maybe you can just tweak that application to
test if the start and end dates are the same and use equality when
they are.
David
[1] https://commitfest.postgresql.org/38/3524/
sense? Or is this something PG is already doing, and I just
> haven’t found the right magic words / built my index correctly to unlock it?
> (I notice that the last example is an index-only scan; would I get this
> behaviour from the previous two queries if I made the index a coverin
On Wed, 4 May 2022 at 00:21, Benjamin Coutu wrote:
> Thanks David, using extended statistics for both (and only for both) tables
> solved this problem.
Oh, whoops. I did get that backwards. The estimate used by the
Memoize costing code is from the outer side of the join, which
ould be. Not the other way around, which is
your case.
create statistics extdataregular_field_index_stats (ndistinct) on
field, index from extdataregular;
analyze extdataregular;
would likely put that right.
David
nel" value matching your WHERE
clause.
I guess "channel" must not be the primary key to "valueseries" and
that's why you use an IN().
The above query would return an error if multiple rows were returned
by the subquery.
David
quot;array_agg(DISTINCT paymenttype)" and then
checking for various array results will be considerably more efficient.
Or do a combination: write the set-oriented query in an SQL function. You
should not need pl/pgsql for this and avoiding it should improve
performance.
David J.
p.s., The conventio
just
trust the join order that exists in the query.
https://www.postgresql.org/docs/current/explicit-joins.html
Lastly, if you can leverage prepared statements you can at least amortize
the cost (depending on whether a generic plan performs sufficiently
quickly).
I'll admit I'm no expert at this. I'd probably just follow the
join_collapse_limit advice and move on if it works. Maybe adding a
periodic check to see if anything has changed.
David J.
On Tue, Mar 1, 2022 at 9:37 AM Andrew Zakharov wrote:
> David, - yes, creation composite foreign/primary key is not a problem. But
> the main question is what method should I use for partitioning by composite
> key gid, region_code?
>
The convention here is to inline or bottom-po
straint there.
Or maybe allow for duplicates across region codes and save space by using a
smaller data type (int or bigint - while renaming the column to "rid" or
some such) - combined with having the non-partitioned reference table being
defined as (region_code, rid, gid).
David J.
obably verify all of that by perusing the PostgreSQL
documentation. Don't know what to recommend regarding Linxu, user land,
kernel mode, and CPUs...
David J.
agree buffers for that query does not seem to account for nearly two
minutes...though as RDS is a shared resource I'd probably chalk at least
some of it to contention on the underlying hardware (disk likely being more
problematic than memory).
David J.
declaring the
column not null and put the sentinel value directly into the record.
David J.
nough I
hadn't really considered that it would be anything but stock PostgreSQL on
a personal VM setup for testing.
David J.
erwise it's completely wrong.
>
>
Related to my preceding observation, from the explain (buffers) docs:
“…and the time spent reading and writing data file blocks (in milliseconds)
if track_io_timing
<https://www.postgresql.org/docs/current/runtime-config-statistics.html#GUC-TRACK-IO-TIMING>
is
enabled.“
David J.
00sec ?
>
> What architecture and OS/version are you running ?
> How did you install postgres? From a package or compiled from source ?
>
The docs indicate you’ll only see I/O Timing information if using EXPLAIN
BUFFERS but I’m not seeing any of the other buffer-related information in
these plans. Thoughts?
David J.
a null bitmap [1] to the stored tuple. And now for every single
column the system has to check whether a specific column’s value is null or
not. Given the number of columns in your table, that this is noticeable is
not surprising.
David J.
[1] https://www.postgresql.org/docs/current/storage-page-layout.html
isn't sufficient to narrow down the problem. It
can still either be the rule processing or the union processing that is
seeming to make a wrong plan choice.
That isn't meant to discount the possibility that this case is actually
correct - or at least the best we do presently for one or more technical
reasons that I'm not familiar with...
David J.
On Wed, Oct 20, 2021 at 6:58 AM Tom Lane wrote:
> "David G. Johnston" writes:
> > On Tuesday, October 19, 2021, Michael Lewis wrote:
> >> On Tue, Oct 19, 2021 at 3:48 PM Mithran Kulasekaran <
> >> mithranakulaseka...@gmail.com> wrote:
> >>
.name is null
>
The are functionally equivalent, though the timing of the expression
evaluation differs slightly.
It could also be written as an anti-join:
Select * from template as t where not exists (select 1 from
template_staging as ts where t.name = ts.name)
David J.
ed to either use your real
scenario's data to help demonstrate the issue or create a self-contained
test that is at least closer to what it produces (this approach still
benefits from seeing what is happening for real).
David J.
at
for whatever reasons individuals may have no one has chosen to volunteer or
fund such development. I don't even remember seeing a proposal in the past
5 or so years.
David J.
s in PostgreSQL?(similar to
> all_objects in Oracle).
>
>
With pg_catalog tables. But I’m not aware of anything that combines all
object types into a single result. Seems like an easy enough query to put
together though.
David J.
cs is trying to
convey.
https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY
But also note that it is "each operation" that gets access to that limit.
David J.
then only 765591 groups fit in the 10GB
of memory.
David
that.
Hmm, math check?
postgres=# select pg_size_pretty(power(2,31)::numeric*1024);
pg_size_pretty
2048 GB
(1 row)
David
kB
You might want to keep going higher with hash_mem_multiplier until you
see no "Disk Usage" there. As mentioned, v11 didn't spill to disk and
just used all the memory it pleased. That was a bit dangerous as it
could result in OOM, so it was fixed.
David
to 1 per transaction.
David
global index over all partitions
then maybe you're better off just using a non-partitioned table to
store this data.
David
ou need does seem more specific to Oid than
relid.
David
7;re running.
If this "register 10 _same_ documents" thing requires running some
query, then you might want to look at EXPLAIN (ANALYZE, BUFFERS) for
that query. You might want to consider doing SET track_io_timing =
on; Perhaps Linux is having to read more buffers from disk than
Windows.
David.
sion on these since results can come from any
> > partition.
>
> Why is it not using constraint exclusion on the above two conditions(1 and 2)
> included in the where clause ?
>
> Both sets are pointing to different tables.
It's because of the OR condition. If it was an AND condition then the
planner wouldn't have to consider the fact that records in other
partitions might be required for the join.
David
On Sat, 22 May 2021 at 10:59, Nagaraj Raj wrote:
> ERROR: empty range bound specified for partition "mytable_z" DETAIL:
> Specified lower bound ('Z') is greater than or equal to upper bound ('[').
> SQL state: 42P17
It looks like '[' does not come after 'Z' in your collation.
David
er 2); --etc
Change the modulus to the number of partitions you want and ensure you
create a partition for each modulus. In this case, it would be 0 to 9.
David
.
You might want to consider HASH partitioning if you're just looking
for a way to keep your tables and indexes to a more manageable size.
You've not really mentioned your use case here, so it's hard to give
any advice.
There are more details about partitioning in
https://www.postgresql.org/docs/current/ddl-partitioning.html
David
x27;alpha';
QUERY PLAN
---
Seq Scan on mytable_a mytable (cost=0.00..27.00 rows=7 width=32)
Filter: (a = 'alpha'::text)
(2 rows)
The mytable_b is not scanned.
David
ndexscan; show that index scanning is switched on?
David
just hide
it from you. When you add it yourself you'll be able to use it in the
subquery and you'll be able to filter out the partitions that you
don't want.
I really think you're driving yourself down a difficult path by
expecting queries with whole-row vars to be optimised just as well as
using select * or explicitly listing the columns.
David
1 - 100 of 226 matches
Mail list logo