Multiple startup messages over the same connection

2024-01-22 Thread Vladimir Churyukin
Hello,

A question about protocol design - would it be possible to extend the
protocol, so it can handle multiple startup / authentication messages over
a single connection? Are there any serious obstacles? (possible issues with
re-initialization of backends, I guess?)
If that is possible, it could improve one important edge case - where you
have to talk to multiple databases on a single host currently, you need to
open a separate connection to each of them. In some cases (multitenancy for
example), you may have thousands of databases on a host, which leads to
inefficient connection utilization on clients (on the db side too). A lot
of other RDBMSes  don't have this limitation.

thank you,
-Vladimir Churyukin


Re: Multiple startup messages over the same connection

2024-01-23 Thread Vladimir Churyukin
On Mon, Jan 22, 2024 at 11:43 PM Heikki Linnakangas  wrote:

> On 22/01/2024 21:58, Vladimir Churyukin wrote:
> > A question about protocol design - would it be possible to extend the
> > protocol, so it can handle multiple startup / authentication messages
> > over a single connection? Are there any serious obstacles? (possible
> > issues with re-initialization of backends, I guess?)
> > If that is possible, it could improve one important edge case - where
> > you have to talk to multiple databases on a single host currently, you
> > need to open a separate connection to each of them. In some cases
> > (multitenancy for example), you may have thousands of databases on a
> > host, which leads to inefficient connection utilization on clients (on
> > the db side too). A lot of other RDBMSes  don't have this limitation.
>
> The protocol and the startup message are the least of your problems.
> Yeah, it would be nice if you could switch between databases, but the
> assumption that one backend operates on one database is pretty deeply
> ingrained in the code.


Yes, I suspected that's the reason why it was not implemented so far,
but what's the main problem there?
Is the issue with the global data cleanup / re-initialization after the
database is changed?
Is it in 3rd party extensions that assume the same and may break?
Anything else?

-Vladimir Churyukin


Re: Improving inferred query column names

2023-02-22 Thread Vladimir Churyukin
On Wed, Feb 22, 2023, 12:40 PM Andres Freund  wrote:

> Hi,
>
> On 2023-02-11 12:47:04 -0800, Vladimir Churyukin wrote:
> > That is a good idea for simple cases, I'm just curious how it would look
> > like for more complex cases (you can have all kinds of expressions as
> > parameters for aggregate function calls).
> > If it works only for simple cases, I think it would be confusing and not
> > very helpful.
>
> I don't think it needs to be perfect to be helpful.
>


It doesn't need to be perfect, but it needs to be consistent. So far you
proposed a rule to replace () with _. What is the plan for expressions, how
to convert them to names (with deduplication I guess?, because there could
be 2 similar expressions mapped to the same name potentially).


> > Wouldn't it make more sense to just deduplicate the names by adding
> > numerical postfixes, like sum_1, sum_2?
>
> That'd be considerably worse than what we do today imo, because any
> reordering
> / added aggregate would lead to everything else changing as well.
>


Ok, that I kinda agree with. Not necessarily worse overall, but worse for
some cases. Well, the proposal above about keeping the names exactly the
same as the full expressions is probably the best we can do then. It will
take care of possible duplications and won't be position-sensitive. And
will be consistent. The only issue is somewhat unusual column names that
you will have to use quotes to refer to. But is that a real issue?

-Vladimir Churyukin

>


Re: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG

2023-01-03 Thread Vladimir Churyukin
As an end user that spends a lot of time optimizing pretty complicated
queries, I'd say that something like this could be useful.
Right now the optimizer is mostly a black box. Why it chooses one plan or
the other, it's a mystery. I have some general ideas about that,
and I can even read and sometimes debug optimizer's code to dig deeper
(although it's not always possible to reproduce the same behavior as in the
production system anyway).
I'm mostly interested to find where exactly the optimizer was wrong and
what would be the best way to fix it. Currently Postgres is not doing a
great job in that department.
EXPLAIN output can tell you about mispredictions, but the logic of choosing
particular plans is still obscure, because the reasons for optimizer's
decisions are not visible.
If configuring OPTIMIZER_DEBUG through GUC can help with that, I think it
would be a useful addition.
Now, that's general considerations, I'm not somebody who actually uses
OPTIMIZER_DEBUG regularly (but maybe I would if it's accessible
through GUC),
I'm just saying that is an area where improvements would be very much
welcomed.

-Vladimir

On Tue, Jan 3, 2023 at 4:57 PM David Rowley  wrote:

> On Tue, 3 Jan 2023 at 19:59, Ankit Kumar Pandey 
> wrote:
> >
> >
> > On 03/01/23 08:38, David Rowley wrote:
> > > Do you actually have a need for this or are you just trying to tick
> > > off some TODO items?
> > >
> > I would say Iatter but reason I picked it up was more on side of
> > learning optimizer better.
>
> I think it's better you leave this then. I think if someone comes
> along and demonstrates the feature's usefulness and can sell us having
> it so we can easily enable it by GUC then maybe that's the time to
> consider it. I don't think ticking off a TODO item is reason enough.
>
> > Also from the thread,
> >
> >
> https://www.postgresql.org/message-id/20120821.121611.501104647612634419.t-is...@sraoss.co.jp
> >
> > > +1. It would also be popular with our academic users.
> > >
> > There could be potential for this as well.
>
> I think the argument is best coming from someone who'll actually use it.
>
> David
>
>
>


Re: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG

2023-01-03 Thread Vladimir Churyukin
On Tue, Jan 3, 2023 at 7:41 PM David Rowley  wrote:

> On Wed, 4 Jan 2023 at 16:15, Vladimir Churyukin 
> wrote:
> > As an end user that spends a lot of time optimizing pretty complicated
> queries, I'd say that something like this could be useful.
>
> I think we really need to at least see that it *is* useful, not that
> it *could be* useful.  For example, as an end user, you might not find
> it great that the output is sent to stdout rather than to the window
> that you execute the query in.
>

That's true, as an end user I would expect to see the output as a query
output, not in stdout.


> From what I can see here, the motivation to make this a useful feature
> is backwards from what is normal. I think if you're keen to see a
> feature that allows you better visibility into rejected paths then you
> need to prove this is it rather than speculating that it might be
> useful.
>
>
You can't see people using the feature unless you make it useful. If it's
not useful right now (because it's implemented as a compile-time flag with
stdout prints for example),
it doesn't mean it's not useful when it becomes more convenient. Probably
the best way to find out is to create a *convenient* extension and see if
people start using it.


> There was a bit of work being done in [1] with the end goal of having
> the ability for add_path to call a hook function before it outright
> rejects a path.  Maybe that would be a better place to put this and
> then write some contrib module that provides some extended output in
> EXPLAIN. That might require some additional fields so that we could
> carry forward additional information that we'd like to show in
> EXPLAIN. I imagine it's not ok just to start writing result lines in
> the planner. The EXPLAIN format must be considered too and explain.c
> seems like the place that should be done. add_path might need to
> become a bit more verbose about the reason it rejected a certain path
> for this to be useful.
>

I agree, extended EXPLAIN output would be a much better solution than
writing into stdout. Can be implemented as an extra EXPLAIN flag, something
like EXPLAIN (TRACE).
One of the issues here is the result will rather be pretty long (and may
consist of multiple parts, so something like returning multiple
refcursors might be necessary, so a client can fetch multiple result sets.
Otherwise it won't be human-readable. Although it's not necessary the
purpose, if the purpose is to make it machine-readable and create tools to
interpret the results, json format and a single resultset would be ok.
The result can be represented as a list of trace events that shows profiler
logic (the traces can be generated by the hook you mentioned and or by some
other additional hooks).
Is that what you were talking about?
Another thing, since people react to this TODO item on
https://wiki.postgresql.org/wiki/Todo, maybe it's better to modify
or remove it, so they don't spend time working on something that is pretty
much a dead end currently?

-Vladimir


Re: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG

2023-01-04 Thread Vladimir Churyukin
On Tue, Jan 3, 2023 at 9:55 PM Tom Lane  wrote:

> David Rowley  writes:
> > The thing I had in mind was some mode that would record additional
> > details during planning that could be tagged onto the final plan in
> > createplan.c so that EXPLAIN could display them. I just think that
> > EXPLAIN is the place where people go to learn about _what_ the plan is
> > and it might also be the place where they might expect to go to find
> > out more details about _why_ that plan was chosen. I by no means have
> > a fully bakes idea on what that would look like, but I just think that
> > dumping a bunch of lines to stdout is not going to be useful to many
> > people and we need to think of something better in order to properly
> > make this useful.
>
> There's a number of problems in this area, but I think the really
> fundamental issue is that for speed reasons the planner wants to
> reject losing plan alternatives as quickly as possible.  So we simply
> don't pursue those alternatives far enough to produce anything that
> could serve as input for EXPLAIN (in its current form, anyway).
>

That's not necessarily a fundamental issue for EXPLAIN (well, in theory,
not sure if there are fundamental limitations of the current
implementation).
When somebody runs EXPLAIN, they don't necessarily care that much about its
performance, as long as it returns results in reasonable time.
So if the planner does some extra work in that mode to better display why
the specific path was chosen, it should probably be ok from the performance
perspective.


> What that means is that a trace of add_path decisions just can't be
> very useful to an end user: there isn't enough data to present the
> decisions in a recognizable form, besides which there is too much
> noise because most of the rejected options are in fact silly.
> So indeed we find that even hard-core developers aren't interested
> in consuming the data in that form.
>

Even if the output is not very human-readable, it still can be useful, if
there are tools that consume the output and extract
meaningful data while omitting meaningless noise (if the meaningful data
exists there of course).


> Another issue is that frequently the problem is that we never
> considered the desired plan at all, so that even if you had a
> perfectly readable add_path trace it wouldn't show you what you want
> to see.  This might happen because the planner is simply incapable of
> producing that plan shape from the given query, but often it happens
> for reasons like "function F() used in the query is marked volatile,
> so we didn't flatten a subquery or consider an indexscan or whatever".
> I'm not sure how we could produce output that would help people
> discover that kind of problem ... but I am sure that an add_path
> trace won't do it.
>
> So, not only am I pretty down on exposing OPTIMIZER_DEBUG in
> its current form, but I don't really believe that adding hooks
> to add_path would allow an extension to produce anything of value.
> I'd for sure want to see a convincing demonstration to the contrary
> before we slow down that hot code path by adding hook calls.


Pardon my ignorance, but I'm curious, how changes in planner code are
currently validated?
Let's say, you add some extra logic that introduces different paths in some
cases, or adjust some constants. How do you validate this logic doesn't
slow down something else dramatically?
I see some EXPLAIN output checks in regression tests (not that many
though), so I'm curious how regressions in planning are currently tested.
Not the simple ones, when you have a small input and predictable
plan/output, but something that can happen with more or less real data
distribution on medium / large datasets.

-Vladimir Churyukin


Making auto_explain more useful / convenient

2023-11-11 Thread Vladimir Churyukin
Hello,

I'm not sure if these ideas were circulated before or not.
We use auto_explain a lot to investigate slow/problematic queries.
One of the main issues with its usability comes from the fact that EXPLAIN
output is logged rather than returned to the caller in some way. If you
have a large cluster with lots of replicas, there is also an extra
inconvenience of log accumulation, search, etc.
Why not have an option to return EXPLAIN results as a NoticeResponse
instead? That would make its usage more convenient.

Another thing is tangentially related...
I think it may be good to have a number of options to generate
significantly shorter output similar to EXPLAIN. EXPLAIN is great, but
sometimes people need more concise and specific information, for example
total number of buffers and reads by certain query (this is pretty common),
whether or not we had certain nodes in the plan (seq scan, scan of certain
index(es)), how bad was cardinality misprediction on certain nodes, etc.
It's not totally clear yet what would be the best way to define  those
rules, but I think we can come up with something reasonable. Logging or
returning shorter messages like that can cause less overhead than logging
full EXPLAIN and can potentially allow for better query monitoring overall.

Do you see any potential issues with implementing those? Of course there
should be more details, like what kind of configuration parameters to add,
how to define rules for the 2nd case, etc. Just wanted to check if there
are any objections in general.

Thank you,
-Vladimir Churyukin.


Re: Making auto_explain more useful / convenient

2023-11-11 Thread Vladimir Churyukin
Thank you, that answers the first part of my question.

On Sat, Nov 11, 2023 at 2:43 AM Sergei Kornilov  wrote:

> Hello
>
> auto_explain.log_level is available since postgresql 12.
>
> postgres=# load 'auto_explain';
> LOAD
> postgres=# set auto_explain.log_min_duration to 0;
> SET
> postgres=# set auto_explain.log_level to 'notice';
> SET
> postgres=# select 1;
> NOTICE:  duration: 0.010 ms  plan:
> Query Text: select 1;
> Result  (cost=0.00..0.01 rows=1 width=4)
>  ?column?
> --
> 1
>
> regards, Sergei
>


Re: Making auto_explain more useful / convenient

2023-11-11 Thread Vladimir Churyukin
On Sat, Nov 11, 2023 at 7:49 AM Tom Lane  wrote:

> Vladimir Churyukin  writes:
> > Why not have an option to return EXPLAIN results as a NoticeResponse
> > instead? That would make its usage more convenient.
>
> That seems quite useless to me, and likely actually counterproductive.
> If you are manually investigating query performance, you can just use
> EXPLAIN directly.  The point of auto_explain, ISTM, is to capture info
> about queries issued by automated applications.  So something like the
> above could only work if you taught every one of your applications to
> capture the NOTICE output, separate it from random other NOTICE
> output, and then (probably) log it somewhere central for later
> inspection.  That's a lot of code to write, and at the end you'd
> only have effectively duplicated existing tooling such as pgbadger.
> Also, what happens in applications you forgot to convert?
>
>
Sergey Kornilov just gave the right answer above in the thread for this one.
Unfortunately, there are a lot of scenarios where you can't use pgbadger or
any other log analysis or it's not convenient.
There are a bunch of cloud hosted forks of postgres for example, not all of
them give you this functionality.
In AWS for example you need to download all the logs first, which
complicates it significantly.
The goal of this is not investigating performance of a single query but
rather constant monitoring of a bunch (or all) queries, so you can detect
plan degradations right away.


> > Another thing is tangentially related...
> > I think it may be good to have a number of options to generate
> > significantly shorter output similar to EXPLAIN. EXPLAIN is great, but
> > sometimes people need more concise and specific information, for example
> > total number of buffers and reads by certain query (this is pretty
> common),
> > whether or not we had certain nodes in the plan (seq scan, scan of
> certain
> > index(es)), how bad was cardinality misprediction on certain nodes, etc.
>
> Maybe, but again I'm a bit skeptical.  IME you frequently don't know
> what you're looking for until you've seen the bigger picture.  Zeroing
> in on details like this could be pretty misleading.
>
>
If you don't know what you're looking for, then it's not very useful, I
agree.
But in many cases you know. There are certain generic "signs of trouble"
that you can detect by
the amount of data the query processor scans, by cache hit rate for certain
queries. presence of seq scans or scans of certain indexes,
large differences between predicted and actual rows, some other stuff that
may be relevant to your app/queries specifically that you want to monitor.
We're already doing similar analysis on our side (a multi-terabyte db
cluster with hundreds of millions to billions queries running daily).
But it's not efficient enough because:
1. the problem I mentioned above, access to logs is limited on cloud
environments
2. explain output could be huge, it causes performance issues because of
its size. compact output is much more preferable for mass processing
(it's even more important if this output is to notice messages rather than
to logs, that's why I said it's tangentially related)

Since it seems the notice output is already possible, half of the problem
is solved already.
I'll try to come up with possible options for more compact output
then, unless you think it's completely futile.

thank you,
-Vladimir Churyukin


Re: Improving inferred query column names

2023-02-11 Thread Vladimir Churyukin
That is a good idea for simple cases, I'm just curious how it would look
like for more complex cases (you can have all kinds of expressions as
parameters for aggregate function calls).
If it works only for simple cases, I think it would be confusing and not
very helpful.
Wouldn't it make more sense to just deduplicate the names by adding
numerical postfixes, like sum_1, sum_2?
For backwards compatibility I guess you can have a GUC flag controlling
that behavior that can be set into backwards compatibility mode if required.
The previous functionality can be declared deprecated and removed (with the
flag) once the current version becomes unsupported.
(or with a different deprecation policy, I'm not sure what is the general
rule for breaking changes and deprecation currently).
If there is a clearly defined deprecation policy and a backwards
compatibility option, it should be good, no? Just my 2 cents.

-Vladimir Churyukin

On Sat, Feb 11, 2023 at 11:24 AM Andres Freund  wrote:

> Hi,
>
> A common annoyance when writing ad-hoc analytics queries is column naming
> once
> aggregates are used.
>
> Useful column names:
> SELECT reads, writes FROM pg_stat_io;
> column names: reads, writes
>
> Not useful column names:
> SELECT SUM(reads), SUM(writes) FROM pg_stat_io;
> column names: sum, sum
>
> So i often end up manually writing:
> SELECT SUM(reads) AS sum_reads, SUM(writes) AS sum_writes, ... FROM
> pg_stat_io;
>
>
> Of course we can't infer useful column names for everything, but for
> something
> like this, it should't be too hard to do better. E.g. by combining the
> function name with the column name in the argument, if a single plain
> column
> is the argument.
>
> I think on a green field it'd be clearly better to do something like the
> above.  What does give me pause is that it seems quite likely to break
> existing queries, and to a lesser degree, might break applications relying
> on
> inferred column names
>
> Can anybody think of a good way out of that? It's not like that problem is
> going to go away at some point...
>
> Greetings,
>
> Andres Freund
>
>
>


Re: Multiple startup messages over the same connection

2024-05-18 Thread Vladimir Churyukin
On Mon, Jan 22, 2024 at 11:43 PM Heikki Linnakangas  wrote:

> On 22/01/2024 21:58, Vladimir Churyukin wrote:
> > A question about protocol design - would it be possible to extend the
> > protocol, so it can handle multiple startup / authentication messages
> > over a single connection? Are there any serious obstacles? (possible
> > issues with re-initialization of backends, I guess?)
> > If that is possible, it could improve one important edge case - where
> > you have to talk to multiple databases on a single host currently, you
> > need to open a separate connection to each of them. In some cases
> > (multitenancy for example), you may have thousands of databases on a
> > host, which leads to inefficient connection utilization on clients (on
> > the db side too). A lot of other RDBMSes  don't have this limitation.
>
> The protocol and the startup message are the least of your problems.
> Yeah, it would be nice if you could switch between databases, but the
> assumption that one backend operates on one database is pretty deeply
> ingrained in the code.
>
> --
> Heikki Linnakangas
> Neon (https://neon.tech)
>
>
Sorry to revive this old thread, just want to check on one thing:
Let's say we keep one database per backend rule, I understand at this point
it would be really hard to change.
What if on a new startup message we just signal the postmaster about it, so
it takes over the socket and spawns a new backend.
After that we terminate the old one. How does it sound like in terms of
implementation complexity?
I guess the process of passing control from child processes to the parent
could be a bit tricky for that one, but doable?
Is there anything I'm missing that can be a no-go for this?
The end goal is to minimize a large overhead for clients having to deal
with a large number of connections on multi-tenant systems (say, one client
deals with thousands of databases on the same database server).

-Vladimir Churyukin


Bypassing shared_buffers

2023-06-14 Thread Vladimir Churyukin
Hello,

There is often a need to test particular queries executed in the worst-case
scenario, i.e. right after a server restart or with no or minimal amount of
data in shared buffers. In Postgres it's currently hard to achieve (other
than to restart the server completely to run a single query, which is not
practical). Is there a simple way to introduce a GUC variable that makes
queries bypass shared_buffers and always read from storage? It would make
testing like that orders of magnitude simpler. I mean, are there serious
technical obstacles or any other objections to that idea in principle?

 Thanks,
-Vladimir Churyukin


Re: Bypassing shared_buffers

2023-06-14 Thread Vladimir Churyukin
To be clear, I'm talking about bypassing shared buffers for reading data /
indexes only, not about disabling it completely (which I guess is
impossible anyway).

-Vladimir Churyukin

On Wed, Jun 14, 2023 at 5:57 PM Vladimir Churyukin 
wrote:

> Hello,
>
> There is often a need to test particular queries executed in the
> worst-case scenario, i.e. right after a server restart or with no or
> minimal amount of data in shared buffers. In Postgres it's currently hard
> to achieve (other than to restart the server completely to run a single
> query, which is not practical). Is there a simple way to introduce a GUC
> variable that makes queries bypass shared_buffers and always read from
> storage? It would make testing like that orders of magnitude simpler. I
> mean, are there serious technical obstacles or any other objections to that
> idea in principle?
>
>  Thanks,
> -Vladimir Churyukin
>


Re: Bypassing shared_buffers

2023-06-14 Thread Vladimir Churyukin
Ok, got it, thanks.
Is there any alternative approach to measuring the performance as if the
cache was empty?
The goal is basically to calculate the max possible I/O time for a query,
to get a range between min and max timing.
It's ok if it's done during EXPLAIN ANALYZE call only, not for regular
executions.
One thing I can think of is even if the data in storage might be stale,
issue read calls from it anyway, for measuring purposes.
For EXPLAIN ANALYZE it should be fine as it doesn't return real data anyway.
Is it possible that some pages do not exist in storage at all? Is there a
different way to simulate something like that?

-Vladimir Churyukin

On Wed, Jun 14, 2023 at 6:22 PM Tom Lane  wrote:

> Vladimir Churyukin  writes:
> > There is often a need to test particular queries executed in the
> worst-case
> > scenario, i.e. right after a server restart or with no or minimal amount
> of
> > data in shared buffers. In Postgres it's currently hard to achieve (other
> > than to restart the server completely to run a single query, which is not
> > practical). Is there a simple way to introduce a GUC variable that makes
> > queries bypass shared_buffers and always read from storage? It would make
> > testing like that orders of magnitude simpler. I mean, are there serious
> > technical obstacles or any other objections to that idea in principle?
>
> It's a complete non-starter.  Pages on disk are not necessarily up to
> date; but what is in shared buffers is.
>
> regards, tom lane
>


Re: Bypassing shared_buffers

2023-06-14 Thread Vladimir Churyukin
Do you foresee any difficulties in implementation of the "unwarm"
operation? It requires a cache flush operation,
so I'm curious how complicated that is (probably there is a reason this is
not supported by Postgres by now? mssql and oracle support stuff like that
for a long time)
Cluster restart is not an option for us unfortunately, as it will be
required for each query pretty much, and there are a lot of them.
An ideal solution would be, if it's possible, to test it in parallel with
other activities...
Evicting all the other stuff using pg_prewarm is an interesting idea though
(if a large prewarm operation really evicts all the previously stored data
reliably).
It's a bit hacky, but thanks, I think it's possible to make this work with
some effort.
It will require exclusive access just for that testing, which is not ideal
but may work for us.

-Vladimir )churyukin


On Wed, Jun 14, 2023 at 7:29 PM Thomas Munro  wrote:

> On Thu, Jun 15, 2023 at 1:37 PM Vladimir Churyukin
>  wrote:
> > Ok, got it, thanks.
> > Is there any alternative approach to measuring the performance as if the
> cache was empty?
>
> There are two levels of cache.  If you're on Linux you can ask it to
> drop its caches by writing certain values to /proc/sys/vm/drop_caches.
> For PostgreSQL's own buffer pool, it would be nice if someone would
> extend the pg_prewarm extension to have a similar 'unwarm' operation,
> for testing like that.  But one thing you can do is just restart the
> database cluster, or use pg_prewarm to fill its buffer pool up with
> other stuff (and thus kick out the stuff you didn't want in there).
>


Re: Bypassing shared_buffers

2023-06-14 Thread Vladimir Churyukin
It could be cheaper, if the testing is done for many SELECT queries
sequentially - you need to flush dirty buffers just once pretty much.

-Vladimir Churyukin

On Wed, Jun 14, 2023 at 7:43 PM Tom Lane  wrote:

> Thomas Munro  writes:
> > There are two levels of cache.  If you're on Linux you can ask it to
> > drop its caches by writing certain values to /proc/sys/vm/drop_caches.
> > For PostgreSQL's own buffer pool, it would be nice if someone would
> > extend the pg_prewarm extension to have a similar 'unwarm' operation,
> > for testing like that.  But one thing you can do is just restart the
> > database cluster, or use pg_prewarm to fill its buffer pool up with
> > other stuff (and thus kick out the stuff you didn't want in there).
>
> But that'd also have to push out any dirty buffers.  I'm skeptical
> that it'd be noticeably cheaper than stopping and restarting the
> server.
>
> regards, tom lane
>


Re: Bypassing shared_buffers

2023-06-15 Thread Vladimir Churyukin
On Thu, Jun 15, 2023 at 12:32 AM Konstantin Knizhnik 
wrote:

>
>
> On 15.06.2023 4:37 AM, Vladimir Churyukin wrote:
> > Ok, got it, thanks.
> > Is there any alternative approach to measuring the performance as if
> > the cache was empty?
> > The goal is basically to calculate the max possible I/O time for a
> > query, to get a range between min and max timing.
> > It's ok if it's done during EXPLAIN ANALYZE call only, not for regular
> > executions.
> > One thing I can think of is even if the data in storage might be
> > stale, issue read calls from it anyway, for measuring purposes.
> > For EXPLAIN ANALYZE it should be fine as it doesn't return real data
> > anyway.
> > Is it possible that some pages do not exist in storage at all? Is
> > there a different way to simulate something like that?
> >
>
> I do not completely understand what you want to measure: how fast cache
> be prewarmed or what is the performance
> when working set doesn't fit in memory?
>
>
No, it's not about working set or prewarming speed.
We're trying to see what is the worst performance in terms of I/O, i.e.
when the database just started up or the data/indexes being queried are not
cached at all.

Why not changing `shared_buffers` size to some very small values (i.e.
> 1MB) doesn't work?
>
As it was already noticed, there are levels of caching: shared buffers
> and OS file cache.
> By reducing size of shared buffers you rely mostly on OS file cache.
> And actually there is no big gap in performance here - at most workloads
> I didn't see more than 15% difference).
>

I thought about the option of setting minimal shared_buffers, but it
requires a server restart anyway, something I'd like to avoid.

You can certainly flush OS cache `echo 3 > /proc/sys/vm/drop_caches` and
> so simulate cold start.
> But OS cached will be prewarmed quite fast (unlike shared buffer because
> of strange Postgres ring-buffer strategies which cause eviction of pages
> from shared buffers even if there is a lot of free space).
>
> So please more precisely specify the goal of your experiment.
> "max possible I/O time for a query" depends on so many factors...
> Do you consider just one client working in isolation or there will be
> many concurrent queries and background tasks like autovacuum and
> checkpointer  competing for the resources?
>

> My point is that if you need some deterministic result then you will
> have to exclude a lot of different factors which may affect performance
> and then ... you calculate speed of horse in vacuum, which has almost no
> relation to real performance.
>
>
Exactly, we need more or less deterministic results for how bad I/O timings
can be.
Even though it's not necessarily the numbers we will be getting in real
life, it gives us ideas about distribution,
and it's useful because we care about the long tail (p99+) of our queries.
For simplicity let's say it will be a single client only (it will be hard
to do the proposed solutions reliably with other stuff running in parallel
anyway).

-Vladimir Churyukin