[PG13] Planning (time + buffers) data structure in explain plan (format text)
Hi all, As far as I understand, in the upcoming version 13, information about buffers used during planning is now available in the explain plan. […] Planning Time: 0.203 ms Buffers: shared hit=14 […] In the JSON format, the data structure is a bit different: […] "Planning": { "Planning Time": 0.533, "Shared Hit Blocks": 14, "Shared Read Blocks": 0, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 0, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0 }, […] For a matter of consistency, I wonder if it would be possible to format it like the following: […] Planning: Planning Time: 0.203 ms Buffers: shared hit=14 […] Note: a similar way to format information is already used for JIT. […] JIT: Functions: 3 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 0.340 ms, Inlining 0.000 ms, Optimization 0.168 ms, Emission 1.907 ms, Total 2.414 ms […] Regards, Pierre
Re: [PG13] Planning (time + buffers) data structure in explain plan (format text)
Le 07/08/2020 à 14:52, Julien Rouhaud a écrit : > Hi, > > On Fri, Aug 7, 2020 at 2:30 PM Pierre Giraud wrote: >> >> Hi all, >> >> As far as I understand, in the upcoming version 13, information about >> buffers used during planning is now available in the explain plan. > > Indeed. > >> […] >> Planning Time: 0.203 ms >>Buffers: shared hit=14 >> […] >> >> In the JSON format, the data structure is a bit different: >> >> […] >> "Planning": { >>"Planning Time": 0.533, >>"Shared Hit Blocks": 14, >>"Shared Read Blocks": 0, >>"Shared Dirtied Blocks": 0, >>"Shared Written Blocks": 0, >>"Local Hit Blocks": 0, >>"Local Read Blocks": 0, >>"Local Dirtied Blocks": 0, >>"Local Written Blocks": 0, >>"Temp Read Blocks": 0, >>"Temp Written Blocks": 0 >> }, >> […] >> >> For a matter of consistency, I wonder if it would be possible to format >> it like the following: >> >> […] >> Planning: >>Planning Time: 0.203 ms >>Buffers: shared hit=14 >> […] > > I agree that this output looks more consistent with other output, > including JIT as you mentioned. I'll send a patch for that if there's > no objection. Thanks a lot! > > Out of curiosity, is the current text output actually harder to parse > than the one you're suggesting? > I don't want to speak in the name of developers of others parsing tools but this should not require a lot of work to parse the output I'm proposing. It would be nice to have their opinion though, especially Hubert depesz Lubaczewski's since he already integrated the change: https://gitlab.com/depesz/Pg--Explain/-/commit/4a760136ee69ee4929625d4e4022f79ac60b763f However, as far as I know, he's not doing anything with the buffers information with the "Planning" section yet. To answer your question, I think that the new output would make the parser a little bit easier to write because it would make things a bit clearer (ie. more separated) so less prone to errors.
Re: [PG13] Planning (time + buffers) data structure in explain plan (format text)
Can you please show what the plan would look like for? =# explain (buffers on, summary on, format JSON) select * from t; Le 20/08/2020 à 09:58, Fujii Masao a écrit : > > > On 2020/08/20 13:00, David Rowley wrote: >> On Thu, 20 Aug 2020 at 03:31, Fujii Masao >> wrote: >>> With the patch, for example, whatever "summary" settng is, "buffers on" >>> displays the planner's buffer usage if it happens. >> >> I forgot to mention earlier, you'll also need to remove the part in >> the docs that mentions BUFFERS requires ANALYZE. > > Thanks for the review! I removed that. > Attached is the updated version of the patch. > I also added the regression test performing "explain (buffers on)" > without "analyze" option. > > Regards, >
Re: [PG13] Planning (time + buffers) data structure in explain plan (format text)
Le 20/08/2020 à 17:41, Fujii Masao a écrit : > > > On 2020/08/20 22:34, Julien Rouhaud wrote: >> On Thu, Aug 20, 2020 at 12:29 PM David Rowley >> wrote: >>> >>> On Thu, 20 Aug 2020 at 19:58, Fujii Masao >>> wrote: On 2020/08/20 13:00, David Rowley wrote: > I forgot to mention earlier, you'll also need to remove the part in > the docs that mentions BUFFERS requires ANALYZE. Thanks for the review! I removed that. Attached is the updated version of the patch. I also added the regression test performing "explain (buffers on)" without "analyze" option. >>> >>> Looks good to me. >> >> Looks good to me too. > > David and Julien, thanks for the review! I'd like to wait for > Pierre's opinion about this change before committing the patch. > > Pierre, > could you share your opinion about this change? It looks good to me too. Thanks a lot! Let's not forget to notify Hubert (depesz) once integrated. > > Regards, >
Explain: Duplicate key "Workers" in JSON format
Hi, I'm currently working on a tool to visualize an execution plan [1]. For those who know PEV, it's actually a fork of this great tool since it hasn't been active for more than 2 years. Among other things, I'd like to show information for the parallel queries. First of which is information about the workers. I'm facing a problem when I am trying to parse a plan in the JSON format. The "Workers" key may be duplicated. While it's not invalid to have keys with the same name several times at the same level in a JSON object, it makes it almost impossible to get the full info when parsed. Indeed when parsing such a JSON string only the last key is kept. Part of the information is lost. JSON validators warn us with the following message : "Duplicate key, names should be unique." Here's an example of a plan in VERBOSE mode. [ { "Plan": { "Node Type": "Gather Merge", "Parallel Aware": false, "Actual Startup Time": 1720.052, "Actual Total Time": 4252.290, "Actual Rows": 1000, "Actual Loops": 1, "Output": ["c1", "c2"], "Workers Planned": 2, "Workers Launched": 2, "Plans": [ { "Node Type": "Sort", "Parent Relationship": "Outer", "Parallel Aware": false, "Actual Startup Time": 1558.638, "Actual Total Time": 2127.522, "Actual Rows": 333, "Actual Loops": 3, "Output": ["c1", "c2"], "Sort Key": ["t1.c1"], "Sort Method": "external merge", "Sort Space Used": 126152, "Sort Space Type": "Disk", "Workers": [ { "Worker Number": 0, "Sort Method": "external merge", "Sort Space Used": 73552, "Sort Space Type": "Disk" }, { "Worker Number": 1, "Sort Method": "external merge", "Sort Space Used": 73320, "Sort Space Type": "Disk" } ], "Workers": [ { "Worker Number": 0, "Actual Startup Time": 1487.846, "Actual Total Time": 1996.879, "Actual Rows": 2692973, "Actual Loops": 1 }, { "Worker Number": 1, "Actual Startup Time": 1468.256, "Actual Total Time": 2012.744, "Actual Rows": 2684443, "Actual Loops": 1 } ], "Plans": [ { "Node Type": "Seq Scan", "Parent Relationship": "Outer", "Parallel Aware": true, "Relation Name": "t1", "Schema": "public", "Alias": "t1", "Actual Startup Time": 0.211, "Actual Total Time": 372.858, "Actual Rows": 333, "Actual Loops": 3, "Output": ["c1", "c2"], "Workers": [ { "Worker Number": 0, "Actual Startup Time": 0.029, "Actual Total Time": 368.356, "Actual Rows": 2692973, "Actual Loops": 1 }, { "Worker Number": 1, "Actual Startup Time": 0.033, "Actual Total Time": 368.874, "Actual Rows": 2684443, "Actual Loops": 1 } ] } ] } ] }, "Planning Time": 0.170, "Triggers": [ ], "Execution Time": 4695.141 } ] As you can see, the "Workers" key is duplicated in the Sort node. Here's the equivalent in TEXT format: - Gather Merge (cost=735306.27..1707599.95 rows=864 width=17) (actual time=1560.468..3749.583 rows=1000 loops=1) Output: c1, c2 Workers Planned: 2 Workers Launched: 2 -> Sort (cost=734306.25..744722.95 rows=4166682 width=17) (actual time=1474.182..1967.788 rows=333 loops=3) Output: c1, c2 Sort Key: t1.c1 Sort Method: external merge Disk: 125168kB Worker 0: Sort Method: external merge Disk: 73768kB Worker 1: Sort Method: external merge Disk: 74088kB Worker 0: actual time=1431.136..1883.370 rows=2700666 loops=1 Worker 1: actual time=1431.175..1891.630 rows=2712505 loops=1 -> Parallel Seq Scan on public.t1 (cost=0.00..105264.82 rows=4166682 width=17) (actual time=0.214..386.014 rows=333 loops=3) Output: c1, c2 Worker 0: actual time=0.027..382.325 rows=2700666 loops=1 Worker 1: actual time=0.038..384.951 rows=2712505 loops=1 Planning Time: 0.180 ms Execution Time: 4166.867 ms (18 rows) - I think that the text format should stay as is. For the JSON format however it would be better in my opinion if "Workers" data i
Re: Poll: are people okay with function/operator table redesign?
Le 16/04/2020 à 00:18, Tom Lane a écrit : > As I threatened to do earlier, I made a pass at converting table 9.10 > to a couple of the styles under discussion. (This is just a > draft-quality patch, so it might have some minor bugs --- the point > is just to see what these styles look like.) > > I've concluded after looking around that the ideas involving not having > a at all, but just a or the like, are not very > well-advised. That would eliminate, or at least greatly degrade, the > visual distinction between the per-function material and the surrounding > commentary. Which does not seem like a winner to me; for example it > would make it quite hard to skip over the detailed material when you're > just trying to skim the docs. > > We did have a number of people suggesting that just reordering things as > "description, signature, examples" might be a good idea, so I gave that > a try; attached is a rendition of a portion of 9.10 in that style (the > "v1" image). It's not bad, but there's still going to be a lot of > wasted whitespace in tables that include even one long function name. > (9.10's longest is "regexp_split_to_array", so it's showing this problem > significantly.) > > I also experimented with Jonathan's idea of dropping the separate > function name and allowing the function signature to span left into > that column -- see "v2" images. This actually works really well, > and would work even better (IMO) if we could get rid of the inter-row > and inter-column rules within a function entry. I failed to > accomplish that with rowsep/colsep annotations, but from remarks > upthread I suppose there might be a CSS way to accomplish it. (But > the rowsep/colsep annotations *do* work in PDF output, so I kept them; > that means we only need a CSS fix and not some kind of flow-object > magic for PDF.) > > To allow direct comparison of these 9.10 images against the situation > in HEAD, I've also attached an extract of 9.10 as rendered by my > browser with "STYLE=website". As you can see this is *not* quite > identical to how it renders on postgresql.org, so there is still some > unexplained differential in font or margins or something. But if you > look at those three PNGs you can see that either v1 or v2 has a pretty > substantial advantage over HEAD in terms of the amount of space > needed. v2 would be even further ahead if we could eliminate some of > the vertical space around the intra-function row split, which again > might be doable with CSS magic. > > The main disadvantage I can see to the v2 design is that we're back > to having two per function, which is inevitably going to result > in PDF builds putting page breaks between those rows. But you can't > have everything ... and maybe we could find a way to discourage such > breaks if we tried. What about putting everything into one and use a block with some left padding/margin for description + example. This would solve the PDF page break issue as well as the column separation border one. The screenshot attached uses a tag for the descrition/example block. > > Another issue is that v2 won't adapt real well to operator tables; > the operator name won't be at the left. I don't have a lot of faith > in the proposal to fix that with font tricks. Maybe we could stick > to something close to the layout that table 9.30 has in HEAD (ie > repeating the operator name in column 1), since we won't have long > operator names messing up the format. Again, CSS'ing our way > out of the internal lines and extra vertical space within a single > logical table cell would make that layout look nicer. > > On balance I quite like the v2 layout and would prefer to move forward > with that, assuming we can solve the remaining issues via CSS or style > sheets. > > In addition to screenshots, I've attached patches against HEAD that > convert both tables 9.10 and 9.33 into v1 and v2 styles. > > regards, tom lane >
Re: Poll: are people okay with function/operator table redesign?
Le 16/04/2020 à 16:43, Tom Lane a écrit : > Pierre Giraud writes: >> Le 16/04/2020 à 00:18, Tom Lane a écrit : >>> The main disadvantage I can see to the v2 design is that we're back >>> to having two per function, which is inevitably going to result >>> in PDF builds putting page breaks between those rows. But you can't >>> have everything ... and maybe we could find a way to discourage such >>> breaks if we tried. > > Further experimentation shows that the PDF toolchain is perfectly willing > to put a page break *within* a multi-line ; if there is any > preference to break between rows instead, it's pretty weak. So that > argument is a red herring and we shouldn't waste time chasing it. > However, there'd still be some advantage in not being dependent on CSS > hackery to make it look nice in HTML. > > What we're down to wanting, at this point, is basically a para with > hanging indent. > >> What about putting everything into one and use a block with >> some left padding/margin for description + example. >> This would solve the PDF page break issue as well as the column >> separation border one. >> The screenshot attached uses a tag for the descrition/example block. > > That looks about right, perhaps, but could you be a little clearer about > how you accomplished that? Attached you will find the HTML structure with associated styles. Sorry I haven't tried to do this from the DocBook sources. I hope this helps though. Regards # HTML Function Description Example age(timestamp, timestamp) → interval Subtract arguments, producing a “symbolic” result that uses years and months, rather than just days age(timestamp '2001-04-10', timestamp '1957-06-13') → 43 years 9 mons 27 days # CSS #docContent table.table code.literal { font-style: italic; } code.returnvalue { font-weight: bold; } code.function { font-weight: bold; } .replaceable > code { font-weight: normal !important; } #docContent table.table dl { margin-left: 30px; margin-top: 5px; } #docContent table.table dl > dd { margin-bottom: 0; }
Re: [PATCH] Add extra statistics to explain for Nested Loop
Le 17/10/2020 à 06:26, Julien Rouhaud a écrit : > On Sat, Oct 17, 2020 at 12:15 PM Pavel Stehule > wrote: >> >> so 17. 10. 2020 v 0:11 odesílatel Anastasia Lubennikova >> napsal: >>> >>> On 16.10.2020 12:07, Julien Rouhaud wrote: >>> >>> Le ven. 16 oct. 2020 à 16:12, Pavel Stehule a >>> écrit : pá 16. 10. 2020 v 9:43 odesílatel napsal: > > Hi, hackers. > For some distributions of data in tables, different loops in nested loop > joins can take different time and process different amounts of entries. > It makes average statistics returned by explain analyze not very useful > for DBA. > To fix it, here is the patch that add printing of min and max statistics > for time and rows across all loops in Nested Loop to EXPLAIN ANALYSE. > Please don't hesitate to share any thoughts on this topic! +1 This is great feature - sometimes it can be pretty messy current limited format >>> >>> >>> +1, this can be very handy! >>> >>> Cool. >>> I have added your patch to the commitfest, so it won't get lost. > > Thanks! I'll also try to review it next week. > >>> https://commitfest.postgresql.org/30/2765/ >>> >>> I will review the code next week. Unfortunately, I cannot give any >>> feedback about usability of this feature. >>> >>> User visible change is: >>> >>> - -> Nested Loop (actual rows=N loops=N) >>> + -> Nested Loop (actual min_rows=0 rows=0 max_rows=0 loops=2) >> >> >> This interface is ok - there is not too much space for creativity. > > Yes I also think it's ok. We should also consider usability for tools > like explain.depesz.com, I don't know if the current output is best. > I'm adding Depesz and Pierre which are both working on this kind of > tool for additional input. Same for me and PEV2. It should be fairly easy to parse this new format. > >> I can imagine displaying variance or average - but I am afraid about very >> bad performance impacts. > > The original counter (rows here) is already an average right? > Variance could be nice too. Instrumentation will already spam > gettimeofday() calls for nested loops, I don't think that computing > variance would add that much overhead? Thus, it's an average value. And to be mentioned: a rounded one! Which I found a bit tricky to figure out.
doc: Remove LC_COLLATE and LC_CTYPE from SHOW command
Hi, Since version 16, the read-only server settings lc_collate and lc_ctype have been removed [1]. Unfortunately, the documentation has been updated only partially. Attached is a patch to also update the page for the "SHOW" SQL command and remove the corresponding "LC_COLLATE" and "LC_CTYPE" parameters. I successfully built the doc in HTML locally. This patch has been created against the current master branch but it should equally be applied to REL_16_STABLE and REL_17_STABLE. Thanks, Pierre [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b0f6c437160db640d4ea3e49398ebc3ba39d1982 From 32575b9478ecff4dd27a7215f9d34e6eb2abf6d0 Mon Sep 17 00:00:00 2001 From: Pierre GIRAUD Date: Wed, 4 Dec 2024 09:03:02 +0100 Subject: [PATCH v1] doc: remove LC_COLLATE and LC_CTYPE from SHOW command The corresponding read-only server settings have been removed since in PG16. See commit b0f6c437160db6. --- doc/src/sgml/ref/show.sgml | 24 1 file changed, 24 deletions(-) diff --git a/doc/src/sgml/ref/show.sgml b/doc/src/sgml/ref/show.sgml index b3747b119f..5fbb5bbe01 100644 --- a/doc/src/sgml/ref/show.sgml +++ b/doc/src/sgml/ref/show.sgml @@ -77,30 +77,6 @@ SHOW ALL - -LC_COLLATE - - - Shows the database's locale setting for collation (text - ordering). At present, this parameter can be shown but not - set, because the setting is determined at database creation - time. - - - - - -LC_CTYPE - - - Shows the database's locale setting for character - classification. At present, this parameter can be shown but - not set, because the setting is determined at database creation - time. - - - - IS_SUPERUSER -- 2.43.0