Execute INSERT during a parallel operation
Hi, I'm trying to use the SPI to save the executed plans in the ExecutorEnd. When the plan involves multiple workers, the insert operations would trigger an error: cannot execute INSERT during a parallel operation. I wonder if there's a different hook I can use when there's a gather node? or other ways to get around? Thank you, Donald Dong
Why could GEQO produce plans with lower costs than the standard_join_search?
Hi, I was expecting the plans generated by standard_join_search to have lower costs than the plans from GEQO. But after the results I have from a join order benchmark show that GEQO produces plans with lower costs most of the time! I wonder what is causing this observation? From my understanding, standard_join_search is doing a complete search. So I'm not sure how the GEQO managed to do better than that. Thank you, Donald Dong
Re: Why could GEQO produce plans with lower costs than the standard_join_search?
Hi, Thank you very much for the explanation! I think the join order benchmark I used [1] is somewhat representative, however, I probably didn't use the most accurate cost estimation. I find the cost from cheapest_total_path->total_cost is different from the cost from queryDesc->planstate->total_cost. What I saw was that GEQO tends to form paths with lower cheapest_total_path->total_cost (aka the fitness of the children). However, standard_join_search is more likely to produce a lower queryDesc->planstate->total_cost, which is the cost we get using explain. I wonder why those two total costs are different? If the total_cost from the planstate is more accurate, could we use that instead as the fitness in geqo_eval? [1] https://github.com/gregrahn/join-order-benchmark Regards, Donald Dong > On May 7, 2019, at 4:44 PM, Tom Lane wrote: > > Donald Dong writes: >> I was expecting the plans generated by standard_join_search to have lower >> costs >> than the plans from GEQO. But after the results I have from a join order >> benchmark show that GEQO produces plans with lower costs most of the time! > >> I wonder what is causing this observation? From my understanding, >> standard_join_search is doing a complete search. So I'm not sure how the GEQO >> managed to do better than that. > > standard_join_search is *not* exhaustive; there's a heuristic that causes > it not to consider clauseless joins unless it has to. > > For the most part, GEQO uses the same heuristic (cf desirable_join()), > but given the right sort of query shape you can probably trick it into > situations where it will be forced to use a clauseless join when the > core code wouldn't. It'd still be surprising for that to come out with > a lower cost estimate than a join order that obeys the heuristic, > though. Clauseless joins are generally pretty awful. > > I'm a tad suspicious about the representativeness of your benchmark > queries if you find this is happening "most of the time". > > regards, tom lane
Re: Why could GEQO produce plans with lower costs than the standard_join_search?
On May 22, 2019, at 11:42 AM, Tom Lane wrote: > > Donald Dong writes: >> I find the cost from cheapest_total_path->total_cost is different >> from the cost from queryDesc->planstate->total_cost. What I saw was >> that GEQO tends to form paths with lower >> cheapest_total_path->total_cost (aka the fitness of the children). >> However, standard_join_search is more likely to produce a lower >> queryDesc->planstate->total_cost, which is the cost we get using >> explain. > >> I wonder why those two total costs are different? If the total_cost >> from the planstate is more accurate, could we use that instead as the >> fitness in geqo_eval? > > You're still asking us to answer hypothetical questions unsupported > by evidence. In what case does that really happen? Hi, My apologies if this is not the minimal necessary set up. But here's more information about what I saw using the following query (JOB/1a.sql): SELECT MIN(mc.note) AS production_note, MIN(t.title) AS movie_title, MIN(t.production_year) AS movie_year FROM company_type AS ct, info_type AS it, movie_companies AS mc, movie_info_idx AS mi_idx, title AS t WHERE ct.kind = 'production companies' AND it.info = 'top 250 rank' AND mc.note NOT LIKE '%(as Metro-Goldwyn-Mayer Pictures)%' AND (mc.note LIKE '%(co-production)%' OR mc.note LIKE '%(presents)%') AND ct.id = mc.company_type_id AND t.id = mc.movie_id AND t.id = mi_idx.movie_id AND mc.movie_id = mi_idx.movie_id AND it.id = mi_idx.info_type_id; I attached the query plan and debug_print_rel output for GEQO and standard_join_search. planstate->total_cost cheapest_total_path GEQO54190.1354239.03 STD 54179.0254273.73 Here I observe GEQO produces a lower cheapest_total_path->total_cost, but its planstate->total_cost is higher than what standard_join_search produces. Regards, Donald Dong Finalize Aggregate (cost=54190.12..54190.13 rows=1 width=68) -> Gather (cost=54189.90..54190.11 rows=2 width=68) Workers Planned: 2 -> Partial Aggregate (cost=53189.90..53189.91 rows=1 width=68) -> Nested Loop (cost=15318.71..53189.55 rows=46 width=45) Join Filter: (mc.movie_id = t.id) -> Hash Join (cost=15318.28..53162.39 rows=46 width=32) Hash Cond: (mc.company_type_id = ct.id) -> Parallel Hash Join (cost=15317.21..53160.33 rows=185 width=36) Hash Cond: (mc.movie_id = mi_idx.movie_id) -> Parallel Seq Scan on movie_companies mc (cost=0.00..37814.90 rows=7320 width=32) Filter: (((note)::text !~~ '%(as Metro-Goldwyn-Mayer Pictures)%'::text) AND (((note)::text ~~ '%(co-production)%'::text) OR ((note)::text ~~ ' %(presents)%'::text))) -> Parallel Hash (cost=15253.60..15253.60 rows=5089 width=4) -> Hash Join (cost=2.43..15253.60 rows=5089 width=4) Hash Cond: (mi_idx.info_type_id = it.id) -> Parallel Seq Scan on movie_info_idx mi_idx (cost=0.00..13685.15 rows=575015 width=8) -> Hash (cost=2.41..2.41 rows=1 width=4) -> Seq Scan on info_type it (cost=0.00..2.41 rows=1 width=4) Filter: ((info)::text = 'top 250 rank'::text) -> Hash (cost=1.05..1.05 rows=1 width=4) -> Seq Scan on company_type ct (cost=0.00..1.05 rows=1 width=4) Filter: ((kind)::text = 'production companies'::text) -> Index Scan using title_pkey on title t (cost=0.43..0.58 rows=1 width=25) Index Cond: (id = mi_idx.movie_id) Finalize Aggregate (cost=54179.01..54179.02 rows=1 width=68) -> Gather (cost=54178.79..54179.00 rows=2 width=68) Workers Planned: 2 -> Partial Aggregate (cost=53178.79..53178.80 rows=1 width=68) -> Nested Loop (cost=37881.27..53178.44 rows=46 width=45) Join Filter: (mc.movie_id = t.id) -> Parallel Hash Join (cost=37880.84..53151.28 rows=46 width=32) Hash Cond: (mi_idx.movie_id = mc.movie_id) -> Hash Join (cost=2.43..15253.60
Re: Why could GEQO produce plans with lower costs than the standard_join_search?
On May 23, 2019, at 9:02 AM, Tom Lane wrote: > > Donald Dong writes: >> On May 22, 2019, at 11:42 AM, Tom Lane wrote: >>> You're still asking us to answer hypothetical questions unsupported >>> by evidence. In what case does that really happen? > >> I attached the query plan and debug_print_rel output for GEQO and >> standard_join_search. > >> planstate->total_cost cheapest_total_path >> GEQO 54190.1354239.03 >> STD 54179.0254273.73 > >> Here I observe GEQO produces a lower >> cheapest_total_path->total_cost, but its planstate->total_cost is higher >> than what standard_join_search produces. > > Well, > > (1) the plan selected by GEQO is in fact more expensive than > the one found by the standard search. Not by much --- as Andrew > observes, this difference is less than what the planner considers > "fuzzily the same" --- but nonetheless 54190.13 > 54179.02. > > (2) the paths you show do not correspond to the finally selected > plans --- they aren't even the same shape. (The Gathers are in > different places, to start with.) I'm not sure where you were > capturing the path data, but it looks like you missed top-level > parallel-aggregation planning, and that managed to find some > plans that were marginally cheaper than the ones you captured. > Keep in mind that GEQO only considers join planning, not > grouping/aggregation. > > Andrew's point about fuzzy cost comparison is also a good one, > though we needn't invoke it to explain these particular numbers. Oh, that's very good to know! I captured the path at the end of the join_search_hook. If I understood correctly, top-level parallel-aggregation will be applied later, so GEQO is not taking it into consideration during the join searching? By looking at the captured costs, I thought GEQO found a better join order than the standard_join_search. However, the final plan using the join order produced by GEQO turns out to be more expansive. Would that imply if GEQO sees a join order which is identical to the one produced by standard_join_search, it will discard it since the cheapest_total_path has a higher cost, though the final plan may be cheaper? Here is another query (JOB/27a.sql) which has more significant cost differences: planstate->total_cost cheapest_total_path GEQO343016.77 343016.75 STD 342179.13 344137.33 Regards, Donald Dong
Re: Why could GEQO produce plans with lower costs than the standard_join_search?
On May 23, 2019, at 10:43 AM, Tom Lane wrote: > Donald Dong writes: >> On May 23, 2019, at 9:02 AM, Tom Lane wrote: >>> (2) the paths you show do not correspond to the finally selected >>> plans --- they aren't even the same shape. (The Gathers are in >>> different places, to start with.) I'm not sure where you were >>> capturing the path data, but it looks like you missed top-level >>> parallel-aggregation planning, and that managed to find some >>> plans that were marginally cheaper than the ones you captured. >>> Keep in mind that GEQO only considers join planning, not >>> grouping/aggregation. > >> By looking at the captured costs, I thought GEQO found a better join >> order than the standard_join_search. However, the final plan using >> the join order produced by GEQO turns out to be more expansive. Would >> that imply if GEQO sees a join order which is identical to the one >> produced by standard_join_search, it will discard it since the >> cheapest_total_path has a higher cost, though the final plan may be >> cheaper? > > I suspect what's really going on is that you're looking at the wrong > paths. The planner remembers more paths for each rel than just the > cheapest-total-cost one, the reason being that total cost is not the > only figure of merit. The plan that is winning in the end, it looks > like, is parallelized aggregation on top of a non-parallel join plan, > but the cheapest_total_path uses up the opportunity for a Gather on > a parallelized scan/join. If we were just doing a scan/join and > no aggregation, that path would have been the basis for the final > plan, but it's evidently not being chosen here; the planner is going > to some other scan/join path that is not parallelized. Seems the paths in the final rel (path list, cheapest parameterized paths, cheapest startup path, and cheapest total path) are the same identical path for this particular query (JOB/1a.sql). Am I missing anything? Since the total cost of the cheapest-total-path is what GEQO is currently using to evaluate the fitness (minimizing), I'm expecting the cheapest-total-cost to measure how good is a join order. So a join order from standard_join_search, with higher cheapest-total-cost, ends up to be better is pretty surprising to me. Perhaps the cheapest-total-cost should not be the best/only choice for fitness? Regards, Donald Dong
Different row estimations on base rels
Hi, I noticed the estimated rows of the base relations during the join searching is *very* different from the estimations in the final plan. Join search (rows of the initial_rels): RELOPTINFO (ct): rows=1 width=4 RELOPTINFO (it): rows=1 width=4 RELOPTINFO (mc): rows=17567 width=32 RELOPTINFO (mi_idx): rows=1380035 width=8 RELOPTINFO (t): rows=2528356 width=25 The final plan: Seq Scan on company_type ct (cost=0.00..1.05 rows=1 width=4) Seq Scan on info_type it (cost=0.00..2.41 rows=1 width=4) Parallel Seq Scan on movie_companies mc (cost=0.00..37814.90 rows=7320 width=32) Parallel Seq Scan on movie_info_idx mi_idx (cost=0.00..13685.15 rows=575015 width=8) Index Scan using title_pkey on title t (cost=0.43..0.58 rows=1 width=25) By looking at the joinrel->rows, I would expect relation t to have the largest size, however, this is not true at all. I wonder what's causing this observation, and how to get estimations close to the final plan? Thank you, Donald Dong
Re: Different row estimations on base rels
On May 29, 2019, at 1:36 PM, Robert Haas wrote: > Well, it's all there in the code. I believe the issue is that the > final estimates are based on the number of rows that will be returned > from the relation, which is often less, and occasionally more, than > the total of the rows in the relation. The reason it's often less is > because there might be a WHERE clause or similar which rules out some > of the rows. The reason it might be more is because a nested loop > could return the same rows multiple times. Yes, indeed. I was confused, and I guess I could've thought about it about more before posting here. Thank you for answering this question! Regards, Donald Dong
Print baserestrictinfo for varchar fields
Hi, I noticed that debug_print_rel outputs "unknown expr" when the fields in baserestrictinfo are typed as varchar. create table tbl_a(id int, info varchar(32)); RELOPTINFO (tbl_a): rows=4 width=86 baserestrictinfo: unknown expr = pattern My approach is to handle the RelabelType case in print_expr. After the patch, I get: RELOPTINFO (tbl_a): rows=4 width=86 baserestrictinfo: tbl_a.info = pattern I wonder if this is a proper way of fixing it? Thank you, Donald Dong 001_print_baserestrictinfo_varchar.patch Description: Binary data
undefined symbol: PQgssEncInUse
Hi, After I make temp-install on HEAD with a clean build, I fail to start psql (tmp_install/usr/local/pgsql/bin/psql) and get an error message: ./psql: symbol lookup error: ./psql: undefined symbol: PQgssEncInUse However, make check and other tests still work. For me, it is fine until commit b0b39f72b9904bcb80f97b35837ccff1578aa4b8. I wonder if this only occurs to me? Thank you, Donald Dong
Re: undefined symbol: PQgssEncInUse
On May 29, 2019, at 8:23 PM, Paul Guo wrote: > Have you used the correct libpq library? If yes, you might want to check the > build logs and related files to see where is wrong. In my environment, it's > ok with both gssapi enabled or disabled. Thank you! Resetting libpq's path fixes it. Regards, Donald Dong
Re: Ryu floating point output patch
> On Jan 18, 2019, at 2:05 AM, Andrew Gierth > wrote: > > BTW, doing that in a thread about a commitfest patch confuses the > commitfest app and cfbot (both of which think it's a new version of the > patch under discussion), so best avoided. Oops. Thank you. Noted. I think the previous additional digits behavior still exist in the latest patch. For example: =# set extra_float_digits = 0; SET =# select float4in('1.123456789'); float4in -- 1.12346 (1 row) =# set extra_float_digits = 1; SET =# select float4in('1.123456789'); float4in --- 1.1234568 (1 row) The extra_float_digits is increased by 1, but two digits are added. Without the patch, it will render ' 1.123457' instead.
Re: Ryu floating point output patch
> On Jan 19, 2019, at 5:12 PM, Andrew Gierth > wrote: > >>>>>> "Donald" == Donald Dong writes: > > Donald> I think the previous additional digits behavior still exist > Donald> in the latest patch. For example: > > Donald> =# set extra_float_digits = 0; > Donald> SET > Donald> =# select float4in('1.123456789'); > Donald> float4in > Donald> -- > Donald> 1.12346 > Donald> (1 row) > > Donald> =# set extra_float_digits = 1; > Donald> SET > Donald> =# select float4in('1.123456789'); > Donald> float4in > Donald> --- > Donald> 1.1234568 > Donald> (1 row) > > Donald> The extra_float_digits is increased by 1, but two digits are > Donald> added. > > That's intentional; this patch takes any value of extra_float_digits > greater than 0 to mean "generate the shortest precise output". > > In practice setting extra_float_digits to 1 is rare to nonexistent; > almost all users of extra_float_digits set it to either 3 (to get > precise values), 2 (for historical reasons since we didn't allow more > than 2 in older versions), or less than 0 (to get rounded output for > more convenient display when precision is not required). Makes sense! Thank you for explaining. I wonder if it's necessary to update the doc accordingly? Such as https://www.postgresql.org/docs/11/datatype-numeric.html#DATATYPE-FLOAT and https://www.postgresql.org/docs/11/runtime-config-client.html .
Install JIT headers
Hi, In the standard_planner, we set the proper JIT flags on the resulting PlannedStmt node. We also offered a planer hook so extensions can add customized planners. The JIT flags in jit/jit.h however, is not present on the system. I think it's probably better to install the jit headers? diff --git a/src/include/Makefile b/src/include/Makefile index 6bdfd7db91..041702809e 100644 --- a/src/include/Makefile +++ b/src/include/Makefile @@ -19,7 +19,7 @@ all: pg_config.h pg_config_ext.h pg_config_os.h # Subdirectories containing installable headers SUBDIRS = access bootstrap catalog commands common datatype \ executor fe_utils foreign \ - lib libpq mb nodes optimizer parser partitioning postmaster \ + jit lib libpq mb nodes optimizer parser partitioning postmaster \ regex replication rewrite \ statistics storage tcop snowball snowball/libstemmer tsearch \ tsearch/dicts utils port port/atomics port/win32 port/win32_msvc \ Regards, Donald Dong
Analyze all plans
Hi, I'm working on an extension which analyzes all possible plans generated by the planner. I believe this extension would become useful for benchmarking the planner (e.g. the performance of the estimation and the cost model) and better understanding the cases where the planners would make a suboptimal move. Here are my procedures: 1. Enumerate all the plans 1.1 Create a hook for add_path so that it won't discard the expensive paths from the planner's point of view. 1.2 Collect all the paths from final_rel->pathlist, and turn them into PlannedStmt nodes by patching standard_planner. 1.3 Mark the cheapest path from the planner's point of view. 2. Explain all the plans 2.1 First explain the cheapest plan 2.1.1 If analyzing, collect the execution time and use it to set a timer interrupt. 2.2 Explain the remaining plans 2.2.1 The other plans can be disastrous; the plans may never finish in a reasonable amount of time. If analyzing, the timer interrupt shall stop the executor. 2.2.2 Move on to the next plan Are those procedures reasonable? I'm able to implement all the steps except for 2.2.1. - Attempt 1 Our most common way of handling the timeouts is to kill the process. However, it would terminate the entire explain statement, yet there're still plans to be explained. - Attempt 2 Fork before explain so it would possible to kill the child process without disrupting the explain statement. However, simply allocating shared memory for the QueryDesc would not work (PANIC: failed to re-find shared lock object). To me, this plan is more doable, but it seems there exist other mechanisms I need to be aware, to execute a plan in the child process and report the result in the parent process? What do you think? I will appreciate any feedback. Thank you, Donald Dong
Re: Analyze all plans
On Jan 23, 2019, at 6:46 AM, Tom Lane wrote: > > Oleksandr Shulgin writes: >> On Wed, Jan 23, 2019 at 9:44 AM Donald Dong wrote: >>> 1. Enumerate all the plans > >> So enumerating all possible plans stops being practical for even slightly >> complicated queries. > > Yes. You can *not* disable the planner's aggressive pruning of losing > paths and subpaths without ending up with something that's completely > impractical for anything beyond toy queries. That's just from the > standpoint of planner runtime. Adding on the cost of actually creating > a finished plan and then running it for long enough to get a reliable > runtime for each case would ... well, let's just say you'd be safely dead > before you got any interesting answers. Thank you for the feedback. I didn't think about this carefully. I thought the planner would use GEQO when the number of joins is large, but indeed it's still n! for normal path selection. Now I keep top-k cheapest sub plans, where k is configured by users. If the k is set up properly, setting a timeout may not be necessary. However, if I do want to set a timeout, I would run into the issues I had in step 2. I'm looking forward to hearing more thoughts :) Thanks again!
Actual Cost
Hi, When explaining a query, I think knowing the actual rows and pages in addition to the operation type (e.g seqscan) would be enough to calculate the actual cost. The actual cost metric could be useful when we want to look into how off is the planner's estimation, and the correlation between time and cost. Would it be a feature worth considering? Thank you, Donald Dong
Re: Actual Cost
On Feb 16, 2019, at 6:44 PM, Tomas Vondra wrote: > > On 2/17/19 3:40 AM, David Fetter wrote: >> >> As someone not volunteering to do any of the work, I think it'd be a >> nice thing to have. How large an effort would you guess it would be >> to build a proof of concept? > > I don't quite understand what is meant by "actual cost metric" and/or > how is that different from running EXPLAIN ANALYZE. Here is an example: Hash Join (cost=3.92..18545.70 rows=34 width=32) (actual cost=3.92..18500 time=209.820..1168.831 rows=47 loops=3) Now we have the actual time. Time can have a high variance (a change in system load, or just noises), but I think the actual cost would be less likely to change due to external factors. On 2/17/19 3:40 AM, David Fetter wrote: > On Sat, Feb 16, 2019 at 03:10:44PM -0800, Donald Dong wrote: >> Hi, >> >> When explaining a query, I think knowing the actual rows and pages >> in addition to the operation type (e.g seqscan) would be enough to >> calculate the actual cost. The actual cost metric could be useful >> when we want to look into how off is the planner's estimation, and >> the correlation between time and cost. Would it be a feature worth >> considering? > > As someone not volunteering to do any of the work, I think it'd be a > nice thing to have. How large an effort would you guess it would be > to build a proof of concept? Intuitively it does not feel very complicated to me, but I think the interface when we're planning (PlannerInfo, RelOptInfo) is different from the interface when we're explaining a query (QueryDesc). Since I'm very new, if I'm doing it myself, it would probably take many iterations to get to the right point. But still, I'm happy to work on a proof of concept if no one else wants to work on it. regards, Donald Dong
Re: Actual Cost
On Feb 16, 2019, at 9:31 PM, Tom Lane wrote: > > Donald Dong writes: >> On Feb 16, 2019, at 6:44 PM, Tomas Vondra wrote: >>> I don't quite understand what is meant by "actual cost metric" and/or >>> how is that different from running EXPLAIN ANALYZE. > >> Here is an example: > >> Hash Join (cost=3.92..18545.70 rows=34 width=32) (actual cost=3.92..18500 >> time=209.820..1168.831 rows=47 loops=3) > >> Now we have the actual time. Time can have a high variance (a change >> in system load, or just noises), but I think the actual cost would be >> less likely to change due to external factors. > > I'm with Tomas: you have not explained what you think those > numbers mean. Yeah, I was considering the actual cost to be the output of the cost model given the actual rows and pages after we instrument the execution: plug in the values which are no longer estimations. For a hash join, we could use the actual inner_rows_total to get the actual cost. For a seqscan, we can use the actual rows to get the actual CPU cost. regards, Donald Dong
Re: Actual Cost
On Feb 17, 2019, at 10:56 AM, Tom Lane wrote: > Perhaps, but refactoring to get that seems impractically invasive & > expensive, since e.g. index AM cost estimate functions would have to > be redefined, plus we'd have to carry around some kind of cost vector > rather than single numbers for every Path ... Maybe we could walk through the final plan tree and fill the expression? With another tree structure to hold the cost vectors. On Feb 17, 2019, at 8:29 AM, Jeff Janes wrote: > I don't think there is any way to assign an actual cost. For example how do > you know if a buffer read was "actually" seq_page_cost or random_page_cost? > And if there were a way, it too would have a high variance. Thanks for pointing that out! I think it's probably fine to use the same assumptions as the cost model? For example, we assume 3/4ths of accesses are sequential, 1/4th are not. > What would I find very useful is a verbosity option to get the cost estimates > expressed as a multiplier of each *_cost parameter, rather than just as a > scalar. Yeah, such expression would also help if we want to plug in the actual values. On Feb 17, 2019, at 4:11 AM, Tomas Vondra wrote: > Perhaps I'm just too used to comparing the rows/pages directly, but I > don't quite see the benefit of having such "actual cost". Mostly because > the cost model is rather rough anyway. Yeah, I think the "actual cost" is only "actual" for the cost model - the cost it would output given the exact row/page number. Some articles/papers have shown row estimation is the primary reason for planners to go off, so showing the actual (or the updated assumption) might also be useful if people want to compare different plans and want to refer to a more accurate quantitative measure. regards, Donald Dong
Re: Actual Cost
On Feb 17, 2019, at 11:05 AM, Donald Dong wrote: > > On Feb 17, 2019, at 10:56 AM, Tom Lane wrote: >> Perhaps, but refactoring to get that seems impractically invasive & >> expensive, since e.g. index AM cost estimate functions would have to >> be redefined, plus we'd have to carry around some kind of cost vector >> rather than single numbers for every Path ... > > Maybe we could walk through the final plan tree and fill the expression? With > another tree structure to hold the cost vectors. Here is a draft patch. I added a new structure called CostInfo to the Plan node. The CostInfo is be added in create_plan, and the cost calculation is centered at CostInfo. Is this a reasonable approach? Thank you, Donald Dong 01_actual_cost_seqscan_001.patch Description: Binary data
Implicit make rules break test examples
Hi, In src/test/example, the implicit make rules produce errors: make -C ../../../src/backend generated-headers make[1]: Entering directory '/home/ddong/postgresql/bld/src/backend' make -C catalog distprep generated-header-symlinks make[2]: Entering directory '/home/ddong/postgresql/bld/src/backend/catalog' make[2]: Nothing to be done for 'distprep'. make[2]: Nothing to be done for 'generated-header-symlinks'. make[2]: Leaving directory '/home/ddong/postgresql/bld/src/backend/catalog' make -C utils distprep generated-header-symlinks make[2]: Entering directory '/home/ddong/postgresql/bld/src/backend/utils' make[2]: Nothing to be done for 'distprep'. make[2]: Nothing to be done for 'generated-header-symlinks'. make[2]: Leaving directory '/home/ddong/postgresql/bld/src/backend/utils' make[1]: Leaving directory '/home/ddong/postgresql/bld/src/backend' gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -O2 -I/home/ddong/postgresql/bld/../src/interfaces/libpq -I../../../src/include -I/home/ddong/postgresql/bld/../src/include -D_GNU_SOURCE -c -o testlibpq.o /home/ddong/postgresql/bld/../src/test/examples/testlibpq.c gcc -L../../../src/port -L../../../src/common -L../../../src/common -lpgcommon -L../../../src/port -lpgport -L../../../src/interfaces/libpq -lpq -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags testlibpq.o -o testlibpq testlibpq.o: In function `exit_nicely': testlibpq.c:(.text.unlikely+0x5): undefined reference to `PQfinish' testlibpq.o: In function `main': testlibpq.c:(.text.startup+0x22): undefined reference to `PQconnectdb' testlibpq.c:(.text.startup+0x2d): undefined reference to `PQstatus' testlibpq.c:(.text.startup+0x44): undefined reference to `PQexec’ … I think the -lpq flag does not have any effects in the middle of the arguments. It works if move the flag to the end: gcc -L../../../src/port -L../../../src/common -L../../../src/common -lpgcommon -L../../../src/port -lpgport -L../../../src/interfaces/libpq -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags testlibpq.o -o testlibpq -lpq So I added an explicit rule to rearrange the flags: gcc testlibpq.o -o testlibpq -L../../../src/port -L../../../src/common -L../../../src/common -lpgcommon -L../../../src/port -lpgport -L../../../src/interfaces/libpq -lpq -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags Then the make command works as expected. This is my first time writing a patch. Please let me know what you think! Thank you, Happy new year! Donald Dong
Re: Implicit make rules break test examples
On Mon, Dec 31, 2018 at 11:24 PM Donald Dong wrote: > > Hi, > > In src/test/example, the implicit make rules produce errors: > > make -C ../../../src/backend generated-headers > make[1]: Entering directory '/home/ddong/postgresql/bld/src/backend' > make -C catalog distprep generated-header-symlinks > make[2]: Entering directory '/home/ddong/postgresql/bld/src/backend/catalog' > make[2]: Nothing to be done for 'distprep'. > make[2]: Nothing to be done for 'generated-header-symlinks'. > make[2]: Leaving directory '/home/ddong/postgresql/bld/src/backend/catalog' > make -C utils distprep generated-header-symlinks > make[2]: Entering directory '/home/ddong/postgresql/bld/src/backend/utils' > make[2]: Nothing to be done for 'distprep'. > make[2]: Nothing to be done for 'generated-header-symlinks'. > make[2]: Leaving directory '/home/ddong/postgresql/bld/src/backend/utils' > make[1]: Leaving directory '/home/ddong/postgresql/bld/src/backend' > gcc -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Werror=vla -Wendif-labels > -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing > -fwrapv -fexcess-precision=standard -Wno-format-truncation -O2 > -I/home/ddong/postgresql/bld/../src/interfaces/libpq > -I../../../src/include -I/home/ddong/postgresql/bld/../src/include > -D_GNU_SOURCE -c -o testlibpq.o > /home/ddong/postgresql/bld/../src/test/examples/testlibpq.c > gcc -L../../../src/port -L../../../src/common -L../../../src/common > -lpgcommon -L../../../src/port -lpgport > -L../../../src/interfaces/libpq -lpq -Wl,--as-needed > -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags testlibpq.o -o > testlibpq > testlibpq.o: In function `exit_nicely': > testlibpq.c:(.text.unlikely+0x5): undefined reference to `PQfinish' > testlibpq.o: In function `main': > testlibpq.c:(.text.startup+0x22): undefined reference to `PQconnectdb' > testlibpq.c:(.text.startup+0x2d): undefined reference to `PQstatus' > testlibpq.c:(.text.startup+0x44): undefined reference to `PQexec’ > … > > I think the -lpq flag does not have any effects in the middle of the > arguments. It works if move the flag to the end: > > gcc -L../../../src/port -L../../../src/common -L../../../src/common > -lpgcommon -L../../../src/port -lpgport > -L../../../src/interfaces/libpq -Wl,--as-needed > -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags testlibpq.o -o > testlibpq -lpq > > So I added an explicit rule to rearrange the flags: > > gcc testlibpq.o -o testlibpq -L../../../src/port -L../../../src/common > -L../../../src/common -lpgcommon -L../../../src/port -lpgport > -L../../../src/interfaces/libpq -lpq -Wl,--as-needed > -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags > > Then the make command works as expected. This is my first time writing > a patch. Please let me know what you think! > > Thank you, > Happy new year! > Donald Dong -- Donald Dong explicit_make_rule_test_example_v1.patch Description: Binary data
Re: Implicit make rules break test examples
Thank you for the explanation! That makes sense. It is strange that it does not work for me. > What platform are you on exactly, and what toolchain (gcc and ld > versions) are you using? I'm using Ubuntu 18.04.1 LTS. gcc version: gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0 ld version: GNU ld (GNU Binutils for Ubuntu) 2.30 Regards, Donald Dong On Tue, Jan 1, 2019 at 9:54 AM Tom Lane wrote: > Donald Dong writes: > > In src/test/example, the implicit make rules produce errors: > > Hm. "make" in src/test/examples works fine for me. > > The only way I can account for the results you're showing is if your > linker is preferring libpq.a to libpq.so, so that reading the library > before the *.o files causes none of it to get pulled in. But that > isn't the default behavior on any modern platform AFAIK, and certainly > isn't considered good practice these days. Moreover, if that's what's > happening, I don't see how you would have managed to build PG at all, > because there are a lot of other places where our Makefiles write > $(LDFLAGS) before the *.o files they're trying to link. Maybe we > shouldn't have done it like that, but it's been working for everybody > else. > > What platform are you on exactly, and what toolchain (gcc and ld > versions) are you using? > > regards, tom lane > -- Donald Dong
Re: Implicit make rules break test examples
> I observe that ecpg's Makefile.regress is already doing #3: > %: %.o > $(CC) $(CFLAGS) $< $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@ > so what we'd be talking about is moving that to some more global spot, > probably Makefile.global. (I wonder why the target is not specified > as $@$(X) here?) Thank you for pointing that out! I think #3 is a better choice since it's less invasive and would potentially avoid similar problems in the future. I think may worth the risks of breaking some extensions. I moved the rule to the Makefile.global and added $(X) in case it's set. I also updated the order in Makefile.linux in the same patch since they have the same cause. I'm not sure if changes are necessary for other platforms, and I am not able to test it, unfortunately. I've built it again on Ubuntu and tested src/test/examples and src/interfaces/libpq/test. There are no errors. Thank you again for the awesome explanation, Donald Dong On Tue, Jan 1, 2019 at 11:54 AM Tom Lane wrote: > > Donald Dong writes: > > Thank you for the explanation! That makes sense. It is strange that it does > > not work for me. > > Yeah, I still can't account for the difference in behavior between your > platform and mine (I tried several different ones here, and they all > manage to build src/test/examples). However, I'm now convinced that > we do have an issue, because I found another place that does fail on my > platforms: src/interfaces/libpq/test gives failures like > > uri-regress.o: In function `main': > uri-regress.c:58: undefined reference to `pg_printf' > > the reason being that the link command lists -lpgport before > uri-regress.o, and since we only make the .a flavor of libpgport, it's > definitely going to be order-sensitive. (This has probably been busted > since we changed over to using snprintf.c everywhere, but nobody noticed > because this test isn't normally run.) > > The reason we haven't noticed this already seems to be that in all the > places where it matters, we have explicit link rules that order things > like this: > > $(CC) $(CFLAGS) $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X) > > However, the places that are having problems are trying to rely on > gmake's implicit rule, which according to their manual is > > Linking a single object file > `N' is made automatically from `N.o' by running the linker > (usually called `ld') via the C compiler. The precise command > used is `$(CC) $(LDFLAGS) N.o $(LOADLIBES) $(LDLIBS)'. > > So really the problem here is that we're doing the wrong thing by > injecting -l switches into LDFLAGS; it would be more standard to > put them into LDLIBS (or maybe LOADLIBES, though I think that's > not commonly used). > > I hesitate to try to change that though. The places that are messing with > that are injecting both -L and -l switches, and we want to keep putting > the -L switches into LDFLAGS because of the strong likelihood that the > initial (autoconf-derived) value of LDFLAGS contains -L switches; our > switches pointing at within-tree directories need to come first. > So the options seem to be: > > 1. Redefine our makefile conventions as being that internal -L switches > go into LDFLAGS_INTERNAL but internal -l switches go into LDLIBS_INTERNAL, > and we use the same recursive-expansion dance for LDLIBS[_INTERNAL] as for > LDFLAGS[_INTERNAL], and we have to start mentioning LDLIBS in our explicit > link rules. This would be a pretty invasive patch, I'm afraid, and would > almost certainly break some third-party extensions' Makefiles. It'd be > the cleanest solution in a green field, I think, but our Makefiles are > hardly a green field. > > 2. Be sure to override the gmake implicit link rule with an explicit > link rule everywhere --- basically like your patch, but touching more > places. This seems like the least risky alternative in the short > run, but we'd be highly likely to reintroduce such problems in future. > > 3. Replace the default implicit link rule with one of our own. > Conceivably this also breaks some extensions' Makefiles, though > I think that's less likely. > > I observe that ecpg's Makefile.regress is already doing #3: > > %: %.o > $(CC) $(CFLAGS) $< $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@ > > so what we'd be talking about is moving that to some more global spot, > probably Makefile.global. (I wonder why the target is not specified > as $@$(X) here?) > > I notice that the platform-specific makefiles in src/makefiles > are generally also getting it wrong in their implicit rules for > building shlibs, eg in Makefi
Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name
Hi Michael, Thank you for the information! > On Dec 11, 2018, at 9:55 PM, Michael Paquier wrote: > > Well, the conninfo and slot name accessible to the user are the values > available only once the information of the WAL receiver in shared memory > is ready to be displayed. Relying more on the GUC context has the > advantage to never have in shared memory the original string and only > store the clobbered one, which actually simplifies what's stored in > shared memory because you can entirely remove ready_to_display (I forgot > to remove that in the patch actually). If those parameters become > reloadable, you actually rely only on what the param context has, and > not on what the shared memory context may have or not. I wonder why do we need to have this addition? src/backend/replication/walreceiver.c + memset(walrcv->slotname, 0, NAMEDATALEN); + if (PrimarySlotName) + strlcpy((char *) walrcv->slotname, PrimarySlotName, NAMEDATALEN); src/backend/replication/walreceiver.c 288: PrimaryConnInfo == NULL || PrimaryConnInfo[0] == '\0' 291: errmsg("cannot connect to the primary server as \"primary_conninfo\" is not defined"))); 392: PrimarySlotName && PrimarySlotName[0] != '\0' src/backend/access/transam/xlog.c 11824: * If primary_conninfo is set, launch walreceiver to try 11833: PrimaryConnInfo && strcmp(PrimaryConnInfo, "") != 0 I notice these patterns appear in different places. Maybe it will be better to have a common method such as pg_str_empty()? Regards, Donald Dong
Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name
On Jan 9, 2019, at 4:47 PM, Michael Paquier wrote: > That's much cleaner to me to clean up the field for safety before > starting the process. When requesting a WAL receiver to start, > slotname and conninfo gets zeroed before doing anything, you are right > that we could do without it actually. Cool! I agree that cleaning up the field would make the logic cleaner. > That's a pattern used quite a lot for many GUCs, so that's quite > separate from this patch. Perhaps it would make sense to have a macro > for that purpose for GUCs, I am not sure if that's worth it though. Yeah. I think having such macro would make the code a bit cleaner. Tho the duplication of logic is not too significant.
Re: Unified logging system for command-line programs
I think this patch is a nice improvement! On Jan 3, 2019, at 2:08 PM, Andres Freund wrote: > Or we could just add an ERROR variant that doesn't exit. Years back > I'd proposed that we make the log level a bitmask, but it could also > just be something like CALLSITE_ERROR or something roughly along those > lines. There's a few cases in backend code where that'd be beneficial > too. I think the logging system can also be applied on pg_regress. Perhaps even for the external frontend applications? The patch cannot be applied directly on HEAD. So I patched it on top of 60d99797bf. When I call pg_log_error() in initdb, I see Program received signal SIGSEGV, Segmentation fault. __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62 62 ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory. (gdb) bt #0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62 #1 0x55568f96 in dopr.constprop () #2 0x55569ddb in pg_vsnprintf () #3 0x55564236 in pg_log_generic () #4 0xc240 in main () I'm not sure what would be causing this behavior. I would appreciate references or docs for testing and debugging patches more efficiently. Now I'm having difficulties loading symbols of initdb in gdb. Thank you, Donald Dong
How does the planner determine plan_rows ?
Hi, I created some empty tables and run ` EXPLAIN ANALYZE` on `SELECT * `. I found the results have different row numbers, but the tables are all empty. =# CREATE TABLE t1(id INT, data INT); =# EXPLAIN ANALYZE SELECT * FROM t1; Seq Scan on t1 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.003..0.003 rows=0 loops=1) =# CREATE TABLE t2(data VARCHAR); =# EXPLAIN ANALYZE SELECT * FROM t2; Seq Scan on t2 (cost=0.00..23.60 rows=1360 width=32) (actual time=0.002..0.002 rows=0 loops=1) =# CREATE TABLE t3(id INT, data VARCHAR); =# EXPLAIN ANALYZE SELECT * FROM t3; Seq Scan on t3 (cost=0.00..22.70 rows=1270 width=36) (actual time=0.001..0.001 rows=0 loops=1) I found this behavior unexpected. I'm still trying to find out how/where the planner determines the plan_rows. Any help will be appreciated! Thank you, Donald Dong
Re: How does the planner determine plan_rows ?
Thank you for the great explanation! > On Jan 10, 2019, at 7:48 PM, Andrew Gierth > wrote: > >>>>>> "Donald" == Donald Dong writes: > > Donald> Hi, > Donald> I created some empty tables and run ` EXPLAIN ANALYZE` on > Donald> `SELECT * `. I found the results have different row numbers, > Donald> but the tables are all empty. > > Empty tables are something of a special case, because the planner > doesn't assume that they will _stay_ empty, and using an estimate of 0 > or 1 rows would tend to create a distorted plan that would likely blow > up in runtime as soon as you insert a second row. > > The place to look for info would be estimate_rel_size in > optimizer/util/plancat.c, from which you can see that empty tables get > a default size estimate of 10 pages. Thus: > > Donald> =# CREATE TABLE t1(id INT, data INT); > Donald> =# EXPLAIN ANALYZE SELECT * FROM t1; > Donald> Seq Scan on t1 (cost=0.00..32.60 rows=2260 width=8) (actual > Donald> time=0.003..0.003 rows=0 loops=1) > > An (int,int) tuple takes about 36 bytes, so you can get about 226 of > them on a page, so 10 pages is 2260 rows. > > Donald> =# CREATE TABLE t2(data VARCHAR); > Donald> =# EXPLAIN ANALYZE SELECT * FROM t2; > Donald> Seq Scan on t2 (cost=0.00..23.60 rows=1360 width=32) (actual > Donald> time=0.002..0.002 rows=0 loops=1) > > Size of a varchar with no specified length isn't known, so the planner > determines an average length of 32 by the time-honoured method of rectal > extraction (see get_typavgwidth in lsyscache.c), making 136 rows per > page. > > -- > Andrew (irc:RhodiumToad)
Re: How does the planner determine plan_rows ?
> On Jan 10, 2019, at 8:01 PM, Tom Lane wrote: > > ... estimate_rel_size() in plancat.c is where to look to find out > about the planner's default estimates when it's lacking hard data. Thank you! Now I see how the planner uses the rows to estimate the cost and generates the best_plan. To me, tracing the function calls is not a simple task. I'm using cscope, and I use printf when I'm not entirely sure. I was considering to use gbd, but I'm having issues referencing the source code in gdb. I'm very interested to learn how the professionals explore the codebase!
Re: Unified logging system for command-line programs
> On Jan 11, 2019, at 9:14 AM, Peter Eisentraut > wrote: > >> The patch cannot be applied directly on HEAD. So I patched it on top of >> 60d99797bf. > > Here is an updated patch with the merge conflicts of my own design > resolved. No functionality changes. > >> When I call pg_log_error() in initdb, I see >> >> Program received signal SIGSEGV, Segmentation fault. >> __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62 >> 62 ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory. >> (gdb) bt >> #0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62 >> #1 0x55568f96 in dopr.constprop () >> #2 0x55569ddb in pg_vsnprintf () >> #3 0x55564236 in pg_log_generic () >> #4 0xc240 in main () > > What do you mean exactly by "I call pg_log_error()"? The existing calls > in initdb clearly work, at least some of them, that is covered by the > test suite. Are you adding new calls? Thank you. I did add a new call for my local testing. There are no more errors after re-applying the patch on master. >> I'm not sure what would be causing this behavior. I would appreciate >> references or docs for testing and debugging patches more efficiently. >> Now I'm having difficulties loading symbols of initdb in gdb. > > The above looks like you'd probably get a better insight by compiling > with -O0 or some other lower optimization setting. > > There is also this: > https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD Thank you for the reference. That's very helpful! I noticed in some places such as pg_log_error("no data directory specified"); fprintf(stderr, _("You must identify the directory where the data for this database system\n" ... and pg_log_warning("enabling \"trust\" authentication for local connections"); fprintf(stderr, _("You can change this by editing pg_hba.conf or using the option -A, or\n" "--auth-local and --auth-host, the next time you run initdb.\n")); , pg_log does not completely replace fprintf. Would it be better to use pg_log so the logging level can also filter these messages?
Re: Ryu floating point output patch
> On Jan 11, 2019, at 10:40 AM, Andres Freund wrote: > > And of course it'd change the dump's text contents between ryu and > non-ryu backends even with extra_float_digits = 3, but the resulting > floats ought to be the same. It's just that ryu is better at figuring > out what the minimal text representation is than the current code. +1. I spent some time reading the Ryu paper, and I'm convinced. I think Ryu will be a good choice we want to have as many digits as possible (and roundtrip-safe). Since we have the extra_float_digits and ndig, the Ryu may give us more digits than we asked. Maybe one way to respect ndig is to clear the last few digits of (a, b, c) (This is the notation in the paper. I think the code used different var names) in base 10, https://github.com/ulfjack/ryu/blob/b2ae7a712937711375a79f894106a0c6d92b035f/ryu/d2s.c#L264-L266 // Step 3: Convert to a decimal power base using 128-bit arithmetic. uint64_t vr, vp, vm; int32_t e10; for Ryu to generate the desired string? Also I wonder why do we need to make this change? -intextra_float_digits = 0; /* Added to DBL_DIG or FLT_DIG */ +intextra_float_digits = 1; /* Added to DBL_DIG or FLT_DIG */
Re: Ryu floating point output patch
> On Jan 15, 2019, at 2:37 AM, Andrew Gierth > wrote: > > Andres> strtod()'s job ought to computationally be significantly easier > Andres> than the other way round, no? And if there's buggy strtod() > Andres> implementations out there, why would they be guaranteed to do > Andres> the correct thing with our current output, but not with ryu's? > > Funny thing: I've been devoting considerable effort to testing this, and > the one failure mode I've found is very interesting; it's not a problem > with strtod(), in fact it's a bug in our float4in caused by _misuse_ of > strtod(). Hi, I'm trying to reproduce the results by calling float4in in my test code. But I have some difficulties linking the code: testfloat4.c:(.text+0x34): undefined reference to `float4in' testfloat4.c:(.text+0x3c): undefined reference to `DirectFunctionCall1Coll' I tried offering float.o to the linker in addition to my test program. I also tried to link all the objects (*.o). But the errors still exist. I attached my changes as a patch. I wonder if creating separated test programs is a good way of testing the code, and I'm interested in learning what I missed. Thank you. test_float4.patch Description: Binary data