Re: Load TIME fields - proposed performance improvement

2020-09-28 Thread Peter Smith
On Tue, Sep 29, 2020 at 2:10 AM Tom Lane wrote: > I think so too, so I pushed it. Thanks! Kind Regards, Peter Smith Fujitsu Australia

Re: Load TIME fields - proposed performance improvement

2020-09-28 Thread Tom Lane
Peter Smith writes: > I only have a couple of questions, more for curiosity than anything else. > 1. Why is there sometimes an extra *tm = &tt; variable introduced? > (e.g. GetSQLCurrentTime, GetSQLLocalTime). Why not just declare struct > pg_tm tm; and pass the &tm same as what GetSQLCurrentDate

Re: Load TIME fields - proposed performance improvement

2020-09-27 Thread Peter Smith
On Sat, Sep 26, 2020 at 2:17 AM Tom Lane wrote: > That led me to refactor the patch as attached. (I'd first thought ... Thanks for refactoring the patch. Everything looks good to me. As expected, observations for the v02 patch gave pretty much the same results as v01 (previous mail). v02 perf

Re: Load TIME fields - proposed performance improvement

2020-09-25 Thread Tom Lane
Peter Smith writes: > The patch has been re-implemented based on previous advice. > Please see attached. Hm, so: 1. The caching behavior really needs to account for the possibility of the timezone setting being changed intra-transaction. That's not very likely, perhaps, but we can't deliver a w

Re: Load TIME fields - proposed performance improvement

2020-09-24 Thread Peter Smith
The patch has been re-implemented based on previous advice. Please see attached. ~ Test: A test table was created and 20 million rows inserted as follows: test=# create table t1 (id int, a timestamp, b time without time zone default '01:02:03', c date default CURRENT_DATE, d time with time zon

Re: Load TIME fields - proposed performance improvement

2020-09-21 Thread Peter Smith
Hi Tom. Thanks for your feedback. On Tue, Sep 22, 2020 at 12:44 PM Tom Lane wrote: > Still, for the size of the patch I'm envisioning, it'd be well > worth the trouble. The OP patch I gave was just a POC to test the effect and to see if the idea was judged as worthwhile... I will rewrite/fix

Re: Load TIME fields - proposed performance improvement

2020-09-21 Thread Tom Lane
I wrote: > Interesting idea, but this implementation is leaving a *lot* > on the table. If we want to cache the result of > timestamp2tm applied to GetCurrentTransactionStartTimestamp(), > there are half a dozen different call sites that could make > use of such a cache, eg, GetSQLCurrentDate and

Re: Load TIME fields - proposed performance improvement

2020-09-21 Thread Tom Lane
Peter Smith writes: > IIUC the code is calling GetCurrentDateTime only to acquire the > current TX timestamp as a struct pg_tm in order to derive some > timezone information. > ... > I have attached a patch which caches this struct, so now those 225 > million calls are reduced to just 1 call. Int

Load TIME fields - proposed performance improvement

2020-09-21 Thread Peter Smith
Hi Hackers. I have a test table with multiple (10) columns defined as TIME WITHOUT TIME ZONE. When loading this table with a lot of data (e.g. "COPY tbl FROM /my/path/2GB.csv WITH (FORMAT CSV)") I observed it was spending an excessive amount of time within the function GetCurrentDateTime. IIUC t