At Thu, 10 Sep 2020 21:55:27 +0300, Surafel Temesgen <surafel3...@gmail.com> wrote in > On Thu, Sep 10, 2020 at 1:17 PM vignesh C <vignes...@gmail.com> wrote: > > > > > > > > > We have a patch for column matching feature [1] that may need a header > > line to be further processed. Even without that I think it is preferable to > > process the header line for nothing than adding those checks to the loop, > > performance-wise. > > > > I had seen that patch, I feel that change to match the header if the > > header is specified can be addressed in this patch if that patch gets > > committed first or vice versa. We are doing a lot of processing for > > the data which we need not do anything. Shouldn't this be skipped if > > not required. Similar check is present in NextCopyFromRawFields also > > to skip header. > > > > The existing check is unavoidable but we can live better without the checks > added by the patch. For very large files the loop may iterate millions of > times if it is not in billion and I am sure doing the check that many times > will incur noticeable performance degradation than further processing a > single line.
FWIW, I thought the same thing seeing the additional if-conditions. It gives more loss than gain. For the first part, the patch reveals COPY_NEW_FE, which I don't think to be a knowledge for the function, to CopyGetData. Considering that that doesn't seem to offer noticeable performance gain, I don't think we should do that. On the contrary, if incoming data were intermittently delayed for some reasons (heavy load of client or in-between network), this patch would make things worse by waiting for delayed bits before processing already received bits. regards. -- Kyotaro Horiguchi NTT Open Source Software Center