On Thu, Apr 9, 2020 at 3:55 AM Ants Aasma <a...@cybertec.at> wrote: > > On Wed, 8 Apr 2020 at 22:30, Robert Haas <robertmh...@gmail.com> wrote: > > > - The portion of the time that is used to split the lines is not > > easily parallelizable. That seems to be a fairly small percentage for > > a reasonably wide table, but it looks significant (13-18%) for a > > narrow table. Such cases will gain less performance and be limited to > > a smaller number of workers. I think we also need to be careful about > > files whose lines are longer than the size of the buffer. If we're not > > careful, we could get a significant performance drop-off in such > > cases. We should make sure to pick an algorithm that seems like it > > will handle such cases without serious regressions and check that a > > file composed entirely of such long lines is handled reasonably > > efficiently. > > I don't have a proof, but my gut feel tells me that it's fundamentally > impossible to ingest csv without a serial line-ending/comment > tokenization pass. >
I think even if we try to do it via multiple workers it might not be better. In such a scheme, every worker needs to update the end boundaries and the next worker to keep a check if the previous has updated the end pointer. I think this can add a significant synchronization effort for cases where tuples are of 100 or so bytes which will be a common case. > The current line splitting algorithm is terrible. > I'm currently working with some scientific data where on ingestion > CopyReadLineText() is about 25% on profiles. I prototyped a > replacement that can do ~8GB/s on narrow rows, more on wider ones. > Good to hear. I think that will be a good project on its own and that might give a boost to parallel copy as with that we can further reduce the non-parallelizable work unit. > For rows that are consistently wider than the input buffer I think > parallelism will still give a win - the serial phase is just memcpy > through a ringbuffer, after which a worker goes away to perform the > actual insert, letting the next worker read the data. The memcpy is > already happening today, CopyReadLineText() copies the input buffer > into a StringInfo, so the only extra work is synchronization between > leader and worker. > > > > - There could also be similar contention on the heap. Say the tuples > > are narrow, and many backends are trying to insert tuples into the > > same heap page at the same time. This would lead to many lock/unlock > > cycles. This could be avoided if the backends avoid targeting the same > > heap pages, but I'm not sure there's any reason to expect that they > > would do so unless we make some special provision for it. > > I thought there already was a provision for that. Am I mis-remembering? > The copy uses heap_multi_insert to insert batch of tuples and I think each batch should ideally use a different page mostly it will be a new page. So, not sure if this will be a problem or a problem of a level for which we need to do some special handling. But if this turns out to be a problem, we definetly need some better way to deal with it. > > - What else? I bet the above list is not comprehensive. > > I think parallel copy patch needs to concentrate on splitting input > data to workers. After that any performance issues would be basically > the same as a normal parallel insert workload. There may well be > bottlenecks there, but those could be tackled independently. > I agree. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com