On Wed, Apr 15, 2020 at 10:45 PM Andres Freund <and...@anarazel.de> wrote: > > Hi, > > On 2020-04-15 20:36:39 +0530, Kuntal Ghosh wrote: > > I was thinking from this point of view - the sooner we introduce > > parallelism in the process, the greater the benefits. > > I don't really agree. Sure, that's true from a theoretical perspective, > but the incremental gains may be very small, and the cost in complexity > very high. If we can get single threaded splitting of rows to be >4GB/s, > which should very well be attainable, the rest of the COPY work is going > to dominate the time. We shouldn't add complexity to parallelize more > of the line splitting, caring too much about scalable datastructures, > etc when the bottleneck after some straightforward optimization is > usually still in the parallelized part. > > I'd expect that for now we'd likely hit scalability issues in other > parts of the system first (e.g. extension locks, buffer mapping). > Got your point. In this particular case, a single producer is fast enough (or probably we can make it fast enough) to generate enough chunks for multiple consumers so that they don't stay idle and wait for work.
-- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com