On Thu, Jun 11, 2020 at 1:41 PM Hamid Akhtar <hamid.akh...@gmail.com> wrote: > As far I understand, parallel backup is not a mandatory performance feature, > rather, one at user's discretion. This IMHO indicates that it will benefit > some users and it may not others. > > IMHO, parallel backup has obvious performance benefits, but we need to ensure > that users understand that there is potential for slower backup if there is > no competition for resources.
I am sure that nobody is arguing that the patch has to be beneficial in all cases in order to justify applying it. However, there are several good arguments against proceding with this patch: * Every version of the patch that has been reviewed by anybody has been riddled with errors. Over and over again, testers have found serious bugs, and code reviewers have noticed lots of problems, too. * This approach requires rewriting a lot of current functionality, either by moving it to the client side or by restructuring it to work with parallelism. That's a lot of work, and it seems likely to generate more work in the future as people continue to add features. It's one thing to add a feature that doesn't benefit everybody; it's another thing to add a feature that doesn't benefit everybody and also hinders future development. See http://postgr.es/m/ca+tgmozublxyr+pd_gi3mvgyv5hqdlm-gbrvxkun-lewaw1...@mail.gmail.com for more discussion of these issues. * The scenarios in which the patch delivers a performance benefit are narrow and somewhat contrived. In remote backup scenarios, AIUI, the patch hasn't been shown to help. In local backups, it does, but how likely is it that you are going to do your local backups over the wire protocol instead of by direct file copy, which is probably much faster? I agree that if your server is overloaded, having multiple processes competing for the server resources will allow backup to get a larger slice relative to other things, but that seems like a pretty hackish and inefficient solution to that problem. You could also argue that we could provide a feature to prioritize some queries over other queries by running them with tons of parallel workers just to convince the OS to give them more resources, and I guess that would work, but it would also waste tons of resources and possibly cripple or even crash your system if you used it enough. The same argument applies here. * Even when the patch does provide a benefit, it seems to max out at about 2.5X. Clearly it's nice to have something go 2.5X faster, but the point is that it doesn't scale beyond that no matter how many workers you add. That doesn't automatically mean that something is a bad idea, but it is a concern. At the very least, we should be able to say why it doesn't scale any better than that. * Actually, we have some hints about that. Over at http://postgr.es/m/20200503174922.mfzzdafa5g4rl...@alap3.anarazel.de Andres has shown that too much concurrency when copying files results in a dramatic performance reduction, and that a lot of the reason why concurrency helps in the first place has to do with the fact that pg_basebackup does not have any cache control (no fallocate, sync_file_range(WRITE), posix_fadvise(DONTNEED)). When those things are added the performance gets better and the benefits of concurrency are reduced. I suspect that would also be true for this patch. It would be unreasonable to commit a large patch, especially one that would hinder future development, if we could get the same benefits from a small patch that would not do so. I am not in a position to tell you how to spend your time, so you can certainly pursue this patch if you wish. However, I think it's probably not the best use of time. Even if you fixed all the bugs and reimplemented all of the functionality that needs reimplementing in order to make this approach work, it still doesn't make sense to commit the patch if either (a) we can obtain the same benefit, or most of it, from a much simpler patch or (b) the patch is going to make it significantly harder to develop other features that we want to have, especially if those features seem likely to be more beneficial than what this patch offers. I think both of those are likely true here. For an example of (b), consider compression of tar files on the server side before transmission to the client. If you take the approach this patch does and move tarfile construction to the client, that is impossible. Now you can argue (and perhaps you will) that this would just mean someone has to choose between using this feature and using that feature, and why should users not have such a choice? That is a fair argument, but my counter-argument is that users shouldn't be forced into making that choice. If the parallel feature is beneficial enough to justify having it, then it ought to be designed in such a way that it works with the other features we also want to have rather than forcing users to choose between them. Since I have already proposed (on the other thread linked above) a design that would make that possible, and this design does not, I have a hard time understanding why we would pick this one, especially given all of the other disadvantages which it seems to have. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company