On Tue, Apr 14, 2020 at 11:08 AM Stephen Frost <sfr...@snowman.net> wrote: > Wouldn't it make sense to, given that we have some idea of what we want > it to eventually look like, to make progress in that direction though?
Well, yes. :-) > That is- I tend to agree with Andres that having this supported > server-side eventually is what we should be thinking about as an > end-goal (what is the point of pg_basebackup in all of this, after all, > if the goal is to get a backup of PG from the PG server to s3..? why > go through some other program or through the replication protocol?) and > having the server exec'ing out to run shell script fragments to make > that happen looks like it would be really awkward and full of potential > risks and issues and agreement that it wouldn't be a good fit. I'm fairly deeply uncomfortable with what Andres is proposing. I see that it's very powerful, and can do a lot of things, and that if you're building something that does sophisticated things with storage, you probably want an API like that. It does a great job making complicated things possible. However, I feel that it does a lousy job making simple things simple. Suppose you want to compress using your favorite compression program. Well, you can't. Your favorite compression program doesn't speak the bespoke PostgreSQL protocol required for backup plugins. Neither does your favorite encryption program. Either would be perfectly happy to accept a tarfile on stdin and dump out a compressed or encrypted version, as the case may be, on stdout, but sorry, no such luck. You need a special program that speaks the magic PostgreSQL protocol but otherwise does pretty much the exact same thing as the standard one. It's possibly not the exact same thing. A special might, for example, use multiple threads for parallel compression rather than multiple processes, perhaps gaining a bit of efficiency. But it's doubtful whether all users care about such marginal improvements. All they're going to see is that they can use gzip and maybe lz4 because we provide the necessary special magic tools to integrate with those, but for some reason we don't have a special magic tool that they can use with their own favorite compressor, and so they can't use it. I think people are going to find that fairly unhelpful. Now, it's a problem we can work around. We could have a "shell gateway" program which acts as a plugin, speaks the backup plugin protocol, and internally does fork-and-exec stuff to spin up copies of any binary you want to act as a filter. I don't see any real problem with that. I do think it's very significantly more complicated than just what Andres called an FFI. It's gonna be way easier to just write something that spawns shell processes directly than it is to write something that spawns a process and talks to it using this protocol and passes around file descriptors using the various different mechanisms that different platforms use for that, and then that process turns around and spawns some other processes and passes along the file descriptors to them. Now you've added a whole bunch of platform-specific code and a whole bunch of code to generate and parse protocol messages to achieve exactly the same thing that you could've done far more simply with a C API. Even accepting as a given the need to make the C API work separately on both the client and server side, you've probably at least doubled, and I suspect more like quadrupled, the amount of infrastructure that has to be built. So... > If, instead, we worked on a C-based interface which includes filters and > storage drivers, and was implemented through libpgcommon, we could start > with that being all done through pg_basebackup and work to hammer out > the complications and issues that we run into there and, once it seems > reasonably stable and works well, we could potentially pull that into > the backend to be run directly without having to have pg_basebackup > involved in the process. ...let's do this. Actually, I don't really mind if we target something that can work on both the client and server side initially, but based on C, not a new wire protocol with file descriptor passing. That new wire protocol, and the file descriptor passing infrastructure that goes with it, are things that I *really* think should be pushed off to version 2, because I think they're going to generate a lot of additional work and complexity, and I don't want to deal with all of it at once. Also, I don't really see what's wrong with the server forking processes that exec("/usr/bin/lz4") or whatever. We do similar things in other places and, while it won't work for cases where you want to compress a shazillion files, that's not really a problem here anyway. At least at the moment, the server-side format is *always* tar, so the problem of needing a separate subprocess for every file in the data directory does not arise. > There's been good progress in the direction of having more done by the > backend already, and that's thanks to you and it's good work- > specifically that the backend now has the ability to generate a > manifest, with checksums included as the backup is being run, which is > definitely an important piece. Thanks. I'm actually pretty pleased about making some of that infrastructure available on the frontend side, and would like to go further in that direction over time. My only concern is that any given patch shouldn't be made to require too much collateral infrastructure work, and any infrastructure work that it will require should be agreed, so far as we can, early in the development process, so that there's time to do it at a suitably unhurried pace. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company