# Supporting HTTP remotes in "git archive" We would like to allow remote archiving from HTTP servers. There are a few possible implementations to be discussed:
## Shallow clone to temporary repo This approach builds on existing endpoints. Clients will connect to the remote server's git-upload-pack service and fetch a shallow clone of the requested commit into a temporary local repo. The write_archive() function is then called on the local clone to write out the requested archive. ### Benefits * This can be implemented entirely in builtin/archive.c. No new service endpoints or server code are required. * The archive is generated and compressed on the client side. This reduces CPU load on the server (for compressed archives) which would otherwise be a potential DoS vector. * This provides a git-native way to archive any HTTP servers that support the git-upload-pack service; some providers (including GitHub) do not currently allow the git-upload-archive service. ### Drawbacks * Archives generated remotely may not be bit-for-bit identical compared to those generated locally, if the versions of git used on the client and on the server differ. * This requires higher bandwidth compared to transferring a compressed archive generated on the server. ## Use git-upload-archive This approach requires adding support for the git-upload-archive endpoint to the HTTP backend. Clients will connect to the remote server's git-upload-archive service and the server will generate the archive which is then delivered to the client. ### Benefits * Matches existing "git archive" behavior for other remotes. * Requires less bandwidth to send a compressed archive than a shallow clone. * Resulting archive does not depend in any way on the client implementation. ### Drawbacks * Implementation is more complicated; it will require changes to (at least) builtin/archive.c, http-backend.c, and builtin/upload-archive.c. * Generates more CPU load on the server when compressing archives. This is potentially a DoS vector. * Does not allow archiving from servers that don't support the git-upload-archive service. ## Add a new protocol v2 "archive" command I am still a bit hazy on the exact details of this approach, please forgive any inaccuracies (I'm a new contributor and haven't examined custom v2 commands in much detail yet). This approach builds off the existing v2 upload-pack endpoint. The client will issue an archive command (with options to select particular paths or a tree-ish). The server will generate the archive and deliver it to the client. ### Benefits * Requires less bandwidth to send a compressed archive than a shallow clone. * Resulting archive does not depend in any way on the client implementation. ### Drawbacks * Generates more CPU load on the server when compressing archives. This is potentially a DoS vector. * Servers must support the v2 protocol (although the client could potentially fallback to some other supported remote archive functionality). ### Unknowns * I am not clear on the relative complexity of this approach compared to the others, and would appreciate any guidance offered. ## Summary Personally, I lean towards the first approach. It could give us an opportunity to remove server-side complexity; there is no reason that the shallow-clone approach must be restricted to the HTTP transport, and we could re-implement other transports using this method. Additionally, it would allow clients to pull archives from remotes that would not otherwise support it. That said, I am happy to work on whichever approach the community deems most worthwhile.