On 24.01.22 09:12, Chris wrote:
On 2022-01-23 10:19, Patrick M. Hausen wrote:
Hi all,
I did not really have an opinion on this, since we never used FTP,
but I was a bit surprised by the suggestion to use SSH instead.
It never occurred to us that anything but HTTP(S) was possible.
We simply run Nginx in a jail serving the packages that Poudriere
produces for us. Setup time/effort: 5 minutes.
Now after this comment:
Am 22.01.2022 um 09:35 schrieb Chris <portmas...@bsdforge.com>:
I find it's less "housekeeping" to use ftp(1) setup through inetd(8)
for pkg repos, than
via ssh.
I understand the appeal of FTP.
Maybe this discussion is focusing on the wrong topic. Perhaps
we should consider including a light weight way to serve HTTP(S)
in base? Like Lighttpd, which as far as I know comes with a BSD
3-clause equivalent license.
But then the general tendency has been to remove network services
from base rather than introduce them. Like e.g. BIND.
So I really have no idea what the general opinion is, just wanted
to throw in that IMHO HTTPS is the best protocol to the task and
if some way to serve that could be included in base, I for one would
appreciate that.
OTOH Chris, what's keeping you from installing a web server just
serving static files?
Different environments/ different requirements. But habit as much as
anything else.
Ftp is trivial, has always been available. So I never even need to
think about it.
I perform mass installs/upgrades in large networks. There is no
overhead using ftp
either through a one-start | inetd. The clients are all started/used
at will.
It seems to me that removing features also removes value. IMHO the
gain from the
removal of transports as trivial as ftp(1) bring little to the table
for all
concerned. But that's just me. :-)
Have you ever looked into a FTP protocol parser and what's required to
get different FTP configurations through the NAT infested networks of
today? FTP is an ugly protocol from the beginning of time that should
have been put down decades ago. Even without pipelining HTTP saves
several network round trips and poudriere already generates HTML and
JSON status updates during builds as read only web ui.
This thread has shown that users have deployed complex, fragile
workarounds the limited protocol selection offered by pkg. I recommend
adding a clean and official extension interface spawning fetch helper
processes from a well known location outside of $PATH derived from the
URI schema (e.g. ${PREFIX}/libexec/pkg/fetch-${SCHEMA}). To keep helpers
simple and small they would be started in an execution environment
(working directory, environment variables, minimal set of inherited file
descriptors) to be prepared by pkg expecting the repository URI as first
(and only?) argument. Reading a stream of pairs of file name (e.g. the
package hash stored in the repository) and relative path per line to
fetch from standard input into the inherited working directory allowing
users to add their own transport helpers similar to git.
To support progress updates and allow pkg to start the installation of
fetch packages as soon as possible helpers could write lines with
"${BYTES_FETCHED} ${BYTES_TOTAL} ${FILE}" to standard output
periodically. A (permanent) transfer failure could be encoded by a
negative $BYTES_FETCHED and a successfully completed transfer as
$BYTES_FETCHED == $BYTES_TOTAL. If the helper doesn't know the file size
it should be allowed to use negative $BYTES_TOTAL values in all but the
last progress update (per fetched file). All transfers not reported as
successfully completed or permanently failed are implicitly confirmed by
exiting with EX_OK. Other exit codes implicitly fail all unconfirmed
transfers. Pkg should clean up the working directory after the the
helper has exited to delete partially transferred files (and anything
else the helper may have left taking care not to follow symlinks). Pkg
should apply resource limits and drop privileges (when running as root)
before exec()ing into the helper. Well written helpers can use capsicum
to provide further defense in depth.
The package repository already contains the the expected package sizes.
As an optimization for dealing with out of sync mirrors the known file
sizes can be matched against positive file sizes reported by helpers to
fail quickly.
Refactoring all supported protocols to use this interface would reduce
the complexity of pkg itself.
This design can be further extended with more features (and potential
for bugs) until we end up with something similar to the git annex
external special remote protocol
(https://git-annex.branchable.com/design/external_special_remote_protocol/)
if there are enough relevant use cases justifying the additional
complexity in pkg and its file transfer helpers.