Hello, For those not at Guix Days:
We have split into groups discussing various topics. Each group is collecting notes on its discussion. I am starting this thread as a place for these notes, to be distributed as necessary.
To kick things off, I've attached my notes on the discussion of "distibuted substitutes" which we clarified referred to participatory/peer-to-peer substitutes. I tried to group things conceptually based on where conversation ended up, but "conclusions" per se are all under "Next Steps" and "Open Questions".
Thanks, Juli
#+title: Participatory (p2p) Substitutes * Angles ** Building ** Delivering * Why ** substitute servers are slow ** resources *** compute **** speed **** cost *** storage ** resilience These problems increase exponentially with users or packages or both. * Problems Source code is easier because we can have absolute knowledge of hash of source -- can cryptographically verify source. By contrast, crypto verification of binary requires compilation. Need to trust source of binary substitutes. ** Trust Someone needs to supply the hash. Currently, this is the central Guix build farm. ** Content-addressed downloads Need architecture for distributed (network topography) delivery. Can already content-address sources and binaries; just need trusted hash. That is, same problem for source and substitutes. ** Nar files Potentially inefficient? ** Obligations on users Users may be expected to contribute back bandwidth, potentially build time to the network. ** Privacy What if we have private info in ~/gnu/store~ eg because of Guix home managing dotfiles? ** Granularity 1. Different people have different security/privacy models. 2. People may want to use different transport mechanisms * Solutions We seemed to quickly shift to envisioning an opt-in network of distributors, eg with Guix system service. Above problems addressed below: ** Trust 1. a server/user you choose to trust gives you a hash; you can get this substitute from any server and hash it yourself. - need to trust central server + can talk to operator 2. apply ~guix challenge~ somehow 3. distribute trust over multiple nodes, eg strongly trust a few nodes, weakly trust more, test hashes against each other - could incorporate this into existing substitute certification infra - existing research in eg Tor exit node trust 4. zero-knowledge proof - expensive - more variables = more expensive - thus, likely not feasible Conversation is tending towards consensus-based trust (trusting hash if plurality of trusted nodes agree on hash) combined with "watchdog" application of ~guix challenge~. ** Content-addressed downloads 1. bittorrent - definitely tackles bandwidth usage - tends towards "supernodes" which advertise lots of smaller nodes + could run this on Guix infra 2. IPFS 3. (bespoke) OCapN/Spritely - could facilitate granular control of access - Spritely envisions distributed storage over ERIS, which is encrypted and complicates this space 4. ~guix publish~ ** Nar files ** Obligations on users 1. have ~guix publish~ already ** Privacy 1. do not advertise hashes, only respond to requests for specific hashes - there is an attack on this (TahoeLFS encountered this?) 2. only advertise specific substitutes eg what's in the core Guix channel - could be used to triangulate what software someone uses by watching what they request + already the case if monitoring requests to central substitute server + could download and distribute software you don't use 3. may not solve all privacy issues, but must communicate privacy concerns to users (ie informed consent) ** Granularity *** Privacy 1. opt-in to share specific nars or equiv (see above) *** Delivery 1. provide abstract interface to a network * Next steps We already have content-addressed distribution. 1. more central substitute servers and mirrors around the world 2. abstract API for decentralized substitute delivery * Open questions 1. trust mechanism 2. exact delivery mechanism 3. who does the work