On Saturday 20 June 2009 03:16:33 Goswin von Brederlow wrote: > Joseph Rawson <umebos...@gmail.com> writes: > > On Friday 19 June 2009 12:57:25 Goswin von Brederlow wrote: > >> Or have a proxy that adds packages that are requested. > > > > When I woke up this morning, I was thinking that it might be interesting > > to have an apt method that talks directly to reprepro. It's just a vague > > idea now, but I'll give it some more thought later. > > Way too much latency to mirror a deb when requested and you need to > run apt-get update for it to show up. > > The best you can do is add the package to the filter list and then > fetch it directly. Then the next night the mirror will pick it up for > future updates. > What I had in mind would eliminate a large part of the latency, and also keep from downloading the deb twice.
Use a server application (I'll call it repserve for now) on the machine that hosts the reprepro repository. apt-get update The apt method talks to repserve, then repserve tells reprepro to run either update or checkupdate, then repserve feeds the appropriate files from the reprepro lists/ director(y/ies) back to the apt-get process on the local machine. This would probably use a bit more bandwidth (at least for the first update) since apt-get will download .pdiff files, where reprepro just grabs the whole Packages.gz files. apt-get install, upgrade, build-dep The apt method determines which source in it's apt lists to retrieve the package from, then sends that info to repserve. Repserve looks in it's repositor(y/ies) to determine where those packages are (or if they aren't yet mirrored), probably by scanning the filter lists. Repserve then tells reprepro to update in the appropriate repositories (if necessary). Then repserve signals the local client (or local client polls repserve), and the debs are then transferred from reprepro repos to local client. After that, the repserve process could instruct reprepro to retrieve the sources, if it's configured to do that. Also, it could try and determine build deps for those packages, and retrieve them and the sources, if it's configured to do that as well. With retrieving builddeps enabled, there might be a problem in having to explicitly list preferred alternatives, but this is mainly for packages that have drop-in replacements for libfoo-dev, like libgamin-dev provides libfam-dev. This is still just a rough idea. One of the interesting things about using an idea like this, is that it can still allow reprepro to be used in the normal way, so you can have a couple of machines that instruct repserve to help maintain the repository, and other machines on the network can just use reprepro directly through apache, ftp, etc. The "controlling" machines would have a sources.list like: deb repserve://myhost/debrepos/debian lenny main contrib non-free The repserve method on the client would send that line to the repserve server. The server would parse the line and match it to the appropriate repository from its configuration. The other hosts would just have this in sources.list: deb http://myhost/debrepos/debian lenny main contrib non-free The hosts using repserve could be the only ones with filter lists in reprepro, but it may be desired to have filter lists from the other machines, also. This would help keep packages from disappearing from the pool when they are still needed. It may also be nice to use reprepro's snapshotting each time a repserve method updates a repository, although this may require using those snapshot urls on the hosts that aren't using repserve. > > But now you made me think about this too. So here is what I think: > > - My bandwidth at home is fast enough to fetch packages directly. No > need to mirror at all. > > - I don't want to download a package multiple times (once per host) so > some shared proxy would be good. > My idea would keep that from happening, at the expense of latency. The latency would be minimal, as it would just be dependant on reprepro retrieving the package(s) and signalling the client that the package is ready. Using reprepro to add extra packages to the repository from upstream without doing a full update may not be possible, but if it were, the latency would certainly be minimum, and the bandwidth to the internet would also be minimum. I just looked at the manpage again, and this may be possible by using the --nolistsdownload option with the update/checkupdate command. > - Bootstraping a chroot still benefits from local packages but a > shared proxy would do there too. > > - When I'm not at home I might not have network access or only a slow > one so then I need a mirror. And my parents computer has a Linux that > only I use and that needs a major update every time I vistit. > > So the ideal setup would be an apt proxy that stores the packages in > the normal pool structure and has a simple command to create > Packages.gz, Sources.gz, Release and Release.gpg files so the cache > directory can be copied onto a USB disk and used as a repository of > its own. > Getting reprepro to do this would save a lot of the hassle, but getting reprepro to act as an apt proxy is also tricky. The current cache and proxy methods in the apt-proxy and apt-cache packages don't work as well in making a good repository, as opposed to reprepro. The Release could be signed using an rsign method with the machine(s) that manage the repository, or it could be done locally on the server using gpg-agent, or an unencrypted private key, depending on how the administrator prefers to manage it. > Optional the apt proxy could prefetch package versions but for me that > wouldn't be a high priority. > > Nice would be that it fetches sources along with binaries. When I find > a bug in some software while traveling I would hate to not have the > source available to fix it. But then it also needs to fetch > Build-depends and their depends. So that would complicate matters a > lot. I mentioned that part above. > > MfG > Goswin Overall, I think that reprepro does a good job of maintaining a local repository, and we shouldn't reimplement what it does. Reprepro also seems flexible enough to implement most of the backend with simple commands and options. I've never tried to implement a new apt-method before, so I think that would take a bit more research from me. My uses: - I have an automated installer that I test and improve frequently. Using a local mirror is a requirement for this. A partial mirror would help to keep me from using as much space, and keep from downloading packages I'll never use. - I've been using full mirrors, but I need a partial mirror that I can carry with me, so I can use the installer on location, instead of having to bring a machine back with me. - I have a mirror of lenny-backports (source only). When I need to backport a package, I install a builder machine (using the automated installer) with virtualbox, and send a .dsc from that mirror to the builder machine using cowpoke, then send the package to the local repository (in this case, separate from the source mirror, where the packages are set for auto-install, instead of having to use the -t option in apt). It's also separate, since there are a few packages from sid in there as well, that aren't at backports.org. -- Thanks: Joseph Rawson
signature.asc
Description: This is a digitally signed message part.