Hi! Maxim Cournoyer <maxim.courno...@gmail.com> skribis:
> Ludovic Courtès <l...@gnu.org> writes: > >> Hi Guix! >> >> I was inspired by Michael Stapelberg’s talk recently shared on IRC¹ >> (well worth watching!). One of the takeaways for me is that many >> actions should be done lazily, in particular populating caches. >> >> ‘guix install’ & co. spend a significant time populating such caches, in >> particular the XDG caches² and the manual page database (mandb). >> >> I’m thinking we could get rid of the mandb hook. However, the >> functionality matters IMO (we need good tools so users can browse local >> documentation; mandb is not that good but better than no search >> mechanism.) Here are several options that come to mind: >> >> 1. Provide a ‘man’ wrapper or modify the ‘man-db’ package such that >> the database gets built on the first use of ‘man -k’, unless it’s >> already up-to-date. > > That would mean the database would live in some user-specific writable > area of the file system correct (where?), right? And could use the > common 'update' mechanism of man-db to make it as fast as possible. > > This sounds good from a performance perpective, but could introduce > cache issues every now and then (if man-db changes a lot). I wouldn't > expect much problem given how mature man-db is, but that's one thing to > consider. I looked a bit at man-db, thinking it must have that already done more or less. Indeed, one can run “mandb -uc” to create the database. The problem is that it insists on writing databases and ‘CACHEDIR.TAG’ files in the same directory as man pages. In our case, these are all read-only, so just prints a warning for each directory and keeps going. It looks like man-db is not written with a situation like ours in mind. >> 2. Add a phase in gnu-build-system.scm that creates a per-package >> database. Change the mandb profile hook such that all it needs to >> do is “concatenate” all these GDBM databases (which should be much >> faster than browsing all the man pages as it currently does). > > I like that idea better, but I don't know how feasible it would be. Yeah, dunno. > What is taking so much time anyway? Why is generating this database so > compute intensive? I don't grok why it should be so inefficient to scan > a union'd tree for expected prefixes and append a bunch of file names > together. ‘mandb-entries’ in (guix man-db) needs to open all the man pages in the profile, decompress them, and read their header. When there are many man pages, that’s a lot of I/O and CPU usage. One option I contemplated at one point is to simply have fewer man pages in the first place. :-) There were packages that install man pages when they shouldn’t. This led to commits like 305eefc0627eb1d047e6fc4320d7e56897719ab8 and 4b797193d7508ddc53bb1ff7a267a0d50c1fe298 (and parent commits). But even with that, this mandb hook will always get in the way, even though few people use ‘man -k’. I think we need a better solution to the whole “search for documentation” problem. Thanks, Ludo’.