I also find it *very* useful to have doc-rfc and doc-iana. I'd keep fortune-data out of tradition. On the other hand, I'd ditch all of the linux magazines (lg, pluto, I think there are others?) without a second thought...
I'd argue that doc-rfc has sort of the same niche as doc-HOWTO. Not sure if I can define that niche, though, other than "developer-useful datasets"; the coastline data, like the tiger/line map data (5 cd's bzipped, mmmm) is domain-specific, but not {debian,linux}-developer-specific... and debian has always been de facto "of, by, and for" developers. I like the idea of a packages-style installer for these things, though; if we go way back (pre-debian, pre-redhat) the BOGUS release (which evolved into RedHat, sort of) was a lot like the *BSD Ports system, in that you could have a URL and an md5sum in the config file... Ports has probably advanced, you'd really want * set of mirrors (multiple or pattern URLs) * md5sum for exact match, but easy-to-upgrade option The problem is that if a data set evolves independently of/faster than our release cycle, a package-by-reference in "stable" will eventually lose. Value-add point for cd distributors, I guess :-)