Olly> Automatically uncompressing gzipped files for indexing isn't hard Olly> to do, but what can you link to for them in the search results? Olly> Of the four web browsers I just tried, only w3m showed the Olly> contents of file:///usr/share/doc/coreutils/README.gz rather than Olly> downloading it for me. Same for Olly> http://localhost/doc/coreutils/README.gz it seems.
The dwww cgi uncompresses these files for you, so something like http://localhost/cgi-bin/dwww/usr/share/doc/foo/README.gz works (in any browser). Also, it may be possible to use mod_deflate in Apache to transparently uncompress, though I have never tried that. Olly> But as others have said, recoll is probably a better choice for a Olly> Xapian-based solution for a desktop situation anyway. Well, but I'm not running a typical desktop, at least not if by that is meant a Gnome or KDE stacked system. The fact that recoll is bound to a particular GUI is definitely a disadvantage. So no orphaning omega please! :-) What I'm ending up doing is a hack: I build a separate tree that is mostly a symlink farm pointing to /usr/share/doc, except that gzipped files are replaced by their uncompressed versions. Then I run omindex on the new tree. I've just tested this and it does the job. Indexing time is about 16 min, which is in between swish-e (9 min) and swish++ (27 min). Not terrible, but maybe there is a way to speed it up by parallelization? The omega docs seem to say nothing about concurrent access to the index. Is it possible to run 2 indexer processes at once, each updating the same index but with different files? -- Ian Zimmerman <i...@buug.org> gpg public key: 1024D/C6FF61AD fingerprint: 66DC D68F 5C1B 4D71 2EE5 BD03 8A00 786C C6FF 61AD Ham is for reading, not for eating. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org