[EMAIL PROTECTED] (Pranav K. Tiwari) writes: > Steve Youngs <[EMAIL PROTECTED]> writes: > >> * Pranav K Tiwari <[EMAIL PROTECTED]> writes: >> >> > Steve Youngs <[EMAIL PROTECTED]> writes: >> >> * Pranav K Tiwari <[EMAIL PROTECTED]> writes: >> >> >> >> > To allow desktop search programs go through nnml articles, I would >> >> > like to give an extension like .xyz, and tell these programs to >> >> > treat these files like email. >> >> >> >> I think this is the wrong approach. Instead of modifying the >> >> filenames to suit the search program, find a way to make the search >> >> program work properly. >> >> >> >> It's really not that difficult, see... >> >> >> >> $ find <nnmldir> -type f -regex '^.*[0-9]+$' >> >> >> >> > The question is not about 'finding' these files, but about >> > associating a 'type' with the file. >> >> But if you can find them, there's really no point in associating a >> "type" to them. >> >> $ find <nnmldir> -type f -regex '^.*[0-9]+$' | \ >> xargs some_app_needing_mail_files_as_input >> >> > Most indexing programs (google/yahoo/microsoft desktop search >> > engines, X1) rely on file extensions to determine the filetype, >> > and then index the contens of the file accordingly. It'll be good >> > if they could deal with files with no extensions, but they don't >> > (afaik). >> >> Yes they do. For example: >> >> <http://homepage.mac.com/pauljlucas/software/swish/> >> >> > So - with that in mind, the easiest way would be to change the way gnus >> > nnml stores files, or write another backend that allows changing >> > filenames. >> >> Maybe you should say what it is exactly that you want to do with your >> nnml files. >> > > swish is fine - that's what I've used till now. I've been unable to use > it to index all of my email periodically. I would like to say, here's > the top directory under which all my nnml mail is, and this should be > indexed periodically. But swish runs out of memory (even with -e option, > on my 512Meg Win2k machine) in trying to index my mails (some, 35-40 > nnml folders, each with 2000-5000 emails). So, the way I use swish is to > have one index file per nnml folder, and I have modified the swish > search function to search a list of index files. > > It works, but as you can see, it's not optimal. Maybe, my usage of swish > is not correct - and if so, I'll be glad to be corrected. > > desktop search programs that I mentioned, all support a 'crawl' type of > indexing where they can keep track of what has changed, and update their > indices appropriately. And I have never had any trouble with memory with > them. That's why I'll like to use any of those to index my mail, instead > of swish that I'm using at present. > > -p
I've had some success with it by modifying nnml.el to store articles with an extension. So, instead of storing articles as group/N, I store it as group/N.nnml, and then configure the search engine to treat .nnml file as a text file. Works well - much better than swish_e for the 50k emails that I have. Diffs attached, in case anyone else cares. regards, -p --------------------------------------------------------------------------- Index: lisp/nnml.el =================================================================== RCS file: /usr/local/cvsroot/gnus/lisp/nnml.el,v retrieving revision 7.8 diff -r7.8 nnml.el 512a513,517 > (defvar pkt:nnml-txt-ext ".nnml" > "*extension for nnml files") > (defvar pkt:nnml-use-txt-extension t > "should text extension be used?") > 513a519,526 > (let (file) > (setq file (nnml-article-to-file-original article)) > (if (file-exists-p file) > file > (if pkt:nnml-use-txt-extension > (concat file pkt:nnml-txt-ext))))) > > (defun nnml-article-to-file-original (article) 621a635,637 > (setq text-ext > (if pkt:nnml-use-txt-extension > pkt:nnml-txt-ext)) 640a657 > text-ext _______________________________________________ info-gnus-english mailing list [email protected] http://lists.gnu.org/mailman/listinfo/info-gnus-english
