Hi. I sent this mail to [EMAIL PROTECTED] to seek the information, And nokubi replied to this.
at Date: Mon, 21 Feb 2000 18:08:06 +0900, on Subject: [debian-devel:11686] Re: new search engine for our web pages? [was: [EMAIL PROTECTED]: Re: ITP: namazu2], Taketoshi Sano <[EMAIL PROTECTED]> writes: > Are there any idea to improve that point ? > > This issue is about to use namazu (or namazu2) for search of the whole page > of www.debian.org. This mail is sent to [EMAIL PROTECTED], with > Cc: to debian-www.debian.or.jp and debian-www.lists.debian.org. > > In article <[EMAIL PROTECTED]>, > at Sat, 19 Feb 2000 23:41:33 -0500, > "James A. Treacy" <[EMAIL PROTECTED]> writes: > > > On Sat, Feb 19, 2000 at 12:33:37PM +0100, Josip Rodin wrote: > > > Hi everyone, > > > > > > Could we use this namazu program for searching the web pages? > > > > > Here is the part that stopped me cold: > > Indexing process will take fifty minutes to index 25 MByte files > > with Linux Box has Pentium 166 MHz + 64 MB. > > > > That's over 8 hours just to index the main part of the site (roughly > > 200MB), which should be reindexed every day. For comparison, swish++ > > takes less than 10 minutes to index the Package section of the site > > (roughly 97MB). Using this for the List Archives (around 2GB) isn't > > even funny. > > > > -- > > James (Jay) Treacy > > [EMAIL PROTECTED] > > The past (not to be updated) record of the List Archives can be > indexed step by step, maybe. But everyday re-indexing may be too much. > > How do you think, Kitame, and Nokubi ? (Masayuki wrote you are > the namazu "demigods", so you can answer to this issue, I hope.) > > FYI: > > The size of my local cvs copy for www.debian.or.jp: > $ du -s /Home/sano/work/Debian/Web/www.debian.or.jp/ > 5089 /Home/sano/work/Debian/Web/www.debian.or.jp > > The size of my local cvs copy for www.debian.org/english: > $ du -s /Home/sano/work/Debian/Web/webwml/english/ > 6809 /Home/sano/work/Debian/Web/webwml/english > > The size of my local cvs copy for www.debian.org/japanese: > $ du -s /Home/sano/work/Debian/Web/webwml/japanese/ > 1679 /Home/sano/work/Debian/Web/webwml/japanese > > # I don't get other language tree, but there may be many langugage trees > # other than these two trees. > > The size of my local cvs copy for www.linux.or.jp/public: > $ du -s /Home/sano/work/JLUG/Web/main/public/ > 6220 /Home/sano/work/JLUG/Web/main/public > > -- > Taketoshi Sano: <[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>,<[EMAIL PROTECTED]> In <[EMAIL PROTECTED]>, at Date: Tue, 22 Feb 2000 09:11:18 +0900, on Subject: [debian-devel:11700] Re: new search engine for our web pages? [was:[EMAIL PROTECTED]: Re: ITP: namazu2], [EMAIL PROTECTED] (NOKUBI Takatsugu) writes: knok> Excuse me, I'm not subscribe debian-www currently. I expect to work knok> linux.debian.www newsgroup. knok> knok> In article <[EMAIL PROTECTED]> knok> [EMAIL PROTECTED] writes: knok> knok> >> The past (not to be updated) record of the List Archives can be knok> >> indexed step by step, maybe. But everyday re-indexing may be too much. knok> >> knok> >> How do you think, Kitame, and Nokubi ? (Masayuki wrote you are knok> >> the namazu "demigods", so you can answer to this issue, I hope.) knok> knok> At first, re-indexing is not heavier than first time indexing. It is knok> "difference indexing". Untouched files are not targets of proccessing. knok> knok> Second, namazu/namazu2 can handle multiple index files. So some index knok> processing can divide (and could be use some machines). knok> knok> BTW, I'm not "demigod". I'm just a member of Namzu Project :-) knok> -- knok> NOKUBI Takatsugu knok> E-mail: [EMAIL PROTECTED] knok> [EMAIL PROTECTED] (Debian-JP) How do you think ? -- Taketoshi Sano: <[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>