That presumably isn't healthy.
-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: 26 May 2006 21:27
To: java-user@lucene.apache.org
Subject: Re: Seeing what's occupying all the space in the index
It kind of sounds like those files are corrupted, but I can&
eckon I can merge the .fdt, .prx and .frq into a
compound index?
-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: 26 May 2006 18:38
To: java-user@lucene.apache.org
Subject: Re: Seeing what's occupying all the space in the index
Can you try a smaller s
Lucene documents
indexed in it now, do you reckon I can merge the .fdt, .prx and .frq into a
compound index?
-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: 26 May 2006 18:38
To: java-user@lucene.apache.org
Subject: Re: Seeing what's occupying all the space i
> Note that IndexReader has a main() that will list the contents of compound
index files.
It looks like some of my index is compound and some isn't. My not very well
informed guess is that an optimize() got interrupted somewhere along the
line.
If I try to optimize the index now, it throws except
I just tried to optimise my index, using the lucli command line client, and
got:
8<
lucli> optimize
Starting to optimize index.
java.io.IOException: Cannot overwrite:
/mnt/sdb1/lucene-index/index-1/_2lhqi.fnm
at
org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.j
Rob Staveley (Tom) wrote:
Is there a tool I can use to see how much of the index is occupied by the
different fields I am indexing?
Note that IndexReader has a main() that will list the contents of
compound index files.
Doug
--
TED]
Sent: 26 May 2006 18:38
To: java-user@lucene.apache.org
Subject: Re: Seeing what's occupying all the space in the index
It seems odd to me that if you are using the CFS format, why you would have
the .fdt, .frq and .prx files in addition to the .cfs files. My
understanding is all files (e
It seems odd to me that if you are using the CFS format, why you would
have the .fdt, .frq and .prx files in addition to the .cfs files. My
understanding is all files (except deletable and segment) get put inside
of the CFS file. Looking at my indices, I only have the CFS file. Are
you optim
ect: RE: Seeing what's occupying all the space in the index
are you by any chance using different field names for each document -- or do
you have a wide range of field names that aren't the same for each document?
... you mentioned indexing emails, email has a very loose header structur
> Is there anything I can learn from the index directory's file listing?
Running this nasty little BASH one-liner...
$ for i in `ls * | perl -nle 'if (/^.+(\..+)/) {print $1;}' | sort |
uniq`;do ls -l *$i | awk '{SUM = SUM + $5} END {if (SUM > 1e10) {print
"'$i': ", SUM}}'; done
... I see
: PS: I am a newbie to the mailing list - I hope I've got the etiquette right
you may have figured this out already, but please CC email to
multiple lucene mailing lists -- in this particular case,
[EMAIL PROTECTED] is just a legacy alias that points at [EMAIL PROTECTED] -- so
there's *really* no
ot;Rob Staveley (Tom)" <[EMAIL PROTECTED]>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: RE: Seeing what's occupying all the space in the index
:
: > I can't see how Luke is going to show me what's occupying most of my
: index.
:
: I
> I can't see how Luke is going to show me what's occupying most of my
index.
I do however notice that none of my stored fields are stored compressed.
Presumably Field.Store COMPRESS is something that is new in Lucene 1.9 and
wasn't available in 1.4.3?? However, it is still hard to see what's c
riginal Message-
From: Karel Tejnora [mailto:[EMAIL PROTECTED]
Sent: 26 May 2006 14:42
To: java-user@lucene.apache.org
Subject: Re: Seeing what's occupying all the space in the index
Or you can use ssh -X for X11 forwarding. I don't know how it's working in
windows (some x client app) bu
Or you can use ssh -X for X11 forwarding. I don't know how it's working
in windows (some x client app) but great on linux(es) with huge bandwidth.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EM
have Luke's whistles and bells. Does Luke have a non-GUI equivalent,
Grant?
-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: 26 May 2006 12:41
To: java-user@lucene.apache.org
Subject: Re: Seeing what's occupying all the space in the index
Give Luke a try.
che.org
Subject: Re: Seeing what's occupying all the space in the index
Give Luke a try. Google for "Luke Lucene" and you should find it.
Otherwise check the Lucene website for a reference.
smime.p7s
Description: S/MIME cryptographic signature
Give Luke a try. Google for "Luke Lucene" and you should find it.
Otherwise check the Lucene website for a reference.
Rob Staveley (Tom) wrote:
In my index of e-mail message parts, it looks like 23K is being used up for
each indexed message part, which is way more than I'd expect.
I have a
In my index of e-mail message parts, it looks like 23K is being used up for
each indexed message part, which is way more than I'd expect.
I have a total of 37 fields per message part.
I tokenize, index and do not store message part bodies.
I store a <= 300 character synopsis of each message part.
19 matches
Mail list logo