Am I the only one who thinks this is not the way to go, MultiReader (or
MulitiSearcher) is not going to fix your problems. Having 1.4B Documents on
one machine is a big number, does not matter how you partition them (or you
have some really expensive hardware at your disposal). Did I miss the poin
t to speak about
> documentation.
>
> About clear(Object sentinel) - is it still a question (now that you
> understood getSentinelValue())? I think we should not make it final anyway.
> It restricts PQ extensions unnecessarily ...
>
> Shai
>
> On Wed, Sep 30, 2009 at 8:41
forget the question about initialize(), reading javadoc before asking already
answered questions helps a lot, sorry for the noise. ...NOTE in
getSentinelObject() javadoc...
- Original Message
> From: eks dev
> To: java-user@lucene.apache.org
> Sent: Wednesday, 30 September
o be sentinels again. And of course add a reset() method to TSDC.
>
> On Wed, Sep 30, 2009 at 5:26 PM, eks dev wrote:
>
> > Thanks Mark, Shai,
> > I was getting confused by so many possibilities to do the "almost the same
> > thing" ;)
> >
> > But have f
> You also do want to specify whether or not to collect docs in order if
> > you care about performance:
> >
> > public static TopScoreDocCollector create(int numHits, boolean
> > docsScoredInOrder)
> >
> > ie:
> >
> > TopScoreDocCollector.create(
> You also do want to specify whether or not to collect docs in order if
> > you care about performance:
> >
> > public static TopScoreDocCollector create(int numHits, boolean
> > docsScoredInOrder)
> >
> > ie:
> >
> > TopScoreDocCollector.create(
.
- Original Message
> From: eks dev
> To: java-user@lucene.apache.org
> Sent: Wednesday, 30 September, 2009 11:43:26
> Subject: TSDC, TopFieldCollector & co
>
> Hi All,
>
> What is the best way to achieve the following and what are the differences,
>
Hi All,
What is the best way to achieve the following and what are the differences, if
I say "I do not normalize scores, so I do not need max score tracking, I do not
care if hits are returned in doc id order, or any other order. I need only to
get maxDocs *best scoring* documents":
OPTION 1:
I do not know much about RAM FS, but I know for sure if you have enough memory
for RAMDirectory, you should go for it. That gives you the fastest and the most
stable performance, no OS swaps, no sudden performance drops... Uwe's tip is
very good, if you/OS occasionally need RAM for other things
> > How do you handle stop words in phrase queries?
ok, good point! You found another item for list of BADs... but not for me as we
do not use phrase Qs to be honest, I do not even know how they are
implemented... but no, there are no positions in such cache...
well, they remain slowe
t exist with new Lucene...
> >> > I did not verify it again on the old one, but hey, who cares. Trunk is
> clean
> >> and, at least so far, our favourite QA team has nothing to complain about
> >> ...
> >> >
> >> > They will keep it u
a while... so if somethings comes up you
> will hear from me...
> > Thanks again to all.
> >
> > Cheers, Eks
> >
> >
> >
> > - Original Message
> >> From: eks dev
> >> To: java-user@lucene.apache.org
> >> Sent: T
up you
will hear from me...
Thanks again to all.
Cheers, Eks
- Original Message
> From: eks dev
> To: java-user@lucene.apache.org
> Sent: Thursday, 16 July, 2009 14:40:26
> Subject: Re: speed of BooleanQueries on 2.9
>
>
> ok new facts, less chaos :)
>
ok new facts, less chaos :)
- LUCENE-1744 fixed it definitely; I have it confirmed
Also, we found another example of the Query that was stuck (t1 t2 t3)~2 ...
this is also fixed with LUCENE-1744
Re: "some queries are 4X slower than before". Was that a different issue?
(Because this issu
I am getting lost as well, maybe I managed to confuse myself and everybody else
here.
But all agree, it would be good to know why it works now
Re. Query rewriting.
This Query gets printed with
///
BooleanQuery q;
q.toString()
search(q, null, 200):
///
=> this is the Query that enters
Trace taken on trunk version (with fixed Yonik's bug and LUCENE-1744 tha fixed
the problem somehow)
full trace is too big (3.5Mb for this list), therefore only beginning and end:
Query: +(((NAME:maria NAME:marae^0.25171682 NAME:marai^0.2365632
NAME:marao^0.2365632 NAME:marau^0.2365632 NAME:mar
well, QA team is not there, and I am "abusing" cutomer's sysadmin, and it will
cost me only a beer if I stop now :)
Will post traces tomorrow, daylight does better ... I will have them done on
trunk version (fixed two bugs) ...
- Original Message
> From: Michael McCandless
> To
warmduscher :)
good night
- Original Message
> From: Uwe Schindler
> To: java-user@lucene.apache.org
> Sent: Thursday, 16 July, 2009 1:06:30
> Subject: RE: speed of BooleanQueries on 2.9
>
> Same here, too late! Good night!
> And the blood glucose level is very low, too - very bad
t; NAME:pikarski^0.23232001 NAME:piowarski^0.20281483 NAME:pirkarski^0.22073482
> NAME:plocharski^0.21168004 NAME:pokarski^0.20172001
> NAME:polikarski^0.20172001
> NAME:pukarski^0.20172001 NAME:pyekarska^0.26508
> NAME:siekarski^0.20281483))^2.0)
> >
> >
>
I jut do not see how...
Also not really expected, but this query runs over BS2, shouldn't +( whatewer
whatever1...) run as BS? what does it mean to have MUST +() at the top level?
it is a bit late here, I am going to bed ...
Thanks a lot to all involved!
Eks
- Original Message -
)^2.0)
- Original Message
> From: eks dev
> To: java-user@lucene.apache.org; yo...@lucidimagination.com
> Sent: Wednesday, 15 July, 2009 23:57:22
> Subject: Re: speed of BooleanQueries on 2.9
>
>
>
> it works with current trunk, 10 Minutes ago built?!
>
it works with current trunk, 10 Minutes ago built?!
if I put lucene from yesterday, the same symptoms like yesterday...
Mike's instrumented version is running ...
- Original Message
> From: Yonik Seeley
> To: java-user@lucene.apache.org
> Sent: Wednesday, 15 July, 2009 23:34:29
DocIdSetIterators. The ones from Lucene core
> all implement the new API and do it more effective than the example code :-)
>
> Or does Eks Dev use custom DocIdSetIterators?
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
>
> If I make a patch that adds verbosity to what BS is doing, can you run
> it & post the output?
can do, it can take some time
- Original Message
> From: Michael McCandless
> To: java-user@lucene.apache.org
> Sent: Wednesday, 15 July, 2009 20:54:25
> Subject: Re: speed of BooleanQuer
g
> >> >> Sent: Wednesday, 15 July, 2009 17:16:23
> >> >> Subject: Re: speed of BooleanQueries on 2.9
> >> >>
> >> >> So now I'm confused. Since your query has required (+) clauses, the
> >> >> setAllowDocsOutOfOrder should
> Is it possible for you to make the problem happen such that we get
> line numbers in this traceback?
sure, I will build lucene trunk with debug/line numbers enabled and ask
customer's QA to run it again...
> Is CPU pegged when it's stuck?
Yes!, One core was 100% hot
- Original Mes
, 2009 at 7:04 PM, eks devwrote:
> >> >
> >> > I do not know exactly why, but
> >> > when I BooleanQuery.setAllowDocsOutOfOrder(true); I have the problem,
> >> > but
> with
> >> setAllowDocsOutOfOrder(false); no problems whatsoever
> >> &
whatsoever
> >
> > not really scientific method to find such bug, but does the job and makes
> > me
> happy.
> >
> > Empirical, "deprecated methods are not to be taken as thoroughly tested, as
> they have short life expectancy"
> >
>
something weird happening w/ BooleanScorer...
indeed, my first impression was jvm bug triggered on some rare conditions...
but we tried old jvm (1.5).. the latest 1.6 U14 , -client instead of -XBatch
-serverno changes
We never managed to wait so long to see it finish, so I am not sure if
ot to be taken as thoroughly tested, as
they have short life expectancy"
- Original Message ----
> From: eks dev
> To: java-user@lucene.apache.org
> Sent: Wednesday, 15 July, 2009 0:24:43
> Subject: Re: speed of BooleanQueries on 2.9
>
>
> Mike, we are definit
earch(Unknown Source)
org.apache.lucene.search.Searcher.search(Unknown Source)
- Original Message
> From: eks dev
> To: java-user@lucene.apache.org
> Sent: Monday, 13 July, 2009 13:28:45
> Subject: Re: speed of BooleanQueries on 2.9
>
> Hi Mike,
>
> getMa
Hi Mike,
getMaxNumOfCandidates() in test was 200, Index is optimised and read-only
We found (due to an error in our warm-up code, funny) that only this Query runs
slower on 2.9.
A hint where to look could be that this Query cointains two, the most frequent
tokens in two particular fields
Hi Mike,
thanks for looking into it...
I am now positive, it was definitely a problem for OS to map() large continuous
chunk of process memory... if I use this machine for a while as a desktop,
eclipse,... I get the same problem again... but after cold restart, mapping
succeeds.
The proble
Is it possible that the same BooleanQuery on 2.9 runs significantly slower than
on 2.4?
we have some strange effects where the following query runs approx 4(ouch!)
times slower on 2.9, test done by 1000 times executing the same Query... But!
if I run test from some real Query log with mixed Qu
-Xms Xms were set to the same value
imo, the problem was to convince OS (Win XP) to map huge continuous block...
there were no jvm processes running at the same time, just this one... but
after killing some desktop processes and restarting machine it worked.
hmm,
MMapDirectory has support for
for tips Uwe.
>
>
>
>
>
> -
>
> Uwe Schindler
>
> H.-H.-Meier-Allee 63, D-28213 Bremen
>
> http://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
>
>
> > -Original Message-
>
> > From: eks dev [mailto
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: eks dev [mailto:eks...@yahoo.co.uk]
> > Sent: Sunday, July 12, 2009 1:24 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: OOM with 2.9
> >
> >
> >
Stack trace
java.io.IOException: Map failed
at sun.nio.ch.FileChannelImpl.map(Unknown Source)
at org.apache.lucene.store.MMapDirectory$MMapIndexInput.(Unknown Source)
at org.apache.lucene.store.MMapDirectory$MMapIndexInput.(Unknown Source)
at org.apache.lucene.store.MMapDirectory.openInput(Un
Hi,
We just upgraded to 2.9 and noticed some (to me) not expected OOM.
We use MMapDirectory and after upgrade, on exactly the same
Index/machine/jvm/params/setup... we cannot start index as mapping screams "No
memory"
any explanation why this could be the case?
---
depends on your architecture, will you partition your index? What is max
expected size of your index (you said 128G and growing..) what do you mean with
growing? You have in both options enogh memory to load it into RAM...
I would definitly try to have less machines and alot of memory, so that
also see,
http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/search/BooleanQuery.html#getAllowDocsOutOfOrder()
- Original Message
> From: Nigel
> To: java-user@lucene.apache.org
> Sent: Friday, 26 June, 2009 4:11:53
> Subject: Optimizing unordered queries
>
> I recently pos
You omitNorms(), did you also omitTf()?
when something like https://issues.apache.org/jira/browse/LUCENE-1345 gets
commited, you will have a posibility to see some benefits (e.g. by packing
single postings lists as Filters). The code there optimises exactly that case
as filters contain no Sco
another performance tip, waht helps "a lot" is collection sorting before you
index.
if you can somehow logically partition your index, you can improve locality of
reference by sorting.
What I mean by this:
imagine index with following fields: zip, user_group, some text
if typical query
We've also had the same Problem on 150Mio doc setup (Win 2003, java 1.6). After
monitoring response time distribution over time for couple of weeks, it was
clear that such long running response times were due to bad warming-up. There
were peeks short after index reload (even comprehensive warmi
there is one case where MMAP does not beat RAM, initial warm-up after process
restart. With MMAP it can take a while before you get up to speed. MMAP with
reopen is the best, if you run without restart.
- Original Message
> From: Uwe Schindler
> To: java-user@lucene.apache.org
>
you can store binary value?
e.g. with:
Field(String name, byte[] value, Field.Store store)
You could store all your fields as byte[], so you get them back as byte[]. How
you index them is just another problem, but you are having no problems with
speed in your case, leave it as it is.
try simp
Have you tried NGram SpellChecker + Query expansion? This is quite similar to
your proposal, you have your priority queue in SpellChecker
- Original Message
> From: mark harwood
> To: java-user@lucene.apache.org
> Sent: Wednesday, 18 February, 2009 11:54:18
> Subject: Re: Lucene sear
The simplest sorting would be to sort your collection before indexing, because
Lucene will preserve order of added documents I think nutch sorts index
afterward somehow, but I do not know how this works
by omitTf() I mean the new feature in the trunk version, see
https://issues.apache.org/ji
hi Cedric,
has nothing to do with SSD... but
>
> All queries involves a Date Range Filter and a Publication Filter.
> We've used WrappingCachingFilters for the Publication Filter for there
> are only a limited number of combinations for this filter. For the
> Date Range Filter we just let it r
no, at the moment you can not make pure boolean queries. But 1.5 seconds on
10Mio document sounds a bit too much (we have well under 200mS on 150Mio
collection) what you can do:
1. use Filter for high frequency terms, e.g. via ConstantScoreQuery as much as
you can, but you have to cache them (C
you could maintain your bloom filter and check only "positives" if they are not
false positives with exact search, if you have small percentage of duplicates
(unique documents dominate updates) this will help you a lot on performance
side
- Original Message
> From: markharw00d <[EMA
Analyzer that detects your condition "ALL match something", if possible at
all...
e.g. "800123456 80034543534 80023423423" -> 800
than you put it in ALL_MATCH field and match this condition against it... if
this prefix needs to be variable, you could extract all matching prefixes to
this fiiel
do not forget that Filter does not have to be loaded in memory, not any more
since LUECEN-584 commit! Now it is only skipping iterator what you need.
translated, you could use:
ConstantScoreQuery created with Filter made from TermDocs (you need to
implement only DocIdSet / DocIdSetIterator, thi
yes, we have seen this many times. The problem is, especially on windows ,that
some simple commands like copy make havoc of File System cache, as matter of
fact, we are not sure it is the cache that is making problems, generally all IO
operations start blocking like crazy (we have seen this effe
hmm, if I am not wrong, it looks awfully similar to the Exception we have seen
and concluded it is some black magic with corrupt memory chip or waht-not, but
the fact we are not alone makes me wonder now... Subject of this thread was
"Strange Exception"... we were able to use this very same inde
NGrams will do ok,
depends a lot on what you are up to, if there is a person looking at result
lists making decision, it will work fine as default TF/IDF similarity will give
you ok order of hits, but if you need to set some cutoff value to decide
automatically if this is a match or not, then y
the example you have sent is too small for the type of compression implemented
in lucene. The problem is that you have to store decoding symbol table , header
...* for each* document you compress.
The best you can do for this would be to use some compressor with static
decoding table (some ent
>>Upping the amount of RAM does not help us when the
index is replaced before we pass the 50.000 queries.
have you seen https://issues..apache.org/jira/browse/LUCENE-1035 , It would be
interesting to see if this one changes HD numbers . You have plenty of free
memory in this setup...
you said, if an Index is optimized, isDeleted() does not present performance
problem? I think there is still check for null in synchronized method, can jvm
optimize this, I doubt it?
- Original Message
From: German Kondolf <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tu
I did not follow this discussion from the start, but I guess you could cleanly
achieve this by implementing
org.apache.lucene.index.FilterIndexReader
have fun. e.
- Original Message
From: 仇寅 <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Sunday, 2 March, 2008 3:05:05 AM
Sub
I would like to try to replace our external storage of documents with Lucene
stored field, so a few questions before we proceed:
Background: We store currently complete documents in a simple binary file and
only keep offsets into this file as a Stored field in Lucene index. Documents
(compre
Otis,
I think it was proposed to have spell checker that works on multiple tokens /
Document:
where field to be searched with SpellChecker" looks like "lucene search
library" does not get tokenized and then fed to the SpellChecker, rather having
this as a "single token" that gets chopped int
300k documents is something I would consider very small. Anything under 10Mio
documents IMHO is small for Lucene (meaning, commodity hardware, 1G RAM should
give you well under second response times).
The number of words is not all that important, much more important would be the
number of uniqu
sounds easy (I said sounds :),
e.g.
your Statement becomes Document in Lucene lingo, you make it with 3-4 Lucene
fields,
CONTENT (Tokenized, not stored)
OFFSET(not indexed, stored) - offset in file of the first byte of your statement
DOC_LENGTH(not indexed, stored) - if you have no END-OF-Statem
I've been doing this in past couple of years, and yes we use Lucene for some
key parts of the problem.
Basically, the problem you face is on how to run extremely high recall without
compromising precision, hard!
the key problem is performance, imagine you have DB with 10Mio persons you need
to
: Performance between Filter and HitCollector?
eks dev and others - have you tried using the code from LUCENE-584? Noticed
any performance increase when you disabled scoring? I'd like to look at that
patch soon and commit it if everything is in place and makes sense, so I'm
curious if y
just to complete this fine answer,
there is also Matcher patch (https://issues.apache.org/jira/browse/LUCENE-584)
that could bring the best of both worlds via e.g. ConstantScoringQuery or
another abstraction that enables disabling Scoring (where appropriate)
- Original Message
From: Ch
have a look at LuceneQueryOptimizer.java in nutch
- Original Message
From: Tim Johnson <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, 21 February, 2007 3:34:36 PM
Subject: Stop long running queries
I'm having issues with some queries taking in excess of 500 secs t
I would strongly suggest not storing these fields in lucene, just keep them as
files and store some kind of url to get them latter. that will boost your speed
heavily. If you really, really need to store documents in lucene, try some
compression
Also, so many fields hurt performance, any chance
1- is there someone out there that already wrote an extension to
Lucene so that 'stored' string for each document/field is in fact stored in
a centralized repository? Meaning, only an 'index' is actually stored in the
document and the real data is put somewhere else.
2- If not, how ha
have you considered hadoop "light" mesagging RPC, should have significantly
smaller latencies than RMI
- Original Message
From: Simon Wistow <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, 4 October, 2006 3:26:38 PM
Subject: Re: Searching documents on big index by u
Paul's Matcher in Jira will almost enable this, indirectly but possible
- Original Message
From: karl wettin <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, 26 September, 2006 11:30:24 PM
Subject: Re: Re[2]: how to enhance speed of sorted search
On 9/26/06, Chris Hoste
d special linguistic tricks are anyhow not
so relevant for most situations for searching. Regular stemmer makes much
greater distorsion than this
Must find this code somewhere, I probably left something out in these emails
- Original Message
From: eks dev <[EMAIL PR
Hi Otis,
Depends what yo need to do with it, if you need this to be only used as "kind
of stemming" for searching documents, solution is not all that complex. If you
need linguisticly correct splitting than it gets complicated.
for the first case:
Build SuffixTree with your dictionary (hope you
I would rather use this
BitSet bits = new BitSet(reader.maxDocs()); //Not sure of exact method, lucene
is not on this PC...
instead of = new BitSet(reader.maxDocs())
- Original Message
From: Mark Miller <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, 5 September, 200
I would suggest you to have a look at Egothor stemmer
(http://www.egothor.org/book/bk01ch01s06.html), can be trained rather easily
(if your only use of "roots" is for searching)
I have only heard of it as a good thing, never tried it
On Aug 4, 2006, at 1:29 PM, Marios Skounakis wrote:
>
>
>
>
XP Proffesionall / win 2003 Server, we had this issue on JVMs 1.5/1.6.
It seams it this happens "not so often" on 1.6/Win2003, but we have this in
production only for 2 weeks.
We have single update machine that builds index in batch and replicates to many
Index readers, so at least customers ar
This is windows/jvm issue . Have a look at how ant is dealing with it, maybe we
could give it a try with something like that (I have not noticed ant having
problems).
We are not able to reproduce this in our environment systematically, so it
would be great if you could patch your lucene with th
have you tried to only collect doc-ids and see if the speed problem is there,
or maybe to fetch only field values? If you have dense results it can easily be
split() or addSymbolsToHash() what takes the time.
I see 3 possibilities what could be slow, getting doc-ids, fetching field
value or do
Did not check it, but solr is using SkippingFilter which is not yet commited
in Lucene... so this will maybe not work?
By the way, any reason today not to commit SkippingFilter to Lucene? I
actually see nothing to do for this, but to commit existing SkippingFilter. If
there is something I do
try your query like ((ducted^1000 duct~2) +tape)
Or maybe (duct* +tape)
or even better you could try to do some stemming (Porter stemmer should get rid
of these ed-suffixes) and some of the above
if this does not help, have a look at lingpipe spellChecker class as this looks
like exactly what yo
Grab it now, it is worth all this money.
- Original Message
From: digby <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, 6 June, 2006 11:59:53 AM
Subject: Lucene in Action
Does everyone recommend getting this book? I'm just starting out with
Lucene and like to have a b
or you could try n-gram approach with Spellchecker (you will find it contrib
area).
get suggestSimilars() and form your query, or even better ConstantScoringQuery
via Filter. It works OK.
Or if you have not so many Terms (could spare to load all terms in memory),
you could try TernarySearch
If you can use all that memory for index, I would say RAM. For long running
indexes (to get os cache populated), MMAP will do just as good if you have any
file system worth using.
- Original Message
From: Michael Chan <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Sunday, 28
try:
1. query-string: "hello +area:home" to get Filtering effect
2. to minimize scoring use boosts: "(hello)^HIGH_BOOST +(area:home)^LOW_BOOST"
3. If scoring via boosts does not work good enough for you, or is slow, use
Filter interface from your code... search this list for Filter
- Or
Just a short one, it rocks in some cases (when actual
BitSet/IntSet is compressable, long runs of set or
clear bits...). Very good general BitSet
representation
I have tried it and found no bugs so far (+- 2 months
of using it)
Unfortunately, there is an issue with Licence (not ASF
compatible :(
Hi,
Would it be OK to add one method in Filter class that
returns DocNrSkipper interface from Pauls's "Compact
sparse Filter" in jira LUCENE-328
This would be the first step for:
- smooth integration of compact representations of the
underlaying BitSet in Filter (VInt and sorted int[]).
They are
Hi Hoss,
Good to hear that, I felt a bit fuzzy trying to grasp
all the possibilities.
I've read discussion from Doug's proposal for
implementing non-scoring Query features,
ConstantScoreQuery, Paul's FilteredQuery patch.
And in summary options to avoid scoring:
1. There is a consensus that
Everything is perfect with your suggestion, scoring is
not needed. I am going to try all also approach with
ChainedFilter, but for this I need to think a bit more
on how to get it right. The Query in the example is
just one variation on the same topic and there are a
few more cases I need to cover
Thanks Hoss,
I've looked intio it and you were absolutely right,
could not be simpler.
Two quick ones on the same topic (my personal
education like questions):
- What is the purpose of hasCode and equals methods in
XxxFilter? (this is a question about actual usage in
Lucene, not java elementary
(currently using HitCollector) and score is not
needed, any way to avoid scoring (would that help at
all?)
Befor adding ZIPS:12* part of the query, Lucene worked
like a charm, a lot under 1 second on 25Mio
collection! Now it jumped into 10 second range.
Trunk is ok for me.
Thanks a lot!
eks dev
Hi,
I have a need for minimum memory footprint of the
index during search (would like to have it in RAM).
Good thing in the story, similarity calculation is not
necessary, only pure boolean model is OK.
I am sure I have seen somewhere one explanation from
Doug about disabling norms... but cannot f
works like a charm,
thanks!
as a side note, the latest patch with properly
disabled coord helped me a lot as well, made coord
usable.
--- Doug Cutting <[EMAIL PROTECTED]> wrote:
> eks dev wrote:
> > When I reindex with the lucene from the latest svn
> > snapshot, a lot o
When I reindex with the lucene from the latest svn
snapshot, a lot of .tii files that are deletable
appear (checked with luke).
This was not happening with previous version using
exactly the same code for indexing.
At the end of indexing Optimize was succesfully
finished.
Is this a bug?
WinXP,
Hi,
Is there a way to create index which does not store
norms (.f..) and Positions (.prx files)?
In the case I need to support, no length normalisation
is needed, the same is with positional info.
(Similarity.encodeNorms(float) returns 0; and
Term.SetPositionalIncrement(0) is used)
>From the size
95 matches
Mail list logo