Nilesh,
the StandardAnalyzer is full of generally useful special cases, including
emails and numbers detection.
I am supposing you met one such special case which has a justification of some
sort.
I can't tell you why but I can tell it's really hard to change because others
rely on this somehow
thanks so much,Brandon Mintern.My mistak,sorry for everyone.
On Wed, Mar 28, 2012 at 3:12 AM, Brandon Mintern wrote:
> On Tue, Mar 27, 2012 at 12:21 AM, jianwen lou wrote:
> > I want to store the long type value to my index files like follwing:
> >
> >NumericField priceField = ne
Dear Lucene users and developers,
sorry for getting back to this old subject, but we are in the position
of re-evaluating our current implementation, which uses re-compiled
version of Lucene 3 with boolean scorers multiplying sub-scores. I was
hoping that "flexible ranking" in Lucene 4 will provid
Or ... move to use a per-segment array. Then you don't need to rely on doc
IDs changing. You will need to build the array from the documents that are
in that segment only.
It's like FieldCache in a way. The array is relevant as long as the segment
exists (i.e. not merged away).
Hope this helps.
On Tue, Mar 27, 2012 at 12:21 AM, jianwen lou wrote:
> I want to store the long type value to my index files like follwing:
>
> NumericField priceField = new NumericField("price");
> priceField.setDoubleValue(temp.getCurrentprice());
> document.add(pric
While using the pruning package, I realised that ridf is calculated in
RIDFTermPruningPolicy as follows:
Math.log(1 - Math.pow(Math.E, termPositions.freq() / maxDoc)) - df
However, according to the original paper (Blanco et al.) for residual idf,
it should be -log(df/D) + log (1 - e^(*-*tf/D)). T
Maybe you only see CFS files? If this is the case, your index is in compound
file format. In that case (the default), to get the raw files, disable
compound files in the merge policy!
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Or
The code seems OK on quick glance...
Are you closing the writer?
Are you hitting any exceptions?
Mike McCandless
http://blog.mikemccandless.com
On Tue, Mar 27, 2012 at 12:19 PM, Luis Paiva wrote:
> Hey all,
>
> i'm in my first steps in Lucene.
> I was trying to index some txt files, and my pr
Hi Nilesh,
Which version of Lucene are you using? StandardTokenizer behavior changed in
v3.1.
Steve
-Original Message-
From: Nilesh Vijaywargiay [mailto:nilesh.vi...@gmail.com]
Sent: Tuesday, March 27, 2012 2:04 PM
To: java-user@lucene.apache.org
Subject: Lucene tokenization
I have a
Hey all,
i'm in my first steps in Lucene.
I was trying to index some txt files, and my program doesn't construct the
term vector files. I would need these files. (.tvd, .tvx, .tvf)
I'm attaching my code so anyone can help me.
Thank you all in advance!
Sorry if i'm repeating the question, but
In general how Lucene assigns docIDs is a volatile implementation
detail: it's free to change from release to release.
Eg, the default merge policy (TieredMergePolicy) merges out-of-order
segments. Another eg: at one point, IndexSearcher re-ordered the
segments on init. Another: because Concurre
I'll, of course, defer to Uwe for technical Lucene issues, but you've
got a copy/paste error it looks like. I doubt it's the root of your
problem, but this code reuses priceField, it seems like
you intend the second to use salesField
NumericField priceField = new NumericField("price");
It seems that the Analyzer i used in my project is the problem.I use
CJKAnalyzer,I am not exactly understand the lucene analysis and tokenizer
process .Is there other way to do this:
I want to store numbers and date time in the lucene filed and to use the
filed to filter and range the search,thanks
Hi,
> I am not exactly understand the precisionStep arg,I need to add the arg?
RTFM: http://goo.gl/PlhhO
> On Tue, Mar 27, 2012 at 3:48 PM, jianwen lou wrote:
>
> > No,There is no multi-thread building index at same time, I google and
> > get the result, i use 64 bit jvm. It matters?
> >
> >
The bug mentioned in this link was a multithread bug (what I asked you). If
you reuse Documents and Fields this can happen, otherwise not. This code is
heavily tested and the code you sent cannot fail. Maybe its different to the
one you actually use?
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-282
I am not exactly understand the precisionStep arg,I need to add the arg?
On Tue, Mar 27, 2012 at 3:48 PM, jianwen lou wrote:
> No,There is no multi-thread building index at same time,
> I google and get the result, i use 64 bit jvm. It matters?
>
>
> http://lucene.472066.n3.nabble.com/Lucene-3-
No,There is no multi-thread building index at same time,
I google and get the result, i use 64 bit jvm. It matters?
http://lucene.472066.n3.nabble.com/Lucene-3-4-shift-bug-in-possibly-invalid-use-of-NumericTokenStream-td3592962.html
F:\Java\open-source\lucene>java -version
java version "1.6.0_25"
Hi all,
I have a search application with 16 million documents that uses custom
scores per document using a ValueSource. These values are updated a lot
(and sometimes all at once), so I can't really write them into the index
for performance reasons. Instead, I simply have a huge array of float
Hi,
Are you sure that you are not reusing the same NumericField instances across
different threads?
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: jianwen lou [mailto:loujan...@gmail.com]
> Sent: Tuesd
19 matches
Mail list logo