Document id on the index level is offset of the document in the index. It can
change over time for the same document, for example when merging several
segments. They are also stored in order in posting lists. This allows fast
posting list intersection. Some Lucene API's explicitly state that the
Thanks Denis. I've been looking at the code in more detail now. I'm
interested in how the new SortingAtomicReader works. Suppose I build an
index and sort the documents using my own sorting function - as shown in
the docs:
AtomicReader sortingReader = new SortingAtomicReader(reader, sorter);
w
I'm not quite sure, what you really need. But as far as I understand, you want
to get all document id's for a given term. If so, the following code will work
for you:
Term term = new Term("fieldName", "fieldValue");
TermDocs termDocs = indexReader.termDocs(term);
while (termDocs.next()) {
Can someone point me to the code that traverses the posting lists? I
trying to understand how it works.
Thanks,
Sriram
Lovely. Thank you very much! Oliver
-邮件原件-
发件人: java-user-return-56101-oliver.xu=aigine@lucene.apache.org
[mailto:java-user-return-56101-oliver.xu=aigine@lucene.apache.org] 代表
Koji Sekiguchi
发送时间: 2013年6月12日 22:47
收件人: java-user@lucene.apache.org
主题: [SPAM] Re: A Problem in Customi
On 6/12/2013 7:02 PM, Steven Schlansker wrote:
On Jun 12, 2013, at 3:44 PM, Michael Sokolov
wrote:
You may not have noticed that CharFilter extends Reader. The expected pattern
here is that you chain instances together -- your CharFilter should act as
*input* to the Analyzer, I think. Don
On Jun 12, 2013, at 3:44 PM, Michael Sokolov
wrote:
> You may not have noticed that CharFilter extends Reader. The expected
> pattern here is that you chain instances together -- your CharFilter should
> act as *input* to the Analyzer, I think. Don't think in terms of extending
> these ana
You may not have noticed that CharFilter extends Reader. The expected
pattern here is that you chain instances together -- your CharFilter
should act as *input* to the Analyzer, I think. Don't think in terms of
extending these analysis classes (except the base ones designed for it):
compose t
Hello,
I figured out how to solve this. I just added stopTypes.add("");
On Wed, Jun 12, 2013 at 8:39 PM, Gucko Gucko wrote:
> Hello all,
>
> is there a filter I can use to remove emails from a TokenStream?
>
> so far I'm using this to remove numbers, URls, and I would like to remove
> emails
Hello all,
is there a filter I can use to remove emails from a TokenStream?
so far I'm using this to remove numbers, URls, and I would like to remove
emails too:
Tokenizer tokenizer = new UAX29URLEmailTokenizer(Version.LUCENE_43,
new StringReader(text));
Set stopTypes = new HashSet();
st
thank you so much. Yes the problem was that I had a jar that's using Lucene
1.5.
Best!
On Wed, Jun 12, 2013 at 7:52 PM, Uwe Schindler wrote:
> Hi,
>
> This happens if you have incompatible Lucene versions next to each other
> in your classpath. Please clean up your classpath carefully and make
Hi,
This happens if you have incompatible Lucene versions next to each other in
your classpath. Please clean up your classpath carefully and make sure all JAR
files of Lucene have the same version and no duplicate JARs with different
versions are in it!
Uwe
-
Uwe Schindler
H.-H.-Meier-All
Hello all,
I'm trying the following code (trying to play with Tokenizers in order to
create my own Analyzer) but I'm getting an exception:
public class TokenizerTest {
public static void main(String[] args) throws IOException {
String text = "A #revolution http://hi.com in t...@test.com softwa
Hi Oliver,
> My questions are:
>
> 1. Why are the overrided lengthNorm() (under Lucene410) or
> computeNorm() (under Lucene350) methods not called during a searching
> process?
Regardless of whether you override the method or not, Lucene framework
calls the method during index time only be
Dear,
I built my own scoring class by extending the DefaultSimilarity. Three major
methods from DefaultSimilarity were overrided, including:
1. public float lengthNorm(FieldInvertState state)
2. public float tf(float freq)
3.public float idf(long docFreq, long numDocs)
However, with embe
15 matches
Mail list logo