[EMAIL PROTECTED] wrote:
You can have multiple languages in the same index. Just make sure that
your language identification process is consistent.
You might still get some false positives, for example, if there's a
German root that has the same letters as a French root, but means
something dif
Thanks Chris, it works like a champ now. I had thought I looked at
the queries themselves with toString but in any case, the queries
actually work now. I didn't realize that Lucene was customizable on
so many levels - when you create the analyzer, when you create the
index, when you perfo
Hi chaps ,
I ran the same search code with lucene-1.4.3.jar and then with
lucene-core-1.9.1.jar
The good news is there appeared to be a performance improvement with 1.9.1
both with single index searching both exact and fuzzy mode,
However when searching muliple indexes with mul
I bet that if you look at the toString() of the query you get back from
your query parser, you'll see that the non numeric part numbers have been
stemmed.
You took the right steps when you indexed the field as UN_TOKENIZED, but
at query time your query parser doesn't know about that -- take a loo
I am trying to search by a number of fields including an alphanumeric
model id.
This is just the model id that comes from manufacturers. I've tried
to use a StandardAnalyzer and a SnowballAnalyzer to index the data.
Then I search with the associated analyzer using a
MultiFieldQueryParse
Also, this question may be better for one of the Eclipse groups, because
Eclipse already uses Lucene for indexing (of help? code?), so they will be able
to tell you how to integrate Eclipse and Lucene.
Otis
- Original Message
From: Chris Hostetter <[EMAIL PROTECTED]>
To: Lucene Users
Hello,
Can somebody tell me what document parsers are available that can be
used with CLucene? I know for lucene, XML->Text, pdf->Text, doc->Text,
html->Text and RTF->Text all parsers are avilable. Have all of these been
ported to CLucene?
Thanks,
John
: Problem: while there is a hit, only the timestamp and ip of the very first
: line in the logfile are shown, but not the "matching" ip and timestamp later
: in the logfile. Any suggestions how to get to the "right entries" ?
It sounds like you are creating one Document per logfile, and then usin
First off: i've changed the reply to be the [EMAIL PROTECTED] list ... that
is the appropriate place to ask questions about using the Lucene APIs.
Second: once you have a test index built, and you can do some test
searches to verify it contains what you think it does (take a look at Luke
to be su
25 apr 2006 kl. 17.54 skrev April06:
We indexed several logfiles which contain for example a timestamp,
an ip and
additional information (all defined as a field) all in one line.
A logfile itself contains many of these lines.
We used a BooleanQuery (timestamp / ip) to search for a ip betwe
Given a term "myterm", what kind of search algorithm lucene uses to
get to the postings list(i.e. the term-frequency location in .frq file)
? From what I understood by looking into the lucene fileformat, is that
it keeps the whole of .tii file in memory and and does a skipped linear
search o
We indexed several logfiles which contain for example a timestamp, an ip and
additional information (all defined as a field) all in one line.
A logfile itself contains many of these lines.
We used a BooleanQuery (timestamp / ip) to search for a ip between a defined
range of time.
Problem: while
It is up to you to create a program to do this, but it is relatively
easy. You may want to search the web, chances are someone has posted
code to do this, as a number of people have used Lucene in TREC in the past.
Good luck,
Grant
thanh nguyen wrote:
Hi trupti,
Thank for your response. I h
On 4/25/06, Oskar Berger <[EMAIL PROTECTED]> wrote:
> What is the most efficient approach to access field values during a
> search?
If you are implementing a hit collector, and want to know a field
value for each document coming in, you can use the FieldCache if the
field is indexed and not tokeni
Hi trupti,
Thank for your response. I have another question.
Whether Lucene can receive a topic file like "
1 abc def " and produce a
result_file which we can use with trec_eval program
(trec_eval relevant_file result_file , relevant_file
is the judgement file of TREC for these topic) ??
Th
Hello masters,
What is the most efficient approach to access field values during a
search?
I am only interested in accessing a couple of fields for counting, and
am thus not in need of storing the values as if sorting on the fields.
Where to look?
Regards,
/oskar
-
16 matches
Mail list logo