Hi,
On 11/11/05, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Was wondering if someone could help me out with a few things in Korean
> as related to Lucene:
> 1. Which Analyzer do you recommend? From the list, I see that some
> have had success with the StandardAnalyzer. Are there any c
Look at IndexReader.open()
It actually uses a MultiReader if there are multiple segments.
-Yonik
Now hiring -- http://forms.cnet.com/slink?231706
On 11/11/05, Charles Lloyd <[EMAIL PROTECTED]> wrote:
> You should run your own tests, but I found the MultiReader to be slower
> than a regular IndexR
It doesn't seem like a custom Similarity would work. Always returning
1.0 for coord would still rank a doc higher if both current_name and
old_name matched.
-Yonik
Now hiring -- http://forms.cnet.com/slink?231706
On 11/11/05, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> I believe if you create a cu
Thanks for your time.
-- Text I want to highlight is stored in the file system and index
-- I can search and highlight the searched terms in results page ( just
snippets)
-- I have given a download link next to snippets ( which will point to file I
stored in ROOT webapp of tomcat)
I understo
On 11 Nov 2005, at 13:27, Lasse L wrote:
I am indexing persons that has the usual fields name, address etc.
I need to keep track of which name and addresses are active now and
which ones are old.
I do that by having a two sets of fields e.g.: current_name and
old_name
When I search for a per
On 11 Nov 2005, at 12:54, bib_lucene bib wrote:
My requirement is that I do a search, the results of the search are
displayed. I am displaying results by using getbestfragmets and
highlighting searched text.
So basically the user can search and see what documents matched his
search with s
Hello,
- Original Message -
From: "Grant Ingersoll" <[EMAIL PROTECTED]>
To:
Sent: Friday, November 11, 2005 10:36 PM
Subject: Getting Started with Korean
> Hi,
>
> Was wondering if someone could help me out with a few things in Korean
> as related to Lucene:
> 1. Which Analyzer do y
On Friday 11 November 2005 23:04, Chris Hostetter wrote:
>
> : Wouldn't it make sense to have BooleanFilter,
> : TermFilter, MultiTermFilter, RangeFilter... fammily to
> : "mirror" xxxQuery world with same idioms and
> : interfaces? Is this the direction allready taken in
> : Lucene development (
: Wouldn't it make sense to have BooleanFilter,
: TermFilter, MultiTermFilter, RangeFilter... fammily to
: "mirror" xxxQuery world with same idioms and
: interfaces? Is this the direction allready taken in
: Lucene development (an alternative would be to
: parametrize existiong Query world). How
: There is no way around using a separate Scorer for this.
: You can make (could have made) the scorer by starting from
: DisjunctionSumScorer.java here:
:
http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/java/org/apache/lucene/search/
: and rewrite it into a DisjunctionMaxScorer.
Coincid
Everything is perfect with your suggestion, scoring is
not needed. I am going to try all also approach with
ChainedFilter, but for this I need to think a bit more
on how to get it right. The Query in the example is
just one variation on the same topic and there are a
few more cases I need to cover
Lasse,
On Friday 11 November 2005 19:27, Lasse L wrote:
> I am indexing persons that has the usual fields name, address etc.
> I need to keep track of which name and addresses are active now and
> which ones are old.
> I do that by having a two sets of fields e.g.: current_name and old_name
>
> W
Yes, this work.
.
String strQuery = query.toString();
WeightedTerm[] weightedTerm = QueryTermExtractor.getTerms(query);
ArrayList bodyQueryTerms = new ArrayList();
for (int i = 0; i < weightedTerm.length; i++) {
String term = weightedTerm[i].getTe
I am indexing persons that has the usual fields name, address etc.
I need to keep track of which name and addresses are active now and
which ones are old.
I do that by having a two sets of fields e.g.: current_name and old_name
When I search for a person and I search in just the current fields
ran
>>> This don't work, because
Ah, crap. You'll have to drop down another level.
Every line of code in QueryTermsExtractor that calls
terms.add(new WeightedTerm(..))
would be the place to test the field name then.
For now you could copy QueryTermsExtractor and put an
"if" around these lines whi
Cheolgoo Kang wrote:
>Thanks Bialecki,
>
>
Bialecki is my last name, my first name is Andrzej. No problem, it's
similarly confusing for Europeans to decide between the first and last
name in Asian names... :-) Is your first name Kang?
>I'm trying to test your program, thanks a lot!
>
>And also
Hi Erik & All
My requirement is that I do a search, the results of the search are displayed.
I am displaying results by using getbestfragmets and highlighting searched text.
So basically the user can search and see what documents matched his search with
snippets of text shown in the result of
Hi Mark
This don't work, because
WeightedTerm[] weightedTerm = QueryTermExtractor.getTerms(query);
return query terms values , not the fields names.
example:
for "body:mark title:highlight"
return [mark, highlight], I can't compare this values with "body" field.
Ernesto.
mark harwood
Ah. You're right. Looks like the current highlighter
api doesn't offer you that degree of control.
The way to fix it is probably to tweak the list of
WeightedTerms you give the highlighter:
[psuedo code follows...]
terms=QueryTermExtractor.getTerms(query);
bodyQueryTerms=new ArrayList();
for all
Hello,
I am new to Lucene. I was trying to use Lucene with TREC-6 Data. The
dataset for TREC-6 used in 1997 contains many input files. Each input
file has multiple documents (some files contain over 200 documents) tagged by
and the text is tagged by .The result
given by Lucene to a
You should run your own tests, but I found the MultiReader to be slower
than a regular IndexReader. I was running on a dual-cpu box and two
separate disk drives.
Charles.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For addit
I queue up all my index operations. If the app stops the queue gets saved
to disk. When the app restarts the queue is loaded and everything carries
on. I haven't looked at the app failing just yet. I know the JVM has hooks
that can be used to ensure clean up code gets called when the JVM exits
Thanks Bialecki,
I'm trying to test your program, thanks a lot!
And also, can you give me the paper you've cited [1] and [2]? I've
googled(entire web and google scholar) about it but got nothing.
On 11/8/05, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
> KwonNam Son wrote:
>
> >First of all, I re
Thanks for the advice Paul,
I thought about doing two passes.. Delete all and then insert all, but
the problem with that approach is if my program fails somewhere in
between start and end.. I may end up with many deleted records and none
changed. The same could happen with a batch build. How are
Hello,
You really do need to batch up your deletes and inserts otherwise it will
take a long time. If you can, do all your deletes and then all of your
inserts. I have gone to the trouble of queueing index operations and when a
new operation comes along I reorder the job queue to ensure delet
The IndexSearcher(MultiReader) will be faster (it's what's used for
indicies with multiple segments too).
-Yonik
Now hiring -- http://forms.cnet.com/slink?231706
On 11/11/05, Mike Streeton <[EMAIL PROTECTED]> wrote:
> I have several indexes I want to search together. What performs better a
> sing
Hi
I'm using highlighter and have this problem:
The query is over two or more fields, like:
*body:home AND title:sale*
I want to highlight over body field, but not highlight "sale" if "sale"
is in body.
How I can do this?
When I create a Highlighter instance, the parameter is the query:
*hi
Martijn,
Sorry for the late reply.
I've been on holiday.
I had other more pressing things come up.
The problem I was trying to solve was clustering the indexing and
search.
I am thinking of breaking my application into indexing and search nodes
and keep them coordinated in some fashion. It would
Howdy all,
I am having a problem with inserting/updating records into my
index. I have approximately 1.5M records in the index taking about 2.5G
space when optimized.
If I want to update 1000 records, I delete the old item and insert the
new one. This is taking a LONG time to accomplis
If you are storing the term vector when you index, then you can ask the
IndexReader for the vector using the getTermFreqVector() method, which
will return the TermFreqVector which should have the information you need
[EMAIL PROTECTED] wrote:
I hope that this isn't a newbies question, but let
I hope that this isn't a newbies question, but let me
ask the more general question. While IndexReader can
return the documents containing the term t, I need to
do the opposite. Is there a method, given document d,
that will return all of the terms in that document (I
need to calculate the averag
On 11 Nov 2005, at 01:22, bib_lucene bib wrote:
Hi All
I use the following code to display search results
LuceneHitHighlighter highlighter = new LuceneHitHighlighter
(queryStr, "snippet", "body");
for (int i = 0; i < hits.size(); i++) {
Document doc = (D
Hi,
Was wondering if someone could help me out with a few things in Korean
as related to Lucene:
1. Which Analyzer do you recommend? From the list, I see that some
have had success with the StandardAnalyzer. Are there any caveats I
should be aware of if I choose to use it?
2. Could anyone
I have several indexes I want to search together. What performs better a
single searcher on a multi reader or a single multi searcher on multiple
searchers (1 per index).
Thanks
Mike
: - What is the purpose of hasCode and equals methods in
: XxxFilter? (this is a question about actual usage in
: Lucene, not java elementary :)
You mean hashCode right? ... those methods are generally important for
Hashing, which makes then key for effective caching in most cases.
CachingWrapper
35 matches
Mail list logo