You can adapt the source code of StopAnalyzer.java in the analysis
package, or I suppose you can use the default constructor with a empty
stop word list (but please check this).
If you don't know "Luke" use this small tool to display your index and
verify your index process.
http://www.getopt
Ok, Thats fine. Thanks
Now what if i don't want to stop any word, means i want lucene not to ignore
any word.How to do this?. And also doing this will afffect any performance or
not?
Thanks...
- Original Message
From: Grant Ingersoll <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Thanks all for the valuable suggestions.
The lock issue also got resolved and all 7 laks files are indexed in arnd 85
minutes which is like wow !
To get away with the lock issue i followed the suggestion given in this :
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200601.mbox/[EMAIL
: I currently have a boolean query which contains a MultiFieldQuery for all
MultiFieldQuery is not a query class that comes with Lucene ... did you
write it yourself?
it sounds like what you want is a boolean query (of a
DisjunctionMaxQuery) containing a seperate phrase query for each field ...
Keywords.setKeyword(String) could've been able to stack all the
keywords set by the digester. So, setKeyword(String) method should be
written like below using java.util.List:
public static class KeyWords
{
private String lineNum;
private List kw = new LinkedList();
pub
Grant Ingersoll <[EMAIL PROTECTED]> wrote on 19/03/2007 13:10:16:
> So, if I am understanding correctly:
>
> >> "SearchSameRdr" Search > : 5000
>
> means don't collect indiv. stats fur SearchSameRdr, but do whatever
> that task does 5000 times, right?
Almost...
It should be btw
{ "SearchSameR
Thanks for the reply, Doron. I knew this email was targeted for you,
but thought it would be good to add to the user record.
On Mar 19, 2007, at 2:30 PM, Doron Cohen wrote:
Grant Ingersoll <[EMAIL PROTECTED]> wrote on 18/03/2007 10:16:14:
I'm using contrib/benchmark to do some tests for my
Grant Ingersoll <[EMAIL PROTECTED]> wrote on 18/03/2007 10:16:14:
> I'm using contrib/benchmark to do some tests for my ApacheCon talk
> and have some questions.
>
> 1. In looking at micro-standard.alg, it seems like not all braces are
> closed. Is a line ending a separator too?
'>' can replace
You'll have a difficult time updating Lucene indexes in place. A lot of
coordination exists within Lucene specifically not to do this: it's the
fact that Lucene does not do this that enables a lot of the lockless
parallelism in Lucene. This applies equally to the data store and the
inverted index p
Hello, I'm having some issues making the correct query. This is my current
situation.
I'm searching for :"foo bar" in 3 fields:
In the index I have:
document 1.
field1 contains (boost is 2.0): "bar stuff"
field2 contains: "bar max"
field3 contains: ""
document 2.
field1 contains (boost is 2.0)
Donna,
You are correct, "enum" should be "terms". Could you please modify the FAQ?
You just have to log into Wiki and edit that page (the edit link is at the
bottom).
Thanks,
Otis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/ - Tag - Search -
I am a very new user of Lucene, and thus far am amazed at its speed and
ease of use.
I have a question about something in the FAQ though. I have a need to get
all
terms in a specific section of the document; I want to create a database
of term vs
an identifier of the document containing the term
On Mar 19, 2007, at 5:09 AM, Michael McCandless wrote:
I think some of these changes are similar to how KinoSearch builds a
segment.
Yup... sounds familiar. ;)
I'm still working through some lingering issues before I can make a
clean patch,
Well, where is it? Don't keep it a secret!
M
One of the constructors for StandardAnalyzer allows you to set your
stop words. If you use the default constructor, you get the default
set of stop words, which is in StopAnalyzer.ENGLISH_STOP_WORDS.
-Grant
On Mar 19, 2007, at 6:14 AM, aslam bari wrote:
Hello All,
I am using StandarAnalyz
Hi,
I've been looking into improving performance of IndexWriter,
specifically how it makes use of RAM to buffer added documents.
I've created a new class (MultiDocumentWriter) that can build a single
segment from many documents at once, more efficiently than the current
single document segment ap
Hello All,
I am using StandarAnalyzer for indexing documents. Then i make a query to
search some words with And query.
For example I need to search for a document which contains followings all words
" this is garden".
I think when lucene index the document , it ignores some common words like
"
To store document (specially large ones) out of the index is better than
in index. Every merge of segments or optimize will copy those data.
Stored in index is possible, but it requires 1-4x more space, depends on
read/write speed of the fs, merge and optimize takes longer time.
Karel
On Sun, 200
17 matches
Mail list logo