Re: Clarification on deletion process...

2008-08-11 Thread Chris Hostetter
: When i delete a document from the index ... The answer to all of your questions is yes, however documents marked for deletion are also "removed" from segments whenever they are merged, which can happen on any add. PS... : In-Reply-To: <[EMAIL PROTECTED]> : Subject: Clarification on

Re: Field sizes: maxFieldLength

2008-08-11 Thread Chris Hostetter
: In-Reply-To: <[EMAIL PROTECTED]> : Subject: Field sizes: maxFieldLength http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you cha

Query to ignore certain phrases

2008-08-11 Thread Jeff French
We're trying to perform a query where if our intended search term/phrase is part of a specific larger phrase, we want to ignore that particular match, but not the entire document (unless of course there are no other hits with our intended term/phrase). For example, a query like: "white house"

Re: Field sizes: maxFieldLength

2008-08-11 Thread Mark Miller
No. Its just a simple number of terms taken from the document limiter used on the assumption that text that long stops adding value, or starts adding noise, or starts accelerating into diminishing returns at that point. No optimizations are used based on it. On Mon, Aug 11, 2008 at 5:20 PM, <[EMAI

Re: Field sizes: maxFieldLength

2008-08-11 Thread Mark Miller
The gist is: it doesn't help. That simply cuts long documents off at the knees on the assumption that its long enough already, that more won't add much value (and may add noise?). Its not used for any sort of optimizations...its a straight, just use the first n tokens from a document. [EMAIL P

Re: Field sizes: maxFieldLength

2008-08-11 Thread Aravind . Yarram
tx for the response but i think i didnt make my question clear... If i am indexing a filed that can at the max contain 1000 fileds, does it help in improving performance if i let Lucene know IN ADVANCE about 1000? Mark Miller <[EMAIL PROTECTED]> 08/11/2008 05:13 PM Please respond to java-u

Re: Field sizes: maxFieldLength

2008-08-11 Thread Mark Miller
[EMAIL PROTECTED] wrote: Hi all - I know in advance that each of the fileds i index doesnt go more than 1000, Can i gain any performance improvement while writing the index by limiting the maxFieldLength to 200? tx Regards, Aravind R Yarram This message contains information from Equifax I

Field sizes: maxFieldLength

2008-08-11 Thread Aravind . Yarram
Hi all - I know in advance that each of the fileds i index doesnt go more than 1000, Can i gain any performance improvement while writing the index by limiting the maxFieldLength to 200? tx Regards, Aravind R Yarram This message contains information from Equifax Inc. which may be confidentia

Clarification on deletion process...

2008-08-11 Thread Aravind . Yarram
The documentation for delete operation seems to be confusing (i am going thru the book and also posted in the books forums...), so appreciate if someone can let me know if my below understanding is correct. When i delete a document from the index 1) It is marked for deletion in the BUFFER until

Re: escaping special characters

2008-08-11 Thread Mark Miller
Steven A Rowe wrote: On 08/11/2008 at 2:14 PM, Chris Hostetter wrote: Aravind R Yarram wrote: can i escape built in lucene keywords like OR, AND aswell? as of the last time i checked: no, they're baked into the grammer. I have not tested this, but I've read somewhere on t

Re: escaping special characters

2008-08-11 Thread Matthew Hall
You can simply change your input string to lowercase before passing it to the analyzers, which will give you the effect of escaping the boolean operators. (I.E you will now search on and or and not) Remember however that these are extremely common words, and chances are high that you are remo

RE: escaping special characters

2008-08-11 Thread Steven A Rowe
On 08/11/2008 at 2:14 PM, Chris Hostetter wrote: > Aravind R Yarram wrote: > > can i escape built in lucene keywords like OR, AND aswell? > > as of the last time i checked: no, they're baked into the grammer. I have not tested this, but I've read somewhere on this list that enclosing OR and AND

Re: escaping special characters

2008-08-11 Thread Chris Hostetter
: can i escape built in lucene keywords like OR, AND aswell? as of the last time i checked: no, they're baked into the grammer. (that may have changed when it switchedfrom a javac to a flex grammer though, so i'm not 100% positive) -Hoss -

Re: Term Based Meta Data

2008-08-11 Thread Mark Miller
If I were feeling adventurous, and I wanted to help out Mark with Lucene-1001, I'd try this: Get the trunk and apply Lucene-1001. Index all of your docs with the highlight coords as payloads. At highlight time, do something like the SpanHighlighter does - I've got a class called something lik

Re: Term Based Meta Data

2008-08-11 Thread Martin Owens
> Following the history of Payloads from its beginnings > (https://issues.apache.org/jira/browse/LUCENE-755, > https://issues.apache.org/jira/browse/LUCENE-761, > https://issues.apache.org/jira/browse/LUCENE-834, > http://wiki.apache.org/lucene-java/Payload_Planning) it looks like > TermP

Re: Re: Highlight huge documents

2008-08-11 Thread jim
it works!! Thanks I believe Highlighter.setMaxDocBytesToAnalyze(int byteCount) should be used for this. On Mon, Aug 11, 2008 at 11:40 AM, <[EMAIL PROTECTED]> wrote: > Hello > > I am using Highlighter to highlight query terms in documents getting from a > database founded from lucene search.

Re: Highlight huge documents

2008-08-11 Thread Doron Cohen
I believe Highlighter.setMaxDocBytesToAnalyze(int byteCount) should be used for this. On Mon, Aug 11, 2008 at 11:40 AM, <[EMAIL PROTECTED]> wrote: > Hello > > I am using Highlighter to highlight query terms in documents getting from a > database founded from lucene search. > My problem is that wh

Highlight huge documents

2008-08-11 Thread jim
Hello I am using Highlighter to highlight query terms in documents getting from a database founded from lucene search. My problem is that when i display the full document, highlighter works fine for most of documents but if the document is huge the highlighter returns only a part of documen