On 8/16/06, Stanislav Jordanov <[EMAIL PROTECTED]> wrote:
I searched the mail list archives for an answer to that question;
The closest (and perhaps the only) thread in this regard that I found is:
http://www.gossamer-threads.com/lists/lucene/java-user/9928
So the answer was "No", but this is
I suspect you'll have to roll your own. I'd use the SynonymAnalyzer from
Lucene in Action as a model, starting around page 129. I really doubt that
there's much you can expect Lucene to do for you for this specialized kind
of tokenizing.
Erick
On 8/16/06, Van Nguyen <[EMAIL PROTECTED]> wrot
We are indexing the file server via Lucene. We have a 15 MB index file and
have set up a once a day re-index and then switch to a continuous index
update.
So everytime a content item is published it is immediately indexed and
deployed. Problem persists with both scenario's.
We are using jconsole
I'm looking for a
cross between a WhitespaceAnalyzer and StandardAnalyzer. If I pass
in:
I-Pity-da-fool who
has a 1" ladder said MR.T
I want it to index
these:
i-pity-da-fool
pity
fool
1"
1
ladder
mr.t
United Rentals
Consider it done.™
800-UR-RENTS
unitedrentals.com
---
This keeps popping back into my head. A little more info for you. Bear
in mind I have not dealt with the QueryParser before.
Use the approach I gave last time. Pull out the QueryParser and change
either QueryParser.jj or QueryParser.java...you may be able to just
change QueryParser.java and av
The reason has already been posted in response to my initial inquiry.
This problem bugged me last month. I did not know the particulars but I
assumed it was a bug.
I inquired on the mailing list and someone responded with the following
link:
Highligter fails to include non-token at end of st
[EMAIL PROTECTED] told me that the highlighter ALWAYS does this
under certain conditions. In my case, it is when the string ends with
. He knew why but I did not. I just fixed it in my code by
putting things back.
On Aug 16, 2006, at 3:17 AM, Ramesh Salla wrote:
which version of Lucene a
On Fri, Aug 11, 2006 at 02:39:19PM +0300, Eugeny N Dzhurinsky wrote:
> On Fri, Aug 11, 2006 at 01:22:26PM +0200, Simon Willnauer wrote:
> > Sure you can do this.
> > You index your document with the keywords assigned to the document and
> > search with and Boolean Query to get all document having t
Oh rats. Thunderbird ate the indenting. The two examples should be:
multipart/alternative
text/plain
multipart/related
text/html
image/gif
image/gif
application/msword
and
multipart/related
text/html
image/
lude wrote:
You also mentioned indexing each bodypart ("attachment") separately.
Why?
To my mind, there is no use case where it makes sense to search a
particular bodypart
I will give you the use case:
[snip]
3.) The result list would show this:
1. mail-1 'subject'
'Abstract of the messa
I searched the mail list archives for an answer to that question;
The closest (and perhaps the only) thread in this regard that I found is:
http://www.gossamer-threads.com/lists/lucene/java-user/9928
So the answer was "No", but this is way back in the mid 2004 (2 years ago).
Is there a solution
Hi Johan,
thanks again for the many words and explanations!
You also mentioned indexing each bodypart ("attachment") separately.
Why?
To my mind, there is no use case where it makes sense to search a
particular bodypart
I will give you the use case:
1.) User searches for "abcd"
2.) Luc
Hi all,
here the scenario, i'm trying to index a database and i'd like to put in the
index "the counts of all related table",
the first option is to count against the db and then store the data into the
documents but i think is not a real option because of huge ammount of
structured-data doesnt
Hi, my lucene index updates via the fileserver is eating up almost a huge
amount of heap memory and once the index is completed the memory is not
been returned. Ram Drive is enabled.
Does anyone know if this might be a problem with the amount of memory been
allocated to the Ram Directory?
Can y
lude wrote:
Hi John,
thanks for the detailed answer.
You wrote:
If you're indexing a
multipart/alternative bodypart then index all the MIME headers, but only
index the content of the *first* bodypart.
Does this mean you index just the first file-attachment?
What do you advice, if you have to
Hello Sameer,
what about this:
- during indexing, use the StandardAnalyzer without stopwords
- during the search, use 2 different Analyzers - one with and one without
stopwords. Thereyby, you look first whether the user
has typed in quotes inside her query String.
# If so, look whether there
-Ursprungligt meddelande-
Från: Marcus Falck [mailto:[EMAIL PROTECTED]
Skickat: den 16 augusti 2006 10:47
Till: java-user@lucene.apache.org
Ämne: addIndexes method without the merge
Hi,
In my search engine (based on top of the lucene 1.4.3 api) I'm using one
RAMDir as a live indexin
Hi Dejan,
how do you query for email- and(!) attachment-documents,
if you just want to present one hit per email (even if the searchterm
matches
in the email- and(!) in the corresponding attachment-document)?
Thanks
lude
On 8/15/06, Dejan Nenov <[EMAIL PROTECTED]> wrote:
The approach we I fi
Hi John,
thanks for the detailed answer.
You wrote:
If you're indexing a
multipart/alternative bodypart then index all the MIME headers, but only
index the content of the *first* bodypart.
Does this mean you index just the first file-attachment?
What do you advice, if you have to index mulitp
Hi,
In my search engine (based on top of the lucene 1.4.3 api) I'm using one
RAMDir as a live indexing buffert and one FSDir as the main persisted
index.
When the RAMDir buffert has been filled I'm adding those documents to
the FSDir and clear the RAMDir.
At first I was iterating thru
goto mailing list archive and you find a lot of info there.
i can brief you out procdure for now.
get the Highlighter jar from the lucene-sandbox and see the examples from this downloaded folder.
Get the Search Results from the Hits and pass this string to the highlighter class.
if you still
which version of Lucene and which version of Highlighter, do you use.
I dont see any such issues?
I think, I can resolve the issue, if you can pass on a few info on you are trying to get the data and highlight things.
On Sat, 2006-08-12 at 00:05 +, Ronnie Kolehmainen wrote:
There is
22 matches
Mail list logo