Indexing and searching with StandardAnalyzer

2006-05-08 Thread Bob Cheung
Using StandardAnalyzer, I was able to index a document containing the string "co_cc" (without quotes) but I couldn't search for it. Using Luke, I was able to see "co_cc" was indexed. Using Luke to search, I was not able to find any hit using StandardAnalyzer. However, if I use KeywordAnalyzer to

Stefan Raspl/Germany/IBM is out of the office.

2006-05-08 Thread Stefan Raspl
I will be out of the office starting 05/09/2006 and will not return until 05/15/2006. I will respond to your message when I return. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Scoring without floating point calculations

2006-05-08 Thread Otis Gospodnetic
Ah, this is pretty disheartening. Regardless, I'm about to dive into this, so if you have any tips or experiences to share, I'm all eyeballs. Otis - Original Message From: Ken Krugler <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, April 28, 2006 7:34:16 PM Subject: R

RE: Encryption

2006-05-08 Thread Colin Young
Rather than concentrating on the specific mechanisms, could you elaborate a bit on the overall goals and features of the system? Who is providing the documents? What are the threats you are guarding against? How is the system going to be used? Colin -Original Message- From: George Washin

spell checker across multiple fields

2006-05-08 Thread lostzen
Hello, I am trying to use the SpellChecker to suggest more popular search terms. It works great one only using one field, but what is the best way to make it work across multiple fields? I've already created my own version of a Dictionary (based on LuceneDictionary) which provides a dictionary o

Re: Subject indexing and seraching documents with multiple languages

2006-05-08 Thread karl wettin
On Mon, 2006-05-08 at 16:08 +0200, karl wettin wrote: > On Mon, 2006-05-08 at 08:34 -0400, Grant Ingersoll wrote: > > This seems to be necessary because the IndexWriter takes an analyzer > > as parameter. Thus we can pass the English documents to the > > IndexWriter created with the English analyze

Re: Subject indexing and seraching documents with multiple languages

2006-05-08 Thread karl wettin
On Mon, 2006-05-08 at 08:34 -0400, Grant Ingersoll wrote: > This seems to be necessary because the IndexWriter takes an analyzer > as parameter. Thus we can pass the English documents to the > IndexWriter created with the English analyzer and so on. /** * Adds a document to this index, using the

Re: Subject indexing and seraching documents with multiple languages

2006-05-08 Thread Grant Ingersoll
We wrote our own MultiSearcher type class that manages this problem. It takes in a query in the user's native language and then feeds it to the searcher for that language, which uses a machine translation component to create a query for that index using that language's Analyzer. -Grant [EMAI

Subject indexing and seraching documents with multiple languages

2006-05-08 Thread pbatcoi
Hello, we need to index and search documents of multiple languages. Our current approach is: Determine the language of each document before passing it to Lucene and use a Lucene index for each language. This seems to be necessary because the IndexWriter takes an analyzer as parameter. Thus we c

Re: The best Chinese Analyzer?

2006-05-08 Thread Ray Tsang
Hi Bob, In short, I use a slightly modified ChineseAnalyzer to index chinese text. They differ mainly in the way they tokenize the text. StandardAnalyzer is inteded to use w/ Latin-based languages, that each word composes of multiple characters, and each word is separated by special markers such