Thanks Otis, I created a custom analyzer and it's working fine.
Here's my analyzer, for reference: public class KeywordLowerAnalyzer extends Analyzer{ public KeywordLowerAnalyzer() { } public TokenStream tokenStream(String fieldName, Reader reader) { TokenStream result = new KeywordTokenizer(reader); result = new LowerCaseFilter(result); return result; } } Cheers Andre On Tue, Aug 12, 2008 at 9:22 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Perhaps you can lowercase the text prior to passing it to Lucene? > Or perhaps you can have a custom Analyzer that treats the whole input as 1 > Token (see KeywordAnalyzer -- > http://lucene.apache.org/java/2_3_2/api/org/apache/lucene/analysis/KeywordAnalyzer.html > ), but also includes LowerCaseFilter that's applied to that 1 Token. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > ----- Original Message ---- >> From: Andre Rubin <[EMAIL PROTECTED]> >> To: java-user@lucene.apache.org >> Sent: Wednesday, August 13, 2008 12:15:25 AM >> Subject: Re: Searching Tokenized x Un_tokenized >> >> Thanks Otis, that was exactly what was happening. >> >> 1) According to here: >> http://wiki.apache.org/lucene-java/LuceneFAQ#head-133cf44dd3dff3680c96c1316a663e881eeac35a >> wildcard queries are not passed through the Analyzer, but they are >> always set to lower case. >> >> 2) And according to here: >> http://wiki.apache.org/lucene-java/LuceneFAQ#head-0f374b0fe1483c90fe7d6f2c44472d10961ba63c >> un_tokenized fields are not passed through the Analyze as well. >> >> So by creating an untokenized field and setting >> parser.setLowercaseExpandedTerms(false), I manage to make my use case >> work in a case-sensitive manner. That is, 'u*' returns 'usa' and 'U*' >> returns USA.... >> >> The thing is, how to make this case-insensitive? I can make #1 work by >> settting it to lowercase: parser.setLowercaseExpandedTerms(true). But >> how make #2 work, that is, using a LowerCaseFilter to an Untokenized >> field? >> >> Thanks, >> >> >> Andre >> >> On Tue, Aug 12, 2008 at 7:57 PM, Otis Gospodnetic >> wrote: >> > Andre, >> > >> > Check the Lucene FAQ, there is an entry about wildcards and analysis (which >> doesn't take place for wildcard queries). Could that be it? >> > >> > Otis >> > -- >> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >> > >> > >> > >> > ----- Original Message ---- >> >> From: Andre Rubin >> >> To: java-user@lucene.apache.org >> >> Sent: Tuesday, August 12, 2008 5:30:47 PM >> >> Subject: Re: Searching Tokenized x Un_tokenized >> >> >> >> My searches for my String tokenized field was working properly. I >> >> switched the field to un_tokenized, rebuilt the index, and now my >> >> searches only return strings that match the query string in lower >> >> case. >> >> >> >> For example, searching for 'us*': >> >> >> >> The tokenized field version would find 'USA' and 'usa' >> >> >> >> The untokenized field version only finds 'usa' >> >> >> >> I'm using the StandardAnalyzer in both cases. >> >> >> >> Thanks >> >> >> >> >> >> Andre >> >> >> >> On Thu, Aug 7, 2008 at 8:16 PM, Otis Gospodnetic >> >> wrote: >> >> > Hi, >> >> > >> >> > Perhaps you can give some examples. Yes, untokenized means "full >> >> > string" - >> it >> >> requires an "exact match". >> >> > >> >> > Otis >> >> > -- >> >> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >> >> > >> >> > >> >> > >> >> > ----- Original Message ---- >> >> >> From: Andre Rubin >> >> >> To: java-user@lucene.apache.org >> >> >> Sent: Thursday, August 7, 2008 8:04:04 PM >> >> >> Subject: Searching Tokenized x Un_tokenized >> >> >> >> >> >> Hi all, >> >> >> >> >> >> When I switched a String field from tokenized to untokenized, some >> >> >> searches started not returning some obvious values. Am I missing >> >> >> something on querying untokenized fields? Another question is, do I >> >> >> need an Analyzer if my search is on an Untokenized field, wouldn't the >> >> >> search be based on the full String rather than its tokens? >> >> >> >> >> >> Thanks, >> >> >> >> >> >> >> >> >> Andre >> >> >> >> >> >> --------------------------------------------------------------------- >> >> >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> >> >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > >> >> > >> >> > --------------------------------------------------------------------- >> >> > To unsubscribe, e-mail: [EMAIL PROTECTED] >> >> > For additional commands, e-mail: [EMAIL PROTECTED] >> >> > >> >> > >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> >> For additional commands, e-mail: [EMAIL PROTECTED] >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: [EMAIL PROTECTED] >> > For additional commands, e-mail: [EMAIL PROTECTED] >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]