RE: Too Many Open files Exception

2007-07-05 Thread Van Nguyen
Ok... after spending time looking at the code... I see that a method is not closing a TokenStream in one of the classes (a class that is instantiated quite often) - I would imagine this could quite possibly be the culprit? Van -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTEC

RE: Too Many Open files Exception

2007-07-05 Thread Van Nguyen
: so ... what is your ulimit set to? Issuing a "limit descriptors", I see that I have it set to 1024 : how many files are in your index directory? In the directory that I'm getting this particular error: 3 I have 24 different index directories... I think the most I saw at that particular time i

Too Many Open files Exception

2007-07-03 Thread Van Nguyen
I am getting a "Too Many Open Files" Exception. I've read the FAQ about lowering the merge factor (currently set to 25), issuing a ulimit -n , etc... but I am still getting the "Too Many Open Files" Exception (yes... I'm making sure I close all writer/searchers/reader and I only have one open at a

RE: Question about applications using different versions of Lucene

2007-02-12 Thread Van Nguyen
>Am I correct in understanding that you have two seperate applications: one >reading hte index and one writing the index and you only upgraded lucene >for the application that writes the index? Yes >If so, this is not a supported compatibility situation, if the wiki were >up right now there is a

Question about applications using different versions of Lucene

2007-02-12 Thread Van Nguyen
I have two applications that share some of the same Lucene Indexes. I recently upgrade the Lucene-core.jar from v2.0 to a nightly build (Feb. 04, 2006 - I was looking for the IndexWriter class that allows you to merge indexes w/o optimizing). Now I notice the index is a little different: P

is there a Query for this?

2007-01-17 Thread Van Nguyen
Just wondering if there was a query for this: Let's say I want to query: "white hard hat". Is there a query that will build something like this: (+field:white +field:hard field:hat) (+field:white field:hard +field:hat) (field:white +field:hard +field:hat) In other words... the query ne

RE: Modifying StandardAnalyzer

2007-01-12 Thread Van Nguyen
what you need? On 1/11/07, Van Nguyen <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I need to modify the StandardAnalyzer so that it will tokenize zip codes > that look like this: > > > > 92626-2646 > > > > I think the part I need to modify is in h

Modifying StandardAnalyzer

2007-01-11 Thread Van Nguyen
Hi, I need to modify the StandardAnalyzer so that it will tokenize zip codes that look like this: 92626-2646 I think the part I need to modify is in here - specifically: // floating point, serial, model numbers, ip addresses, etc. // every other segment must have at least

RE: JAVA JVM Question

2006-12-21 Thread Van Nguyen
. best regards simon On 12/20/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Are you using 2.1-dev version of Lucene? Try the latest nightly build, it as a fix for a certain OOM bug (see LUCENE-754). > > Otis > > - Original Message > From: Van Nguyen <[EMAIL

JAVA JVM Question

2006-12-20 Thread Van Nguyen
I have an index that's approximately 875MB. I'm using JBoss Application Server 4.04 w/ Apache HTTP Server 2.2. My min/max JVM size is: 128MB/512MB. On initial startup, everything works fine. I'm able to search (although it takes a while doing the first search because it's loading the index into

Filter question

2006-12-08 Thread Van Nguyen
I have a query that uses a filter... looking something like this: BooleanQuery filterQuery = new BooleanQuery(); // add criteria QueryFilter qf = new QueryFilter(filterQuery); CachingWrapperFilter cwf = new CachingWrapperFilter(qf);

any ides on this type of analyzer?

2006-11-30 Thread Van Nguyen
I've been trying to brainstorm on this but could not figure out a way to go about this. Let's say I'm searching for "batman". I want results that include: batman bat man bat-man etc. or if I search screwdriver, I would want results to include: screwdriver screw drivers etc.

StandardAnalyzer question

2006-09-29 Thread Van Nguyen
I have a field in my index that is being tokenized using the StandardAnalyzer.  Let’s say that field was:   TOOLS FOR TRAILER   The word “FOR” is a stop word so it is not being indexed (based on the StandardAnaylzyer).  When someone types in TOOLS FOR TRAILER in, I have a BooleanQuery s

RE: Caused by: java.io.IOException: The handle is invalid

2006-09-26 Thread Van Nguyen
where during program. Using the client version, it builds w/o any errors. Van -Original Message- From: Michael McCandless [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 26, 2006 6:38 AM To: java-user@lucene.apache.org Subject: Re: Caused by: java.io.IOException: The handle is invalid

Caused by: java.io.IOException: The handle is invalid

2006-09-25 Thread Van Nguyen
I’m getting an error while trying to build my index:   Caused by: java.io.IOException: The handle is invalid   at java.io.RandomAccessFile.close0(Native Method)   at java.io.RandomAccessFile.close(RandomAccessFile.java:532)   at org.apache.lucene.store.FSIndexOu

RE: is there such an analyzer?

2006-08-17 Thread Van Nguyen
#x27;d use the SynonymAnalyzer from Lucene in Action as a model, starting around page 129. I really doubt that there's much you can expect Lucene to do for you for this specialized kind of tokenizing. Erick On 8/16/06, Van Nguyen <[EMAIL PROTECTED]> wrote: > > I'

is there such an analyzer?

2006-08-16 Thread Van Nguyen
I'm looking for a cross between a WhitespaceAnalyzer and StandardAnalyzer.  If I pass in:   I-Pity-da-fool who has a 1" ladder said MR.T   I want it to index these:   i-pity-da-fool pity fool 1" 1 ladder mr.t United Rentals Consider it done.™ 800-UR-RENTS unitedrentals.com ---

RE: 7GB index taking forever to return hits

2006-08-15 Thread Van Nguyen
lin <[EMAIL PROTECTED]> wrote: > > To avoid "TooManyClauses", you can try Filter instead of Query. But > that will be slower. > Form what I see is that there are so many keys that match your query, > it will be tough for Lucene. > > On 8/14/06, Van Nguye

RE: 7GB index taking forever to return hits

2006-08-15 Thread Van Nguyen
-user@lucene.apache.org Subject: RE: 7GB index taking forever to return hits Sounds like you want to tokenise CONTENTS, if you are not already doing so. Then you could simply have: +CONTENTS:white +CONTENTS:hard +CONTENTS:hat -Original Message- From: Van Nguyen [mailto:[EMAIL PROTECTED] Sent: 15 A

RE: 7GB index taking forever to return hits

2006-08-14 Thread Van Nguyen
org Subject: Re: 7GB index taking forever to return hits 2GB limitation only exists when you want to put them to memory in 32bits box. Our index size is larger than 13 giga bytes, and it works fine. I think it must be something error in your design. You can use Luke to see what happened in you

7GB index taking forever to return hits

2006-08-14 Thread Van Nguyen
Hi,   I have a 7GB index (about 45 fields per document X roughly 5.5 million docs) running on a Windows 2003 32bit machine (dual proc, 2GB memory).  The index is optimized.  Performing a search on this index will just “hang” when performing the search (wild card query with a sort).  At fi

Question regarding URL encoding

2006-07-17 Thread Van Nguyen
I'm trying to search my index using this search phrase: 1" That returns zero search results and throws a ParseException: Lexical error at line... I can see that 1" is part of that particular document by searching that same document using a different search term. How should the Lucene in

how to find out if two fields are identical?

2006-07-12 Thread Van Nguyen
Is there a way to compare the values of two fields to see if they are the same? Let's say we have an index with these fields: ID:2 childID: 7 parentID: 0 ID:3 childID: 6 parentID: 5

RE: RangeQuery question?

2006-07-12 Thread Van Nguyen
Exactly what I was looking for. Thanks! -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 12, 2006 12:47 AM To: java-user@lucene.apache.org Subject: Re: RangeQuery question? 1) RangeQuery is the devil, don't use it. If I weren't so lazy I would c

RangeQuery question?

2006-07-11 Thread Van Nguyen
Is there a RangeQuery equivalent that can query date range on two different fields? Term startTerm = new Term("startDate", "20060710"); Term endTerm = new Term("endDate", "20060711"); RangeQuery q = new RangeQuery(startTerm, endTerm, true);

question regarding Field.Index.UN_TOKENZED

2006-07-10 Thread Van Nguyen
I'm storing a field in an index with that option (Field.Index.UN_TOKENZIED). The String that is being stored is: NORTH SAFETY PRODUCT (all uppercase) When I try a wildcard query against that field, it only produces results if the query term is capitalized. I'm using the StandardAnalyz

RE: BooleanQuery question

2006-07-10 Thread Van Nguyen
That worked... thanks! -Original Message- From: Michael D. Curtin [mailto:[EMAIL PROTECTED] Sent: Thursday, July 06, 2006 1:04 PM To: java-user@lucene.apache.org Subject: Re: BooleanQuery question Van Nguyen wrote: > I just want results that have: > > ID: 1234 OR 234

BooleanQuery question

2006-07-06 Thread Van Nguyen
I have a BooleanQuery that looks like this: BooleanQuery query = new BooleanQuery(); TermQuery term1 = new TermQuery(new Term(ID, "1234")); TermQuery term2 = new TermQuery(new Term(ID, "2344")); TermQuery term2 = new TermQuery(new Term(ID, "2323")); TermQuery termLocation = new TermQuery

RE: Use one or more indexes?

2006-06-14 Thread Van Nguyen
I have a question in regards to the same topic: If I have three different database queries, should I just create a separate index for each query? Or should I just add all the results I get back from each of the query into one big index. Will there be any issues with documents having different nu

updating index - web application

2006-06-12 Thread Van Nguyen
I've been playing around with Lucene for a while now. I'm pretty comfortable with creating an index and searching against it. Up until now, I've been using the LuceneIndexAccessor package contributed by Maik Schreiber and that's working well for me. Now the next obstacle is to figure out wh

RE: question with spellchecker

2006-06-12 Thread Van Nguyen
I'll experiment with both. Thanks... -Original Message- From: mark harwood [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 07, 2006 2:16 AM To: java-user@lucene.apache.org Subject: Re: question with spellchecker I think the problem in your particular example is the suggestion software h

question with spellchecker

2006-06-06 Thread Van Nguyen
I'm implementing a spellchecker in my search and have a question. After creating the index and spellchecker index, I pass in the word "ducted tape" to search (I am expecting "duct tape" back). I've played around with boosting the prefixes and suffixes, setting the accuracy, passing in an Inde

fuzzyquery question

2006-05-31 Thread Van Nguyen
I have a question regarding the results I get back from a fuzzyquery. If I were to do a fuzzy search on: Classic series Should it come back with a result like: Standard Series Non Vented Hat - Class E&G If I do a search on: Clssic Series it will return the same results I get from a non-

RE: sorting issues

2006-05-23 Thread Van Nguyen
I was expecting it to be sorted alphabetically by a field I think I may have figured out my own question. I was tokenizing the field I wanted to sort. Changed it so that it's not tokenizing that field and I'm getting the results that I was expecting. Thanks, Van Nguyen Wynne Sy

sorting issues

2006-05-23 Thread Van Nguyen
Does anyone have any sorting issues in lucene? When lucene is returning results from my query, I get results similar to this: E.D. BULLARD E.D. BULLARD MINE SAFETY APPL MSA NORTH SAFETY PRODUCT NORTH SAFETY PRODUCT MINE SAFETY APPL MSA MINE SAFETY APPL MSA NORTH SAFETY PRODUCT ... Van This co

incremental updates

2006-05-22 Thread Van Nguyen
I'm pretty new to lucene and was wondering if there are any resources on how to do incremental updates in lucene. Thanks! Van Nguyen Wynne Systems, Inc. 19800 MacArthur Blvd., Suite 900 Irvine, CA 92612-2421 949.224.6300 ext 223 949.225.6540 (fax) 866.901.9284 (toll-free) www.wynnesystem

using MultiFieldQueryParser and WildcardQuery?

2006-05-18 Thread Van Nguyen
= queryParser.parse(line); Is there a way to do create a query that does this: (+description_short:white* +description_short:hard* +description_short:hat*) (+description_long:white* +description_long:hard* +description_long:hat*) Van Nguyen Wynne Systems, Inc. 19800 MacArthur Blvd., Suite 9