date:20051017

StopWords -- File Suffix

2005-10-17 Thread tcorbet

I thought that by using a StandardAnalyzer with a StopWord list that is a merge of the ENGLISH_STOP_WORDS and a handful of additions that I have provided -- additions which include the most common file suffixes [.txt, .xml, .doc, etc.] -- ought to eliminate any occurrence of those terms in the resu

Re: need help for generating Query String

2005-10-17 Thread Koji Sekiguchi

Hi, In a program I have indexed 10 files. When I do a search using the query "contents:java", it will return 2 documents. But when I give "-contents:java", then it will return an empty result set. Does anyone know what the right query string for this? I.e., to retrieve all documents that does no

need help for generating Query String

2005-10-17 Thread jibu mathew

Hi all, I need urgent help for the following issues. What is the query string to retrieve all the documents indexed (something similar to *.*)? In a program I have indexed 10 files. When I do a search using the query "contents:java", it will return 2 documents. But when I give "-contents:java

Re: Too many clauses

2005-10-17 Thread Chris Hostetter

: : To circumvent it, here are a few options that I have thought of: : 1. Chunk it up: : a. Create a filter based on a query that has a maximum of 1024. : b. Get its bits. : c. Get the next 1024 blocked skus and create a filter out of it and get : its bits. : d. AND the two BitSets. :

RE: Too many clauses

2005-10-17 Thread Sharma, Siddharth

I thought of that but I had that listed as a last fallback option because I was not sure what it meant in terms of performance since I am a newbie to Lucene. So if I bump up my heap (I assume that's what you are referring to when you say java pool) it'll be ok? Are there metrics around this? At x

RE: Too many clauses

2005-10-17 Thread Aigner, Thomas

Another way around it is to increase the max clause count. //Setting the clause Count BooleanQuery.setMaxClauseCount(int); Can use maxint or some number smaller.. When I set this high, I have had to set the java pool higher for memory as well. Tom -Original Message- From: Sharma, Siddh

Re: Clustering with Lucene

2005-10-17 Thread Stanislaw Osinski

Hi Joe, I'm one of Carrot2 developers and I have good news for you :) The example of using Carrot2 with Lucene is in the Carrot2 repository on SourceForge.net ( http://sourceforge.net/projects/carrot2). Please check out the "carrot2" module (http://cvs.sourceforge.net/viewcvs.py/carrot2/carrot2/)

Too many clauses

2005-10-17 Thread Sharma, Siddharth

Query: caught a class org.apache.lucene.queryParser.ParseException with message: Too many boolean clauses I realize why this is happening (the 1024 clauses limit for BooleanQuery). My question is more design related. During customer registration, the customer defines a set of skus/products that

Re: Lucene in Action : example code -> document-parsing framework ...

2005-10-17 Thread Patricio Galeas

Hello, first, thank you for your help !! I have replaced the JAR File in the "Java Build Path" von Eclipse with the lastest version (PDFBox-0.7.2.jar), but I still receive the same error message : Indexing E:\Galeas\lucene\data\pdfs\Beginning Java Server Pages.pdf Exception in thread "mai

Error with incremental load

2005-10-17 Thread Aigner, Thomas

Hi all, Was just wondering if anyone has come across this or if I'm doing something wrong here. On initial load of my index, I can close the writer and delete an entry and then update an entry, then open the writer again and go on to the next entry etc. Then while searching, everything t

RE: Lucene in Action : example code -> document-parsing framework ...

2005-10-17 Thread Ben Litchfield

In addition, the latest version(0.7.2) of PDFBox does not require log4j, so you could also upgrade to that version. Ben On Mon, 17 Oct 2005 [EMAIL PROTECTED] wrote: > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/log4j/Logger > at org.pdfbox.pdfparser.BaseParser.(Ba

RE: Lucene in Action : example code -> document-parsing framework ...

2005-10-17 Thread n.bulthuis

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Logger at org.pdfbox.pdfparser.BaseParser.(BaseParser.java:70) PDFBox cannot find Log4J. You can add Log4J to you classpath to fix this. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] S

Clustering with Lucene

2005-10-17 Thread msftblows

Hi All- I have seen an example using carrot2 for clustering, but have not really played with it that much. Does anyone have a good example of using clustering with Lucene...has anyone attempted to do it with carrot2 or something else? I was initially going to do a facated search...which would

Re: Lucene in Action : example code -> document-parsing framework ...

2005-10-17 Thread msftblows

Do you have the log4j.properties file in the classpath? -Original Message- From: Patricio Galeas <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Mon, 17 Oct 2005 15:50:46 +0200 Subject: Lucene in Action : example code -> document-parsing framework ... Hi ALL, I try to run the

Lucene in Action : example code -> document-parsing framework ...

2005-10-17 Thread Patricio Galeas

Hi ALL, I try to run the an example of the "Lucene in Action" book : Chapter 7: Parsing Common Document Formats: lia.handlingtypes.framework.FileIndexer I have downloaded all the source code from www.manning.com/hatcher2 and create a java project in Lucene 3.1. I become the following error mess

StopWords -- File Suffix

Re: need help for generating Query String

need help for generating Query String

Re: Too many clauses

RE: Too many clauses

RE: Too many clauses

Re: Clustering with Lucene

Too many clauses

Re: Lucene in Action : example code -> document-parsing framework ...

Error with incremental load

RE: Lucene in Action : example code -> document-parsing framework ...

RE: Lucene in Action : example code -> document-parsing framework ...

Clustering with Lucene

Re: Lucene in Action : example code -> document-parsing framework ...

Lucene in Action : example code -> document-parsing framework ...

15 matches

Site Navigation

Mail list logo

Footer information