I thought that by using a StandardAnalyzer
with a StopWord list that is a merge of the
ENGLISH_STOP_WORDS and a handful
of additions that I have provided -- additions
which include the most common file
suffixes [.txt, .xml, .doc, etc.] -- ought
to eliminate any occurrence of those
terms in the resu
Hi,
In a program I have indexed 10 files. When I do a search using the query
"contents:java", it will return 2 documents. But when I give
"-contents:java", then it will return an empty result set. Does anyone
know what the right query string for this? I.e., to retrieve all
documents that does no
Hi all,
I need urgent help for the following issues.
What is the query string to retrieve all the documents indexed
(something similar to *.*)?
In a program I have indexed 10 files. When I do a search using the query
"contents:java", it will return 2 documents. But when I give
"-contents:java
:
: To circumvent it, here are a few options that I have thought of:
: 1. Chunk it up:
: a. Create a filter based on a query that has a maximum of 1024.
: b. Get its bits.
: c. Get the next 1024 blocked skus and create a filter out of it and get
: its bits.
: d. AND the two BitSets.
:
I thought of that but I had that listed as a last fallback option because I
was not sure what it meant in terms of performance since I am a newbie to
Lucene.
So if I bump up my heap (I assume that's what you are referring to when you
say java pool) it'll be ok?
Are there metrics around this?
At x
Another way around it is to increase the max clause count.
//Setting the clause Count
BooleanQuery.setMaxClauseCount(int);
Can use maxint or some number smaller.. When I set this high, I have had
to set the java pool higher for memory as well.
Tom
-Original Message-
From: Sharma, Siddh
Hi Joe,
I'm one of Carrot2 developers and I have good news for you :) The example of
using Carrot2 with Lucene is in the Carrot2 repository on SourceForge.net (
http://sourceforge.net/projects/carrot2). Please check out the "carrot2"
module (http://cvs.sourceforge.net/viewcvs.py/carrot2/carrot2/)
Query: caught a class org.apache.lucene.queryParser.ParseException
with message: Too many boolean clauses
I realize why this is happening (the 1024 clauses limit for BooleanQuery).
My question is more design related.
During customer registration, the customer defines a set of skus/products
that
Hello,
first, thank you for your help !!
I have replaced the JAR File in the "Java Build Path" von Eclipse with
the lastest version (PDFBox-0.7.2.jar), but I still receive the same
error message :
Indexing E:\Galeas\lucene\data\pdfs\Beginning Java Server Pages.pdf
Exception in thread "mai
Hi all,
Was just wondering if anyone has come across this or if I'm doing
something wrong here. On initial load of my index, I can close the
writer and delete an entry and then update an entry, then open the
writer again and go on to the next entry etc. Then while searching,
everything t
In addition, the latest version(0.7.2) of PDFBox does not require log4j,
so you could also upgrade to that version.
Ben
On Mon, 17 Oct 2005 [EMAIL PROTECTED] wrote:
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/log4j/Logger
> at org.pdfbox.pdfparser.BaseParser.(Ba
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/log4j/Logger
at org.pdfbox.pdfparser.BaseParser.(BaseParser.java:70)
PDFBox cannot find Log4J. You can add Log4J to you classpath to fix
this.
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
S
Hi All-
I have seen an example using carrot2 for clustering, but have not really played
with it that much. Does anyone have a good example of using clustering with
Lucene...has anyone attempted to do it with carrot2 or something else?
I was initially going to do a facated search...which would
Do you have the log4j.properties file in the classpath?
-Original Message-
From: Patricio Galeas <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Mon, 17 Oct 2005 15:50:46 +0200
Subject: Lucene in Action : example code -> document-parsing framework ...
Hi ALL,
I try to run the
Hi ALL,
I try to run the an example of the "Lucene in Action" book :
Chapter 7: Parsing Common Document Formats:
lia.handlingtypes.framework.FileIndexer
I have downloaded all the source code from www.manning.com/hatcher2
and create a java project in Lucene 3.1.
I become the following error mess
15 matches
Mail list logo