StackTrace
java.io.IOException: read past EOF
at org.apache.lucene.store.InputStream.refill(InputStream.java:154)
at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
at org.apache.lucene.store.InputStream.readBytes(InputStream.java
:57)
at org.apache.l
Is there a way to preload portions of the other files, particularly .tis,
.frq, .prx into memory? My total index size is roughly 4GB and we have 2GB
memory in the machine... the .tii file is tiny (about 1.5 MB).Basically,
before my server starts accepting and handling queries, I'd like to loa
: where ??
:
: Please send me url..
I'm not sure if i understnad your question, but have you had a chance to
take a look at all of the pages linked to from here yet?
http://lucene.apache.org/java/docs/gettingstarted.html
: -Original Message-
: From: Otis Gospodnetic [mailto:[EM
Hi Otis
Here are my specifications
--Query Simple query ( one to ten parameters ) and range queries ( three
Parameters )
--Size of index 15 - 20 GB
--Hardware 2 GB RAM 2 CPU machine Intel Xeon 2.4 Ghz
--Latency should be within a second
If you can provide me with some numerical figures on the
: entry for "apple" ? Basically I'd like to make sure that the entire
: inverted index (or as much as possible) is preloaded into memory, so if I
if you've got enough ram, and you really want everything loaded into
memory, you can allways use a RAMDirectory.
even if you want your index stored
where ??
Please send me url..
amaresh
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 01, 2006 10:04 AM
To: java-user@lucene.apache.org
Subject: Re: how to craete index with particular ID
Here's an example that will work with the query parser:
Look in your index directory and look for a .tii file. That file is read in
RAM (if there is enough of it. If there is not, you will see OOM). What
Monsur was talking about is related to sorting and warming up of FieldCache
instances. If you don't sort your results by criteria other than the
Here's an example that will work with the query parser:
title:FAQ
Otis
- Original Message
From: Amaresh Kumar Yadav <[EMAIL PROTECTED]>
To: "java-user@lucene.apache.org"
Sent: Wednesday, May 31, 2006 11:56:19 PM
Subject: RE: how to craete index with particular ID
i want to search fo
i want to search for text into "title" field only.
how shuold i specify it?
Regards..
Amaresh
-Original Message-
From: Alexey Sorokin [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 31, 2006 4:21 PM
To: java-user@lucene.apache.org
Subject: Re: how to craete index with particular ID
Actu
Thanks for the advice guys... i'm still not entirely clear on what a search
causes Lucene to do with respect to warming up/caching portions of the index
in memory.
If I warm up lucene using a search for "apple", does Lucene load the entire
inverted index into Memory, or just the part of the inde
Check this out.
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200512.mbox/[EMAIL
PROTECTED]
On 6/1/06, Monsur Hossain <[EMAIL PROTECTED]> wrote:
When Lucene first issues a query, it caches a hash of sort values (one
value per document, plus a bit more if you are sorting on strings
When Lucene first issues a query, it caches a hash of sort values (one
value per document, plus a bit more if you are sorting on strings),
which takes a while. Therefore, when our application first starts up,
we issue one query per sort type. As I understand, it doesn't matter
what the query is
Is there a way to preload the index into memory when the process starts?
Basically I want to warm up the index before processing user queries. What
are some recommended ways to do this? Thanks.
I have a question regarding the results I get back from a fuzzyquery.
If I were to do a fuzzy search on:
Classic series
Should it come back with a result like:
Standard Series Non Vented Hat - Class E&G
If I do a search on:
Clssic Series
it will return the same results I get from a non-
Hi experts,
Does Lucene do any caching of Document fields during a search?
If I perform a search and retrieve some fields in the Document hits,
then I repeat the same search, are those fields cached in memory? It
doesn't seem to be -- I'm performing several thousand unique search
and retrievely
On Mittwoch 31 Mai 2006 10:12, Simon Courtenage wrote:
> I'm new to Lucene but trying it out. I've successfully installed the
> luceneweb.war for the indexHTML
> web demo, but am getting an error when tomcat tries to compile the
> results.jsp part of the demo as it
> tries to answer a search quer
When attempting to re-create my index, I receive the following error:
java.io.IOException: Cannot delete C:\CatalogCollections\Live\_2s.cfs
at org.apache.lucene.store.FSDirectory.create(FSDirectory.java:198)
at
org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:144)
at org.apach
See QueryParser.setFuzzyPrefixLength()
This will apply to all fields parsed by the parser and is probably
generally advisable anyway to avoid server CPU overload.
Many production apps disable fuzzy searching completely in the search
syntax for this reason.
__
Hi,
thanks for the tip !! Yes, basically, I would like to reduce the number
of comparisons. Using this prefix length seems doable for my problem..
(even though I'm not 100% sure it is appropriate, this has to be
investigated)
Is there a way to use this prefix length (or something similar) on som
In our application we noticed that anytime there was more than one segment
(as in not optimized) in the index that there was a big drop in
performance. After thinking about this for a long time it didn't add up,
even if you optimize an index and then add just 1 job the big drop occurs.
I tracked
or you could try n-gram approach with Spellchecker (you will find it contrib
area).
get suggestSimilars() and form your query, or even better ConstantScoringQuery
via Filter. It works OK.
Or if you have not so many Terms (could spare to load all terms in memory),
you could try TernarySearch
I tried the cityName:city~0.8, and it is still not fast enough..
something around 2 seconds... to return only 2 results...
OK, so we trimmed down the search terms we actually used in the query but I suspect what you are
seeing is the effect of having to perform edit-distance comparisons on ALL
Hi,
thanks for the tip.. However, my slowness issues do not seem to be
caused by the number of search results returned, since cityName:XX~0.8
took 2 seconds to return 2 results
So, the problem seems to be more related to scanning the index...
Thanks,
Sami Dalouche
Le mardi 30 mai 2006 à 16:
Hi,
Compass offers me any kind of control Lucene does. it gives access to
the low level Lucene API if you want too, so if you have a nice way of
optimizing it, I can have Compass adapt to that.
I tried the cityName:city~0.8, and it is still not fast enough..
something around 2 seconds... to ret
Thanks for the quick Reply Erik; subSearcher did the trick.
--Mike
On 5/26/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:
I'm running out the door, so only a quick reply... yes you can. Look
at the subSearcher(?) method - that'll give you the index. Your
application will need to keep track of
: Question is two fold. One, here is the layout I was thinking:
my rule of thumb: if a field is going to contain less then a few dozen
bytes (ie: a date, an email address, etc) you might as well store it ...
it will make your life easier when looking at your results.
another important thing you
Hello,
I will try this again
I am working on a system that will index emails and their attachments.
I have all the pieces working that parse the documents and I am now
working on the actual indexing part. I would like to have synonym
searching as well.
Question is two fold. One, here
: public CompassHits find(String query) throws CompassException {
: return createQueryBuilder().queryString(query).toQuery().hits();
: }
: And all of these objects are pure wrappers around lucene equivalents,
: nothing more.
: 2) What I am timing is only the find call :
: -- start tim
It depends on:
- query complexity/type
- size of the index
- hardware (CPU, disk, RAM)
- acceptable latency
...
Otis
- Original Message
From: "Kinnar Kumar Sen, Noida" <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, May 31, 2006 8:22:48 AM
Subject: RE: My first quest
Mile,
Any Analyzer that uses a Tokenizer that throws out non-characters will do.
For example, take a look at SimpleAnalyzer. It uses LowerCaseTokenizer. If
you read the javadoc for LowerCaseTokenizer, I think you will see it suits you.
Otis
- Original Message
From: Mile Rosu <[EMAIL
Hello!
I am currently trying to index latin language documents, in which
missing letters are appended to words by using square brackets, like
this : "[divinit]atis".
Could you tell me please which would be the best practice to remove the
brackets before adding into the Lucene index? (in the exam
Hi All
Can any one tell me how many simultaneous queries can a Lucene Index
Support
Regards and Thanks
Kinnar
DISCLAIMER:
---
The contents of this e-mail and any attachment(s) a
Hello,
Or you can purchase the Book "Lucene in Action"
You will find the t.o.c and some sample chapters here http://lucenebook.com
Have a nice day
--- Sven
Le mercredi 31 mai 2006 à 13:32:45, vous écriviez :
AKY> http://lucene.apache.org/java/docs/gettingstarted.html
AKY> Regards,
AKY> Amar
http://lucene.apache.org/java/docs/gettingstarted.html
Regards,
Amaresh
-Original Message-
From: saikrishna [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 31, 2006 4:52 PM
To: java-user@lucene.apache.org
Subject: Re : good link to start working on Lucene
Hai ,
I am completly ne
http://lucene.apache.org/java/docs/api/overview-summary.html#overview_description
2006/5/31, saikrishna <[EMAIL PROTECTED]>:
Hai ,
I am completly new to lucene.And unfortunatly I couldn't find good
site to start working with Lucene.
Can some suggest me good link to work
Hai ,
I am completly new to lucene.And unfortunatly I couldn't find good
site to start working with Lucene.
Can some suggest me good link to work on Lucene ( relevent
to Indexing and searching )
regards,
Saikrishna.
--
*
Actually you don't need to create text file. Get data from db and
create Document that put in index. At least you must store ID of row
in Document. Or you may store doctitle and docpath too.
For each row you shoul do something like this:
import org.apache.lucene.document.Document;
import org.apa
Hi,
> From: Amaresh Kumar Yadav [mailto:[EMAIL PROTECTED]
>
> First create a text file which contains data(that is retrived
> by oracle query) which is stored in table.
You have not to create text file for indexing. You can index your data
immediately:
open IndexWriter
execute
Hi All,
Infact i want to search data which is stored in a table on oracle...
my table contains two fields "doctitle" and "docpath", first field(doctitle)
represents location of some document and second field(docpath)represents
documents title.
I want to apply lucene search on doctitle.
What i
>>Actually, I am not using Lucene directly, but a
wrapper called compass
I don't know what controls it offers you then.
One option which could offer a speed up is to raise
the minimum quality match threshold above the default
of 0.5 and use a query string like this:
cityName:London~0.8
This
Hi,
I'm new to Lucene but trying it out. I've successfully installed the
luceneweb.war for the indexHTML
web demo, but am getting an error when tomcat tries to compile the
results.jsp part of the demo as it
tries to answer a search query. The error message (see below) says that
the QueryPars
Hi,
1) Actually, I am not using Lucene directly, but a wrapper called
compass. I am using the find() method of the CompassSession, which code
is :
public CompassHits find(String query) throws CompassException {
return createQueryBuilder().queryString(query).toQuery().hits();
}
And all
42 matches
Mail list logo