I seem to say this a lot :), but, assuming your OS has a decent
filesystem cache, try reducing your JVM heapsize, using an FSDirectory
instead of RAMDirectory, and see if your filesystem cache does ok. If
you have 12GB, then you should have enough RAM to hold both the old
and new indexes during th
Hi,
I want to know how the lucene normalizes the score. I see hits class has
this function to get each document's score. But i dont know how lucene
calculates the normalized score and in the "Lucene in action", it only said
normalized score of the nth top scoring docuemnts.
--
Regards
Jiang Xing
>> Since I didn't find anything in the log from log4j I did a "kill
>> -3" on
>> > the process and found two very interesting things:
>>
>> Almost all multisearcher threads were in this state:
>>
>> "MultiSearcher thread #1" daemon prio=10 tid=0x01900960
>> nid=0x81442c waiting for moni
Ray,
The 135 qps rate was using the standard FSDirectory in 1.9.
Peter
On 1/26/06, Ray Tsang <[EMAIL PROTECTED]> wrote:
>
> Paul,
>
> Thanks for the advice! But for the 100+queries/sec on a 32-bit
> platfrom, did you end up applying other patches? or use different
> FSDirectory implementations?
Paul,
Thanks for the advice! But for the 100+queries/sec on a 32-bit
platfrom, did you end up applying other patches? or use different
FSDirectory implementations?
Thanks!
ray,
On 1/27/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> Ray,
>
> The short answer is that you can make Lucene blazingly
Hello,
I have a couple instances of lucene. I just altered on implementation and now
its not keeping a segments file. while indexing occurs, there is a segment
file.but once its done, there isn't.all the other indexes have one.
the problem comes when i try to update a document, it
Ray,
The short answer is that you can make Lucene blazingly fast by using advice
and design principles mentioned in this forum and of course reading 'Lucene
in Action'. For example, use a 'content' field for searching all fields (vs
mutli-field search), put all your stored data in one field, under
Thanks for the info :) One last related question.
If I delete documents using a IndexReader(), can I assume that the
internal document numbers of other undeleted documents (obtained using
the same IndexReader instance) will not change until I call
IndexReader.close()?
Peter,
Wow, the speed up in impressive! But may I ask what did you do to
achieve 135 queries/sec prior to the JVM swich?
ray,
On 1/27/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> Correction: make that 285 qps :)
>
> On 1/26/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> >
> > I tried the AMD64-b
There is no difference in bytecode... the whole difference is just in
the underlying JVM.
-Yonik
On 1/26/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> Dumb question: does the 64-bit compiler (javac) generate different code than
> the 32-bit version, or is it just the jvm that matters? My reported
Dumb question: does the 64-bit compiler (javac) generate different code than
the 32-bit version, or is it just the jvm that matters? My reported speedups
were soley from using the 64-bit jvm with jar files from the 32-bit
compiler.
Peter
On 1/26/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
> Ni
Nice speedup! The extra registers in 64 bit mode hay have helped a little too.
-Yonik
On 1/26/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> Correction: make that 285 qps :)
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additi
Correction: make that 285 qps :)
On 1/26/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
>
> I tried the AMD64-bit JVM from Sun and with MMapDirectory and I'm now
> getting 250 queries/sec and excellent cpu utilization (equal concurrency on
> all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm
I tried the AMD64-bit JVM from Sun and with MMapDirectory and I'm now
getting 250 queries/sec and excellent cpu utilization (equal concurrency on
all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I wasn't aware
of it.
Thanks all very much.
Peter
On 1/26/06, Doug Cutting <[EMAIL PROTEC
Doug Cutting wrote:
A 64-bit JVM with NioDirectory would really be optimal for this.
Oops. I meant MMapDirectory, not NioDirectory.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PRO
Peter Keegan wrote:
The throughput is worse with NioFSDIrectory than with the FSDIrectory
(patched and unpatched). The bottleneck still seems to be synchronization,
this time in NioFile.getChannel (7 of the 8 threads were blocked there
during one snapshot). I tried this with 4 and 8 channels.
Hello,
On Jan 26, 2006, at 12:01, John Haxby wrote:
I have a perl script here that I used to generate downgrading table
for a C program. I can let you have the perl script as is, but if
there's enough interest(*) I'll use it to generate, say,
CompoundAsciiFilter since it converts compound cha
On Thursday 26 January 2006 19:44, Chris Hostetter wrote:
>
> : > The document number is the variable i in this case.
> : If the document number is the variable i (enumerated from numDocs()),
> : what's the difference between numDocs() and maxDoc() in this case? I
> : was previously under the impr
On Thursday 26 January 2006 09:47, Chun Wei Ho wrote:
> Hi,
>
> Thanks for the help, just a few more questions:
>
> On 1/26/06, Paul Elschot <[EMAIL PROTECTED]> wrote:
> > On Thursday 26 January 2006 09:15, Chun Wei Ho wrote:
> > > I am attempting to prune an index by getting each document in tur
: > The document number is the variable i in this case.
: If the document number is the variable i (enumerated from numDocs()),
: what's the difference between numDocs() and maxDoc() in this case? I
: was previously under the impression that the internal docNum might be
: different to the counter.
BEA Jrockit supports both AMD64 and Intel's EM64T (basically renamed AMD64)
http://www.bea.com/framework.jsp?CNT=index.htm&FP=/content/products/jrockit/
and Sun's Java 1.5 for "Windows AMD64 Platform"
They advertize AMD64, presumably because that's what there servers
use, but it should work on Int
I'd love to try this, but I'm not aware of any 64-bit jvms for Windows on
Intel. If you know of any, please let me know. Linux may be an option, too.
btw, I'm getting a sustained rate of 135 queries/sec with 4 threads, which
is pretty impressive. Another way around the concurrency limit is to run
arnaudbuffet wrote:
if I try to index a text file encoded in Western 1252 for exemple with the Turkish text
"düzenlediğimiz kampanyamıza" the lucene index will contain re encoded data with
�k��
ISOLatin1AccentFilter.removeAccents() converts that string to
"duzenlediğimiz kampanyamıza"
On Jan 26, 2006, at 7:26 PM, arnaudbuffet wrote:
I do not find the ISOLatin1AccentFilter class in my lucene jar, but
I find one on google attach to this mail, could you tell me if it
is the good one?
This used to be in contrib/analyzers but has been moved into the core
(Subversion only fo
Hmmm, can you run the 64 bit version of Windows (and hence a 64 bit JVM?)
We're running with heap sizes up to 8GB (RH Linux 64 bit, Opterons,
Sun Java 1.5)
-Yonik
On 1/26/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> Paul,
>
> I tried this but it ran out of memory trying to read the 500Mb .fdt fi
Ray,
The throughput is worse with NioFSDIrectory than with the FSDIrectory
(patched and unpatched). The bottleneck still seems to be synchronization,
this time in NioFile.getChannel (7 of the 8 threads were blocked there
during one snapshot). I tried this with 4 and 8 channels.
The throughput wi
Paul,
I tried this but it ran out of memory trying to read the 500Mb .fdt file. I
tried various values for MAX_BBUF, but it still ran out of memory (I'm using
-Xmx1600M, which is the jvm's maximum value (v1.5)) I'll give
NioFSDirectory a try.
Thanks,
Peter
On 1/26/06, Paul Elschot <[EMAIL PROT
Hello and thanks for your answer.
I do not find the ISOLatin1AccentFilter class in my lucene jar, but I find one
on google attach to this mail, could you tell me if it is the good one?
I do not see anything in this class which can help me. This program will
replace some accent characters but my
Hi,
Got more questions regarding Lucene and this time it's about performance
;-)
We currently are using RAMDirectories to read our Indexes. This has now
become a problem since our index has grown to appx 5GB of RAM and the
machine we are running on only has 12GB of RAM and everytime we refr
You can also look at Phonetix which has many implementations of this...
-Original Message-
From: Erik Hatcher <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wed, 18 Jan 2006 05:41:30 -0500
Subject: Re: SoundEx
On Jan 18, 2006, at 4:20 AM, Christian Reuschling wrote:
> yes,
Yes, that is correct...you need to rewrite the query. I was actually the main
developer for the 1.5 .NET port, so if you come across any issues, please email
me at my hotmail address which I check more often than this one...
-Joe Langley
-Original Message-
From: Gwyn Carwardine <[EMAI
For the recent questions about this here are a couple of methods for
encoding/decoding long values that will be sorted into order by a range
query
public static String encodeLong(long num) {
String hex = Long.toHexString(num < 0 ? Long.MAX_VALUE -
(0xL ^ num) : num);
arnaudbuffet wrote:
For text files, data could be in different languages so different
encoding. If data are in Turkish for exemple, all special characters and
accents are not recognized in my lucene index. Is there a way to resolve
problem? How do I work with the encoding ?
I've been looking
Hello,
I 've a problem with data i try to index with lucene. I browse a
directory and index text from different types of files throw parsers.
For text files, data could be in different languages so different
encoding. If data are in Turkish for exemple, all special characters and
accents are no
Hi,
Thanks for the help, just a few more questions:
On 1/26/06, Paul Elschot <[EMAIL PROTECTED]> wrote:
> On Thursday 26 January 2006 09:15, Chun Wei Ho wrote:
> > I am attempting to prune an index by getting each document in turn and
> > then checking/deleting it:
> >
> > IndexReader ir = IndexR
Speaking of NioFSDirectory, I thought there was one posted a while
ago, is this something that can be used?
http://issues.apache.org/jira/browse/LUCENE-414
ray,
On 11/22/05, Doug Cutting <[EMAIL PROTECTED]> wrote:
> Jay Booth wrote:
> > I had a similar problem with threading, the problem turned o
On Thursday 26 January 2006 09:15, Chun Wei Ho wrote:
> I am attempting to prune an index by getting each document in turn and
> then checking/deleting it:
>
> IndexReader ir = IndexReader.open(path);
> for(int i=0;i Document doc = ir.document(i);
> if(thisDocShouldBeDeleted(doc)) {
>
On Wednesday 25 January 2006 22:24, Chris Hostetter wrote:
>
> : for this site, but would you cash all manufacturers and intersect all with
> : the initial query in one page load? Seems like that would be alot.
>
> Yep it is a lot, but if you've got the RAM, it's not that time intensive.
> At CNE
I am attempting to prune an index by getting each document in turn and
then checking/deleting it:
IndexReader ir = IndexReader.open(path);
for(int i=0;i
On Wednesday 25 January 2006 20:51, Peter Keegan wrote:
> The index is non-compound format and optimized. Yes, I did try
> MMapDirectory, but the index is too big - 3.5 GB (1.3GB is term vectors)
>
> Peter
>
You could also give this a try:
http://issues.apache.org/jira/browse/LUCENE-283
Regards
40 matches
Mail list logo