I was referring to *RAMDirectory*.
On Wed, Jul 18, 2012 at 11:04 PM, Lance Norskog wrote:
>> You do not want to store 30 G of data in the JVM heap, no matter what
library does this.
> MMapDirectory does not store data in the JVM heap. It lets the
> operating system manage the disk buffer space. E
Thanks for the input.
I am not using Solr.
Also, my index has a fixed size, I am not going to update it.
-Original Message-
From: googoo [mailto:liu...@gmail.com]
Sent: 18 July 2012 15:21
To: java-user@lucene.apache.org
Subject: Re: In memory Lucene configuration
Doron,
To verify actual
> Why anyone buys computers without SSD's is a mystery to me. Use SSDs for
On topic and highly recommended:
http://www.youtube.com/watch?v=H7PJ1oeEyGg
Dawid
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For a
I had a threading issue in the client code calling Lucene, really nothing that
has anything to do with this list :)
-Original Message-
From: Simon Willnauer [mailto:simon.willna...@gmail.com]
Sent: 18 July 2012 21:48
To: java-user@lucene.apache.org
Subject: Re: In memory Lucene configura
> You do not want to store 30 G of data in the JVM heap, no matter what library
> does this.
MMapDirectory does not store data in the JVM heap. It lets the
operating system manage the disk buffer space. Even if the JVM says "I
have 30G of memory space", it really does not. It only has address
spac
On Thu, Jul 19, 2012 at 1:53 AM, Bernd Fehling
wrote:
> ...
> Robert Muir added a comment - 12/Apr/12 16:24
>
> We can save 10MB with this patch, which nukes the 'index'.
> I guarantee you nobody will miss it. Just click this thing and see how
> useless it is (since its every method etc in all of
...
Robert Muir added a comment - 12/Apr/12 16:24
We can save 10MB with this patch, which nukes the 'index'.
I guarantee you nobody will miss it. Just click this thing and see how
useless it is (since its every method etc in all of lucene).
...
Yeah, "nobody will miss it" and "see how useless it i
On Wed, 2012-07-18 at 17:50 +0200, Dragon Fly wrote:
> If I want to improve performance, which of the following is better and why?
>
> 1. Buy a machine with a lot of RAM and use a RAMDirectory for the index.
As others has pointed out, MMapDirectory should work better than
RAMDirectory. I am sure
On Tue, Jul 17, 2012 at 12:44 PM, Roman Chyla wrote:
> Hi,
>
> Tests show that TermEnum.docFreq() returns sum of all docs, including
> the deleted ones. Which seems to (indirectly) contradict the javadoc
That's right; fixing it to reflect deleted documents would be
prohibitively costly.
Hmm whic
Hi,
just to clarify:
> In additional, i don't think load whole index to memory is good idea.
Since the
> index size will always increase.
> For me, i change lucene code to disable MMapDirectory, since the index
size is
> bigger and bigger.
> And MMapDirectory will call something like c++ share me
> Rum is an essential ingredient in all software systems :-)
You probably meant "social systems".
D.
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache
On Wed, Jul 18, 2012 at 9:05 PM, Tim Eck wrote:
> Rum is an essential ingredient in all software systems :-)
Absolutely! :)
simon
>
> -Original Message-
> From: Simon Willnauer [mailto:simon.willna...@gmail.com]
> Sent: Wednesday, July 18, 2012 11:49 AM
> To: java-user@lucene.apache.org
>
Rum is an essential ingredient in all software systems :-)
-Original Message-
From: Simon Willnauer [mailto:simon.willna...@gmail.com]
Sent: Wednesday, July 18, 2012 11:49 AM
To: java-user@lucene.apache.org
Subject: Re: RAM or SSD...
1. use mmap directory
2. buy rum
3. get an SSD
simon
1. use mmap directory
2. buy rum
3. get an SSD
simon :)
On Wed, Jul 18, 2012 at 8:36 PM, Vitaly Funstein wrote:
> You do not want to store 30 G of data in the JVM heap, no matter what
> library does this.
>
> On Wed, Jul 18, 2012 at 10:44 AM, Paul Jakubik wrote:
>> If only 30GB, go with RAM and
doron, enlighten me please!
On Wed, Jul 18, 2012 at 1:32 PM, Doron Yaacoby
wrote:
> Glad to announce the problem was on my side, and had nothing to do with
> Lucene. Indeed, looks like that MMapDirectory is the best choice for me.
>
> Thanks again.
>
> -Original Message-
> From: Doron Ya
You do not want to store 30 G of data in the JVM heap, no matter what
library does this.
On Wed, Jul 18, 2012 at 10:44 AM, Paul Jakubik wrote:
> If only 30GB, go with RAM and MMAPDirectory (as long as you have the budget
> for that hardware).
>
> My understanding is that RAMDirectory is intended
: What is the sense of removing the "Index" from the API Javadoc for Lucene and
Solr?
It was heavily bloating the size of the releases...
https://issues.apache.org/jira/browse/LUCENE-3977
It's pretty easy to turn this back on and rebuild the docs locally. Feel
free to open a jira and submit
If only 30GB, go with RAM and MMAPDirectory (as long as you have the budget
for that hardware).
My understanding is that RAMDirectory is intended for unit tests, not for
production indexes.
On Wed, Jul 18, 2012 at 10:50 AM, Dragon Fly wrote:
>
> Hi,
>
> If I want to improve performance, which of
Lucene certainly supports multiple sort criteria, see
IndexSearcher.search, any one that takes a Sort
object. The Sort object can contain a list of fields where
any ties in the first N field(s) are decided by looking
at field N+1.
But, Ganesh, be a little careful about resolving by internal
Lucene
Optimize will release disk space if have lots of delete. (Merge will do same
thing).
For me, I think optimize will little bit speed up search.
Which JRE are you using? for windows, if you are using 64bit JRE, then
lucene try to map index to memory.
that will use lots of memory and also involve lot
it always add one more search conditional.
like you search by subject:hello.
the back end will search
subject:hello AND accound:齐保元
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-implement-a-search-engine-like-gmail-tp3995675p3995700.html
Sent from the Lucene - Java Us
I don't think lucene will support multi sort.
If you look into org.apache.lucene.search.TopScoreDocCollector you may get
some feeling.
It use max heap to sort the document, and the score is one time calculate,
it it not first sort by time, then sort again by id.
When lucene sort below documents:
Doron,
To verify actual query speed, i think you may need:
1) do not run index job
2) in solrconfig.xml, set filterCache and queryResultCache value to 0
3) restart solr
4) run the query and check the qtime result
That may give you some idea what is actual query time.
To break down query time, yo
What metrics are you measuring performance by? Also, what is your current
setup? You might be able to speed up your current setup by tweaking
configuration settings without needing more hardware.
On Wed, Jul 18, 2012 at 11:50 AM, Dragon Fly wrote:
>
> Hi,
>
> If I want to improve performance, whi
Thank you Robert,
Thank you!
It solves my problems!
> From: rcm...@gmail.com
> Date: Wed, 18 Jul 2012 10:40:08 -0400
> Subject: Re: Indexed BytesRef
> To: java-user@lucene.apache.org
>
> Here's a test indexing some binary terms
>
> http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/c
Hi,
If I want to improve performance, which of the following is better and why?
1. Buy a machine with a lot of RAM and use a RAMDirectory for the index.
2. Put the index on a solid state drive.
By the way, my index is about 30 GB. Thank you.
That is one option. See recent thread (yesterday?) about possible
problems with that approach, and an alternative or two.
I've no idea how Google do it.
And I've no idea what you mean by problem with different subjects.
--
Ian.
On Wed, Jul 18, 2012 at 4:27 PM, 许超前 wrote:
> Maybe everyone ha
Maybe everyone has his/her own index.
2012/7/18 齐保元
> HI buddy,
>In gmail,there are many accounts,how google manage to
> search individual email without the risk of search other accounts email?If
> there are *huge* account,small index may knock down the server,any good
> idea?and
Here's a test indexing some binary terms
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/core/src/test/org/apache/lucene/index/TestBinaryTerms.java
It uses BinaryTokenStream
(http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/core/src/test/org/apache/lucene/index/BinaryTokenStream.ja
Hi,
I'm using Lucene 4.0.
I would like to index String,
but since my system required High volume I need to reuse always the
same memory. No question to use String.
My process receives bytes and I can transform it in BytesRef (representing a
String)
At the moment, it seems that when I use fiel
Will be great if someone can show how to do it..
For my application, I donot care about any score (just vanilla boolean
search is sufficient)
In the mean while, I experimented with some workaround and would like to
share the findings:
Problem details:
On a collection on 10 million documents, I wa
This is possible, using the ScorerVisitor (3.6) / getChildren (4.0).
You need a custom collector that when it collects a competitive hit,
visits the sub-scorers of your BooleanQuery and saves away which ones
matched the current doc.
But this is very expert and there are real challenges (eg not all
Dear developers,
while upgrading from 3.6.x to 4.x I have to rewrite some of my code and
search for the new methods and/or classes. In 3.6.x and older versions
the API Javadoc interface had an "Index" which made it easy to find the
appropriate methods. The button to call the "Index" was located in
Glad to announce the problem was on my side, and had nothing to do with Lucene.
Indeed, looks like that MMapDirectory is the best choice for me.
Thanks again.
-Original Message-
From: Doron Yaacoby [mailto:dor...@gingersoftware.com]
Sent: 16 July 2012 09:43
To: java-user@lucene.apache.
I'd forgotten about IndexUpgrader, but I'd still go for 3.6. I
wouldn't want the complexity of shipping two versions of lucene and
having to get customers to run an upgrade script. And probably
wouldn't want to ship the first stable version of 4.0, even though
lucene is very stable and reliable.
The tool docs can be found here: http://goo.gl/TbbxC
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Wednesday, July 18, 2012 11:13 AM
> To: java-user@luce
Hi,
You have to first "convert" your indexes with version 3.x to migrate from
2.x to 4.0. This can be done with the new tool called "IndexUpgrader"
(available since Lucene 3.2 or like that). You can call it from command
line, it will upgrade all index segments to the latest version you are using
t
The release notice for 4.0-alpha sent to this list says "file format
backwards compatibility is provided for indexes from the 3.0 series"
so you won't be able to go straight from 2.x to 4.0. I'm sure that
will remain true for all 4.x releases. The comments about waiting for
a stable release of 4.
> Any thoughts on this?
Patience ...
> Is it good to use multiple sort fields?
Absolutely, if that's what you need. On the other hand, if you don't
need it then it's a bad idea.
> Using sort on docid will consume any memory?
Don't know. Certainly won't use less than not sorting this way.
>
39 matches
Mail list logo