Re: Help with huge index

2018-03-04 Thread Michael Sokolov
I wonder if you might not get better performance in a case like this if you were ok taking your index off line, disabling merges, performing deletions and only then enabling merges? This could be done on a copy of the index if updates can be turned off or held in a queue, so that queries could stil

Re: Help with huge index

2018-02-28 Thread Stuart Goldberg
Thanks so much. I actually found that my purging routine finished after about 35 minutes which is really acceptable given that this routine is supposed to run during the overnight period. On Feb 28, 2018 8:34 PM, "Adrien Grand" wrote: > Thanks. Deleting lots of documents can indeed trigger a lot

Re: Help with huge index

2018-02-28 Thread Adrien Grand
Thanks. Deleting lots of documents can indeed trigger a lot of work in the Lucene side. First Lucene likely needs to rewrite the live docs of all your segments and then this might trigger significant merging activity due to the fact that Lucene tries to keep the number of deleted docs reasonable so

Re: Help with huge index

2018-02-28 Thread Stuart Goldberg
I call deleteDocuments On Feb 28, 2018 8:16 PM, "Adrien Grand" wrote: > What do you mean by purging? What methods do you call? > > Le mer. 28 févr. 2018 à 19:34, Stuart Goldberg a > écrit : > > > I have huge lucene index. On disk it's about 24Gb. > > > > > > > > I have a purging routine that is

Re: Help with huge index

2018-02-28 Thread Adrien Grand
What do you mean by purging? What methods do you call? Le mer. 28 févr. 2018 à 19:34, Stuart Goldberg a écrit : > I have huge lucene index. On disk it's about 24Gb. > > > > I have a purging routine that is supposed to run and purge old docs. > > > > There are about 650 million docs in there and

Help with huge index

2018-02-28 Thread Stuart Goldberg
I have huge lucene index. On disk it's about 24Gb. I have a purging routine that is supposed to run and purge old docs. There are about 650 million docs in there and through testing I have determined that about 1/3 of these need to be purged. During the purge, every so often it's appare

Re: Lucene not merging frequently causing huge index size

2015-04-27 Thread Anand Bhagwat
rio: > > I have an application with frequent doucment updates and it uses NRT > > reader. I also have a background thread which calls commit on the > > IndexWriter on a regular interval. There is one instance of IndexWriter > > which is created and it is never closed. > &

Re: Lucene not merging frequently causing huge index size

2015-04-27 Thread Michael McCandless
ead which calls commit on the > IndexWriter on a regular interval. There is one instance of IndexWriter > which is created and it is never closed. > > The problem is that Lucene is creating lot of files and huge index which is > way much more than the size of documents. This size is not r

Lucene not merging frequently causing huge index size

2015-04-27 Thread Anand Bhagwat
that Lucene is creating lot of files and huge index which is way much more than the size of documents. This size is not reduced despite the lucene merge running in the background. (I am using the default TieredMergePolicy and ConcurrentMergeScheduler). I also tried running a background thread

Re: Lucene not merging frequently causing huge index size

2015-04-27 Thread Michael McCandless
ead which calls commit on the > IndexWriter on a regular interval. There is one instance of IndexWriter > which is created and it is never closed. > > The problem is that Lucene is creating lot of files and huge index which is > way much more than the size of documents. This size is not r

Lucene not merging frequently causing huge index size

2015-04-27 Thread Anand Bhagwat
that Lucene is creating lot of files and huge index which is way much more than the size of documents. This size is not reduced despite the lucene merge running in the background. (I am using the default TieredMergePolicy and ConcurrentMergeScheduler). I also tried running a background thread

RE: howto run CheckIndex on huge index size

2012-08-15 Thread Uwe Schindler
o: java-user@lucene.apache.org > Subject: RE: howto run CheckIndex on huge index size > > I hope the problem is fixed now; this mail is just to check! It was hard to > unsubscribe because of the strange eMail. Have no idea at all... > > Uwe > > - > Uwe Schindler > H

RE: howto run CheckIndex on huge index size

2012-08-15 Thread Uwe Schindler
m: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Wednesday, August 15, 2012 3:13 PM > To: java-user@lucene.apache.org > Subject: RE: howto run CheckIndex on huge index size > > I got is, too. As a moderator of this list, I will look into finding the root cause > and forcefully

RE: howto run CheckIndex on huge index size

2012-08-15 Thread Uwe Schindler
ehling [mailto:bernd.fehl...@uni-bielefeld.de] > Sent: Wednesday, August 15, 2012 3:04 PM > To: java-user@lucene.apache.org > Subject: Re: howto run CheckIndex on huge index size > > > I guess that ulimit could be a default setting of XenServer when it was first time > setup. >

Re: howto run CheckIndex on huge index size

2012-08-15 Thread Bernd Fehling
om: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de] >> Sent: Wednesday, August 15, 2012 2:07 PM >> To: java-user@lucene.apache.org >> Subject: Re: howto run CheckIndex on huge index size >> >> Hi Uwe, >> >> index size is: >> -rw-r--r-- 1 solr users 82

RE: howto run CheckIndex on huge index size

2012-08-15 Thread Uwe Schindler
bject: Re: howto run CheckIndex on huge index size > > Hi Uwe, > > index size is: > -rw-r--r-- 1 solr users 82G 15. Aug 07:50 _2rhe.fdt > -rw-r--r-- 1 solr users 303M 15. Aug 07:50 _2rhe.fdx > -rw-r--r-- 1 solr users 1,2k 15. Aug 07:36 _2rhe.fnm > -rw-r--r-- 1 solr users 39G

Re: howto run CheckIndex on huge index size

2012-08-15 Thread Bernd Fehling
> Sent: Wednesday, August 15, 2012 1:25 PM >> To: java-user@lucene.apache.org >> Subject: howto run CheckIndex on huge index size >> >> >> I'm trying to run CheckIndex as seperate tool on a large index to get nice > infos >> about number of terms, number of tok

RE: howto run CheckIndex on huge index size

2012-08-15 Thread Uwe Schindler
feld.de] > Sent: Wednesday, August 15, 2012 1:25 PM > To: java-user@lucene.apache.org > Subject: howto run CheckIndex on huge index size > > > I'm trying to run CheckIndex as seperate tool on a large index to get nice infos > about number of terms, number of tokens, ... but a

howto run CheckIndex on huge index size

2012-08-15 Thread Bernd Fehling
I'm trying to run CheckIndex as seperate tool on a large index to get nice infos about number of terms, number of tokens, ... but always get OOM exception. Already have JAVA_OPTS -d64 -Xmx25g -Xms25g -Xmn6g Any idea how to use CheckIndex on huge index size? Opening index @ /srv/www

Re: How to avoid huge index files

2009-09-10 Thread Ted Stockwell
009 2:18:35 PM > Subject: Re: How to avoid huge index files > > > Is it possible to upload to GAE an already exist index? My index is data I'm > collecting for long time, and I prefer not to give it up. > >

Re: How to avoid huge index files

2009-09-10 Thread Dvora
in Google App Engine >> > (http://code.google.com/appengine/), which limits files length to be >> > smaller than 10MB. > > > > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org &

Re: How to avoid huge index files

2009-09-10 Thread Ted Stockwell
Another alternative is storing the indexes in the Google Datastore, I think Compass already supports that (though I have not used it). Also, I have successfully run Lucene on GAE using GaeVFS (http://code.google.com/p/gaevfs/) to store the index in the Datastore. (I developed a Lucene Directory

RE: How to avoid huge index files

2009-09-10 Thread Dvora
28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> From: Dvora [mailto:barak.ya...@gmail.com] >> Sent: Thursday, September 10, 2009 1:23 PM >> To: java-user@lucene.apache.org >> Subject: Re: How to avoid huge index files >> >> >> Hi aga

RE: How to avoid huge index files

2009-09-10 Thread Uwe Schindler
earches. > >>>>> I'm intending to deploy the application in Google App Engine > >>>>> (http://code.google.com/appengine/), which limits files length to be > >>>>> smaller than 10MB. I've read about the various policies supported by >

Re: How to avoid huge index files

2009-09-10 Thread Dvora
t; which parameters, the index files still grew to be lot more the 10MB. >>>>> Looking at the code, I've managed to limit the cfs files (predicting >>>>> the >>>>> file size in CompoundFileWriter before closing the file) - I guess >>>>> that >

Re: How to avoid huge index files

2009-09-10 Thread Michael McCandless
zes, but on matter which policy I used and >>>> which parameters, the index files still grew to be lot more the 10MB. >>>> Looking at the code, I've managed to limit the cfs files (predicting the >>>> file size in CompoundFileWriter before closing the file)

Re: How to avoid huge index files

2009-09-10 Thread Dvora
#x27;ve managed to limit the cfs files (predicting the >>> file size in CompoundFileWriter before closing the file) - I guess that >>> will degrade performance, but it's OK for now. But now the FDT files are >>> becoming huge (about 60MB) and I cant identifiy a

Re: How to avoid huge index files

2009-09-10 Thread Michael McCandless
ance, but it's OK for now. But now the FDT files are >> becoming huge (about 60MB) and I cant identifiy a way to limit those >> files. >> >> Is there some built-in and correct way to limit these files length? If no, >> can someone direct me please how should I twea

Re: How to avoid huge index files

2009-09-09 Thread Dvora
huge (about 60MB) and I cant identifiy a way to limit those > files. > > Is there some built-in and correct way to limit these files length? If no, > can someone direct me please how should I tweak the source code to achieve > that? > > Thanks for any help. > -- View t

How to avoid huge index files

2009-09-08 Thread Dvora
eak the source code to achieve that? Thanks for any help. -- View this message in context: http://www.nabble.com/How-to-avoid-huge-index-files-tp25347505p25347505.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. -

Re: Huge Index

2007-01-11 Thread Otis Gospodnetic
AIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, January 11, 2007 12:36:52 PM Subject: RE: Huge Index Given the ram directory inside of the IndexWriter in 2.0, is there still any reason to stage this manually? - T

RE: Huge Index

2007-01-11 Thread Benson Margulies
Given the ram directory inside of the IndexWriter in 2.0, is there still any reason to stage this manually? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Huge Index

2007-01-11 Thread Otis Gospodnetic
.org Sent: Thursday, January 11, 2007 12:16:51 PM Subject: RE: Huge Index I never got to index all the data but it is too slow. I got 3 million in 2,5 hours. As suggested in Lucene in Action, I use ramDir and after I write 5000 documents I merge them to the fsDir. The merge factor is now 100 I tr

RE: Huge Index

2007-01-11 Thread Alice
-Original Message- From: Ruslan Sivak [mailto:[EMAIL PROTECTED] Sent: quinta-feira, 11 de janeiro de 2007 15:12 To: java-user@lucene.apache.org Subject: Re: Huge Index Alice, If you have a computer that crashes once you put a lot of load on it, I'd say you have bigger problems t

RE: Huge Index

2007-01-11 Thread Alice
ginal Message- From: Grant Ingersoll [mailto:[EMAIL PROTECTED] Sent: quinta-feira, 11 de janeiro de 2007 15:07 To: java-user@lucene.apache.org Subject: Re: Huge Index Hi Alice, Can you define slow (hours, days, months and on what hardware)? Have you done any profiling, etc. to see wher

Re: Huge Index

2007-01-11 Thread Ruslan Sivak
e the server crashes. -Original Message- From: Russ [mailto:[EMAIL PROTECTED] Sent: quinta-feira, 11 de janeiro de 2007 14:33 To: java-user@lucene.apache.org Subject: Re: Huge Index Can you use multiple threads/machines to index the data into separate indexes, and then combine them?

Re: Huge Index

2007-01-11 Thread Grant Ingersoll
Hi Alice, Can you define slow (hours, days, months and on what hardware)? Have you done any profiling, etc. to see where the bottlenecks are? What size documents are you talking about here? What are your merge factors, etc.? Thanks, Grant On Jan 11, 2007, at 10:47 AM, Alice wrote: H

Re: Huge Index

2007-01-11 Thread James Rhodes
Unfortunately I can't use multiple machines. > > And I cannot start lots of threads because the server crashes. > > -Original Message- > From: Russ [mailto:[EMAIL PROTECTED] > Sent: quinta-feira, 11 de janeiro de 2007 14:33 > To: java-user@lucene.apache.org &g

Re: Huge Index

2007-01-11 Thread Rangarirayi Muvavarirwa
ause the server crashes. -Original Message- From: Russ [mailto:[EMAIL PROTECTED] Sent: quinta-feira, 11 de janeiro de 2007 14:33 To: java-user@lucene.apache.org Subject: Re: Huge Index Can you use multiple threads/machines to index the data into separate indexes, and then combine t

RE: Huge Index

2007-01-11 Thread Alice
Unfortunately I can't use multiple machines. And I cannot start lots of threads because the server crashes. -Original Message- From: Russ [mailto:[EMAIL PROTECTED] Sent: quinta-feira, 11 de janeiro de 2007 14:33 To: java-user@lucene.apache.org Subject: Re: Huge Index Can yo

Re: Huge Index

2007-01-11 Thread Russ
Can you use multiple threads/machines to index the data into separate indexes, and then combine them? Russ Sent wirelessly via BlackBerry from T-Mobile. -Original Message- From: "Alice" <[EMAIL PROTECTED]> Date: Thu, 11 Jan 2007 13:47:36 To: Subject: Huge Index Hell