Hi Otis, this is 3Gb of heap (-Xmx). I am running on a multicore 32 bits machine and I am concerned about the 4Gb limit. cpu is not a problem, however I am wondering about memory requirements as I will be scaling up. I mostly use term queries on multiple fields (about 30 fields); so no fuzzy or sort; wildcards from time to time.
vincent Otis Gospodnetic wrote: > > Hello, > > > Comments inlined. > > > ----- Original Message ---- >> To: java-user@lucene.apache.org >> Sent: Fri, November 13, 2009 11:32:02 AM >> Subject: Re: OutofMemory in large index >> >> >> Hi, I am jumping into the thread because I have got a similar issue. >> My index is 30Gb large and contains 21M docs. >> I was able to stay with 1Gb of RAM on the server for a while. Recently I > > Is that 1GB heap or 1GB RAM? > >> started to simulate parallel searches. Just 2 parallel searches would get >> the server to crash with out of memory errors. I upgraded the server to >> 3Gb >> of RAM and I was able to run happily 10 parallel full text searches on my >> documents. >> My questions: >> - is 3Gb a relatively normal amount of memory for a server doing lucene >> searches? > > These days 3GB of RAM is very little even for a laptop. :) > >> - when is that going to stop? I am planning to have at least 40M docs in >> my >> index. will I need to go from 2.5 to 5Gb of RAM? what about 60M docs? >> what >> about 20 concurrent searches? > > The more you hit the machine, the more resources it needs. The more > resource intensive the queries (e.g. sorting? fuzzy? wildcard?), the > more resources they'll need. > > One instance of Lucene/Solr I looked at today has an index with < 5M not > very large documents, but high query rates and relatively expensive > queries hitting a 20GB index. Each of 10 servers has 8 cores that were > only about 30% idle. This is just an example. Each case is different. > >> - are there any safety mechanisms that would get a search to abort rather >> than make the server crash with out of memory? > > I don't think so. When an app hits OOM, I think it doesn't have much > control over its destiny. > > Otis > -- > Sematext is hiring -- http://sematext.com/about/jobs.html?mls > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR >> Simon Willnauer wrote: >> > >> > On Fri, Nov 13, 2009 at 11:17 AM, Ian Lea wrote: >> >>> I got OutOfMemoryError at >> >>> org.apache.lucene.search.Searcher.search(Searcher.java:183) >> >>> My index is 43G bytes. Is that too big for Lucene ? >> >>> Luke can see the index has over 1800M docs, but the search is also >> out >> >>> of memory. >> >>> I use -Xmx1024M to specify 1G java heap space. >> >> >> >> 43Gb is not too big for lucene, but it certainly isn't small and that >> >> is a lot of docs. Just give it more memory. >> > I would strongly recommend to give it more memory, what version of >> > lucene do you use? Depending on your setup you could run into a JVM >> > bug if you use a lucene version < 2.9. Your index is big enough >> > (document wise) that you norms file grows > 100MB, depending on your >> > Xmx settings this could trigger a false OOM during index open. So if >> > you are using < 2.9 check out this issue >> > https://issues.apache.org/jira/browse/LUCENE-1566 >> > >> > >> >> >> >>> One abnormal thing is that I broke a running optimize of this index. >> >>> Is that can be a problem ? >> >> >> >> Possibly ... >> > In general, this should not be a problem. The optimize will not >> > destroy the index you are optimizing as segments are write once. >> >> >> >>> If so, how can I fix an index after optimize process is broken. >> >> >> >> Probably depends on what you mean by broken. Start with running >> >> org.apache.lucene.index.CheckIndex. That can also fix some things - >> >> but see the warning in the javadocs. >> > 100% recommended to make sure nothing is wrong! :) >> >> >> >> >> >> -- >> >> Ian. >> > >> >> -- >> View this message in context: >> http://old.nabble.com/OutofMemory-in-large-index-tp26332397p26339388.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > -- View this message in context: http://old.nabble.com/OutofMemory-in-large-index-tp26332397p26364116.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org