Hi Guys, Thank you very much for your answers.
I will do some profiling on memory usage, but is there any documentation on how Lucene uses/allocates the memory? Best wishes, Rui Wang On 6 Dec 2011, at 06:11, KARTHIK SHIVAKUMAR wrote: > hi > >>> would the memory usage go through the roof? > > Yup .... > > My past experience got me pickels in there... > > > > with regards > karthik > > On Mon, Dec 5, 2011 at 11:28 PM, Rui Wang <rw...@ebi.ac.uk> wrote: > >> Hi All, >> >> We are planning to use lucene in our project, but not entirely sure about >> some of the design decisions were made. Below are the details, any >> comments/suggestions are more than welcome. >> >> The requirements of the project are below: >> >> 1. We have tens of thousands of files, their size ranging from 500M to a >> few terabytes, and majority of the contents in these files will not be >> accessed frequently. >> >> 2. We are planning to keep less accessed contents outside of our database, >> store them on the file system. >> >> 3. We also have code to get the binary position of these contents in the >> files. Using these binary positions, we can quickly retrieve the contents >> and convert them into our domain objects. >> >> We think Lucene provides a scalable solution for storing and indexing >> these binary positions, so the idea is that each piece of the content in >> the files will a document, each document will have at least an ID field to >> identify to content and a binary position field contains the starting and >> stop position of the content. Having done some performance testing, it >> seems to us that Lucene is well capable of doing this. >> >> At the moment, we are planning to create one Lucene index per file, so if >> we have new files to be added to the system, we can simply generate a new >> index. The problem is do with searching, this approach means that we need >> to create an new IndexSearcher every time a file is accessed through our >> web service. We knew that it is rather expensive to open a new >> IndexSearcher, and are thinking of using some kind of pooling mechanism. >> Our questions are: >> >> 1. Is this one index per file approach a viable solution? What do you >> think about pooling IndexSearcher? >> >> 2. If we have many IndexSearchers opened at the same time, would the >> memory usage go through the roof? I couldn't find any document on how >> Lucene use allocate memory. >> >> Thank you very much for your help. >> >> Many thanks, >> Rui Wang >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > > -- > *N.S.KARTHIK > R.M.S.COLONY > BEHIND BANK OF INDIA > R.M.V 2ND STAGE > BANGALORE > 560094* --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org