Re: Lucene Index File Format

2012-09-30 Thread Selvakumar
Hi Pranab Kumar, I'm not looking for reading the documents through IndexReader. I just want to know how does lucene persists its data in the index. I just want to learn about the metadata and the meta-objects of lucene index. On 10/1/2012 10:44 AM, parnab kumar wrote: Hi, U

Re: Lucene Index File Format

2012-09-30 Thread parnab kumar
Hi, Use IndexReader instead . You can loop through the index and read one document at a time . Thanks, Parnab On Mon, Oct 1, 2012 at 10:33 AM, Selvakumar wrote: > Hi, > > I'm new to Lucene and I reading the docs on Lucene. > > > I read through the Lucene Index File Format, so to e

Lucene Index File Format

2012-09-30 Thread Selvakumar
Hi, I'm new to Lucene and I reading the docs on Lucene. I read through the Lucene Index File Format, so to exercise well I tried to open the lucene index through a text editor. The editor opened with the encrypted text. If I open the index directory with luke the entire index is opened but

Re: Index size doubles every time when I synchronize the RAM-based index with the FD-based index

2012-09-30 Thread Cheng
Yes. I build ram indexes from disk and update the ram indexes when new docs come in (Step 1). When the number of new docs gets to 10,000, I will persistent the ram indexes to disk (Step 2). The bigger concern is however the update. I don't know how much ram is eaten up, but I suppose whenever do t

Re: Index size doubles every time when I synchronize the RAM-based index with the FD-based index

2012-09-30 Thread Ian Lea
Are you loading it from disk, adding loads of docs then writing it back to disk? That would do it. How many docs in the memory index? How many on disk? What version of lucene? -- Ian. On Fri, Sep 28, 2012 at 1:56 AM, Cheng wrote: > Hi, > > I have a ram based index which occasionally needs t

Re: Variable term weighting while indexing

2012-09-30 Thread parnab kumar
Hi Erick, Can you please share your thoughts on the following : Since lucene by default does vector space scoring , the weight component for a term from the document is nothing but its term frequency . Now if i have an associated payload weight for that term , when the fi