Hai ,
      Nothing to change in Indexing process. What requires is a little
pre-processing.
      If the structure of ur xml file is same as what I said earlier,then
split the 35MB file into small files and make sure that new small files
generated are of correct xml syntax.
      Now Index small files{more than one} generated instead of one large
file.

      Could you say the sturcture of ur xml file and what ur trying to
index.

On 1/22/07, aslam bari <[EMAIL PROTECTED]> wrote:

Hi Saikrishna,
Thanks for reply,
But i don't know how i can go with this. Here is my code sample, let me
know where to change.

SAXBuilder builder = new SAXBuilder();

//CONTENT here is bytearrayinputstream , i know i can give here file url
also. Let me know whta is best.
Document doc = builder.build(CONTENT);

loop(---)
{
    doc.selectNodes(xpathquery);
}

Thanks...
----- Original Message ----
From: saikrishna venkata pendyala <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, 22 January, 2007 10:07:27 AM
Subject: Re: Big size xml file indexing


Hai ,
       I have indexed 6.2 gb xml file using lucene. What I did was
        1 .  I have splitted the 6.2gb file into small files each of size
10mb.
        2 .  And then I worte a python script to quantize number
no.ofdocuments in each file.

        Structure of my xml file is """
       <document>
        -----
        -----
        </document>
        <document>
        -----
        -----
        </document> """

Since you cannot go beyond 500MB this technique might help you of course
if
file sturcture is the same.

On 1/22/07, aslam bari <[EMAIL PROTECTED]> wrote:
>
> Dear all,
> I m using lucene to index xml files. For parsing i m using JDOM to get
> XPATH nodes and do some manipulation on them and indexed them. All
things
> work well but when the file size is very big about 35 - 50 MB. Then it
goes
> out of memory or take a lot of time. How can i set some parameters to
speed
> up and took less memory to parse the file. The problem is that i cannot
> increase much high Heap Size. So i have to limit to use heap size of 300
-
> 500 MB. Has anybody some solution for this.
>
> Thanks...
>
>
>
> __________________________________________________________
> Yahoo! India Answers: Share what you know. Learn something new
> http://in.answers.yahoo.com/
>



__________________________________________________________
Yahoo! India Answers: Share what you know. Learn something new
http://in.answers.yahoo.com/

Reply via email to