lengthNorm accessible?

2007-03-13 Thread maureen tanuwidjaja
correct me if I'm wrong. Thanks, Xiaocheng maureen tanuwidjaja wrote: Ya...I think i will store it in the database so that later it could be used in scoring/ranking for retrieval...:) Another thing i would like to see is whether the precision or recall will be much affaected by this...

Re: Urgent : How much actually the disk space needed to optimize the index?

2007-03-13 Thread maureen tanuwidjaja
te:One side-effect of turning off the norms may be that the scoring/ranking will be different? Do you need to search by each of these many fields? If not, you probably don't have to index these fields (but store them for retrieval?). Just a thought. Xiaocheng Michael McCandless wrot

Re: How to disable lucene norm factor?

2007-03-13 Thread maureen tanuwidjaja
ok mike.I'll try it and see wheter could work :) then I will proceed to optimize the index. Well then i guess it's fine to use the default value for maxMergeDocs which is INTEGER.MAX? Thanks a lot Regards, Maureen Michael McCandless <[EMAIL PROTECTED]> w

How to disable lucene norm factor?

2007-03-13 Thread maureen tanuwidjaja
Hi all, How to disable lucene norm factor? Thanks, Maureen - We won't tell. Get more on shows you hate to love (and love to hate): Yahoo! TV's Guilty Pleasures list.

Re: Urgent : How much actually the disk space needed to optimize the index?

2007-03-13 Thread maureen tanuwidjaja
Hi Mike, How to disable/turn off the norm?is it while indexing? Thanks, Maureen - Need Mail bonding? Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users.

Re: Urgent : How much actually the disk space needed to optimize the index?

2007-03-13 Thread maureen tanuwidjaja
Oops sorry,mistyping.. I have the searching result in 30 SECONDS to 3 minutes, which is actually quite unacceptable for the "search engine" I build...Is there any recommendation on how faster searching could be done? maureen tanuwidjaja <[EMAIL PROTECTED]> wrote: Hi

Re: Urgent : How much actually the disk space needed to optimize the index?

2007-03-13 Thread maureen tanuwidjaja
the unoptimized one? I have the searching result in 30 to 3 minutes, which is actually quite unacceptable for the "search engine" I build...Is there any recommendation on how faster searching could be done? Thanks, Maureen Michael McCandless <[EMAIL PROTECTED]>

Re: Urgent : How much actually the disk space needed to optimize the index?

2007-03-13 Thread maureen tanuwidjaja
d to 3minutes in searching inside this unoptimized index How bout the memory consumption?will it took greater amount of memory consumption if using the optimized one? Thanks a lot Regards, Maureen Michael McCandless <[EMAIL PROTECTED]> wrote: "mau

Urgent : How much actually the disk space needed to optimize the index?

2007-03-13 Thread maureen tanuwidjaja
Dear All How much actually the disk space needed to optimize the index?The explanation given in documentation seems to be very different with the practical situation I have an index file of size 18.6 G and I am going to optimize it.I keep this index in mobile Hard Disk with capacit

Re: Optimizing Index

2007-02-22 Thread maureen tanuwidjaja
t;[EMAIL PROTECTED]> wrote: "maureen tanuwidjaja" wrote: > I had an exsisting index file with the size 20.6 GB...I havent done any > optimization in this index yet.Now I had a HDD of 100 GB,but apparently > when I create program to optimize(which simply calls writer.opti

Searching eats lots of memory?

2007-02-21 Thread maureen tanuwidjaja
I also would like to know wheter searching in the indexfile eats lots of memory...I always ran out of memory when doing searching,i.e. it gives the exception java heap space(although I have put -Xmx768 in the VM argument) ...Is there any way to solve it? - TV di

Optimizing Index

2007-02-21 Thread maureen tanuwidjaja
Hi, I had an exsisting index file with the size 20.6 GB...I havent done any optimization in this index yet.Now I had a HDD of 100 GB,but apparently when I create program to optimize(which simply calls writer.optimize() to this indexfile),it gives the error that there is not enough space on

about merge factor

2007-02-11 Thread maureen tanuwidjaja
Hi all, I just wondering wheter is it sensible and possible if I have 660,000 documents to be indexed,I set the merge factor to 660,000 instead of the default value 10 (...and this means no merge while indexing) and later after closing the index,I use the IndexWriter to optimize/merge t

Is there any way to optimize existing unoptimized index?

2007-02-07 Thread maureen tanuwidjaja
Hi, May I also ask wheter there is a way to use writer.optimize() without indexing the files from the beginning? It took me about 17 hrs to finish building an unoptimized index(finish when I call IndexWriter.close() ).I just wonder wheter this existing index could be optimized...

exception is hit while optimizing index

2007-02-07 Thread maureen tanuwidjaja
Hi , I would like to know about optimizing index... The exception is hit due to disk full while optimizing the index and hence,the index has not been closed yet. Is the unclosed index dangerous?Can i perform searching in such index correctly?Is the index built robust yet? than

RE: Building lucene index using 100 Gb Mobile HardDisk

2007-02-01 Thread maureen tanuwidjaja
e- From: maureen tanuwidjaja [mailto:[EMAIL PROTECTED] Sent: 01 February 2007 14:22 To: java-user@lucene.apache.org Subject: Building lucene index using 100 Gb Mobile HardDisk Dear All, I was indexing 660,000 XML documents.The unoptimized index file was successfully built in about 17 hrs..

Building lucene index using 100 Gb Mobile HardDisk

2007-02-01 Thread maureen tanuwidjaja
Dear All, I was indexing 660,000 XML documents.The unoptimized index file was successfully built in about 17 hrs...This index file resides in my D drive which has the free space 38 Gb.This space is insufficient for optimizing the index file -->I read Lucene documentation said about its

Re: printout of the stack trace while failing to indexing the 190,000th ocument

2007-01-28 Thread maureen tanuwidjaja
I think so ...btw may I ask the opinion, will it be useful to optimize let say every 50,000-60,000 documents? I have total of 660,000 docs... Erik Hatcher <[EMAIL PROTECTED]> wrote: On Jan 28, 2007, at 9:15 PM, maureen tanuwidjaja wrote: > OK,This is the printout of the stack tr

printout of the stack trace while failing to indexing the 190,000th ocument

2007-01-28 Thread maureen tanuwidjaja
OK,This is the printout of the stack trace while failing to indexing the 190,000th ocument Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491886.xml Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491887.xml Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491891.xml Indexin

Sorry, it is the 190,000th documents

2007-01-28 Thread maureen tanuwidjaja
te: > > > did you try triggering a thread dump to see what it was doing at that > point? > > depending on your merge factors and other IndexWriter settings it could > just be doing a relaly big merge. > > : Date: Sat, 27 Jan 2007 09:40:47 -0800 (PST) > : From: maureen tan

Indexwriter can't add the 10000th document to the index

2007-01-28 Thread maureen tanuwidjaja
point? depending on your merge factors and other IndexWriter settings it could just be doing a relaly big merge. : Date: Sat, 27 Jan 2007 09:40:47 -0800 (PST) : From: maureen tanuwidjaja : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: My program stops indexi

My program stops indexing after 10000th documents is indexed

2007-01-27 Thread maureen tanuwidjaja
Hi all, Is there any limitation of number of file that lucene can handle? I indexed a total of 3 XML Documents,however it stops at 1th documents. No warning,no error ,no exception as well. Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491876.xml Indexing C:\sweetp

Re: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

2007-01-26 Thread maureen tanuwidjaja
oh thanks then:) Пустовалов Михаил <[EMAIL PROTECTED]> wrote: in your java command line, of course :) Example : java -Xms128m -Xmx1024m -server -Djava.awt.headless=true -XX:MaxPermSize=128m protei.Starter On Fri, 26 Jan 2007 19:39:13 +0300, maureen tanuwidjaja

Re: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

2007-01-26 Thread maureen tanuwidjaja
E...where shall I put that" -XX:MaxPermSize=128m"? Thanks Pustovalov Regards, Maureen Пустовалов Михаил <[EMAIL PROTECTED]> wrote: try this : -XX:MaxPermSize=128m On Fri, 26 Jan 2007 19:32:45 +0300, maureen tanuwidjaja wrote

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

2007-01-26 Thread maureen tanuwidjaja
Hi Mike and Erick and all, I have fixed my code and yes,indexing is much faster than previously when I do such "hammering" with IndexWriter However,I am now encountering the error while indexing Exception in thread "main" java.lang.OutOfMemoryError: Java heap space This error n

Re: Building Lucene index for XML document

2007-01-25 Thread maureen tanuwidjaja
Thanks Doron =) Regards, Maureen Doron Cohen <[EMAIL PROTECTED]> wrote: Hi Maureen, Some relevant info in the file formats doc - http://lucene.apache.org/java/docs/fileformats.html Regards, Doron maureen tanuwidjaja wrote on 25/01/2007 01:31:25: > btw Daniel,can please gi

Re: Lock obtain timed out SimpleFSLock

2007-01-25 Thread maureen tanuwidjaja
and clean up > after it > someplace else. In your code snippet, opening the IndexWriter in your > DocumentIndexer then having to remember to close it in main is a recipe for > disaster. Trust me on this one, I've spent way more time than I'd like > to admit debugging th

Re: Lock obtain timed out SimpleFSLock

2007-01-25 Thread maureen tanuwidjaja
have deleted the directory where the indexfile exist and try to index from the beginning...I dunno wheter 7 hrs later it will raise the same problem"Lock obtain timed out" 4.I use the latest version of Lucene (nightly build) Thanks and Regards, Maureen Michael McCandless

Re: Building Lucene index for XML document

2007-01-25 Thread maureen tanuwidjaja
Best regards ^^ Maureen maureen tanuwidjaja <[EMAIL PROTECTED]> wrote: Thanks a lot Daniel :) Regards, Maureen Daniel Noll wrote: maureen tanuwidjaja wrote: > Before implementing this search engine,I have designed to build the > index in such a way that every XML tag

Lock obtain timed out SimpleFSLock

2007-01-25 Thread maureen tanuwidjaja
Hi, I am indexing thousands of XML document,then it stops after indexing for about 7 hrs ... Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37003.xml Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37004.xml Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37008.xml Indexing C:\swee

Re: Building Lucene index for XML document

2007-01-25 Thread maureen tanuwidjaja
Thanks a lot Daniel :) Regards, Maureen Daniel Noll <[EMAIL PROTECTED]> wrote: maureen tanuwidjaja wrote: > Before implementing this search engine,I have designed to build the > index in such a way that every XML tag is converted using binary > value,in order to

Building Lucene index for XML document

2007-01-24 Thread maureen tanuwidjaja
Hi... I am a Final Year Undergrad.My Final year project is about search engine for XML Document..I am currently building this system using Lucene. The example of XML element from an XML document : -- This is my