correct me if I'm wrong.
Thanks,
Xiaocheng
maureen tanuwidjaja wrote: Ya...I think i will store it in the database so
that later it could be used in scoring/ranking for retrieval...:)
Another thing i would like to see is whether the precision or recall will be
much affaected by this...
te:One side-effect of turning off the
norms may be that the scoring/ranking will be different? Do you need to search
by each of these many fields? If not, you probably don't have to index these
fields (but store them for retrieval?).
Just a thought.
Xiaocheng
Michael McCandless wrot
ok mike.I'll try it and see wheter could work :) then I will proceed to
optimize the index.
Well then i guess it's fine to use the default value for maxMergeDocs which
is INTEGER.MAX?
Thanks a lot
Regards,
Maureen
Michael McCandless <[EMAIL PROTECTED]> w
Hi all,
How to disable lucene norm factor?
Thanks,
Maureen
-
We won't tell. Get more on shows you hate to love
(and love to hate): Yahoo! TV's Guilty Pleasures list.
Hi Mike,
How to disable/turn off the norm?is it while indexing?
Thanks,
Maureen
-
Need Mail bonding?
Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users.
Oops sorry,mistyping..
I have the searching result in 30 SECONDS to 3 minutes, which is actually
quite unacceptable for the "search engine" I build...Is there any
recommendation on how faster searching could be done?
maureen tanuwidjaja <[EMAIL PROTECTED]> wrote: Hi
the unoptimized one?
I have the searching result in 30 to 3 minutes, which is actually quite
unacceptable for the "search engine" I build...Is there any recommendation on
how faster searching could be done?
Thanks,
Maureen
Michael McCandless <[EMAIL PROTECTED]>
d to 3minutes in searching inside
this unoptimized index
How bout the memory consumption?will it took greater amount of memory
consumption if using the optimized one?
Thanks a lot
Regards,
Maureen
Michael McCandless <[EMAIL PROTECTED]> wrote:
"mau
Dear All
How much actually the disk space needed to optimize the index?The
explanation given in documentation seems to be very different with the
practical situation
I have an index file of size 18.6 G and I am going to optimize it.I keep
this index in mobile Hard Disk with capacit
t;[EMAIL PROTECTED]> wrote:
"maureen tanuwidjaja" wrote:
> I had an exsisting index file with the size 20.6 GB...I havent done any
> optimization in this index yet.Now I had a HDD of 100 GB,but apparently
> when I create program to optimize(which simply calls writer.opti
I also would like to know wheter searching in the indexfile eats lots of
memory...I always ran out of memory when doing searching,i.e. it gives the
exception java heap space(although I have put -Xmx768 in the VM argument) ...Is
there any way to solve it?
-
TV di
Hi,
I had an exsisting index file with the size 20.6 GB...I havent done any
optimization in this index yet.Now I had a HDD of 100 GB,but apparently when I
create program to optimize(which simply calls writer.optimize() to this
indexfile),it gives the error that there is not enough space on
Hi all,
I just wondering wheter is it sensible and possible if I have 660,000
documents to be indexed,I set the merge factor to 660,000 instead of the
default value 10 (...and this means no merge while indexing) and later after
closing the index,I use the IndexWriter to optimize/merge t
Hi,
May I also ask wheter there is a way to use writer.optimize() without
indexing the files from the beginning?
It took me about 17 hrs to finish building an unoptimized index(finish when
I call IndexWriter.close() ).I just wonder wheter this existing index could be
optimized...
Hi ,
I would like to know about optimizing index...
The exception is hit due to disk full while optimizing the index and
hence,the index has not been closed yet.
Is the unclosed index dangerous?Can i perform searching in such index
correctly?Is the index built robust yet?
than
e-
From: maureen tanuwidjaja [mailto:[EMAIL PROTECTED]
Sent: 01 February 2007 14:22
To: java-user@lucene.apache.org
Subject: Building lucene index using 100 Gb Mobile HardDisk
Dear All,
I was indexing 660,000 XML documents.The unoptimized index file was
successfully built in about 17 hrs..
Dear All,
I was indexing 660,000 XML documents.The unoptimized index file was
successfully built in about 17 hrs...This index file resides in my D drive
which has the free space 38 Gb.This space is insufficient for optimizing the
index file -->I read Lucene documentation said about its
I think so ...btw may I ask the opinion, will it be useful to optimize let say
every 50,000-60,000 documents? I have total of 660,000 docs...
Erik Hatcher <[EMAIL PROTECTED]> wrote:
On Jan 28, 2007, at 9:15 PM, maureen tanuwidjaja wrote:
> OK,This is the printout of the stack tr
OK,This is the printout of the stack trace while failing to indexing the
190,000th ocument
Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491886.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491887.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491891.xml
Indexin
te:
>
>
> did you try triggering a thread dump to see what it was doing at that
> point?
>
> depending on your merge factors and other IndexWriter settings it could
> just be doing a relaly big merge.
>
> : Date: Sat, 27 Jan 2007 09:40:47 -0800 (PST)
> : From: maureen tan
point?
depending on your merge factors and other IndexWriter settings it could
just be doing a relaly big merge.
: Date: Sat, 27 Jan 2007 09:40:47 -0800 (PST)
: From: maureen tanuwidjaja
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: My program stops indexi
Hi all,
Is there any limitation of number of file that lucene can handle?
I indexed a total of 3 XML Documents,however it stops at 1th
documents.
No warning,no error ,no exception as well.
Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491876.xml
Indexing C:\sweetp
oh thanks then:)
ÐÑÑÑовалов ÐиÑ
аил <[EMAIL PROTECTED]> wrote: in your java
command line, of course :)
Example : java -Xms128m -Xmx1024m -server -Djava.awt.headless=true
-XX:MaxPermSize=128m protei.Starter
On Fri, 26 Jan 2007 19:39:13 +0300, maureen tanuwidjaja
E...where shall I put that" -XX:MaxPermSize=128m"?
Thanks Pustovalov
Regards,
Maureen
ÐÑÑÑовалов ÐиÑ
аил <[EMAIL PROTECTED]> wrote: try this :
-XX:MaxPermSize=128m
On Fri, 26 Jan 2007 19:32:45 +0300, maureen tanuwidjaja
wrote
Hi Mike and Erick and all,
I have fixed my code and yes,indexing is much faster than previously when I
do such "hammering" with IndexWriter
However,I am now encountering the error while indexing
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
This error n
Thanks Doron =)
Regards,
Maureen
Doron Cohen <[EMAIL PROTECTED]> wrote: Hi Maureen,
Some relevant info in the file formats doc -
http://lucene.apache.org/java/docs/fileformats.html
Regards,
Doron
maureen tanuwidjaja wrote on 25/01/2007
01:31:25:
> btw Daniel,can please gi
and clean up
> after it
> someplace else. In your code snippet, opening the IndexWriter in your
> DocumentIndexer then having to remember to close it in main is a recipe for
> disaster. Trust me on this one, I've spent way more time than I'd like
> to admit debugging th
have deleted the directory where the indexfile exist and try to index
from the beginning...I dunno wheter 7 hrs later it will raise the same
problem"Lock obtain timed out"
4.I use the latest version of Lucene (nightly build)
Thanks and Regards,
Maureen
Michael McCandless
Best regards ^^
Maureen
maureen tanuwidjaja <[EMAIL PROTECTED]> wrote:
Thanks a lot Daniel :)
Regards,
Maureen
Daniel Noll wrote:
maureen tanuwidjaja wrote:
> Before implementing this search engine,I have designed to build the
> index in such a way that every XML tag
Hi,
I am indexing thousands of XML document,then it stops after indexing for
about 7 hrs
...
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37003.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37004.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37008.xml
Indexing C:\swee
Thanks a lot Daniel :)
Regards,
Maureen
Daniel Noll <[EMAIL PROTECTED]> wrote:
maureen tanuwidjaja wrote:
> Before implementing this search engine,I have designed to build the
> index in such a way that every XML tag is converted using binary
> value,in order to
Hi...
I am a Final Year Undergrad.My Final year project is about search engine for
XML Document..I am currently building this system using Lucene.
The example of XML element from an XML document :
--
This is my
32 matches
Mail list logo