Re: why would Explanation be *ALWAYS* null?

2009-01-21 Thread Chris Hostetter
: even mention that possibility. When I debug through the call, I find the : "explanation" in this code inside class MarkupContainsQuery (which is : the code that gets called): ... : // TODO SY - implement : > return null; : } that

Re: Lucene app to run as daemon service in windows and linux

2009-01-21 Thread Ganesh
Thanks. http://wrapper.tanukisoftware.org/doc/english/download.jsp is free to use in open source projects. It requires license to use in commerical products. I am looking for free or which costs less. Regards Ganesh - Original Message - From: joseph.syj...@ph.lawson.com To: jav

Re: Indexing and Searching Web Application

2009-01-21 Thread Amin Mohammed-Coleman
Hi Please ignore my last email. I have managed to work out how to fix the problem. Sent reply without morning coffee! Thanks Amin Sent from my iPhone On 21 Jan 2009, at 22:32, Erick Erickson wrote: NOTE: you're iterating over 'searchers' and adding to indexSearchers. Is that a typo

Re: Indexing and Searching Web Application

2009-01-21 Thread Amin Mohammed-Coleman
Hi Thanks for your reply. You right it looks as the original list is the problem. The list I loop over is spring configured to return a list of index searcher. Each index searcher looks at different indexes. I would like to inject the list of index searchers as we may have requirement to

Re: Lucene app to run as daemon service in windows and linux

2009-01-21 Thread Joseph.Syjuco
Hi, You can try this one http://wrapper.tanukisoftware.org/doc/english/download.jsp "XP is making a bet. It is betting that it is better to do a simple thing today and pay a little more tomorrow to change it if it needs it, than to do a more complicated thing today that may never be used anyw

Lucene app to run as daemon service in windows and linux

2009-01-21 Thread Ganesh
Hello all, This question is not related to Lucene and i hope some folks might have faced this issue. My java app uses RMI and i wanted to run it as a service both in windows and linux. Is there any open source projects available to do that? Whether tomcat could host RMI app. Suggest me, if

Re: Jdbc persitence Index

2009-01-21 Thread Otis Gospodnetic
Haroldo, Have you looked at Solr? http://lucene.apache.org/solr Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Haroldo Nascimento > To: lucene-user lucene-user > Sent: Wednesday, January 21, 2009 5:19:52 PM > Subject: Jdbc persitence In

Re: contrib Benchmark enwiki problem

2009-01-21 Thread Jason Rutherglen
The xml file temp/enwiki-20070527-pages-articles.xml was downloaded by "ant get-enwiki expand-enwiki". The docs.file in extractWikipedia.alg and wikipedia.alg points to it. The error message is regarding work/enwiki.txt. Is there a how to on this stuff? What is an alg? On Wed, Jan 21, 2009 at

Re: contrib Benchmark enwiki problem

2009-01-21 Thread Michael McCandless
You should download Wikipedia's XML file manually yourself, uncompress it, and then edit docs.file in that alg to point to it. Mike Jason Rutherglen wrote: I downloaded trunk via SVN. Went to trunk/contrib/benchmark. Executed ant enwiki. I'm not sure what else needs to be done. Receiv

contrib Benchmark enwiki problem

2009-01-21 Thread Jason Rutherglen
I downloaded trunk via SVN. Went to trunk/contrib/benchmark. Executed ant enwiki. I'm not sure what else needs to be done. Received this error: enwiki: [echo] Working Directory: /Users/jrutherg/dev/lucenetrunk/trunk/contrib/benchmark/work [java] Running algorithm from: /Users/jrutherg

Re: Nightly source builds of Lucene ..

2009-01-21 Thread Chris Hostetter
: maybe try: : : http://hudson.zones.apache.org/hudson/view/Solr/job/Solr-trunk/ I believe Kay was asking about Lucene-Java nightly builds. : > I am trying to access the nightly lucene builds here at - : > http://people.apache.org/builds/lucene/java/nightly/ . It does not seem to : > be avail

Re: Words that need protection from stemming, i.e., protwords.txt

2009-01-21 Thread Chris Hostetter
: Subject: Words that need protection from stemming, i.e., protwords.txt : References: <49710068.1090...@gmail.com> : <3994e409-bff0-4348-9d84-4c762b150...@gmail.com> : <497111f8.7020...@stimulussoft.com> : In-Reply-To: <497111f8.7020...@stimulussoft.com> http://people.apache.org/~hossman/#thre

Re: Search Across All Fields

2009-01-21 Thread Chris Hostetter
: Subject: Search Across All Fields : References: <49710068.1090...@gmail.com> : <3994e409-bff0-4348-9d84-4c762b150...@gmail.com> : In-Reply-To: <3994e409-bff0-4348-9d84-4c762b150...@gmail.com> http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a

Re: Indexing and Searching Web Application

2009-01-21 Thread Erick Erickson
NOTE: you're iterating over 'searchers' and adding to indexSearchers. Is that a typo? Assuming that it's not and your 'searchers' is the copy you talk about (so you can freely add?) you never delete from the underlying indexSearchers. But you do close elements because you're closing a reference to

Jdbc persitence Index

2009-01-21 Thread Haroldo Nascimento
Hi, I need to replicate my index in many nodes, but also I need to do the indexation process only one time. I thought in the solution of the to persiste the index in DB and later load the index from in DB to the all nodes. This is a good solution? I searched and I discovered that compass

Lucene Index Monitor

2009-01-21 Thread Haroldo Nascimento
Hi, I need to monitor my searches and index. i know that there is LIMO "Lucene Index Monitor", but the last version 0.6 works with the version 2.0 of Lucene. Is there other application that monitor the index and search ? Thansks __

Re: indexing database

2009-01-21 Thread Chris Lu
This is not a lucene question, but a jdbc question. The code is not releasing the jdbc connection, statement, and resultset, and what's worse, the code is creating new connections when paginating the results. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/

indexing database

2009-01-21 Thread cemsoft
hi, i am trying to index a database table with appr. 1,5 million rows.. first i got OutOfMemory exception, then i used offsets, it works now till 5 rows lucene is new to me ..should i set sth else ?? below is the code that i use private static void indexData(MysqlDataSource ds) throws SQL

Re: Indexing and Searching Web Application

2009-01-21 Thread Amin Mohammed-Coleman
Hi I am trying to get an understanding and what the best practice is. I am not saying that I am right, it may well be that my code is wrong, that is why I am posting this. The original loop that I am iterating over is a spring injected dependency. I don't reuse that in the multisearcher

Re: Indexing and Searching Web Application

2009-01-21 Thread Ian Lea
Oh well, it's your code so I guess you know what it does. But I still think you're wrong. If your list contains 3 searchers at the top of the loop and all 3 need to be reopened then the list will contain 6 searchers at the end of the loop, and the first 3 will be for readers that you've just clos

Re: why would Explanation be *ALWAYS* null?

2009-01-21 Thread Ian Lea
Have you tried it without the rewrite, marked as "expert" in the javadocs? I'm not a lucene expert but I am an experienced user and in several years of usage I don't think I've ever had cause to call Query.rewrite(). -- Ian. On Wed, Jan 21, 2009 at 7:44 PM, wrote: > R2.4 > > There is much abo

Re: Indexing and Searching Web Application

2009-01-21 Thread Amin Mohammed-Coleman
Hi, That is what I am doing with the line: indexSearchers.add(indexSearch); indexSearchers is an ArrayList that is constructed before the for loop: List indexSearchers = new ArrayList(); I then pass the indexSearchers to : multiSearcher = new MultiSearcher(indexSearchers.toArray(new

Re: Indexing and Searching Web Application

2009-01-21 Thread Ian Lea
I haven't been following this thread, but shouldn't you be replacing the old searcher in your list of searchers rather than just adding the new one on the end? Could be wrong - I find the names in your code snippet rather confusing. -- Ian. On Wed, Jan 21, 2009 at 6:59 PM, Amin Mohammed-Coleman

why would Explanation be *ALWAYS* null?

2009-01-21 Thread rolarenfan
R2.4 There is much about Lucene that I do not understand, so it may be that there is some simple or obvious mistake I am making. I build an index, get hits (documents) back from it, with various non-zero scores. Now I call this code: Explanation expl = _searcher.explain(rewrite, docIn

Re: first time using lucene

2009-01-21 Thread Michael Wechner
nitin gopi schrieb: Hello , I have recently started downloaded lucene. My project is to add LSI(Latent Semantic Indexing) to the indexing method of the lucene, to improve the indexing of documents. I am totally new into this field. Please help me in this matter and guide me how to proceed in the

RE: first time using lucene

2009-01-21 Thread Steven A Rowe
Hi Nitin, Lucene in Action 2nd edition is a good place to start. If you want free stuff, check out the Lucene wiki Resources page: . Also, some basic code on the wiki: .

Re: Indexing and Searching Web Application

2009-01-21 Thread Amin Mohammed-Coleman
Hi I did the following according to java docs: for (IndexSearcher indexSearcher: searchers) { IndexReader reader = indexSearcher.getIndexReader(); IndexReader newReader = reader.reopen(); if (newReader != reader) { reader.close(); } reader = newReader; IndexSearcher indexSearch = n

first time using lucene

2009-01-21 Thread nitin gopi
Hello , I have recently started downloaded lucene. My project is to add LSI(Latent Semantic Indexing) to the indexing method of the lucene, to improve the indexing of documents. I am totally new into this field. Please help me in this matter and guide me how to proceed in the right direction. I fir

Re: Lucene Performance issue

2009-01-21 Thread Anshul jain
@Erick: Yes I changed the default field, it is "bagofwords" now. @Ian: Yes both indexes were optimized, and I didn't do any deletions. version 2.4.0 I'll repeat the experiment, just be sure. Mean while, do you have any document on Lucene fields? what I need to know is how lucene is storing field

Re: Lucene Performance issue

2009-01-21 Thread Ian Lea
> ... > I can for sure say that multiple copies are not index. But the number of > fields in which text is divided are many. Can that be a reason? Not for that amount of difference. You may be sure that you are not indexing multiple copies, but I'm not. Convince me - create 2 new indexes via the

Re: Lucene Performance issue

2009-01-21 Thread Erick Erickson
Note that your two queries are different unless you've changed the default operator. Also, your bagOfWords query is searching across your default field for the second two terms. Your bagOfWords is really something like bagOfWords:Alexander OR :history OR :Macedon. Best Erick On Wed, Jan 21, 20

Re: Lucene Performance issue

2009-01-21 Thread Erick Erickson
I agree with Ian that these times sound way too high. I'd also ask whether you fire a few warmup searches at your server before measuring the increased time, you might just be seeing the cache being populated. Best Erick On Wed, Jan 21, 2009 at 10:42 AM, Ian Lea wrote: > Hi > > > Space: 700Mb v

Re: Lucene Performance issue

2009-01-21 Thread Anshul jain
Hi, thanks for the reply. For the document, in my last mail.. multifieldQuery: name: Alexander AND domain: history AND first_sentence: Macedon Single field query: bagOfWords: Alexander history Macedon I can for sure say that multiple copies are not index. But the number of fields in which text

Re: Lucene Performance issue

2009-01-21 Thread Ian Lea
Hi Space: 700Mb vs 4.5Gb sounds way too big a difference. Are you sure you aren't loading multiple copies of the data or something like that? Queries: a 20 times slowdown for a multi field query also sounds way too big. What do the simple and multi field queries look like? -- Ian. On Wed,

Re: Indexing and Searching Web Application

2009-01-21 Thread Amin Mohammed-Coleman
Hi Will give that a go. Thanks Sent from my iPhone On 21 Jan 2009, at 12:26, "Ganesh" wrote: I am closing the old reader and it is working fine for me. Refer to IndexReader.Reopen javadoc. ///Below is the code snipper from IndexReader.reopen javadoc IndexReader reader = ... ... IndexRead

Lucene Performance issue

2009-01-21 Thread Anshul jain
Hi, I've indexed around half a million XML documents. Here is the document sample: cogito:Name Alexander the Great cogito:domain ancient history cogito:first_sentence Alexander the Great (Greek: or Megas Alexandros; July 20 356 BC June 10 323 BC), also known as Alexander III

Re: Indexing and Searching Web Application

2009-01-21 Thread Ganesh
I am closing the old reader and it is working fine for me. Refer to IndexReader.Reopen javadoc. ///Below is the code snipper from IndexReader.reopen javadoc IndexReader reader = ... ... IndexReader new = r.reopen(); if (new != reader) { ... // reader was reopened reader.close(); //Old

Re: Lucene Indexing and Search Policy

2009-01-21 Thread Anshum
Its about building a custom similarity class that scores using your normalization factors etc. This might help in that case, http://www.gossamer-threads.com/lists/lucene/java-user/69553 -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opini

Re: Lucene Indexing and Search Policy

2009-01-21 Thread M Seetha Ramaiah
Hi Anshum, Even that document says that higher frequency implied higher score. My doubt is if the score is based only on the frequency, won't it be inappropriate for Internet based search? For example, if Google did the same thing, when I search for "Microsoft", there is a chance that Google

Re: Lucene Indexing and Search Policy

2009-01-21 Thread Anshum
Hi msr, Perhaps this could be useful for you. Lucene implements a modified vector space model in short. http://jayant7k.blogspot.com/2006/07/document-scoringcalculating-relevance_08.html -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the op

Lucene Indexing and Search Policy

2009-01-21 Thread MSR
Hi, Does Lucene take into consideration anything other than the frequency of the query words in a document? If it does, what are the other considerations? If it is purely based on word frequency, is it appropriate for Internet based search (where we need to consider reference count also)? Th

check if document is deleted using indexwriter

2009-01-21 Thread Marc Sturlese
Hey there, I would like to know how to check if a document has been deleted if I am using an IndexWriter and the fucntions deleteDocument or updateDocument. I have seen that deleteDocument from IndexReader returns an integer but in the IndexWriter's case it's a void. Any advice? Thanks in advance