Boosting numerical field

2012-05-18 Thread Meeraj Kunnumpurath
Hi, Is there anyway in a query, I can boost the relevance of a hit based on the value of a numerical field in the index. i.e higher the value of the field, more relevant the hit is. Kind regards Meeraj - To unsubscribe, e-mail

Unable to run LookupBenchmarkTest

2012-05-18 Thread Sudarshan Gaikaiwari
I am trying to run the LookupBenchmarkTest using the following command ant -v test -Dtestcase=LookupBenchmarkTest -Dtests.seed=24BC5D3301BB6D9 -Dargs="-Dfile.encoding=UTF-8" I see the following error -

Re: old fashioned....."Too many open files"!

2012-05-18 Thread Michel Blase
IndexSearcher Lucene 3.6 API: public void close() throws IOException Note that the underlying IndexReader is not closed, if IndexSearcher was constructed with IndexSearcher(IndexReader r). If t

Re: old fashioned....."Too many open files"!

2012-05-18 Thread Michel Blase
Ian was right! I didn't notice that before each insert the code was performing a search! but I'm not sure how to solve the problem! This is how I changed the code, after each search I'm closing the IndexSearcherbut stillI get too many open files! private IndexSearcher getSearcher() thro

RE: old fashioned....."Too many open files"!

2012-05-18 Thread Edward W. Rouse
I don't know. I do it as a matter of course. But if it fixes the problem, then at least you know why you are getting the error and can work on a scheme (using counters maybe), to do regular commits after every 10/20/100 documents. But you can't fix it until you know why it happens and this would c

Re: old fashioned....."Too many open files"!

2012-05-18 Thread Michel Blase
but commit after each insert should be really expensive and unnecessary! no? On Fri, May 18, 2012 at 10:31 AM, Edward W. Rouse wrote: > Have you tried adding im.commit() after adding a document? Could be all of > the uncommitted documents are leaving files open. > > > -Original Message- >

RE: old fashioned....."Too many open files"!

2012-05-18 Thread Edward W. Rouse
Have you tried adding im.commit() after adding a document? Could be all of the uncommitted documents are leaving files open. > -Original Message- > From: Michel Blase [mailto:mblas...@gmail.com] > Sent: Friday, May 18, 2012 1:24 PM > To: java-user@lucene.apache.org > Subject: Re: old fashi

Re: old fashioned....."Too many open files"!

2012-05-18 Thread Michel Blase
also.my problem is indexing! Preparation: private void SetUpWriters() throws Exception { Set set = IndexesPaths.entrySet(); Iterator i = set.iterator(); while(i.hasNext()){ Map.Entry index = (Map.Entry)i.next(); int id = (Integer)index.getKey()

Re: old fashioned....."Too many open files"!

2012-05-18 Thread Chris Hostetter
: the point is that I keep the readers open to share them across search. Is : this wrong? your goal is fine, but where in your code do you think you are doing that? I don't see any readers ever being shared. You open new ones (which are never closed) in every call to getSearcher() : > >

Re: old fashioned....."Too many open files"!

2012-05-18 Thread Michel Blase
Thanks Ian, the point is that I keep the readers open to share them across search. Is this wrong? On Fri, May 18, 2012 at 9:58 AM, Ian Lea wrote: > You may need to cut it down to something simpler, but I can't see any > reader.close() calls. > > > -- > Ian. > > > On Fri, May 18, 2012 at 5:47 P

Re: old fashioned....."Too many open files"!

2012-05-18 Thread Ian Lea
You may need to cut it down to something simpler, but I can't see any reader.close() calls. -- Ian. On Fri, May 18, 2012 at 5:47 PM, Michel Blase wrote: > This is the code in charge of managing the Lucene index. Thanks for your > help! > > > > package luz.aurora.lucene; > > import java.io.File

Re: old fashioned....."Too many open files"!

2012-05-18 Thread Michel Blase
This is the code in charge of managing the Lucene index. Thanks for your help! package luz.aurora.lucene; import java.io.File; import java.io.IOException; import java.util.*; import luz.aurora.search.ExtendedQueryParser; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analy

Re: NullPointerException using IndexReader.termDocs when there are no matches

2012-05-18 Thread Michael McCandless
OK I committed an improvement to the 3.6.x javadocs (in case we do a 3.6.1). Thanks! Mike McCandless http://blog.mikemccandless.com On Fri, May 18, 2012 at 9:37 AM, Edward W. Rouse wrote: > Thanks, I missed that. And the API doc fails to mention it, though it is > pretty standard for a next()

Re: Better Way of calculating Cosine Similarity between documents

2012-05-18 Thread nemeskey . david
Hi, can you provide a minimal example (no. of sentences max 5)? 1 -> 0.85 seems a rather big decrease in score to me, so unless you removed the longest sentence with the rarest words in the collection, I smell some bug, e.g. you forgot to remove it from the denominator as well, etc. It wo

Re: Performance of storing data in Lucene vs other (No)SQL Databases

2012-05-18 Thread Glen Newton
Storing content in large indexes can significantly add to index time. The model of indexing fields only in Lucene and storing just a key, and then storing the content in some other container (DBMS, NoSql, etc) with the key as lookup is almost a necessity for this use case unless you have a complet

Performance of storing data in Lucene vs other (No)SQL Databases

2012-05-18 Thread Konstantyn Smirnov
Hi all, apologies, if this question was already asked before. If I need to store a lot of data (say, millions of documents), what would perform better (in terms of reads/writes/scalability etc.): Lucene with stored fields (Field.Store.YES) or another NoSql DB like Mongo or Couch? Does it make se

RE: NullPointerException using IndexReader.termDocs when there are no matches

2012-05-18 Thread Edward W. Rouse
Thanks, I missed that. And the API doc fails to mention it, though it is pretty standard for a next() method. > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Thursday, May 17, 2012 6:20 PM > To: java-user@lucene.apache.org > Subject: Re: NullPoint

Re: Better Way of calculating Cosine Similarity between documents

2012-05-18 Thread Akos Tajti
köszi! On Fri, May 18, 2012 at 11:19 AM, Kasun Perera wrote: > Hi all > > I’m indexing collection of documents using Lucene specifying TermVerctor at > the indexing time. Then I retrieve terms and their term frequencies by > reading the index and calculate TF-IDF scores vector for each docum

Better Way of calculating Cosine Similarity between documents

2012-05-18 Thread Kasun Perera
Hi all I’m indexing collection of documents using Lucene specifying TermVerctor at the indexing time. Then I retrieve terms and their term frequencies by reading the index and calculate TF-IDF scores vector for each document. Then using TF-IDF vectors, I calculate pairwise cosine similarity betwee

Re: Store a query in a database for later use

2012-05-18 Thread Ahmet Arslan
> 2. toString() doesn't always generate a query that the > QueryParser can parse. I remember similar discussion, I think Xml-Query-Parser is more suitable for this use case. http://www.lucidimagination.com/blog/2009/02/22/exploring-query-parsers/ --