RE: Merge Exception in Lucene 2.4

2009-08-26 Thread Sumanta Bhowmik
Hi I ran a long running test and now got this exception. Exception in thread "Lucene Merge Thread #39" org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: read past EOF at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(Concur rentMergeScheduler.

Re: Extending Sort/FieldCache

2009-08-26 Thread Shai Erera
Thanks a lot for the response ! I wanted to avoid two things: * Writing the logic that invokes cache-refresh upon IndexReader reload. * Write my own TopFieldCollector which uses this cache. I guess I don't have any other choice by to write both of them, or try to make TFC more "customizable" such

Re: Lucene in Action Rev2

2009-08-26 Thread Erik Hatcher
I've pinged Manning to get this corrected. Thanks for the heads-up. Erik On Aug 26, 2009, at 5:58 PM, tsuraan wrote: In the free first chapter of the new Lucene in Action book, it states that it's targetting Lucene 3.0, but on the Manning page for the book, it says the code in the boo

Re: Seattle / NW Hadoop, HBase Lucene, etc. Meetup , Wed August 26th, 6:45pm

2009-08-26 Thread Bradford Stephens
Hello, My apologies, but there was a mix-up reserving our meeting location, and we don't have access to it. I'm very sorry, and beer is on me next month. Promise :) Sent from my Internets On Aug 25, 2009, at 4:21 PM, Bradford Stephens > wrote: Hey there, Apologies for this not going out

Re: Lucene in Action Rev2

2009-08-26 Thread Michael McCandless
Thanks for spotting that! We'll work with Manning to get it fixed... Yes, we're targeting 3.0 not 2.3 :) Mike On Wed, Aug 26, 2009 at 5:58 PM, tsuraan wrote: > In the free first chapter of the new Lucene in Action book, it states > that it's targetting Lucene 3.0, but on the Manning page for th

Re: Is there a way to check for field "uniqueness" when indexing?

2009-08-26 Thread Jason Rutherglen
Daniel, You may want to look at SOLR-1375 which enables ID checking using a BloomFilter (with a specified errorrate of false positives). Otherwise for what you're trying to do, you'd need to create a hash map? -J On Thu, Aug 13, 2009 at 7:33 AM, Daniel Shane wrote: > Hi all! > > I'm currently ru

Lucene in Action Rev2

2009-08-26 Thread tsuraan
In the free first chapter of the new Lucene in Action book, it states that it's targetting Lucene 3.0, but on the Manning page for the book, it says the code in the book is written for 2.3. I'm guessing that the book is the authority on what the book covers, but could somebody maybe change the Man

Lucene Search Performance Analysis Workshop

2009-08-26 Thread Andrzej Bialecki
Hi all, I am giving a free talk/ workshop next week on how to analyze and improve Lucene search performance for native lucene apps. If you've ever been challenged to get your Java Lucene search apps running faster, I think you might find the talk of interest. Free online workshop: Thursday,

Re: Is there a way to check for field "uniqueness" when indexing?

2009-08-26 Thread Yonik Seeley
On Wed, Aug 26, 2009 at 12:47 PM, Daniel Shane wrote: > Humm... there is something I dont catch.. > > When you open up an index writer, you batch up add and deletes. Now if you > create a signature for the document, as long as you add it works, but what > happens if you delete stuff from the index

Re: How to give a score for all documents?

2009-08-26 Thread Fabrício Raphael
Can you help me? How to give customized scores for all documents? 2009/8/25 Fabrício Raphael > I am continuing a work about wavelets in IR. In the bellow article you will > to find a example. > > > http://www.ieeexplore.ieee.org/search/srchabstract.jsp?arnumber=4740460&isnumber=4740405&punumber=

Re: Is there a way to check for field "uniqueness" when indexing?

2009-08-26 Thread Daniel Shane
Humm... there is something I dont catch.. When you open up an index writer, you batch up add and deletes. Now if you create a signature for the document, as long as you add it works, but what happens if you delete stuff from the index using a query as well as adding? Does Solr also remember

Re: Lucene release 2.9

2009-08-26 Thread Mark Miller
Looks like we are going to pull back a day and start the freeze sometime tomorrow (Thursday, August 27 2009). There is still a lot of documentation to catch up - wouldn't make sense to have anyone look over what we know is still wrong. Thanks, -- - Mark http://www.lucidimagination.com Mark M

Re: score from spans

2009-08-26 Thread Eran Sevi
I've done some work and would like to post it to the list in order to get some opinions and try to reach something that is satisfactory for everyone. One problem is that i'm actually using Lucene.Net and have written the code in c#. Anothe problem is that I'm using version 2.3.2 which might be a b

Re: Lucene gobbling file descriptors

2009-08-26 Thread Michael McCandless
This is not normal. As long as you are certain you close every IndexReader/Searcher that you opened, the number of file descriptors should stay "contained". Though: how many files are there in your index directory? Mike On Wed, Aug 26, 2009 at 9:18 AM, Chris Bamford wrote: > Hi there, > > I won

Re: Lucene gobbling file descriptors

2009-08-26 Thread Shai Erera
That's strange ... how do you execute your searches - each search opens up an IndexReader? Do you make sure to close them? Maybe those are file descriptors of files you index? Forgive the silly questions, but I've never seen Lucene run into out-of-files handles ... Shai On Wed, Aug 26, 2009 at 4

Lucene gobbling file descriptors

2009-08-26 Thread Chris Bamford
Hi there, I wonder if someone can help? We have a successful Lucene app deployed on Tomcat which works well. As far as we can tell, our developers have observed all the guidelines in the Lucene FAQ, but on some of our installations, Tomcat eventually runs out of file descriptors and needs a r

Re: how to down-weight synonyms

2009-08-26 Thread abhay kumar
Hi, The first answer by Sven is more efficient and generaly used. Abhay @Sven f you add the synonyms at query time you can assign a boost factor to the added synonyms that would boost the matches to a particular term down. -> something in the interval [0,1] On Wed, Aug 26, 2009 at 3:40 PM, Simon

Re: how to down-weight synonyms

2009-08-26 Thread Simon Willnauer
Hi Sven, While I have no idea bout the example in LiA I can give you some quick pointers. if you add the synonyms at query time you can assign a boost factor to the added synonyms that would boost the matches to a particular term down. -> something in the interval [0,1] if you add the synonyms at

how to down-weight synonyms

2009-08-26 Thread Sven Fischer
Hi, I implemented a synonym search by using the chapter 4.6's example from the Lucene in Action book. Now I want to extend the example in that way, that synonyms are boosted less that the original word the user searched for. Is there a way to do it? If it is, I would like to get any help on how

Re: "Read timed out" behind firewall - Ports closed? Loopback?

2009-08-26 Thread Simon Willnauer
David, Lucene is a framework that offers fulltext-indexing and search capabilities with a very limited support for client/server communication. The only remote communication mechanism included in Lucene (I know about - but I'm very confident I did not miss anything related to that) is the RemoteSea

Re: "Read timed out" behind firewall - Ports closed? Loopback?

2009-08-26 Thread David de la Torre
Dear Simon. Firstly thank you very much for your answer. I've been trying to debug this problem for a while and I am a bit at a loss. I am using lucene as search engine as included in a document management system called KnowledgeTree (http://wiki.knowledgetree.com/Troubleshooting_the_Document_I

Re: "Read timed out" behind firewall - Ports closed? Loopback?

2009-08-26 Thread Simon Willnauer
David, I can not follow you. What kind of Lucene applicaiton are you talking about. Afaik lucene does not use xmlRPC anywhere and we do not have any dependency on it (Do I miss something?). There is a RemoteSearcher / RemoteSearchable in core (until 2.4.1) and now in contrib/remote which uses RMI

"Read timed out" behind firewall - Ports closed? Loopback?

2009-08-26 Thread David de la Torre
When running lucene, on a machine with a firewall, I got the following error message, which I think it must be related to the firewall. In fact, when I shut down the firewall, the error dissapears. It must be something relating to the ports I have open. Lucene says it is running in port 8875. Is th