Re: Lock obtain timed exception

2007-05-23 Thread Laxmilal Menaria
Lock obtain timed out: java.io.IOException: Lock obtain timed out: [EMAIL PROTECTED]:\WINDOWS\TEMP\<[EMAIL PROTECTED]:%5CWINDOWS%5CTEMP%5Clucene-22e0ad3c019e26a6e2991b0e6ed97e1c-commit.lock> lucene-22e0ad3c019e26a6e2991b0e6ed97e1c-commit.lock<[EMAIL PROTECTED]><[EMAIL PROTECTED]:%5CWINDOWS%5CTEM

Lock obtain timed exception

2007-05-23 Thread Laxmilal Menaria
Hello everyone, I am getting Lock obtain timed exception while Searching in Index. My Steps: I have created a Lucene Index at first week of may 2007, after that I have nothing changed in index folder. Just I am searching. Searcher code have only MultiSearcher. BUT now I am getting "Lock obtain t

Re: Integrate Lucene search facilities with existing databases

2007-05-23 Thread Doron Cohen
Huajing Li wrote: > I am working on an application that must deal with ranking on highly dynamic > metadata. For example, suppose I want to provide ranking based on the number > of downloads of hit documents. A user may log-in to the system and send a > query, which will be answered by Lucene in

Re: WhitespaceAnalyzer [was: Re: regaridng Reader.terms()]

2007-05-23 Thread Mohammad Norouzi
Sorry Steven that change is in WhitespaceTokenizer not WhiteSpaceAnalyzer but in Analyzer I had to call the tokenizer On 5/24/07, Mohammad Norouzi <[EMAIL PROTECTED]> wrote: Hi Steven Thank you so much for your thorough comments about Analyzer I write that class a couple of months ago, now I

Re: WhitespaceAnalyzer [was: Re: regaridng Reader.terms()]

2007-05-23 Thread Mohammad Norouzi
Hi Steven Thank you so much for your thorough comments about Analyzer I write that class a couple of months ago, now I take a look at my customized Analyzer the only change I've made as follows: the original class has this method: protected boolean isTokenChar(char c) { return !Character.isW

Integrate Lucene search facilities with existing databases

2007-05-23 Thread Huajing Li
Hi all, I am working on an application that must deal with ranking on highly dynamic metadata. For example, suppose I want to provide ranking based on the number of downloads of hit documents. A user may log-in to the system and send a query, which will be answered by Lucene in a traditional wa

WITH_POSITIONS_OFFSETS versus WITH_OFFSETS

2007-05-23 Thread Michael Mitiaguin
What practical of using WITH_POSITIONS_OFFSETS ? Aren't WITH_OFFSETS sufficient and if iterate getStartOffset effectively gives the value from array element of getTermPositions ? - To unsubscribe, e-mail: [EMAIL PROTECTED] For

HitCollector or Hits

2007-05-23 Thread Carlos Pita
Hi folks, I need to collect some global information from my first 1000 search results in order to build up some search refining components containing only relevant values (those which correspond to at least one of the first 1000 hits). For example, the results are products and there is a store fi

Highlighting fast and highlighting all text

2007-05-23 Thread Michael Mitiaguin
I browsed this list and contributions and have a difficulty to determine whether there is anything which may be used straightforwardly to highlight all hits ( no fragmenting ) for large chunk of text. Probably my query should be sent as 3 separate ones : 1. The fastest possible fragment highligh

Re: How to filter fields with hits from result set

2007-05-23 Thread Erick Erickson
Two things to watch... 1> Think about indexing the special page-end token with an increment gap of 0 (see SynonymAnalyzer in Lucene In Action). That preserves the sense of phrases across page breaks. 2> Assembling the span query is tricky. Search the mail archive for SpanQuery to see an exchange

RE: How to filter fields with hits from result set

2007-05-23 Thread Andreas Guther
Eric, Thank you very much for your response. That sounds very interesting. Let me do some experimenting to see if I fully understood your solution. Otherwise I have to come back to you with more questions. Andreas -Original Message- From: Erick Erickson [mailto:[EMAIL PROTECTED] Se

Who has sample code of remote multiple servers multiple indexes searching?

2007-05-23 Thread Su.Cheng
Hi, I studied "5.6 Searching across multiple Lucene indexes 178" in <>. I have 2 remote serarch computers(SearchServer) work as index servers and search requests from a search client(SearchClient,the 3rd computer). An error message, "Exception in thread "main" java.rmi.UnmarshalException", was t

Re: How to filter fields with hits from result set

2007-05-23 Thread Erick Erickson
As luck would have it, I've done something very similar. What I had to do is index a special token at the end of each page. Then I could get the term offsets for each page Then I used one of the SpanQuery.getSpans to get all of the offsets of the hits throughout all of the pages. now I have

How to filter fields with hits from result set

2007-05-23 Thread Andreas Guther
Hi, If a search returns a document that has multiple fields with the same name, is there a way to filter only those fields that contain hits? Background: I am indexing documents and we store all content in our index for display reasons. We want to show only those pages containing hits. My fir

WhitespaceAnalyzer [was: Re: regaridng Reader.terms()]

2007-05-23 Thread Steven Rowe
Hi Mohammad, WhitespaceAnalyzer uses Java's Character.isWhitespace(char) method to determine whether or not a character should be part of a token. As far as I know, this method is problematic only for characters outside of the Basic Multilingual Plane (BMP). I think Lucene should switch to using

Re: How to avoid score calculation completely?

2007-05-23 Thread Yonik Seeley
On 5/23/07, Zhang, Lisheng <[EMAIL PROTECTED]> wrote: We have been using lucene for years and it serves us well. Sometimes when we issue a query, we only what to know how many hits it leads, not want any docs back. Is it possible to completely avoid score calculation to get total count back? I

How to avoid score calculation completely?

2007-05-23 Thread Zhang, Lisheng
Hi, We have been using lucene for years and it serves us well. Sometimes when we issue a query, we only what to know how many hits it leads, not want any docs back. Is it possible to completely avoid score calculation to get total count back? I understand score calculation needs a loop for all m

Re: CAD files, Images

2007-05-23 Thread jim shirreffs
thank you for the reply, I knew the answer but was compelled to ask anyway. CAD files like AutoCad/ProE/CaTia do contain some useful text and it is possible to get at that and index it. But mostly it's vectors and there is not much a text engine can do with a vectors. thanks again. jim s --

Re: MoreLikeThis?

2007-05-23 Thread Donna L Gresh
Thank you-- Donna L. Gresh Services Research, Mathematical Sciences Department IBM T.J. Watson Research Center (914) 945-2472 http://www.research.ibm.com/people/g/donnagresh [EMAIL PROTECTED] Otis Gospodnetic <[EMAIL PROTECTED]> 05/22/2007 05:33 PM Please respond to java-user@lucene.apache.or

Re: regaridng Reader.terms()

2007-05-23 Thread Mohammad Norouzi
Wow, very nice comments Thank you so much Erick. You really showed me the way -- Regards, Mohammad -- see my blog: http://brainable.blogspot.com/

Re: CAD files, Images

2007-05-23 Thread Erick Erickson
No, you can only index text. It's the same thing as indexing HTML documents or XML documents. You index "important stuff" from the guts of the doc. So, if you can get some text out of these docs, you can index *that*, possibly along with page information that allows you to, say, display the relev

Re: regaridng Reader.terms()

2007-05-23 Thread Erick Erickson
You may have to index things twice, once for searching and once UN_TOKENIZED for display. Say you have a bunch of service names you want to display service one service two service three If you use WhitespaceAnalyzer, TOKENIZED you index the tokens service (note, there are three of these) one two

CAD files, Images

2007-05-23 Thread jim shirreffs
Is it possibe to index CAD formats such as AutoCad or CGM? I know some commecail products (excalaber) claim to be able to do that? If so what about TIFF? thanks jim s - To unsubscribe, e-mail: [EMAIL PROTECTED] For additiona