Re: Approches/semantics for arbitrarily combining boolean and proximity search operators?

2012-05-25 Thread Trejkaz
On Sat, May 26, 2012 at 12:07 PM, Chris Harris wrote: > > Alternatively, if you insist that query > > merger w/5 (medical and agreement) > > should match document "medical x x x merger x x x agreement" > > then you can propagate 2x the parent's slop value down to child queries. This is in fact ex

Re: Approches/semantics for arbitrarily combining boolean and proximity search operators?

2012-05-25 Thread Chris Harris
In case it's of interest, I have a new approach I'm considering. For the basic intuition, a colleague who works with some of the users formulating these complicated queries proposed that (merger and agreement) w/5 (medical and companion) is approximately the same as (merger w/5 agreement) w/5 (

Re: lucene (search) performance tuning

2012-05-25 Thread Yang
I tested with more threads / processes. indeed this is completely cpu-bound, since running 1 thread gives the same latency as 4 threads (my box has 4 cores) given this, is there any way to simplify the scoring computation (i'm only using lucene as a first level "rough" search, so the search quali

Re: lucene (search) performance tuning

2012-05-25 Thread Yang
thanks a lot guys On Tue, May 22, 2012 at 1:34 AM, Ian Lea wrote: > Lots of good tips in > http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from > the FAQ. > > > -- > Ian. > > > On Tue, May 22, 2012 at 2:08 AM, Li Li wrote: > > something wrong when writing in my android client.

Re: Bizarre Search order request

2012-05-25 Thread Chris Hostetter
: For example, if I display of 20 results, I might want to limit it to a : maximum of 10 "mail", 10 "blog" and 10 "website" documents. Which ones : get displayed and how they were ordered would depend on the normal : relevancy ranking, but, for example, once I had 10 "mail" objects to : displ

Re: Bizarre Search order request

2012-05-25 Thread Chris Lu
Nothing like this yet. But you don't need to do everything in one search request. You can send one search request to know that the match distribution for each document type, and then send 3 requests for 3 document types each. -- Chris Lu - Instant Scalable Full-Text Searc

RE: IndexReader.deleteDocument in Lucene 3.6

2012-05-25 Thread Edward W. Rouse
To ensure deletion I use a while loop with a counter (to prevent an endless loop if there's a problem) Term term = this.createIdTerm(id); Int count = 0; while(readDocument(indexName, id) != null) { count++; log.debug("deleting document " + id + " from index " + indexN

Bizarre Search order request

2012-05-25 Thread Scott Smith
I really need this on Solr, but thought I would start here as I suspect that, if it's possible, it's some kind of custom relevancy ranking that would need to be done in lucene and then used in SOLR. I will simplify the actual problem somewhat, but I think it will have the gist of what I want to

Re: IndexReader.deleteDocument in Lucene 3.6

2012-05-25 Thread Yonik Seeley
On Fri, May 25, 2012 at 5:23 AM, Nikolay Zamosenchuk wrote: > IndexWriter.deleteDocument(..) is not final, > but doesn't return any result. Deleted terms are buffered for good performance, so at the time of IndexWriter.deleteDocument(Term) we don't know how many documents match the term. > Can a

IndexReader.deleteDocument in Lucene 3.6

2012-05-25 Thread Nikolay Zamosenchuk
Hi everyone. We are using IndexReader.deleteDocument(Term) method to delete documents, since it returns the number of deleted documents. This is used to be sure that some docs were removed. We must know for sure if documents were deleted. But in lucene 3.6 this method is final and can't be overridd

Re: Lucene Grouping problem

2012-05-25 Thread Martijn v Groningen
If the time span or website (I assume you mean domain name) is a field in your index then you can use result grouping. Result grouping has impact on your query time and if you have a lot of data you need to divide your data across multiple indices and use distributed result grouping. Martijn On 2

Re: ToParentBlockJoinQuery$BlockJoinWeight cannot explain match on parent document

2012-05-25 Thread Martijn v Groningen
Hi Christoph, You can open an issue for this. I think we can use the child score as an explanation of why a parent doc is scored the way it is. Martijn On 25 May 2012 13:20, Christoph Kaser wrote: > Hello all, > > I try to calculate score explanations for a query that contains a > ToParentBlock

RE: IndexReader.deleteDocument(Term) in Lucene 3.6/4.0

2012-05-25 Thread Uwe Schindler
To change the behaviour of IndexReaders use FilterIndexReader, don't subclass IndexReader's directly. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Nikolay Zamosenchuk [mailto:nikolaz...@gmail.com] > S

Re: IndexReader.deleteDocument(Term) in Lucene 3.6/4.0

2012-05-25 Thread Simon Willnauer
hey, On Fri, May 25, 2012 at 2:45 PM, Nikolay Zamosenchuk wrote: > Hi everyone. We are using IndexReader.deleteDocument(Term) method to > delete documents, since it returns the number of deleted documents. > This is used to be sure that some docs were removed. We must know for > sure if documents

IndexReader.deleteDocument(Term) in Lucene 3.6/4.0

2012-05-25 Thread Nikolay Zamosenchuk
Hi everyone. We are using IndexReader.deleteDocument(Term) method to delete documents, since it returns the number of deleted documents. This is used to be sure that some docs were removed. We must know for sure if documents were deleted. But in lucene 3.6 this method is final and can't be overridd

Re: Taking backup of Lucene DB

2012-05-25 Thread Michael McCandless
The simplest way is to stop all index writing (close the IndexWriter), do the copy, then start your IndexWriter again. If that's a problem (usually it is!) then use SnapshotDeletionPolicy to protect the commit point (ie prevent any of the files it uses from being deleted) while you are making the

ToParentBlockJoinQuery$BlockJoinWeight cannot explain match on parent document

2012-05-25 Thread Christoph Kaser
Hello all, I try to calculate score explanations for a query that contains a ToParentBlockJoinQuery and get the following exception: java.lang.UnsupportedOperationException: org.apache.lucene.search.join.ToParentBlockJoinQuery$BlockJoinWeight cannot explain match on parent document at org.a

Taking backup of Lucene DB

2012-05-25 Thread Ganesh
Hello all, We want to place our search db in SAN drive and want to replicate the data to another SAN. We have a replication software to do this process. My question is whether we could do that for search db? While indexing (for testing purpose), i just copied the db files and dropped in anothe

Re: ToParentBlockJoinQuery and grand-children

2012-05-25 Thread Christoph Kaser
Hi Mike, unfortunately, you were right about only getting the last child's grandchildren. Furthermore, the groups have wrong groupValues: They are the document id of the previous parent, not the child. I have opened an issue: https://issues.apache.org/jira/browse/LUCENE-4076 I also created