RE: Merging database index with fulltext index

2009-02-28 Thread spring
> Yes. DBSight helps to flatten database objects into Lucene's > documents. OK, thx for the advice. But back to my original question. When I have to merge both resultsets, what is the best approach to do this? - To unsubscrib

Re: Merging database index with fulltext index

2009-02-28 Thread Chris Lu
Yes. DBSight helps to flatten database objects into Lucene's documents. It's more like Lucene-On-Rails. Custom crawler is supported via java api to crawl outside database. DBSight query syntax and Lucene query syntax are both supported, in addition to customizable analyzer, similarity, ranking, etc

RE: Merging database index with fulltext index

2009-02-28 Thread spring
> Actually you can use DBSight(disclaimer:I work on it) to > collect the data > and keep them in sync. Hm... it fulltext-indexes a database? It supports document content outside the database (custom crawler)? What query-syntax it supports? --

Re: queryNorm affect on score

2009-02-28 Thread Yonik Seeley
On Sat, Feb 28, 2009 at 3:02 PM, Peter Keegan wrote: >> in situations where you  deal with simple query types, and matching query > structures, the queryNorm >> *can* be used to make scores semi-comparable. > > Hmm. My example used matching query structures. The only difference was a > single term

Re: Confidence scores at search time

2009-02-28 Thread Grant Ingersoll
Personally, I have my doubts about this actually working and I think others do too. It's in there in Lucene, but I don't know if it makes sense. Logically speaking, I just don't see how it makes sense to compare different queries results, but maybe I'm just short-sighted. I'd certainly w

Re: Merging database index with fulltext index

2009-02-28 Thread Chris Lu
Actually you can use DBSight(disclaimer:I work on it) to collect the data and keep them in sync. The free version has most the features and doesn't have size limit. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net dem

RE: Merging database index with fulltext index

2009-02-28 Thread spring
> Contrariwise, look for anything by Marcelo Ochoa on the user list > about embedding Lucene in Oracle (which I confess I haven't looked > into at all, but seems interesting). I know this lucene-oracle text cartridge. But my solution has to work with any of the big databases (MS, IBM, Oracle). -

RE: Merging database index with fulltext index

2009-02-28 Thread spring
> I feel this may not be a good example. It was a very simple example. The real database query is very complex and joins serveral tables. It would be an absolute nightmare to copy all these tables into lucene and keep both in sync.

Re: Merging database index with fulltext index

2009-02-28 Thread Erick Erickson
I'll second Chris's comment and ask whether you've considered denormalizing your data into Lucene and sticking exclusively with Lucene? Contrariwise, look for anything by Marcelo Ochoa on the user list about embedding Lucene in Oracle (which I confess I haven't looked into at all, but seems intere

Re: Merging database index with fulltext index

2009-02-28 Thread Chris Lu
I feel this may not be a good example. Since you can easily index field c, a, d and let Lucene to handle the filter "c = 'foo'" and the order by clause"order by a desc, d" -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.

Re: queryNorm affect on score

2009-02-28 Thread Peter Keegan
> in situations where you deal with simple query types, and matching query structures, the queryNorm > *can* be used to make scores semi-comparable. Hmm. My example used matching query structures. The only difference was a single term in a field with zero weight that didn't exist in the matching

Merging database index with fulltext index

2009-02-28 Thread spring
Hi, what is the best approach to merge a database index with a lucene fulltext index? Both databases store a unique ID per doc. This is the join criteria. requirements: * both resultsets may be very big (100.000 and much more) * the merged resultset must be sorted by database index and/or releva

Re: queryNorm affect on score

2009-02-28 Thread Chris Hostetter
: I guess I don't really understand this comment in the similarity java doc : then: : : http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html#formula_queryNorm : : *queryNorm(q) * is a normalizing factor used to make scores between queries : comparable. that comment

Re: TopDocCollector

2009-02-28 Thread Yonik Seeley
On Sat, Feb 28, 2009 at 7:51 AM, wrote: >> Solr has always allowed all scores through w/o screening out <=0 > > Why? Partially historical... due to some limitations in Lucene back when Solr was first written (like undesired score normalization), Solr interfaces with Lucene search at the hit coll

RE: TopDocCollector

2009-02-28 Thread spring
> > * How can a hit have a score of <=0? > > A function query, or a negative boost would do it. Ah ok. > Solr has always allowed all scores through w/o screening out <=0 Why? - To unsubscribe, e-mail: java-user-unsubscr...@lu

RE: TopDocCollector

2009-02-28 Thread spring
> That works fine, because hq.size() is still less than numHits. So > nomatter what, the first numHits hits will be added to the queue. > > > public void collect(int doc, float score) { > > 57 if (score > 0.0f) { > > 59 if (hq.size() < numHits || score >= minScore) { Oh damned... it'

Re: How build Lucene in Action examples

2009-02-28 Thread Erik Hatcher
Please post questions/issues related to Lucene in Action to Manning's Author Online forum at: Thanks, Erik On Feb 27, 2009, at 6:33 PM, tolkienGR wrote: Hi !!! I'm new in Lucene.I started reading Lucene in action (first

Re: queryNorm affect on score

2009-02-28 Thread Michael Stoppelman
I guess I don't really understand this comment in the similarity java doc then: http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html#formula_queryNorm *queryNorm(q) * is a normalizing factor used to make scores between queries comparable. :/. M On Fri, Feb 27, 2009

Re: Confidence scores at search time

2009-02-28 Thread Michael Stoppelman
I was just reading the Similarity javadocs ( http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html#formula_queryNorm) and I thought this might be relevant to your issue. >From the javadoc: *queryNorm(q) * is a normalizing factor used to make scores between queries compar