injecting fields looked up from DB at the runtime - Solr/Lucene question

2006-11-04 Thread Vladimir Olenin
Hi, I wonder if the below is the correct way of doing things... - when the Hits objects are returned from IndexSearcher (as a result of some search), 'inject' 'info' fields into the 'Hit' objects at runtime by looking the values up in the DB. The main purpose is to avoid storing 'info' fields

RE: Re: lucene and web services?

2006-11-04 Thread Graham Stead
Jim, IMHO, Solr is excellent and perhaps your best bet. There's a new-ish Lucene WS project here: http://lucene-ws.net/. I have not tried it, but it looks fine if you want to interface vanilla Lucene to, for example, Opensearch. There's an older Lucene WS project here: http://lucene-ws.sourcefor

RE: Re: lucene and web services?

2006-11-04 Thread Vladimir Olenin
You might want to check out: - Solr (WS & RESTish access to Lucene engine, both search & index) - DWR (AJAX remote access library. Not really a WS, since communication protocol is not generic at this point, but works excellent if all you need is access to POJOs from JavaScript; it's more or less

Re: Re: lucene and web services?

2006-11-04 Thread James Rhodes
Yeah, I've considered that but I thought it would be pretty nice if I was able to build something that was transparent to Lucene. I'd like to use the Hits object itself. I may be complicating something, but the Hits implementation, at least in concept, seems pretty efficient and worth keeping. I'm

Re: lucene and web services?

2006-11-04 Thread Chris Lu
Why not render your search results in XML format? You can use some templating like Velocity. -- Chris Lu - Instant Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com On 11/4/06, James Rhodes <[EMAIL PROTECTED]> wrote:

Re: 2.0 and Tokenized versus UN_TOKENIZED

2006-11-04 Thread James Rhodes
Thanks. That helps, but I've tried a lot of combinations and I forget now. I'm using StandardAnalyzer for the index and query.I can't say for sure if I've tried other cases. The specific combination is lastname:rhodes AND city:"EAGLE RIVER" AND state:AK, Before TOKENIZED no match after TOKENIZED m

Re: Query question

2006-11-04 Thread jeff . richley
Thought I attached the code :) package com.infinity.naxx.sandbox; import java.io.IOException; import java.util.Iterator; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.KeywordAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Fi

Minor issues with sample Web app

2006-11-04 Thread David Halsted
I ran into a couple of minor issues with the sample Web app included in the Lucene 2.0.0 download in results.jsp. Not sure whether this list is the right place to bring this up but thought I'd try. *) At line 81, there's a line that assumes a static parse(String,String,Analyzer) in QueryParser,

Re: Query question

2006-11-04 Thread jeff . richley
I know I am getting very close on this one but can't seem to get the score above .306. My guess is that I need to do something different in my query. If at all possible, could you take a quick look at my test code and point me in the correct direction? I know everyone is very busy, so any help w

Re: 2.0 and Tokenized versus UN_TOKENIZED

2006-11-04 Thread Erick Erickson
Two questions come to mind... 1> what analyzer are you using for the *query*? Is it possible that when you query for city you're using a tokenizer that breaks up your city code? 2> what about case? I'll assume that you have tried to search one-word cities, so how the stream is tokenized won't br

lucene and web services?

2006-11-04 Thread James Rhodes
Has anyone successfully implemented a web services front end to remotely search a Lucene index? I've tried to do it with the Xfire stuff in MyEclipse, but their default Aegis xml mapping stuff doesn't support the Lucene Hits object. I'd like to avoid searching a remote index via RMI, but for now i

JGuruMultisearcher

2006-11-04 Thread Mark Miller
The JGuruMultiSearcher from LIA appears to me to be outdated. Am I wrong or has anyone who is willing to share updated the code? I would attempt it myself but I have not played with this stuff before and I would like something without bugs :_ A quick look at when the search(Query query, Filter

2.0 and Tokenized versus UN_TOKENIZED

2006-11-04 Thread James Rhodes
I'm using the 2.0 branch and I've had issues with searching indexes where the fields aren't tokenized. For instance, my index consists of count,lastname,city,state and I used the following code to index it (the data is in a sql server db): * if*(count != 0) { doc.add(*new* Field("count", NumberU

RemoteSearchable Term Freq

2006-11-04 Thread Amit Kumar
I am trying to figure out a way to get term frequencies from RemoteSearchables; My index is partitioned in to 3 sets on remote servers, the RemoteSearchable API has docFreq* methods but no getTermFreq* methods. So my question is; 1. Is there a class that I am over looking that would provide

Re: How to improve document retrieval speed.

2006-11-04 Thread eks dev
I would strongly suggest not storing these fields in lucene, just keep them as files and store some kind of url to get them latter. that will boost your speed heavily. If you really, really need to store documents in lucene, try some compression Also, so many fields hurt performance, any chance

Re: How to improve document retrieval speed.

2006-11-04 Thread Sunil Kumar PK
Hi, I am using Lucene's Remote Parallel Multisearcher with 10 nodes in my search cluster having 200+ distributed index fragments (1 Index fragment = 4GB). I have 30+ fields in my index, and I am storing a master XML file (contains 5 to 30 pages of information) in one field. I also have two web s

Re: How to improve document retrieval speed.

2006-11-04 Thread Grant Ingersoll
You probably can skip the QueryParser part and just construct a TermQuery with your term and field. That will save you a few ticks. I'm betting you have just included the code below for example, so this may not apply, however, you want to make sure you aren't creating the IndexSearcher ev

How to improve document retrieval speed.

2006-11-04 Thread Sunil Kumar PK
Hi, In my index there is a unique field, "MY_DOCNO". If I want get a document from the index with MY_DOCNO=1000, I am using following code, IndexSearcher isearcher = new IndexSearcher("myindex1"); QueryParser qp = new QueryParser("MY_DOCNO", new StandardAnalyzer()); Query query = qp.parse("MY

Re: How to get Term Weights (document term matrix)?

2006-11-04 Thread Soeren Pekrul
Chris Hostetter wrote: You really, *REALLY* don't wnat to be doing this using the "Hits" class like in your example ... 1) this will re-execute your search behind the scenes many many times 2) the scores returnd by "Hits" are psuedo-normalized ... they will be meaningless for any sort