Re: simple (?) question about scoring

2006-11-02 Thread Chris Hostetter
: I have a question: is the score for a document different if I have : only that document in my index, or if I have N documents? : If the answer is yes, I will put all N documents together, otherwise I : will evaluate them one by one. as i said before, yes it does... >> For most of the various t

Re: search within search

2006-11-02 Thread Doron Cohen
This code adds the same query twice to a boolean query: Query query = parser.parse(searchString); bq1.add(query, BooleanClause.Occur.MUST); bq1.add(new BooleanClause(query, BooleanClause.O

Re: Query question

2006-11-02 Thread Chris Hostetter
: 1.) I have data like name="Jeff" lastname="Richley" age="33" and I need to : be able to query by any combination such as name="Jeff" age="33". But if : I query with name="Jeffrey" there is no match. : : 2.) The name value pairs are not really controlled until the end user is : inserting informa

Re: Any experience with spring's lucene support?

2006-11-02 Thread lude
Nobody here, who is using spring-modules? On 11/1/06, lude <[EMAIL PROTECTED]> wrote: Hello, while starting a new project we are thinking about using the spring-modules for working with lucene. See: https://springmodules.dev.java.net/ Does anybody has experience with this higher level lucene

Re: search within search

2006-11-02 Thread spinergywmy
Hi, I have look at the examples from lucene source, and try out myself but it doesn't work. Perhaps u can point out where I did wrong. Below r the codes that I developed: public String search(String searchString) throws IOException, Exception { //System.out.println

Re: simple (?) question about scoring

2006-11-02 Thread Michele Amoretti
I have a question: is the score for a document different if I have only that document in my index, or if I have N documents? If the answer is yes, I will put all N documents together, otherwise I will evaluate them one by one. Btw, I will ask the ws develepoer about how queries are interpreted by

Re: Modelling relational data in Lucene Index?

2006-11-02 Thread Chris Lu
Hi, Rajesh, You can use space as , by use WhitespaceAnalyzer. By detached mode, I mean the search function and your java system should be kind of logically separated. From the technical side, a separated search server will be more scalable. From the business side, searching is more like an add-o

Re: Possible documentation error?

2006-11-02 Thread Doron Cohen
"Johan Stuyts" <[EMAIL PROTECTED]> wrote on 26/10/2006 07:40:21: > Hi, > > On the page about the file formats I think there might be a > documentation error below 'frequencies'. The example is '15, 22, 3', but > if I read the paragraph starting with 'DocDelta determines both the > document number a

Re: Modelling relational data in Lucene Index?

2006-11-02 Thread Rajesh parab
Thanks for feedback Chris. I agree with you. The data set should be flattened out to store inside Lucene index. The Folder-File was just an example. As you know, in relational database, we can have more complex relationships. I understand that this model may not work for deeper relationships.

Re: Query question

2006-11-02 Thread jeff . richley
Ah good question. The data that I am needing to query on is not a set definition of tables or columns like a database is. Let me give two examples: 1.) I have data like name="Jeff" lastname="Richley" age="33" and I need to be able to query by any combination such as name="Jeff" age="33". But if

Re: Modelling relational data in Lucene Index?

2006-11-02 Thread Chris Lu
For this specific question, you can create index on files, search files that of type image, and from matched files, find the unique directories(can be done in lucene or you can do it via java). Of course this does not scale to deeper relationships. Usually you do need to flattern the database obj

Re: Modelling relational data in Lucene Index?

2006-11-02 Thread Rajesh parab
Thanks Mark. Can you please tell me more about the Lucene add-on you are talking about? Are you talking about Compass? Regards, Rajesh - Original Message From: Mark Miller <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, November 2, 2006 7:29:10 PM Subject: Re: Model

Re: Modelling relational data in Lucene Index?

2006-11-02 Thread Mark Miller
Lucene is probably not the solution if you are looking for a relational model. You should be using a database for that. If you want to combine Lucene with a relational model, check out Hibernate and the new EJB annotations that it supports...there is a cool little Lucene add-on that lets you de

Modelling relational data in Lucene Index?

2006-11-02 Thread Rajesh parab
Hi, As I understand, Lucene has a flat structure where you can define multiple fields inside the document. There is no relationship between any field. I would like to enable index based search for some of the components inside relational database. For exmaple, let say "Folder" Object. The Folde

Re: Query question

2006-11-02 Thread Erick Erickson
An example (simplified, to be sure) would help a lot. What does a 100% match mean? Why do you care? What problem are you trying to solve? Why wouldn't a database server you better? Best Erick On 11/2/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: I am wanting to be able to put sets of data i

Query question

2006-11-02 Thread jeff . richley
I am wanting to be able to put sets of data in a very structured way and query Lucene for only 100% matches. Is there a way to do this? I seem to be getting back at best 0.30685282. I appreciate any help and insite. Jeff Richley, Vice President Southeast Virginia Java Users Group [EMAIL PROTEC

Re: simple (?) question about scoring

2006-11-02 Thread Chris Hostetter
: le list is not ordered (I do not know the details of the search : angine, I only have its result for a query) : : then I have this list of documents, which represents a subset of the corpus : : I have to rank the documents of the list, using your scoring algorithm In other words, out of a large

Re: reindex all files

2006-11-02 Thread spinergywmy
Hi, Thanks Erik. I will take a look on it. regards, Wooi Meng -- View this message in context: http://www.nabble.com/reindex-all-files-tf2558188.html#a7149611 Sent from the Lucene - Java Users mailing list archive at Nabble.com. ---

Re: simple (?) question about scoring

2006-11-02 Thread Michele Amoretti
The whole problem I have to face is the following: I have a web service which searches a corpus of documents and returns a list of documents which match the query le list is not ordered (I do not know the details of the search angine, I only have its result for a query) then I have this list of

Re: simple (?) question about scoring

2006-11-02 Thread Doron Cohen
michele.amoretti wrote: > Ok I am trying the MemoryIndex, but when compiling I have the > following erro message: > > package org.apache.lucene.index.memory does not exist > > Is it not included in the lucene .jar? > > I currently have the latest lucene binaries. Yes this is not part of core Lucen

Re: simple (?) question about scoring

2006-11-02 Thread Michele Amoretti
Ok I am trying the MemoryIndex, but when compiling I have the following erro message: package org.apache.lucene.index.memory does not exist Is it not included in the lucene .jar? I currently have the latest lucene binaries. Moreover: parse(java.lang.String) in org.apache.lucene.queryParser.Qu

Re: simple (?) question about scoring

2006-11-02 Thread Chris Hostetter
: > .. Btw, I do not have an index, I have 1 Document, and 1 Query. : Lucene scoring - http://lucene.apache.org/java/docs/scoring.html - uses : pre-computed statistics, location info, and the number of documents in the : index (1 in your case). So some preparation is required before a : (stand-al

Re: simple (?) question about scoring

2006-11-02 Thread Doron Cohen
[EMAIL PROTECTED] wrote on 02/11/2006 06:36:48: > .. the following operation: > given a Query and a Document, return the score > .. I would like a method which returns the score directly. > .. Btw, I do not have an index, I have 1 Document, and 1 Query. Lucene scoring - http://lucene.apache.org/ja

Re: simple (?) question about scoring

2006-11-02 Thread Michele Amoretti
I look at the test section in the source code, before posting, but all seemed too complicated. Currently I am trying to use an implementation of the Scorer abstract class. Michele On 11/2/06, Erick Erickson <[EMAIL PROTECTED]> wrote: BTW, I highly recommend "Lucene in Action" for examples on g

Re: Warming up a Searcher

2006-11-02 Thread Yonik Seeley
SolrCore.getSearcher() and registerSearcher() work together to do do warming. If you want to try and rip that out of solr, remove any calls having to do with autowarming (warming new caches from most-recently-used items in old caches), and replace the static warming queries (defined in an external

Re: Warming up a Searcher

2006-11-02 Thread Simon Willnauer
I can't point out the actual position in solr but I bet if you spend 10 min looking into the solr source you will find a nice example how to warm up a searcher. regards simon On 11/2/06, Aigner, Thomas <[EMAIL PROTECTED]> wrote: I have seen numerous posts on warming up a searcher, but was wond

Re: simple (?) question about scoring

2006-11-02 Thread Erick Erickson
BTW, I highly recommend "Lucene in Action" for examples on getting started. Another good place to see examples is in the unit tests that come along with the Lucene source code. See http://www.eng.lsu.edu/mirrors/apache/lucene/java/ Best Erick On 11/2/06, Michele Amoretti <[EMAIL PROTECTED]> wrot

Re: simple (?) question about scoring

2006-11-02 Thread Grant Ingersoll
Also, from the javadocs, check out the explain method on Searcher: http://lucene.apache.org/java/docs/api/org/apache/lucene/search/ Searcher.html#explain(org.apache.lucene.search.Query,%20int) As for the state of the documentation, If you have concerns about the javadocs, please write up an

Re: Indexing floating point number

2006-11-02 Thread Yonik Seeley
On 11/2/06, Nadav Har'El <[EMAIL PROTECTED]> wrote: On Wed, Nov 01, 2006, Yonik Seeley wrote about "Re: Indexing floating point number": > > longer strings than Solr's NumberTools. Moving to base 100 or even 256 > > (as I suggest in the comments) can eliminate this difference. > > Or higher,

Re: reindex all files

2006-11-02 Thread Erick Erickson
I really, really, really recommend a copy of "Lucene in action" . Another rich source of examples is the unit test code that comes along with the source for Lucene. You can download a copy of all the source and unit tests from http://www.eng.lsu.edu/mirrors/apache/lucene/java/ Best Erick On 11/

Re: search within search

2006-11-02 Thread Erick Erickson
I *strongly* recommend that you get a copy of "Lucene in Action". It has many examples and I found it extremely helpful when I started out Best Erick On 11/2/06, spinergywmy <[EMAIL PROTECTED]> wrote: Hi, Thanks Erik. Is there any example that I can refer to cause I m actually qui

Re: simple (?) question about scoring

2006-11-02 Thread Erick Erickson
Well, the simplest way is to look at the HITS object returned from a search. Something like Hits hits = searcher.search(new TermQuery(new Term("field", "value"))); for (int idx = 0; idx < hits.length(); ++idx) float score = hits.score(idx); Look at the warnings in the javadoc for why using t

Warming up a Searcher

2006-11-02 Thread Aigner, Thomas
I have seen numerous posts on warming up a searcher, but was wondering if someone could post their code that would spin off another thread to warm up a searcher, then switch to the new one when it is warmed up? - To unsubscri

Re: reindex all files

2006-11-02 Thread spinergywmy
Hi, Thanks Erik. I hope u can provide me a complete example cause I actually quite new to lucene search, sorry for the inconvenience. Thanks. regards, Wooi Meng -- View this message in context: http://www.nabble.com/reindex-all-files-tf2558188.html#a7135033 Sent from the Lucene - Ja

Re: search within search

2006-11-02 Thread spinergywmy
Hi, Thanks Erik. Is there any example that I can refer to cause I m actually quite new to apache lucene although it is sometimes quite straight forward, but I hope u understand and it is better to have a guide for me. Thanks regards, Wooi Meng -- View this message in context: http:

simple (?) question about scoring

2006-11-02 Thread Michele Amoretti
Hello, I am completely new at Lucene. I browsed the web site and the source code, searching for an example which illustrates the following operation: given a Query and a Document, return the score To me, this is a very basic operation, but I cannot find a class which easily provide this function

Re: reindex all files

2006-11-02 Thread Erick Erickson
There isn't enough of an explanation here to give you any meaningful answer about performance. Do you have any evidence that the optimization will take a long time? Have you run it before? Do you have any count of how many files you're indexing? Do they have a complex structure? What are the perfo

Re: experiences with lingpipe

2006-11-02 Thread Martin Braun
Hi Breck, i have tried your tutorial and built (hopefully) a successful SpellCheck.model File with 49M. My Lucene Index directory is 2,4G. When I try to read the Model with the readmodel function, i get an "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space", though I started j

Re: search within search

2006-11-02 Thread Erik Hatcher
Take the original search, and stick it inside a BooleanQuery as a MUST clause along with the new search criteria as another MUST clause. That is effectively ANDing the two queries together. Erik On Nov 2, 2006, at 3:39 AM, spinergywmy wrote: Hi, I want to perform a search wit

How to get Term Weights (document term matrix)?

2006-11-02 Thread Soeren Pekrul
Hello, I would like to extract and store the document term matrix externally. I iterate the terms and the documents for each term: TermEnum terms=IndexReader.terms(); while(terms.next()) { TermDocs docs=IndexReader.termDocs(terms.term()); while(docs.next()) { //s

Re: Indexing floating point number

2006-11-02 Thread Nadav Har'El
On Wed, Nov 01, 2006, Yonik Seeley wrote about "Re: Indexing floating point number": > > longer strings than Solr's NumberTools. Moving to base 100 or even 256 > > (as I suggest in the comments) can eliminate this difference. > > Or higher, depending on what you are optimizing for. > If you a

search within search

2006-11-02 Thread spinergywmy
Hi, I want to perform a search within search feature in my application, so I having this problem and stuck at this point. I be able to retrieve search index from my first search, but having problem to search within the result that I retrieved. I have gone through some of the mailing list arch