need scoring help

2009-03-20 Thread m.harig
Hello all i've a search application running on lucene-2.3.0 , say for example am indexing 10 urls as an input , when am searching am not able to get the expected result at the best ranking, i.e, unrelated hits are coming up rather than related hits. I've been working this for a w

Re: Getting started with Lucene

2009-03-20 Thread nitin gopi
hi you should add classpath of both lucene demo jar file and lucene core jar file . then run the command to build the index final step is to run the command to search files nitin On Fri, Mar 20, 2009 at 6:05 PM, Uwe Schindler wrote: > As I know, the demos are not compiled by default in the rel

Random sorting results

2009-03-20 Thread rawmans...@gmail.com
Hi, In the search application I'm working on I would like to prevent the user from getting always the same search results for a certain query, but without affecting results quality too much. In order to do so I'm processing the hits in smaller chunks and doing some random shuffle inside the

Re: froze

2009-03-20 Thread Michael McCandless
Did you hit enter after that [input]... line? It's asking you to confirm where you want the index written. BTW you should send future questions about Lucene in Action 2 at the book's forum, here: http://www.manning-sandbox.com/forum.jspa?forumID=451 Mike nga pham wrote: Hi All,

froze

2009-03-20 Thread nga pham
Hi All, I recently purchased Lucene in action 2nd edition. When i ran the suggested command "ant indexer" (located in Chapter 1: Running Indexer, page 21) it was able to run for a while until it froze. The last thing that was printed on my screen was "[input] Directory for new Lucene index: [in

job announcement - Machine Learning Specialist

2009-03-20 Thread Michael Osofsky
Machine Learning Specialist Location: Mountain View, CA NetBase, a well-funded, fast growing company with an impressive roster of top-tier Fortune 500 companies, seeks to grow its team with a machine learning specialist. NetBase delivers Content Intelligence solutions that harness value and insi

job announcement - Software Architect

2009-03-20 Thread Michael Osofsky
Software Architect Location: Mountain View, CA NetBase, a well-funded, fast growing company with an impressive roster of top-tier Fortune 500 companies, seeks to grow its team with a hands-on software architect. NetBase delivers Content Intelligence solutions that harness value and insight from

job announcement - Senior Technical Operations Leader

2009-03-20 Thread Michael Osofsky
Senior Technical Operations Leader Location: Mountain View, CA NetBase, a well-funded, fast growing company with an impressive roster of top-tier Fortune 500 companies, seeks to grow its team with a hands-on senior technical operations leader. NetBase delivers Content Intelligence solutions that

RE: Performance tips on searching

2009-03-20 Thread Uwe Schindler
Sorry I did not read your email and the first one a lot of people did not understand (I read it now). Everything is rather simple: It makes no difference between MultiSearcher or IndexSearcher, you can really do everything with both. Just use them as if they are equals (in your declararation just

Re: Performance tips on searching

2009-03-20 Thread Amin Mohammed-Coleman
Hi I wrote last week about the best way to paginate. I will reply back with that email if that ok. This isn't my thread and I don't want to deviate from the original topic. Cheers Amin On 20 Mar 2009, at 17:50, "Uwe Schindler" wrote: No, the MultiSearcher also exposes all methods, Ind

RE: Performance tips on searching

2009-03-20 Thread Uwe Schindler
No, the MultiSearcher also exposes all methods, IndexSearcher/Seracher exposes (it inherits it from the superclass IndexSearcher). And a call to the collector is never sortable, because the sorting is done *inside* the hit collector. Where is your problem with pagination? Normally you choose n to

Re: Performance tips on searching

2009-03-20 Thread Amin Mohammed-Coleman
Hi How do you expose a pagination without a customized hit collector. The multi searcher does not expose a method for hit collector and sort. Maybe this is not an issue for people ... Cheers Amin On 20 Mar 2009, at 17:25, "Uwe Schindler" wrote: Why not use a MultiSearcher an all single

RE: Performance tips on searching

2009-03-20 Thread Uwe Schindler
Why not use a MultiSearcher an all single searchers? Or a Searcher on a MultiReader consisting of all IndexReaders? With that you do not need to merge the results. By the way: instead of creating a TopDocCollector, you could also call directly, Searcher.search(Query query, Filter filter, int n, S

Performance tips on searching

2009-03-20 Thread Paul Taylor
Hi, my code receives a search query from the web, there are 5 different searches that can be searched on - each index is searched with a single IndexSearcher referenced in a map. it parses then performs the search and return the best 10 results, with scores readjusted over the results s

Re: robust inverse of query parser?

2009-03-20 Thread Marvin Humphrey
On Fri, Mar 20, 2009 at 05:03:49PM +0100, Paul Libbrecht wrote: > query.toString() does a fair job at being reparsed by QueryParser but > is there a safe way to do so? Probably not. That's certainly not tested or guaranteed. Pathological input would break it. > I have a lucene query object an

Re: Similarity and Lucene

2009-03-20 Thread Amin Mohammed-Coleman
Allthough (I could be wrong) but I'm wondering if the lenthNorm is the correct one I should be overriding. I'm interested in the number of times a term occurs found in a document (more occurance the higher the score) which I believe is coord. I may well be i am barking up the wrong tree. Cheers

Similarity and Lucene

2009-03-20 Thread Amin Mohammed-Coleman
Hi If I choose to subclass the default similarity, do I need to apply the same subclassed Similarity to IndexReader, IndexWriter and IndexSearcher? I am interested in doing the below: Similarity sim = new DefaultSimilarity() { public float lengthNorm(String field, int numTerms) { if(field

robust inverse of query parser?

2009-03-20 Thread Paul Libbrecht
Hello luceners, query.toString() does a fair job at being reparsed by QueryParser but is there a safe way to do so? I have a lucene query object and want a string that QueryParser will reparse fairly exacty. thanks in advance paul smime.p7s Description: S/MIME cryptographic signature

Re: LUCENE-1453 not fixed?

2009-03-20 Thread Michael McCandless
It's an easy mistake :) Now that you're on 2.4.1 you can try switching back to using a String path... that should work fine (since 2.4.1 fixes LUCENE-1453). Mike Chris Salem wrote: oops, the lucene 2.4.0 jar was in the jre/lib/ext directory (I don't remember putting it there). when i u

Re: LUCENE-1453 not fixed?

2009-03-20 Thread Chris Salem
oops, the lucene 2.4.0 jar was in the jre/lib/ext directory (I don't remember putting it there). when i updated to lucene 2.4.1 i put the jar in the tomcat/lib directory (which also had the lucene 2.4.0 jar). i deleted the old lucene 2.4.0 jar. changing the code to use FSDirectory instead of

Re: sloppyFreq question

2009-03-20 Thread Peter Keegan
Sorry, here's the example I meant to show. Doc 1 and doc 2 both contain the terms "hey look, the quick brown fox jumped very high", but in Doc 1 all the terms are indexed at the same position. In doc 2, the terms are indexed in adjacent positions (normal way). For the query "the quick brown fox", d

Re: [ANN] Luke 0.9.2 release

2009-03-20 Thread Marcelo Ochoa
Hi Andrzej: > > If you tried to access this url during last couple hours the site was down. > It should be up again - apparently I went over the allocated bandwidth and > the hosting company disabled the site without any warning or even > notification. It's time to look for a better home for Luke

RE: Getting started with Lucene

2009-03-20 Thread Uwe Schindler
As I know, the demos are not compiled by default in the release (because they show how you use Lucene and are so included as .java source files in the binary distribution). You have to build the demos using ANT. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de e

Getting started with Lucene

2009-03-20 Thread nga pham
Hi, I have a project that involve Lucene. Currently I, 1) downloaded Lucene-2.4.1. into my CentOS 4.7 box. 2) succesfullly downloaded java, version 6. 3) successfully completed setting CLASSPATH as I ran the command : java org.apache.lucene.demo.SearchFiles I get an error saying: Exception in t

Re: [ANN] Luke 0.9.2 release

2009-03-20 Thread Andrzej Bialecki
Andrzej Bialecki wrote: (sorry for cross-posting) Hi all, I'm happy to announce a new release of Luke, the Lucene Index Toolbox. As usually, you can obtain it from here: http://www.getopt.org/luke If you tried to access this url during last couple hours the site was down. It should be