Re: Re: Sort by count?

2009-03-08 Thread hyj
> >: lucene query result is sort by tf*idf. how/what can i do, to make >: the result is only by MatchWords Count? > >customize your Similarity implementation to eliminate all but the tf() >(using constant values for the other functions) > > >-Hoss > > >--

Re: Questions about analyzer

2009-03-08 Thread Ganesh
Mike in of his replies to the thread "Faceted search using Lucene", gave the following code review comment * You are creating a new Analyzer & QueryParser every time, also creating unnecessary garbage; instead, they should be created once & reused. This made me to ask the below questi

Re: Sort by count?

2009-03-08 Thread Chris Hostetter
: lucene query result is sort by tf*idf. how/what can i do, to make : the result is only by MatchWords Count? customize your Similarity implementation to eliminate all but the tf() (using constant values for the other functions) -Hoss --

Sort by count?

2009-03-08 Thread hyj
hi, lucene query result is sort by tf*idf. how/what can i do, to make the result is only by MatchWords Count? thanks hyj hongyin...@163.com   2009-03-09

Re: Problem building Lucene 2_4 with Ant/Eclipse

2009-03-08 Thread Chris Hostetter
: ok, thanks. Thought the top build.xml was going to build everything : underneath. it can, but the default target only builds the core. try "ant package" or "ant dist" -Hoss - To unsubscribe, e-mail: java-user-unsubscr...@l

Re: Lucene: MultiSearcher

2009-03-08 Thread Daniel Noll
Michael McCandless wrote: You could look at the docID of each hit, and compare to the .maxDoc() of each underlying reader. There is also MultiSearcher#subSearcher(int) which also works as you add more without having to do the maths yourself. Daniel -- Daniel Noll

Re: IndexSearcher

2009-03-08 Thread liat oren
Yes, I changed it to TOKENIZED and its working now, Thanks! About Luke, what do you mean by saying that the analyzer is in the classpath? It exists in a package in my computer - it also has its filter and other classes. How can it be used in Luke? 2009/3/8 Andrzej Bialecki > liat oren wrote: >

Re: IndexSearcher

2009-03-08 Thread Andrzej Bialecki
liat oren wrote: Ok, thanks. I will have to edit the code of Luke in order to add another analyzer, right? No - if your analyzer is already on the classpath, then it's enough to type in the fully qualified class name in the drop down box (it's editable). -- Best regards, Andrzej Bialecki

Re: IndexSearcher

2009-03-08 Thread Erick Erickson
What Shashi said. On Sun, Mar 8, 2009 at 10:22 AM, Shashi Kant wrote: > Liat, i think what Erick suggested was to use the TOKENIZED setting instead > of UN_TOKENIZED. For example your code should read something like: > > Document doc = new Document(); > doc.add(new Field(WordIndex.FIELD_WORL

Re: Problem building Lucene 2_4 with Ant/Eclipse

2009-03-08 Thread Raymond Balmès
ok, thanks. Thought the top build.xml was going to build everything underneath. On Sun, Mar 8, 2009 at 4:09 PM, Uwe Schindler wrote: > That's normal. To build contrib or demos, you have to execute ant with the > corresponding targets. The error message about SVN can be ignored. > > - > Uwe S

RE: Problem building Lucene 2_4 with Ant/Eclipse

2009-03-08 Thread Uwe Schindler
That's normal. To build contrib or demos, you have to execute ant with the corresponding targets. The error message about SVN can be ignored. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Raymond Balmès

Problem building Lucene 2_4 with Ant/Eclipse

2009-03-08 Thread Raymond Balmès
I get the problem below in trying to build Lucene 2_4. I'm using Eclipse and just run Ant on the top build.xml. It is pretty weird because the core is indeed build, but for some reason the build stops there and I don't get any of the demos build etc... Any idea what this "svnversion" program is ?

Re: IndexSearcher

2009-03-08 Thread Shashi Kant
Liat, i think what Erick suggested was to use the TOKENIZED setting instead of UN_TOKENIZED. For example your code should read something like: Document doc = new Document(); doc.add(new Field(WordIndex.FIELD_WORLDS, "111 222 333", Field.Store.YES, * Field.Index.TOKENIZED*)); Unless I am missing s

Re: IndexSearcher

2009-03-08 Thread Erick Erickson
In Luke, you can select from among most of the analyzers provided by Lucene, there's a drop-down on the search tab. And I'm sure you can add your own custom ones, but I confess I've never tried. Best Erick On Sun, Mar 8, 2009 at 4:14 AM, liat oren wrote: > Ok, thanks. > > I will have to edit th

Re: Search while indexing

2009-03-08 Thread Erick Erickson
Lucene is an *indexing* engine, it isn't, and shouldn't be, in the business of domain-specific knowledge, in this case web crawling. You might look at Nutch, which is a web-crawling and indexing application that uses Lucene as its underlying search engine, and there are other web crawlers out ther

Re: Scores between words. Boosting?

2009-03-08 Thread liat oren
Hi Grant, No, you can only have two words - the score is between two words. "cat dog" and "dog cat" is equivalent, it will actually always be "cat dog" - going by alphabetic order. About the boosting, I read a bit about it - but couldn't find how it can help me, unless I change every appearance

Re: ZipFile directory implementation

2009-03-08 Thread Grant Ingersoll
Hi, Sounds interesting. Can you tell us a bit more about the use case for it? Is it basically you are in a situation where you can't unzip the index? Also, have you looked at how it performs? -Grant On Mar 6, 2009, at 5:02 PM, tsuraan wrote: I wrote a really basic read-only Directory

Re: Scores between words. Boosting?

2009-03-08 Thread Grant Ingersoll
Hi Liat, Some questions inline below. On Mar 8, 2009, at 5:49 AM, liat oren wrote: Hi, I have scores between words, for example - dog and animal have a score of 0.5 (and not 0), dog and cat have a score of 0.2, etc. These scores are stored in an index: Doc1: field words: dog animal

Scores between words. Boosting?

2009-03-08 Thread liat oren
Hi, I have scores between words, for example - dog and animal have a score of 0.5 (and not 0), dog and cat have a score of 0.2, etc. These scores are stored in an index: Doc1: field words: dog animal field score: 0.5 Doc2: field words: dog cat field score: 0.2 If the user searc

Re: IndexSearcher

2009-03-08 Thread liat oren
Ok, thanks. I will have to edit the code of Luke in order to add another analyzer, right? If I need to query a long text - for example to search for articles that are textually close in their content, I need to parse to the query the text of the article. Then I get the error that it is too long.