Paging with Lucene

2011-01-19 Thread Clemens Wyss
(thanks fort he many answers to my initial lucene question "Best practices for multiple languages?") We shall be confronted with the followong problem: due to the very dynamic access rules on our content, we shall not be able to formulate these in/as Filter(s). Hence we need to first search and

Phrase query on multiple fields

2011-01-19 Thread amg qas
Hi, I have two question regarding phrase query : 1) How can I execute a phrase query over multiple fields ? I can only get PhraseQuery to work over a single field - For eg something like this : PhraseQuery query = new PhraseQuery(); query.setSlop(1

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread N. Hira
Where do you get your Lucene/Solr downloads from? [X] ASF Mirrors (linked in our release announcements or via the Lucene website) [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors

Re: search on a field that is NOT_ANALYZED

2011-01-19 Thread Yuhan Zhang
oh, TermQuery works! it was my own mistake. thanks for the help! Yuhan On Wed, Jan 19, 2011 at 5:17 PM, Yuhan Zhang wrote: > Hi Paul and Earl, > > thanks. > > I tried TermQuery. it gave me back zero document my termDocs is also > empty... having a feeling that > no term was gathered if a field

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Yuhan Zhang
On Tue, Jan 18, 2011 at 1:04 PM, Grant Ingersoll wrote: > As devs of Lucene/Solr, due to the way ASF mirrors, etc. works, we really > don't have a good sense of how people get Lucene and Solr for use in their > application. Because of this, there has been some talk of dropping Maven > support for

Re: search on a field that is NOT_ANALYZED

2011-01-19 Thread Yuhan Zhang
Hi Paul and Earl, thanks. I tried TermQuery. it gave me back zero document my termDocs is also empty... having a feeling that no term was gathered if a field was not analyzed. Yuhan On Wed, Jan 19, 2011 at 1:15 PM, Earl Hood wrote: > On Wed, Jan 19, 2011 at 2:11 PM, Paul Libbrecht wrote: > >

Re: AW: Best practices for multiple languages?

2011-01-19 Thread Bill Janssen
Paul Libbrecht wrote: > I did several changes of this sort and the precision and recall > measures went better in particular in presence of language-indication > failure which happened to be very common in our authoring environment. There are two kinds of failures: no language, or wrong languag

Re: AW: Best practices for multiple languages?

2011-01-19 Thread Trejkaz
On Thu, Jan 20, 2011 at 9:08 AM, Paul Libbrecht wrote: >>> Wouldn't it be better to prefer precise matches (a field that is >>> analyzed with StandardAnalyzer for example) but also allow matches are >>> stemmed. >> >> StandardAnalyzer isn't quite precise, is it?  StandardFilter does some >> kind o

Re: AW: Best practices for multiple languages?

2011-01-19 Thread Paul Libbrecht
Le 19 janv. 2011 à 20:56, Bill Janssen a écrit : > Paul Libbrecht wrote: > >> So you are only indexing "analyzed" and querying "analyzed". Is that correct? > > Yes, that's correct. I fall back to StandardAnalyzer if no > language-specific analyzer is available. > >> Wouldn't it be better to

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread viruslviv
Where do you get your Lucene/Solr downloads from? [] ASF Mirrors (linked in our release announcements or via the Lucene website) [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors

Re: search on a field that is NOT_ANALYZED

2011-01-19 Thread Earl Hood
On Wed, Jan 19, 2011 at 2:11 PM, Paul Libbrecht wrote: > I think you should use a TermQuery. How about IndexReader.termDocs()? >> I am trying to use >> *IndexSearcher

Re: search on a field that is NOT_ANALYZED

2011-01-19 Thread Paul Libbrecht
I think you should use a TermQuery. paul Le 19 janv. 2011 à 20:03, Yuhan Zhang a écrit : > Hi all, > > I am trying to use > *IndexSearcher > * to retri

Re: AW: Best practices for multiple languages?

2011-01-19 Thread Bill Janssen
Paul Libbrecht wrote: > So you are only indexing "analyzed" and querying "analyzed". Is that correct? Yes, that's correct. I fall back to StandardAnalyzer if no language-specific analyzer is available. > Wouldn't it be better to prefer precise matches (a field that is > analyzed with StandardA

search on a field that is NOT_ANALYZED

2011-01-19 Thread Yuhan Zhang
Hi all, I am trying to use *IndexSearcher * to retrieve a doc from an existing index by reading a field that is NOT_ANALYZED. I am trying to use that field

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread cn.mingyuan
[X] ASF Mirrors (linked in our release announcements or via the Lucene website) [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors them internally or via a downstream project)

Re: AW: Best practices for multiple languages?

2011-01-19 Thread Paul Libbrecht
So you are only indexing "analyzed" and querying "analyzed". Is that correct? Wouldn't it be better to prefer precise matches (a field that is analyzed with StandardAnalyzer for example) but also allow matches are stemmed. paul Le 19 janv. 2011 à 19:21, Bill Janssen a écrit : > Clemens Wyss w

Re: AW: Best practices for multiple languages?

2011-01-19 Thread Bill Janssen
Clemens Wyss wrote: > > 1) Docs in different languages -- every document is one language > > 2) Each document has fields in different languages > We mainly have 1)-models I've recently done this for UpLib. I run a language-guesser over the document to identify the primary language when the docu

Re: Best practices for multiple languages?

2011-01-19 Thread Paul Libbrecht
Because it does not find "junks" when you search "junk". Or... chevaux when you search cheval. paul Le 19 janv. 2011 à 18:59, Luca Rondanini a écrit : > why not just using the StandardAnalyzer? it works pretty well even with > Asian languages! > > > > On Wed, Jan 19, 2011 at 12:23 AM, Shai E

Re: Best practices for multiple languages?

2011-01-19 Thread Luca Rondanini
why not just using the StandardAnalyzer? it works pretty well even with Asian languages! On Wed, Jan 19, 2011 at 12:23 AM, Shai Erera wrote: > If you index documents, each in a different language, but all its fields > are > of the same language, then what you can do is the following: > > Creat

Re: recurrent IO/CPU peaks

2011-01-19 Thread Michael McCandless
This is normal behavior, unfortunately. The default LogByteSizeMergePolicy does mergeFactor (default 10) small merges in a row, then must do a 10X bigger merge. Every 100 small merges it does a 100X bigger merge, etc. You can try using the BalancedSegmentMergePolicy (in contrib/misc) -- it tries

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Rafael Rossini
Where do you get your Lucene/Solr downloads from? > > [] ASF Mirrors (linked in our release announcements or via the Lucene > website) > > [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) > > [] I/we build them from source via an SVN/Git checkout. > > [] Other (someone in your c

RE: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Pierre GOSSE
Where do you get your Lucene/Solr downloads from? [] ASF Mirrors (linked in our release announcements or via the Lucene website) [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [X] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Karl Wettin
On Jan 18, 2011, at 10:04 PM, Grant Ingersoll wrote: > As devs of Lucene/Solr, due to the way ASF mirrors, etc. works, we really > don't have a good sense of how people get Lucene and Solr for use in their > application. Because of this, there has been some talk of dropping Maven > support fo

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?(RefNo:115224)

2011-01-19 Thread Developer PAV(Panagiotis Vlissidis)
[X] ASF Mirrors (linked in our release announcements or via the Lucene website) [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors them internally or via a downstream project)

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Dawn Zoë Raison
On 18/01/2011 21:04, Grant Ingersoll wrote: [X] ASF Mirrors (linked in our release announcements or via the Lucene website) [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors them

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread heikki
> > > > > Where do you get your Lucene/Solr downloads from? > > > > [] ASF Mirrors (linked in our release announcements or via the Lucene > website) > > > > [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) > > > > [] I/we build them from source via an SVN/Git checkout. > > > > []

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Daan de Wit
> > Where do you get your Lucene/Solr downloads from? > [] ASF Mirrors (linked in our release announcements or via the Lucene website) > [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) > [] I/we build them from source via an SVN/Git checkout. > [] Other (someone in your comp

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Ivan Vasilev
On 18.1.2011 г. 23:04, Grant Ingersoll wrote: [x] ASF Mirrors (linked in our release announcements or via the Lucene website) [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [x] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors th

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Paul Libbrecht
Grant Ingersoll wrote: > Where do you get your Lucene/Solr downloads from? > > [x] ASF Mirrors (linked in our release announcements or via the Lucene > website) > > [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) > > [X] I/we build them from source via an SVN/Git checkout.

Re: Best practices for multiple languages?

2011-01-19 Thread Shai Erera
If you index documents, each in a different language, but all its fields are of the same language, then what you can do is the following: Create separate indexes per language --- This will work and is not too hard to set up. Requires some mainten

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Péter Király
> [x] ASF Mirrors (linked in our release announcements or via the Lucene > website) > [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) > [x] I/we build them from source via an SVN/Git checkout. > I rarely build, only if I would like to try an interesting patch. > [] Other (someon

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Jan Engler
Where do you get your Lucene/Solr downloads from? [X] ASF Mirrors (linked in our release announcements or via the Lucene website) [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-19 Thread Wouter Heijke
> > Where do you get your Lucene/Solr downloads from? > > [] ASF Mirrors (linked in our release announcements or via the Lucene > website) > > [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) > > [] I/we build them from source via an SVN/Git checkout. > > [] Other (someone in you