Re: Grouping on multiple shards possible in lucene?

2012-11-19 Thread Ravikumar Govindarajan
Thanks Mike. Actually, I think I can eliminate sort-by-time, if I am able to iterate postings in reverse doc-id order. Is this possible in lucene? Also, for a TopN query sorted by doc-id will the query terminate early? -- Ravi On Fri, Nov 16, 2012 at 9:40 PM, Michael McCandless < luc...@mikemccan

Re: ANN: UweSays Query Operator

2012-11-19 Thread Uwe Schindler
Lol! Many thanks for this support! Uwes Otis Gospodnetic schrieb: >Hi, > >Quick announcement for Uwe & Friends. > >UweSays is now a super-duper-special query operator over on >http://search-lucene.com/ . Now whenever you want to know what Uwe >says >about something just start the query with

Re: TokenStreamComponents in Lucene 4.0

2012-11-19 Thread Carsten Schnober
Am 19.11.2012 17:44, schrieb Carsten Schnober: Hi again, just a little update: > However, after switching to Lucene 4 and TokenStreamComponents, I'm > getting a strange behaviour: only the first document in the collection > is tokenized properly. The others do appear in the index, but > un-tokeni

TokenStreamComponents in Lucene 4.0

2012-11-19 Thread Carsten Schnober
Hi, I have recently updated to Lucene 4.0, but having problems with my custom Analyzer/Tokenizer. In the days of Lucene 3.6, it would work like this: 0. define constants lucene_version and indexdir 1. create an Analyzer: analyzer = new KoraAnalyzer() (our custom Analyzer) 2. create an IndexWriter

Re: Question about ordering rule of SpanNearQuery

2012-11-19 Thread Jack Krupansky
Unfortunately, there doesn't appear to be any Javadoc that discusses what factors are used to score spans. For example, how to relate the number of times a span matches in a document vs. the exactness of each span match. -- Jack Krupansky -Original Message- From: 杨光 Sent: Monday, Nov

Re: what is the offsets and payload in DocsAndPositionsEnum for ??

2012-11-19 Thread Michael McCandless
A new postings format would be tricky because you have new attributes you want to index. The DocsAndPositionsEnum does have an attributes source, but this is not well explored, and there are known problems (they can't be easily merged in the composite reader case). So that's why I suggested packi

Question about ordering rule of SpanNearQuery

2012-11-19 Thread 杨光
Hi all, Recently, we are developing a platform with lucene. The ordering rule we specified is the document with the shortest distance between query terms ranks the first. But there may be a little different with SpanNearQuery. It returns all the documents with qualified distance. So I am con

Re: German 'ue' -> 'u' conversion

2012-11-19 Thread Wouter Heijke
Hi, We use a solution where we have our own implementation similar to ASCIIFoldingFilter for German language specific characters (and also French and Dutch). Wouter > Hello, > > I have two questin regarding handling German umlauts in Lucene: > > 1. I'm trying to find a way to convert German Umlau

RE: German 'ue' -> 'u' conversion

2012-11-19 Thread Dyga, Adam
Yes, that would solve my question 2. I can convert all umlauts to 'ue', 'ae', etc form before the tokens get to other filters and it should work fine. Thanks, Adam -Original Message- From: Igal @ getRailo.org [mailto:i...@getrailo.org] Sent: 19 listopada 2012 11:15 To: java-user@lucene

Re: German 'ue' -> 'u' conversion

2012-11-19 Thread Igal @ getRailo.org
if your needs are so specific -- you can always build a NormalizeCharMap and use MappingCharFilter Igal On 11/19/2012 2:11 AM, Dyga, Adam wrote: I did, but none of them can do it (at least in default configuration). Regards, AD -Original Message- From: Igal @ getRailo.org [mailto:i

RE: German 'ue' -> 'u' conversion

2012-11-19 Thread Dyga, Adam
I did, but none of them can do it (at least in default configuration). Regards, AD -Original Message- From: Igal @ getRailo.org [mailto:i...@getrailo.org] Sent: 19 listopada 2012 11:10 To: java-user@lucene.apache.org Subject: Re: German 'ue' -> 'u' conversion look for filters that use t

Re: German 'ue' -> 'u' conversion

2012-11-19 Thread Igal @ getRailo.org
look for filters that use the ICU4J library On 11/19/2012 2:08 AM, Lutz Fechner wrote: Hi, we use a modified ISOLatin1AccentFilter bit to replace German accents by ae, oe, ue and so on for that purpose. In the code you will see a switch for the characters. You need to change it from case

RE: German 'ue' -> 'u' conversion

2012-11-19 Thread Lutz Fechner
Hi, we use a modified ISOLatin1AccentFilter bit to replace German accents by ae, oe, ue and so on for that purpose. In the code you will see a switch for the characters. You need to change it from case '\u00E4' : // small ä output[outputPos++] = 'a'; output[outputPos++] =

German 'ue' -> 'u' conversion

2012-11-19 Thread Dyga, Adam
Hello, I have two questin regarding handling German umlauts in Lucene: 1. I'm trying to find a way to convert German Umlauts written as 'ue', 'ae', etc to folded form 'u', 'a' and so on. This is done by GermanAnalyzer (and German2StemFilter used by it), but unfortunately it also does stemming w