Re: What is the best way to aggregate scores for sets of documents?

2013-11-07 Thread Alan Burlison
anted groupings with higher numbers of matching documents to score higher, so simple addition of the scores worked well. -- Alan Burlison -- - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional command

Re: What is the best way to aggregate scores for sets of documents?

2013-11-07 Thread Alan Burlison
query results, grouping by the upper-level construct and adding up all the scores for the sub-documents, then sorting by aggregated score. Crude, but gives good relevancy in the results. -- Alan Burlison -- - To unsubscribe, e

Re: queries with "&&" doesn't work but "AND" does

2013-10-10 Thread Alan Burlison
dard analyzer. Could some one please help me resolving this. please let me know if you need more details of implementation. Most likely cause is that the analyzer is discarding non-alphanumeric tokens. Use toString on the query returned by queryparser.parse() to

Writing Lucene analyzers - in Scala

2013-09-17 Thread Alan Burlison
def createComponents(fieldName: String, reader: Reader): TokenStreamComponents = { new TokenStreamComponents(new KeywordTokenizer(reader)) } override def getPositionIncrementGap(fieldName: String) = 100 } Hopefully this might help someone in the future who googles to see if it'

Re: Multiple field instances and Field.Store.NO

2013-09-17 Thread Alan Burlison
hether that one document you're looking at has the field ... That would make sense, yes. Thanks. -- Alan Burlison -- - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Multiple field instances and Field.Store.NO

2013-09-17 Thread Alan Burlison
On 16/09/2013 19:04, Alan Burlison wrote: Is Luke showing you stored fields? If so, this makes no sense ... Field.Store.NO (single or multiple calls) should have resulted in no stored fields. It shows the field but shows the content as I think perhaps what I'm seeing is an artefact o

Re: Multiple field instances and Field.Store.NO

2013-09-16 Thread Alan Burlison
> Is Luke showing you stored fields? If so, this makes no sense ... > Field.Store.NO (single or multiple calls) should have resulted in no > stored fields. It shows the field but shows the content as -- Alan Burlison -- ---

Re: Multiple field instances and Field.Store.NO

2013-09-16 Thread Alan Burlison
On 16 September 2013 12:40, Michael McCandless wrote: > If you use Field.Store.NO for all fields for a given document then no > field should have been stored. Can you boil this down to a small test > case? repeated calls to doc.add(new TextField("content", c, Field.Store.NO))) result in a sin

Re: Multiple field instances and Field.Store.NO

2013-09-16 Thread Alan Burlison
fields and up with multiple instances, unstored ones don't. -- Alan Burlison -- - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Multiple field instances and Field.Store.NO

2013-09-16 Thread Alan Burlison
that expected or am I doing something dumb? Thanks, -- Alan Burlison -- - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Position increment clarification?

2013-09-15 Thread Alan Burlison
to display the position value of each instance of a duplicated field so I wasn't quite sure if what I was doing was actually working. -- Alan Burlison -- - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org Fo

Re: Position increment clarification?

2013-09-15 Thread Alan Burlison
nclusion I came to. It's easy enough to do, I'm using JavaMail to recursively traverse the mail file so I can separate out each mail and also deal with multipart mails as well as attachments, which I'm then feeding into Tika. Thank you for the inf

Position increment clarification?

2013-09-15 Thread Alan Burlison
it seem that manipulating the inter-token position increment isn't particularly useful. The second mechanism - overriding Analyzer.getPositionIncrementGap - does seem to work, but that obviously means putting each segment of the mbox file into a new field instance. Is that the preferr