Re: Blåbærsyltetøy v.s. Räksmörgås

2013-05-22 Thread Petite Abeille
On May 22, 2013, at 7:08 PM, Karl Wettin wrote: >> * Use a filter after ASCIIFoldingFilter that discriminate all use of ae, oe, >> oo, and other combination of double vowels, just keeping the first one. > > I ended up with that solution. > > https://issues.apache.org/jira/browse/LUCENE-5013

Re: Blåbærsyltetøy v.s. Räksmörgås

2013-05-22 Thread Karl Wettin
22 maj 2013 kl. 14:37 skrev Karl Wettin: > * Use a filter after ASCIIFoldingFilter that discriminate all use of ae, oe, > oo, and other combination of double vowels, just keeping the first one. I ended up with that solution. https://issues.apache.org/jira/browse/LUCENE-5013

Re: Ordering of terms in TermsEnum

2013-05-22 Thread Michael McCandless
On Wed, May 22, 2013 at 11:28 AM, Brendan Grainger wrote: > Hi All, > > Sorry if this is a stupid question, but I'm still catching up with some of > the new APIs and I want to make sure my assumptions are correct. > > Anyway, I'm the solr PathHierachyTokenizer to create a number of paths, > e.g. f

Ordering of terms in TermsEnum

2013-05-22 Thread Brendan Grainger
Hi All, Sorry if this is a stupid question, but I'm still catching up with some of the new APIs and I want to make sure my assumptions are correct. Anyway, I'm the solr PathHierachyTokenizer to create a number of paths, e.g. for a book object say with a category field of /compsci/search/lucene th

Re: Query with phrases, wildcards and fuzziness

2013-05-22 Thread Jack Krupansky
Using BooleanQuery and Should is the way to go. There are some nuances, but you may not run into them. Sometimes it is more that the query parser syntax is the issue rather than the Lucene BQ itself. For example, with a string of AND and OR, they all get parsed into a single BQ, which is clearly

Getting position increments directly from the the index

2013-05-22 Thread Igor Shalyminov
Hello! I'm storing sentence bounds in the index as position increments of 1000. I want to get the total number of sentences in the index, i. e. the number of "1000" increment values. Can I do that some other way rather than just loading each document and extracting position increments with a cus

Blåbærsyltetøy v.s. Räksmörgås

2013-05-22 Thread Karl Wettin
This is a question (or perhaps a line of thought) regarding the mutually intelligible Scandinavian languages Danish, Norwegian and Swedish. The Swedish letters åäö is in fact the same letters as the Danish/Norwegian åæø. A Norwegian writing about the Swedish city of Göteborg write Gøteborg and

Re: Query with phrases, wildcards and fuzziness

2013-05-22 Thread Ross Simpson
One further question: If I wanted to construct my query using Query implementations instead of a QueryParser (e.g. TermQuery, WildcardQuery, etc.), what's the right way to duplicate the "OR" functionality I wrote about below? As I mentioned, I've read that wrapping query objects in a BooleanQ