from:"Manuel Le Normand"

Re: Too many unique terms

2013-04-27 Thread Manuel Le Normand

Hi, real thanks for the previous reply. For now i'm not able to make a separation between these useless words, whether they contain words or digits. I liked the idea of iterating with TermsEnum. Will it also delete the occurances of these terms in the other file formats (termVectors etc.)? As i un

Re: Too many unique terms

2013-04-29 Thread Manuel Le Normand

On Mon, Apr 29, 2013 at 1:22 PM, Adrien Grand wrote: > On Sat, Apr 27, 2013 at 8:41 PM, Manuel Le Normand > wrote: > > Hi, real thanks for the previous reply. > > For now i'm not able to make a separation between these useless words, > > whether they contain wor

Profiling Solr Lucene for query

2013-09-08 Thread Manuel Le Normand

Hello all Looking on the 10% slowest queries, I get very bad performances (~60 sec per query). These queries have lots of conditions on my main field (more than a hundred), including phrase queries and rows=1000. I do return only id's though. I can quite firmly say that this bad performance is due

Expunge deleting using excessive transient disk space

2013-09-08 Thread Manuel Le Normand

Hi again, In order to delete part of my index I run a delete by query that intends to erase 15% of the docs. I added this params to the solrconfig.xml 2 2 5000.0 10.0 15.0 The extra params were added in order to promote merge of old segments but with restriction on the transient d

Re: Expunge deleting using excessive transient disk space

2013-09-08 Thread Manuel Le Normand

you have when you > try the merge? > > Is this a typo? > > 2 > Note name=| > > Best > Erick > > > On Sun, Sep 8, 2013 at 7:26 AM, Manuel Le Normand < > manuel.lenorm...@gmail.com> wrote: > > > Hi again, > > In order to delete part of my i

Re: Profiling Solr Lucene for query

2013-09-08 Thread Manuel Le Normand

al boxes? How much memory per JVM? How > many JVMs? How much physical memory per box? > > 'Cause this seems excessive time-wise for loading the info. > > Best > Erick > > > On Sun, Sep 8, 2013 at 7:03 AM, Manuel Le Normand < > manuel.lenorm...@gmail.com>

Understanding FST Prefix & CheckIndex output

2013-09-22 Thread Manuel Le Normand

Hi there, I try to deep dive into the inner LucenePostingFormat to check what might I do for improving query performance. I'm curious about the termBlock stats that I get from checkIndex -verbose. What does the followong mean: index FST bytes - the FST size, which is the field's partition of the .

segment corruption - ArrayIndexOutOfBoundsException

2013-10-22 Thread Manuel Le Normand

Hello, My lucene index contains 46 segments with a total of 4M docs. Lately, while running queries I started getting seldom exceptions from this index: java.lang.ArrayIndexOutOfBoundsException at org.apache.lucene.codecs.lucene41.ForUtil.readBlock(ForUtil.java196) at org.apache.lucene.codecs.lu

Indexing useful N-grams (phrases & entities) and adding payloads

2014-03-12 Thread Manuel Le Normand

Hi, I posted this question on the Solr mailing list but it has more to do with Lucene. I have a performance and scoring problem for phrase queries 1. Performance - phrase queries involving frequent terms are very slow due to the reading of large positions posting list. 2. Scoring - I wan

Re: Indexing useful N-grams (phrases & entities) and adding payloads

2014-03-12 Thread Manuel Le Normand

e surrounding context / document? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Wed, Mar 12, 2014 at 5:27 AM, Manuel Le Normand > wrote: > > Hi, > > I posted this question on the Solr mailing list but it has more to do > with > > Lucene. &

Re: Question about Payloads in Lucene 4.5

2014-03-23 Thread Manuel Le Normand

Hello Rohit, We had a similar query time bottleneck when attempting to map lucene's internal id's to the uniqueKey, especially as we generally return only the uniqueKey to the user we had no other use of the stored field. As you noted, every internal id --> uniqueKey id requires a disk seek and as

Controlling FuzzyQuery edit type

2014-09-28 Thread Manuel Le Normand

Hello, In the FuzzyQuery I see it is possible to control the char transposition option by a boolean (which btw seems hardcoded and not configurable ). Is it possible to control the other edit types (char insertion, deletion or substitution) that are allowed somewhere in the code? Thanks, Manuel

Re: Controlling FuzzyQuery edit type

2014-09-29 Thread Manuel Le Normand

Nevermind, I just wrote a custom function that outputs the edit type for each word. Thanks

Re: Too many unique terms

Re: Too many unique terms

Profiling Solr Lucene for query

Expunge deleting using excessive transient disk space

Re: Expunge deleting using excessive transient disk space

Re: Profiling Solr Lucene for query

Understanding FST Prefix & CheckIndex output

segment corruption - ArrayIndexOutOfBoundsException

Indexing useful N-grams (phrases & entities) and adding payloads

Re: Indexing useful N-grams (phrases & entities) and adding payloads

Re: Question about Payloads in Lucene 4.5

Controlling FuzzyQuery edit type

Re: Controlling FuzzyQuery edit type

13 matches

Site Navigation

Mail list logo

Footer information