date:20150807

Re: SOLR/LUCENE 5.2.1: Solution of CharTermAtt, StartOffset, EndOffset, Position

2015-08-07 Thread Shai Erera

I think you can just write a TokenFilter which sets the PositionIncrementAttribute of every other token to 0. Then you can use StandardTokenizer and wrap it with that filter. Shai On Aug 8, 2015 6:33 AM, "Văn Châu" wrote: > Hi, > > I'm looking a solution for the following format in solr/lucene 5

SOLR/LUCENE 5.2.1: Solution of CharTermAtt, StartOffset, EndOffset, Position

2015-08-07 Thread Văn Châu

Hi, I'm looking a solution for the following format in solr/lucene 5.2.1 version: Text eg: "fast wi fi network is down". If using solr.StandardTokenizerFactory , I have the "Position " corresponding to displayed : fast ( 1 ) - > wi ( 2 ) - > fi ( 3 ) - > Network ( 4 ) - > is ( 5 ) - - > down ( 6 )

PerFieldAnalyzerWrapper does not seem to allow use of a custom analyzer

2015-08-07 Thread Bauer, Herbert S. (Scott)

I can’t seem to detect any issues with the final custom analyzer declared in this code snippet (The one that attempts to use a PatternMatchingTokenizer and is initialized as sa), but it doesn’t seem to be hit when I run my indexing code despite being in the map. It is indexed finally but I assu

Prioritizing BooleanQueries to Improve Performance

2015-08-07 Thread markh

If I have a BooleanQuery which has two subqueries, one fast, one slow (fastQuery) && (slowQuery) Is there a way to tell Lucene to execute the fastQuery first so it can potentially skip the slowQuery if there are no results from the fastQuery? I don't think making the slow query a filter (Occur.

Re: new to Lucene

2015-08-07 Thread Erick Erickson

2. Is the "Index" saved as a file or loaded into the memory? Adding to Modassar's comments: Almost all "real" implementations save the index to disk and read selected portions back in to memory as needed, otherwise the data isn't permanent. In the Lucene world, I'd start with NRTCachingDirectory.

Re: new to Lucene

2015-08-07 Thread Modassar Ather

Please see my comments in-line. 1. For the indexing of these chapters, how many fields that need to be declared? Can I just declare only one field for the contents? This depends on what you need to search with. E.g if only plain content (chapters) are to be searched then one indexed field is requ

new to Lucene

2015-08-07 Thread Nantha Kumar Subramaniam

Good day I am new to Lucene and have started to explore Lucene. I have questions: I have a book in which all the chapters are in pdf. I plan to index all these individual chapters in Lucene using Tika for the text extraction. 1. For the indexing of these chapters, how many fields that need to b

Re: Mapping doc values back to doc ID (in decent time)

2015-08-07 Thread Adrien Grand

On Fri, Aug 7, 2015 at 8:30 AM, Trejkaz wrote: > for (int ourId = 0; ourId < count; ourId++) > { > builder.clear(); > NumericUtils.longToPrefixCoded(ourId, 0, builder); > termsEnum.seekExact(builder.get()); > postingsEnum = termsEnum.postings(null, postingsE

Re: SOLR/LUCENE 5.2.1: Solution of CharTermAtt, StartOffset, EndOffset, Position

SOLR/LUCENE 5.2.1: Solution of CharTermAtt, StartOffset, EndOffset, Position

PerFieldAnalyzerWrapper does not seem to allow use of a custom analyzer

Prioritizing BooleanQueries to Improve Performance

Re: new to Lucene

Re: new to Lucene

new to Lucene

Re: Mapping doc values back to doc ID (in decent time)

8 matches

Site Navigation

Mail list logo

Footer information