Splitting word tokens - other languages

2011-02-17 Thread CassUser CassUser
Hey all, I'm somewhat new to Lucene. Meaning I used it some time ago for a parser we wrote to tokenize a document into word grams. the approach I took was simple as follows: 1. extended the lucene Analyzer 2. In the tokenStream method use ShingleMatrixFilter. Passed in the standard tokenizer,

Re: fields : stored and indexed

2011-02-17 Thread Ian Lea
In the second case I guess that searching might be faster by the odd fraction of a millisecond, but any affect will likely be dwarfed by most of the stuff on the Wiki page mentioned before. -- Ian. On Thu, Feb 17, 2011 at 12:56 PM, suman.holani wrote: > Hi , > > Thanks its useful. > > One thing

RE: fields : stored and indexed

2011-02-17 Thread suman.holani
Hi , Thanks its useful. One thing , if I reduce the number of fields by setting them INDEX.NO will it affect searching speed.But yes ,I will be storing those fields in index. Let say there are 5 fields A,B,C,D,E ---with first index having all fields config as STORE.YES and INDEX.ANALYZED ---a

Re: fields : stored and indexed

2011-02-17 Thread Ian Lea
http://lucene.apache.org/java/3_0_3/fileformats.html will tell you all you need to know about what is stored where and how. In general, the speed of searching i.e. finding matching docs will not be affected by the number of stored fields but retrieving data from lots of stored fields will certainl

Re: Boost value is always 1

2011-02-17 Thread Ian Lea
Try different boost values with larger gaps between low and high. If that doesn't help, post a tiny but complete self-contained example that demonstrates the problem. And you should always say what version of lucene you are using. -- Ian. On Thu, Feb 17, 2011 at 7:56 AM, Akos Tajti wrote: >

fields : stored and indexed

2011-02-17 Thread suman.holani
Hello, I am little confused on the stored and index part of lucene How it actually stores the indexed field and stored field Is it that for every field indexed , all the store fields added .I mean do we create diff indexes for every indexed field ,replicating the stored field in each of th