Re: Field.Store.YES vs Field.Store.NO

2013-01-10 Thread Denis Bazhenov
That's right. For example, file is uploaded with keyword "picture" and you are searching "keyword:picture keyword:music" (music OR picture). This file will be returned by search, but your will not be able to tell if it's music or picture. The Lucene index itself is inverted index, so all fields

Re: Field.Store.YES vs Field.Store.NO

2013-01-10 Thread Igal @ getRailo.org
say you index information about a book with the title: "Lucene in Action" with an ID and other information. searching for "Lucene" will find the book and will give you the book's ID. now if you used Store.YES -- then Lucene can also give you the full title, i.e. "Lucene in Action", but if you

Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-10 Thread saisantoshi
Thanks for all the responses. Apart from the API changes, is there any major functionality change from 2.4.0 -> 4.x version. I know we need to modify the API to the latest version but just curious if we need to be aware of any functional changes so as to do more thorough testing? Thanks, Sai.

Re: Field.Store.YES vs Field.Store.NO

2013-01-10 Thread saisantoshi
Not sure what does the following below mean? >>using Field.Store.NO the field itself is definitely searchable. You will not be able to retrieve the field value itself For example, if we have a file that we upload using some keywords and if the keyword (is of type Field.Store.NO but is analyzed)

Re: getting the token position

2013-01-10 Thread Igal @ getRailo.org
hi Denis, thanks for your reply. OffsetAttribute gives the character position whereas I was looking for the Token Position. I ended up adding the attached PositionAttribute/PositionAttributeImpl/PositionFilter. as it turned out though I didn't need that attribute as there was an easier way

Re: Field.Store.YES vs Field.Store.NO

2013-01-10 Thread Denis Bazhenov
If you are using Field.Store.NO the field itself is definitely searchable. You will not be able to retrieve the field value itself, though. But consequences of using Field.Store.YES is dependent on context. If you have a lot of documents in index and storable field is relatively large, this coul

Re: getting the token position

2013-01-10 Thread Denis Bazhenov
What you are looking for is OffsetAttribute. Also consider the possibility of using ShingleFilter with position increment > 1 and then filtering tokens containing "_" (underscore). This will be easier, I guess. On Jan 11, 2013, at 7:14 AM, Igal @ getRailo.org wrote: > hi all, > > how can I ge

Re: Field.Store.YES vs Field.Store.NO

2013-01-10 Thread Igal @ getRailo.org
I'm no expert but my understanding is that it is Searchable, but you can Not retrieve the information, if for example you want to show excerpts etc. the index size will be smaller, of course. Igal On 1/10/2013 3:16 PM, saisantoshi wrote: I am new to lucene and am trying to understand what

Field.Store.YES vs Field.Store.NO

2013-01-10 Thread saisantoshi
I am new to lucene and am trying to understand what is the impact on the search in using Field.Store.NO vs Field.Store.YES. I know the earlier does not store the value in the index and later stores it in the index. Would that mean that the one that uses Field.Store.NO is not searchable? new Field

getting the token position

2013-01-10 Thread Igal @ getRailo.org
hi all, how can I get the Token's Position from the TokenStream / Tokenizer / Analyzer ? I know that there's a TokenPositionIncrement Attribute and a TokenPositionLength Attribute, but is there an easy way to get the token position or do I need to implement my own attribute by adding one of t

StandardAnalyzer: Support for Japanese

2013-01-10 Thread saisantoshi
We are using StandardAnalyzer for indexing some Japanese Keywords. It works fine so far but just wanted to confirm if the StandardAnalyzer can fully support it ( I have read somewhere in Lucene In Action book, that StandardAnalyzer does support CJK). Just want to confirm if my understanding is corr

Re: recover corrupted index

2013-01-10 Thread Rafał Kuć
Hello! Just one thing - backup your index first, just in case. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Hello! > Try using CheckIndex - > http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/api/all/org/apache/lucene/index/CheckIndex.html

Re: recover corrupted index

2013-01-10 Thread Rafał Kuć
Hello! Try using CheckIndex - http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/api/all/org/apache/lucene/index/CheckIndex.html -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Hi, > I have an index, for which I am missing at least 1 file after

recover corrupted index

2013-01-10 Thread v . sevel
Hi, I have an index, for which I am missing at least 1 file after hitting a disk full situation. is there any way I could bypass the error I get when trying to open the index, to salvage as many docs as I can from the other files? thanks, vince java.io.FileNotFoundException: D:\_2c9kgw.cfs (T

Re: how much blocksize is set in lucene.

2013-01-10 Thread Ian Lea
Not a clue and I don't recall ever seeing anything on this list about disk block sizes etc. General advice on disks for lucene is the faster the better. SSDs are reputably excellent and recommended. Other than that I stick with my original advice: A good general principle is to start with the de

Re: Indexing your documents with Lucene!

2013-01-10 Thread Ian Lea
rm -rf works well for number 4. For the others use your favourite search engine with queries like "lucene tutorial" or "lucene getting started". Or start with these: http://lucene.apache.org/core/quickstart.html http://www.lucenetutorial.com/lucene-in-5-minutes.html Good luck. -- Ian. On

Re: CustomScoreQuery + Collector + Scoring

2013-01-10 Thread Yann-Erwan Perio
On Thu, Jan 10, 2013 at 11:22 AM, Uwe Schindler wrote: Hi Uwe, > The best way to do this ist o wrap the standard Lucene > TopScoreDocCollector by your own collector (passing all > calls to the collector also down to the top-docs collector). > Then you don't have to take care of sorting the resul

RE: CustomScoreQuery + Collector + Scoring

2013-01-10 Thread Uwe Schindler
> I am using Lucene 4.0.0, trying to put together a CustomQuery and a > Collector, and have a problem with the calculation of scores. > > My context is as follows. I have a big BooleanQuery which works fine, but I > also want to calculate some statistics during the search (i.e. > perform aggregati

CustomScoreQuery + Collector + Scoring

2013-01-10 Thread Yann-Erwan Perio
Hello, I am using Lucene 4.0.0, trying to put together a CustomQuery and a Collector, and have a problem with the calculation of scores. My context is as follows. I have a big BooleanQuery which works fine, but I also want to calculate some statistics during the search (i.e. perform aggregation o