Re: external file stored field codec

2013-10-11 Thread Michael Sokolov
On 10/11/2013 03:19 PM, Michael Sokolov wrote: On 10/11/2013 03:04 PM, Adrien Grand wrote: On Fri, Oct 11, 2013 at 7:03 PM, Michael Sokolov wrote: I've been running some tests comparing storing large fields (documents, say 100K .. 10M) as files vs. storing them in Lucene as stored fields. In

Re: external file stored field codec

2013-10-11 Thread Michael Sokolov
On 10/11/2013 03:04 PM, Adrien Grand wrote: On Fri, Oct 11, 2013 at 7:03 PM, Michael Sokolov wrote: I've been running some tests comparing storing large fields (documents, say 100K .. 10M) as files vs. storing them in Lucene as stored fields. Initial results seem to indicate storing them exter

Re: external file stored field codec

2013-10-11 Thread Adrien Grand
On Fri, Oct 11, 2013 at 7:03 PM, Michael Sokolov wrote: > I've been running some tests comparing storing large fields (documents, say > 100K .. 10M) as files vs. storing them in Lucene as stored fields. Initial > results seem to indicate storing them externally is a win (at least for > binary doc

external file stored field codec

2013-10-11 Thread Michael Sokolov
I've been running some tests comparing storing large fields (documents, say 100K .. 10M) as files vs. storing them in Lucene as stored fields. Initial results seem to indicate storing them externally is a win (at least for binary docs which don't compress, and presumably we can compress the ex

Re: Performance/scoring impacts with multiple occurrences of a field

2013-10-11 Thread Ian Lea
With multiple fields of the same name vs a single field I doubt you'd be able to tell the difference in performance or matching or scoring in normal use. There may be some matching/ranking effect if you are looking at, say, span queries across the multiple fields. Try it out and see what happens.

Re: Optimizing Filters

2013-10-11 Thread Ian Lea
Are you going to be caching and reusing the filters e.g. by CachingWrapperFilter? The main benefit of filters is in reuse. It takes time to build them in the first place, likely roughly equivalent to running the underlying query although with variations as you describe. Or are you saying that qu

Finding which document is deleted when someone Manually delete the Index

2013-10-11 Thread VIGNESH S
Hi, If some one removes some of the segments in my Lucene Index from file system,How to find out which documents are deleted. -- Thanks and Regards Vignesh Srinivasan 9739135640

Re: Multiple Keywords - Regular and Any Order Search

2013-10-11 Thread Ian Lea
Looks like you can achieve most of what you want by using AND rather than OR. I think that all the should/should not examples you give will work if you use AND on your content field. For ordering, I suggest you look at SpanNearQuery. That can consider order and slop, the distance between the sea