Re: CompressingStoredFieldsFormat doesn't show improvement

2013-01-30 Thread arun k
Hi, Please find the snapshots here. http://picpaste.com/Lucene3.0.2-G00Z5FfX.png http://picpaste.com/Lucene4.1-LsxpcQk0.png Arun On Wed, Jan 30, 2013 at 5:30 PM, Uwe Schindler wrote: > Hi, > > > > there is nothing attached to your mail; maybe the mailing list software > removed it. Can you pl

Re: How to find related words ?

2013-01-30 Thread wgggfiy
en, it seems nice, but I'm puzzled by you and Andrew Gilmartina above, what's the difference between you guys ? and I'm reading the reference about how to *extract relevant terms from the top document(s). * anyway, thx - -- Email: wuqiu.m...@qq.com --

Re: List of files that Lucene 4.0 generates during indexing

2013-01-30 Thread saisantoshi
The following files are originally created files (upon an initial indexing): _0.fdt _0.fdx _0.fnm _0.si _0_Lucene40_0.frq _0_Lucene40_0.prx _0_Lucene40_0.tim _0_Lucene40_0.tip _0_nrm.cfe _0_nrm.cfs index.v0008

Re: Pulsing40PostingsFormat in lucene 4.1

2013-01-30 Thread Michael McCandless
On Tue, Jan 29, 2013 at 7:39 PM, Sean Bridges wrote: > How do we avoid similar situations in the future? Is Pulsing41PostingsFormat > going to be maintained in future versions of Lucene? What are the > safe PostingFormat/Codecs > to use? Every PostingFormat/Codec is @deprecated or @experimenta

Re: Questions about FuzzyQuery in Lucene 4.x

2013-01-30 Thread Michael McCandless
On Tue, Jan 29, 2013 at 2:43 PM, George Kelvin wrote: > Hi Jack, > > The problematic query is "scar"+"wads". > > There are several (more than 10) documents in the data with the content > "star wars", so I think that query should be able to find all these > documents. > > I was trying to provide a

Re: How to find related words ?

2013-01-30 Thread Jack Krupansky
Take a look at MoreLikeThisQuery: http://lucene.apache.org/core/4_1_0/queries/org/apache/lucene/queries/mlt/MoreLikeThisQuery.html And MoreLikeThis itself: http://lucene.apache.org/core/4_1_0/queries/org/apache/lucene/queries/mlt/MoreLikeThis.html So, the idea is search for documents using your

Re: How to find related words ?

2013-01-30 Thread Andrew Gilmartin
wgggfiy wrote: In short, you put in a term like "Lucene", and The ideal output would be "solr", "index", "full-text search", and so on. How to make it ? to find the related words. thx My idea is to use FuzzyQuery, or MoreLikeThis, or calc the score with all the terms and then sort. Any idea ? T

How to find related words ?

2013-01-30 Thread wgggfiy
In short, you put in a term like "Lucene", and The ideal output would be "solr", "index", "full-text search", and so on. How to make it ? to find the related words. thx My idea is to use FuzzyQuery, or MoreLikeThis, or calc the score with all the terms and then sort. Any idea ? - -

Re: ANTLR and Custom Query Syntax/Parser

2013-01-30 Thread Carsten Schnober
Am 29.01.2013 00:24, schrieb Trejkaz: > On Tue, Jan 29, 2013 at 3:42 AM, Andrew Gilmartin > wrote: >> When I first started using Lucene, Lucene's Query classes where not suitable >> for use with the Visitor pattern and so I created my own query class >> equivalants and other more specialized ones.

Re: IndexWriter deleteDocuments

2013-01-30 Thread Bernd Müller
sorry, user error... please discard the question ;-) 2013/1/30 Bernd Müller : > Hello, > > In my index, I have documents identified by a field with their unique > identifier. Now, I tried to delete documents having such a unique > identifier using deleteDocuments(Term t). If I test the IndexWriter

Re: IndexWriter deleteDocuments

2013-01-30 Thread Michael McCandless
Documents are still only "marked' as deleted. expungeDeletes was renamed to forceMergeDeletes, but it's a horribly, horribly costly operation. Normal merging will collapse the deletes anyway, and the default merge policy favors segments with more deletions ... so you shouldn't have to force merge

Re: IndexWriter deleteDocuments

2013-01-30 Thread wgggfiy
it seems that a doc is really deleted until next index merge or something. I'm not sure. - -- Email: wuqiu.m...@qq.com -- -- View this message in context: http://lucene.472066.n3.nabble.com/IndexWriter-deleteDocuments-tp4037365p4037377.html Se

IndexWriter deleteDocuments

2013-01-30 Thread Bernd Müller
Hello, In my index, I have documents identified by a field with their unique identifier. Now, I tried to delete documents having such a unique identifier using deleteDocuments(Term t). If I test the IndexWriter for deletions with hasDeletions(), it tells me true. Even if I commit and close the ind

RE: CompressingStoredFieldsFormat doesn't show improvement

2013-01-30 Thread Uwe Schindler
Hi, there is nothing attached to your mail; maybe the mailing list software removed it. Can you place it somewhere on the web (e.g. pastebin,…)? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de

Re: CompressingStoredFieldsFormat doesn't show improvement

2013-01-30 Thread arun k
Adrein, Please find the attached profilers report. On Wed, Jan 30, 2013 at 3:35 PM, Adrien Grand wrote: > On Wed, Jan 30, 2013 at 8:08 AM, arun k wrote: > > Adrein, > > > > I have created an index of size 370M of 1 million docs of 40 fields of 40 > > chars and did the profiling. > > I see that

[OT] San Fran. Lucene/Solr Hack Night

2013-01-30 Thread Grant Ingersoll
If you are in the San. Fran. area next Wednesday, Feb. 06, LucidWorks and I will be hosting a Lucene/Solr hack night. To reserve a spot or learn more, see http://www.meetup.com/SFBay-Lucene-Solr-Meetup/ Bring you laptop, your code, etc. and we'll hack on Lucene/Solr for a few hours. Cheers,

Re: CompressingStoredFieldsFormat doesn't show improvement

2013-01-30 Thread Adrien Grand
On Wed, Jan 30, 2013 at 8:08 AM, arun k wrote: > Adrein, > > I have created an index of size 370M of 1 million docs of 40 fields of 40 > chars and did the profiling. > I see that the indexing and in particular the addDocument & > ConcurrentMergeScheduler in 4.1 takes double the time compared to 3.

Re: FacetRequest include residue

2013-01-30 Thread Nicola Buso
Hi Shai, your solution sound good to me, an accumulator that can add in the counting some "exception". Nicola. On Wed, 2013-01-30 at 08:13 +0200, Shai Erera wrote: > Hi Nicola, > > > There might be a way to do what you want, with some coding on your > part. If you're interested in counting th

Re: Migration to Lucene 4.1

2013-01-30 Thread Ian Lea
Have you read the changes and migration docs that come with 4.1? You may also need to look at 3.[123456] javadocs to see deprecations and alternatives for stuff that was present in 3.0 but gone in 4.1. -- Ian. On Tue, Jan 29, 2013 at 7:30 PM, Paul Sitowitz wrote: > Hello, > > I currently have