Single filter instance with different searchers

2010-11-03 Thread Samarendra Pratap
Hi. We have a large index (~ 28 GB) which is distributed in three different directories, each representing a country. Each of these country wise indexes is further distributed on the basis of last update date into 21 smaller indexes. This index is updated once in a day. A user can search into any

Re: Single filter instance with different searchers

2010-11-08 Thread Samarendra Pratap
control including a restrictive clause in the query > to do the same thing your filter is doing. Or construct the filter new > for comparison If the numbers continue to be the same, I clearly > don't understand something! > > Best > Erick > > On Wed, Nov 3, 2010

Re: Single filter instance with different searchers

2010-11-08 Thread Samarendra Pratap
** > where lid: ### is the Lucene doc ID returned in scoreDocs > *** > > dumping ram1 > lid: 0, content: common 0.11100571422470962 > lid: 1, content: common 0.31555863707233567 > dumping ram2 > lid: 0, content: common 0.

Re: Single filter instance with different searchers

2010-11-09 Thread Samarendra Pratap
That said, I'd #guess# that you'll be OK because I'd #guess# that > filters are maintained on a per-reader basis and the results > are synthesized when combined in a MultiSearcher. > > But that's all a guess > > Best > Erick > > On Tue, Nov 9, 2010 a

Re: asking about index verification tools

2010-11-16 Thread Samarendra Pratap
It is not guaranteed that every term will be indexed. There is a limit on maximum number of terms (as in lucene 3.0 and may be earlier too) per field. Check out this http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/index/IndexWriter.html#setMaxFieldLength(int) On Tue, Nov 16, 2010 at

Sharding Techniques

2011-05-09 Thread Samarendra Pratap
Hi list, We have an index directory of 30 GB which is divided into 3 subdirectories (idx1, idx2, idx3) which are again divided into 21 sub-subdirectories (idx1-1, idx1-2, , idx2-1, , idx3-1, , idx3-21). We are running with java 1.6, lucene 2.9 (going to upgrade to 3.1 very soon), linu

Re: Sharding Techniques

2011-05-09 Thread Samarendra Pratap
gination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr > looks well worth a read. > > > -- > Ian. > > On Mon, May 9, 2011 at 12:56 PM, Samarendra Pratap > wrote: > > Hi list, > > We have an index directory of 30 GB which is divided into

Re: Sharding Techniques

2011-05-10 Thread Samarendra Pratap
end of > file. > > Regards > Ganesh > > > > - Original Message - > From: "Samarendra Pratap" > To: > Sent: Monday, May 09, 2011 5:26 PM > Subject: Sharding Techniques > > > > Hi list, > > We have an index directory of 30 GB whic

Re: Sharding Techniques

2011-05-10 Thread Samarendra Pratap
Thanks to Johannes - I am looking into katta. Seems promising. to Toke - Great explanation. That's what I was looking for. I'll come back and share my experience. Thank you very much. On Tue, May 10, 2011 at 1:31 PM, Toke Eskildsen wrote: > On Mon, 2011-05-09 at 13:56 +020

Re: Sharding Techniques

2011-05-10 Thread Samarendra Pratap
Hi Mike, *"I think the usual approach is to create multiple mirrored copies (slaves) rather than sharding"* This is where my eyes stuck. We do have mirrors and in-fact a good number of those. 6 servers are being used for serving regular queries (2 are for specific queries that do take time) and e

Re: Sharding Techniques

2011-05-11 Thread Samarendra Pratap
Hi Tom, the more i am getting responses in this thread the more i feel that our application needs optimization. 350 GB and less than 2 seconds!!! That's much more than my expectation :-) (in current scenario). *characteristics of slow queries:* there are a few reasons for greater search time

Re: Sharding Techniques

2011-05-13 Thread Samarendra Pratap
Hi Tom, Thanks for pointing me to something important (phrase queries) which I wasn't thinking of. We are using synonyms which gets expanded at run time. I'll have to give it a thought. We are not using synonyms at indexing time due to lack of flexibility of changing the list. We are not using

Re: Rewriting an index without losing 'hidden' data

2011-05-17 Thread Samarendra Pratap
Hi, I know it is too late to answer a question (sorry Chris) but I thought it could be useful to share things (even late). I was just going through the mails and I found that we've done it a few months back. *Objective: To add a new field to existing index without re-writing the whole index.* We

Re: about analyzer for searching location

2010-04-16 Thread Samarendra Pratap
Hi. I don't think you need a different analyzer. Read about PhraseQuery. If you are using parse() method of QueryParser. Enclose the searched string in extra double quotes, which must obviously be escaped. Quer

Re: about analyzer for searching location

2010-04-19 Thread Samarendra Pratap
rstanding is, united and states are separately stored in index, but > not as "united states". So, if I build a query like Query q = > qp.parse("\"united states\""); It would not return any result. Am I right? > > Ian > > ---

Reopening a Searcher for each request

2010-04-22 Thread Samarendra Pratap
Greetings to all. I have read at so many places that we should not open a Searcher for each request for the sake of performance, but I have always been wondering whether it is actually Searcher or Reader? I have a group of index amounting to 23G which actually contains of different index directo

Re: Reopening a Searcher for each request

2010-04-22 Thread Samarendra Pratap
; > The Searchers do very little on construction so re-creating per query > should be OK. > > Mike > > On Thu, Apr 22, 2010 at 6:38 AM, Samarendra Pratap > wrote: > > Greetings to all. > > I have read at so many places that we should not open a Searcher for > each &g

Re: Reopening a Searcher for each request

2010-04-24 Thread Samarendra Pratap
earcher.close(); > > > indexSearcher = new IndexSearcher(newIndexReader); >} > } > return indexSearcher; > } catch (CorruptIndexException e) { > log.error(e.getMessage(),e); > return null; > } catch (IOException e) { > log.error(e.getMessage(),e); >

Right memory for search application

2010-04-27 Thread Samarendra Pratap
Hi. I am searching for some guidance on right memory options for my Search Server application. How much memory a lucene based application should be given? Till a few days back I was running my search server on java 1.4 with memory options "-Xmx3600m" which was running quite fine. After upgrading

Re: Right memory for search application

2010-04-27 Thread Samarendra Pratap
ltiple ones open, or > failing to close old ones, will use more memory. Does memory usage > grow then stabilize or keep on growing? > > A memory profiler/heap dump could tell you what is really using all the > space. > > > -- > Ian. > > On Tue, Apr 27, 2010 at 1:51

Re: Right memory for search application

2010-04-27 Thread Samarendra Pratap
n, and sorts with far less > space than Strings. > > Also you get something called a date facet, which lets you bucketize > facet searches by time block. > > On Tue, Apr 27, 2010 at 1:02 PM, Toke Eskildsen > wrote: > > Samarendra Pratap [samarz...@gmail.com] wrote: >

Re: Right memory for search application

2010-04-28 Thread Samarendra Pratap
> And you can extend this ad nauseum. For instance, you could use 6 > fields, yy, mm, dd, HH, MM, SS and have a very small number of > unique values in each using really tiny amounts of memory to sort down > to the second in this case. > > Best > Erick > > On Wed, Apr 28

Grouping results on the basis of a field

2005-11-21 Thread Samarendra Pratap
Hi, I am using lucene 1.4.3. The basic functionality of the search is simple, put in the keyword as “java” and it will display you all the books having java keyword. Now I have to add a feature which also shows the name of top authors (lets say top 5 authors) with the number of books,