Re: Rewriting an index without losing 'hidden' data

2011-05-17 Thread Samarendra Pratap
Hi, I know it is too late to answer a question (sorry Chris) but I thought it could be useful to share things (even late). I was just going through the mails and I found that we've done it a few months back. *Objective: To add a new field to existing index without re-writing the whole index.* We

Re: Sharding Techniques

2011-05-13 Thread Samarendra Pratap
Hi Tom, Thanks for pointing me to something important (phrase queries) which I wasn't thinking of. We are using synonyms which gets expanded at run time. I'll have to give it a thought. We are not using synonyms at indexing time due to lack of flexibility of changing the list. We are not using

Re: Sharding Techniques

2011-05-11 Thread Samarendra Pratap
Hi Tom, the more i am getting responses in this thread the more i feel that our application needs optimization. 350 GB and less than 2 seconds!!! That's much more than my expectation :-) (in current scenario). *characteristics of slow queries:* there are a few reasons for greater search time

Re: Sharding Techniques

2011-05-10 Thread Samarendra Pratap
Hi Mike, *"I think the usual approach is to create multiple mirrored copies (slaves) rather than sharding"* This is where my eyes stuck. We do have mirrors and in-fact a good number of those. 6 servers are being used for serving regular queries (2 are for specific queries that do take time) and e

Re: Sharding Techniques

2011-05-10 Thread Samarendra Pratap
Thanks to Johannes - I am looking into katta. Seems promising. to Toke - Great explanation. That's what I was looking for. I'll come back and share my experience. Thank you very much. On Tue, May 10, 2011 at 1:31 PM, Toke Eskildsen wrote: > On Mon, 2011-05-09 at 13:56 +020

Re: Sharding Techniques

2011-05-10 Thread Samarendra Pratap
end of > file. > > Regards > Ganesh > > > > - Original Message - > From: "Samarendra Pratap" > To: > Sent: Monday, May 09, 2011 5:26 PM > Subject: Sharding Techniques > > > > Hi list, > > We have an index directory of 30 GB whic

Re: Sharding Techniques

2011-05-09 Thread Samarendra Pratap
gination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr > looks well worth a read. > > > -- > Ian. > > On Mon, May 9, 2011 at 12:56 PM, Samarendra Pratap > wrote: > > Hi list, > > We have an index directory of 30 GB which is divided into

Sharding Techniques

2011-05-09 Thread Samarendra Pratap
Hi list, We have an index directory of 30 GB which is divided into 3 subdirectories (idx1, idx2, idx3) which are again divided into 21 sub-subdirectories (idx1-1, idx1-2, , idx2-1, , idx3-1, , idx3-21). We are running with java 1.6, lucene 2.9 (going to upgrade to 3.1 very soon), linu

Re: asking about index verification tools

2010-11-16 Thread Samarendra Pratap
It is not guaranteed that every term will be indexed. There is a limit on maximum number of terms (as in lucene 3.0 and may be earlier too) per field. Check out this http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/index/IndexWriter.html#setMaxFieldLength(int) On Tue, Nov 16, 2010 at

Re: Single filter instance with different searchers

2010-11-09 Thread Samarendra Pratap
That said, I'd #guess# that you'll be OK because I'd #guess# that > filters are maintained on a per-reader basis and the results > are synthesized when combined in a MultiSearcher. > > But that's all a guess > > Best > Erick > > On Tue, Nov 9, 2010 a

Re: Single filter instance with different searchers

2010-11-08 Thread Samarendra Pratap
** > where lid: ### is the Lucene doc ID returned in scoreDocs > *** > > dumping ram1 > lid: 0, content: common 0.11100571422470962 > lid: 1, content: common 0.31555863707233567 > dumping ram2 > lid: 0, content: common 0.

Re: Single filter instance with different searchers

2010-11-08 Thread Samarendra Pratap
control including a restrictive clause in the query > to do the same thing your filter is doing. Or construct the filter new > for comparison If the numbers continue to be the same, I clearly > don't understand something! > > Best > Erick > > On Wed, Nov 3, 2010

Single filter instance with different searchers

2010-11-03 Thread Samarendra Pratap
Hi. We have a large index (~ 28 GB) which is distributed in three different directories, each representing a country. Each of these country wise indexes is further distributed on the basis of last update date into 21 smaller indexes. This index is updated once in a day. A user can search into any

Re: Right memory for search application

2010-04-28 Thread Samarendra Pratap
> And you can extend this ad nauseum. For instance, you could use 6 > fields, yy, mm, dd, HH, MM, SS and have a very small number of > unique values in each using really tiny amounts of memory to sort down > to the second in this case. > > Best > Erick > > On Wed, Apr 28

Re: Right memory for search application

2010-04-27 Thread Samarendra Pratap
n, and sorts with far less > space than Strings. > > Also you get something called a date facet, which lets you bucketize > facet searches by time block. > > On Tue, Apr 27, 2010 at 1:02 PM, Toke Eskildsen > wrote: > > Samarendra Pratap [samarz...@gmail.com] wrote: >

Re: Right memory for search application

2010-04-27 Thread Samarendra Pratap
ltiple ones open, or > failing to close old ones, will use more memory. Does memory usage > grow then stabilize or keep on growing? > > A memory profiler/heap dump could tell you what is really using all the > space. > > > -- > Ian. > > On Tue, Apr 27, 2010 at 1:51

Right memory for search application

2010-04-27 Thread Samarendra Pratap
Hi. I am searching for some guidance on right memory options for my Search Server application. How much memory a lucene based application should be given? Till a few days back I was running my search server on java 1.4 with memory options "-Xmx3600m" which was running quite fine. After upgrading

Re: Reopening a Searcher for each request

2010-04-24 Thread Samarendra Pratap
earcher.close(); > > > indexSearcher = new IndexSearcher(newIndexReader); >} > } > return indexSearcher; > } catch (CorruptIndexException e) { > log.error(e.getMessage(),e); > return null; > } catch (IOException e) { > log.error(e.getMessage(),e); >

Re: Reopening a Searcher for each request

2010-04-22 Thread Samarendra Pratap
; > The Searchers do very little on construction so re-creating per query > should be OK. > > Mike > > On Thu, Apr 22, 2010 at 6:38 AM, Samarendra Pratap > wrote: > > Greetings to all. > > I have read at so many places that we should not open a Searcher for > each &g

Reopening a Searcher for each request

2010-04-22 Thread Samarendra Pratap
Greetings to all. I have read at so many places that we should not open a Searcher for each request for the sake of performance, but I have always been wondering whether it is actually Searcher or Reader? I have a group of index amounting to 23G which actually contains of different index directo

Re: about analyzer for searching location

2010-04-19 Thread Samarendra Pratap
rstanding is, united and states are separately stored in index, but > not as "united states". So, if I build a query like Query q = > qp.parse("\"united states\""); It would not return any result. Am I right? > > Ian > > ---

Re: about analyzer for searching location

2010-04-16 Thread Samarendra Pratap
Hi. I don't think you need a different analyzer. Read about PhraseQuery. If you are using parse() method of QueryParser. Enclose the searched string in extra double quotes, which must obviously be escaped. Quer

Grouping results on the basis of a field

2005-11-21 Thread Samarendra Pratap
Hi, I am using lucene 1.4.3. The basic functionality of the search is simple, put in the keyword as “java” and it will display you all the books having java keyword. Now I have to add a feature which also shows the name of top authors (lets say top 5 authors) with the number of books,