Getting most occurring words in lucene

2015-02-22 Thread Maisnam Ns
Hi, I am trying to get the top occurring words by building a memory index using lucene using the code below but I am not getting the desired results. The text contains 'freedom' three times but it gives only 1. Where am I committing a mistake. Is there a way out. Please help. RAMDirectory idx = n

Re: Lucene querying with words ending with 'ing'

2015-02-16 Thread Maisnam Ns
Thanks Erick On Tue, Feb 17, 2015 at 11:33 AM, Erick Erickson wrote: > See ReverseWildcardFilterFactory. The trick is to index the tokens > backwards, so leading wildcards become trailing ones. > > Best, > Erick > > On Mon, Feb 16, 2015 at 9:16 PM, Maisnam Ns wrote: > &

Lucene querying with words ending with 'ing'

2015-02-16 Thread Maisnam Ns
Hi, Can someone help me with querying terms ending with 'ing' with Lucene. I tried searching with '*ing' , it is saying query string cannot start with * , but I would like to get all words ending with 'ing' How can I accomplish this with Lucene Regards NS

Re: Top 10 words

2015-02-15 Thread Maisnam Ns
Hi Jigar, The link you shared http://search.carrot2.org is really nice a lot of it's features actually has my requirements. Thanks for the share <http://search.carrot2.org> On Mon, Feb 16, 2015 at 9:20 AM, Maisnam Ns wrote: > Hi Denis, > > Looks good and thanks for the

Re: Top 10 words

2015-02-15 Thread Maisnam Ns
fact (see https://github.com/addthis/stream-lib). > > > On Feb 14, 2015, at 04:34, Maisnam Ns wrote: > > > > Hi Jigar, > > > > Thanks for the clustering algorithm will see if it can be applied. > > > > These are not known fields as these documents are co

Re: occurrence of two terms with the highest frequency

2015-02-13 Thread Maisnam Ns
; > etc. > > or BooleanQuery equivalents with MUST clauses. Use > aol.search.TotalHitCountCollector and it should be blazingly fast, > even if you have more that 100 docs. > > > -- > Ian. > > > On Thu, Feb 12, 2015 at 5:42 PM, Maisnam Ns wrote: > > Hi,

Re: Top 10 words

2015-02-13 Thread Maisnam Ns
you can refer http://search.carrot2.org > > > > > On Fri, Feb 13, 2015 at 10:13 PM, Maisnam Ns wrote: > > > Hi, > > > > Can someone help me with this use case: > > > > 1. I have to search a string and let's say the search engine(it is not > >

Top 10 words

2015-02-13 Thread Maisnam Ns
Hi, Can someone help me with this use case: 1. I have to search a string and let's say the search engine(it is not lucene) found this string in 100,000 documents. I need to find the top 10 words occurring in this 10 documents.As the document size is large how to further index these documents

Re: Proximity query

2015-02-12 Thread Maisnam Ns
, Maisnam Ns wrote: > Hi Allison and Sujit, > > Thanks so much for your links I am so happy I am looking at exactly the > links that almost covers my use case. > > Allison, sure will get back to you if I have some more questions. > > Regards > NS > > > > >

occurrence of two terms with the highest frequency

2015-02-12 Thread Maisnam Ns
Hi, Can someone help me with this use case. Use case: Say there are 4 key words 'Flying', 'Shooting', 'fighting' and 'looking' in100 documents to search for. Consider 'Flying' and 'Shooting' co- occurs (together) in 70 documents where as 'Flying and 'fighting' co- occurs in 14 documents 'Flyin

Re: Proximity query

2015-02-12 Thread Maisnam Ns
> written against Lucene 3.x so you may have to upgrade it if you are using > Lucene 4.x): > > > http://sujitpal.blogspot.com/2011/08/implementing-concordance-with-lucene.html > > -sujit > > > On Thu, Feb 12, 2015 at 8:57 AM, Maisnam Ns wrote: > > > Hi Shah, >

Re: Proximity query

2015-02-12 Thread Maisnam Ns
gt; On Thu, Feb 12, 2015 at 10:10 PM, Maisnam Ns wrote: > > > Hi, > > > > Can someone help me if this use case is possible or not with lucene > > > > Use case: I have a string say 'Japan' appearing in 10 documents and I > want > > to get back ,

Proximity query

2015-02-12 Thread Maisnam Ns
Hi, Can someone help me if this use case is possible or not with lucene Use case: I have a string say 'Japan' appearing in 10 documents and I want to get back , say some results which contain two words before 'Japan' and two words after 'Japan' may be something like this ' Economy of Japan is gro