Re: Regarding Clustering Support in Lucene

2025-05-14 Thread Arun Kumar Kalakanti
Dear all, My bad, KMeans is in 10.2 too. Are there any other clustering algos like DBSCAN (or HDBSCAN) or Agglomerative planned in future? Regards, Arun Kumar K On Tue, 6 May 2025 at 17:11, Arun Kumar Kalakanti wrote: > Dear all, > > Lucene 10.1 introduced "experimental"

Regarding Clustering Support in Lucene

2025-05-06 Thread Arun Kumar Kalakanti
alternatives in place to support it? Regards, Arun Kumar K

Re: Handling Nested Vector Field in the Filter Criteria

2025-04-08 Thread Arun Kumar Kalakanti
Examples of the Nested Query: A AND (B OR C) AND D, A AND (!B OR (C and D)), etc. Any field(s) represented by A, B, C, and D can be a Vector or Regular Field too. On Tue, Apr 8, 2025 at 12:33 PM Arun Kumar Kalakanti < arun.kalaka...@gmail.com> wrote: > Hi all, > > I’m working with

Handling Nested Vector Field in the Filter Criteria

2025-04-08 Thread Arun Kumar Kalakanti
Hi all, I’m working with vector queries using KnnVectorQuery, which, as I understand it, requires a vector field with the target vector for vector search and a separate filter query for other fields based filters. This setup has worked well for me so far. However, I have two queries or use-cases:

Re: How to retain % sign next to number during tokenization

2023-09-21 Thread Amitesh Kumar
handle stuff like product numbers correctly. There > you can possibly make sure thet "%" survives. > > Uwe > > Am 20.09.2023 um 22:42 schrieb Amitesh Kumar: > > Thanks Mikhail! > > > > I have tried all other tokenizers from Lucene4.4. In case of > > Whi

Re: How to retain % sign next to number during tokenization

2023-09-20 Thread Amitesh Kumar
Thanks Mikhail! I have tried all other tokenizers from Lucene4.4. In case of WhitespaceTokwnizer, it loses romanizing of special chars like - etc On Wed, Sep 20, 2023 at 16:39 Mikhail Khludnev wrote: > Hello, > Check the whitespace tokenizer. > > On Wed, Sep 20, 2023 at 7:46 PM A

How to retain % sign next to number during tokenization

2023-09-20 Thread Amitesh Kumar
Hi, I am facing a requirement change to get % sign retained in searches. e.g. Sample search docs: 1. Number of boys 50 2. My score was 50% 3. 40-50% for pass score Search query: 50% Expected results: Doc-2, Doc-3 i.e. My score was 1. 50% 2. 40-50% for pass score Actual result: All 3 documents (

Re: How to retain % sign next to number during tokenization

2023-07-18 Thread Amitesh Kumar
Sorry for duplicating the question. On Tue, Jul 18, 2023 at 19:09 Amitesh Kumar wrote: > I am facing a requirement change to get % sign retained in searches. e.g. > > Sample search docs: > 1. Number of boys 50 > 2. My score was 50% > 3. 40-50% for pass score > > Sear

How to retain % sign next to number during tokenization

2023-07-18 Thread Amitesh Kumar
I am facing a requirement change to get % sign retained in searches. e.g. Sample search docs: 1. Number of boys 50 2. My score was 50% 3. 40-50% for pass score Search query: 50% Expected results: Doc-2, Doc-3 i.e. My score was 1. 50% 2. 40-50% for pass score Actual result: All 3 documents (becau

Fwd: How to retain % sign against numbers in lucene indexing/ search

2023-07-13 Thread Amitesh Kumar
*Warm Regards,* *Amitesh K* -- Forwarded message - From: Amitesh Kumar Date: Wed, Jul 12, 2023 at 7:03 AM Subject: How to retain % sign against numbers in lucene indexing/ search To: Hi Group, I am facing a requirement change to get % sign retained in searches. e.g Sample

Info about the Lucene 4.10.4 version.

2021-06-22 Thread Arvind Kumar Sahu
Hi Team, Currently we are using Lucene 4.10.4 version. We are getting the below error: "Document contains at least one immense term in field (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of

Incremental Lucene Index

2019-06-24 Thread Sukhendu Kumar Biswal
Hi Team, Does Lucene support incremental indexing or we need to index complete repository for every update in the database? We have a scenario where millions of records are stored in RDMS and which gets updated once in a day. Details: First time we created indexes for millions of records ,if some

Near real time search in Lucene 7.2.0

2018-03-06 Thread Kumar, Santosh
Hi All, I am new to Lucene API and need help with below issues: * How to achieve near real time search in Lucene v 7.2.0. ? I have seen examples of having one indexWriter open for entire application life cycle and invoking indexWriter. getReader() and reader.reopen(). But, these no longer

Re: Storing and retrieving Java objects in Lucene

2018-02-19 Thread Kumar, Santosh
need to retrieve one field and you can easily convert back to object. Regards Ganesh On 20-02-2018 08:34, Kumar, Santosh wrote: > Hi, > > I have a requirement to store a Java object with multiple fields into the Lucene index. Basically, at the ap

Storing and retrieving Java objects in Lucene

2018-02-19 Thread Kumar, Santosh
Hi, I have a requirement to store a Java object with multiple fields into the Lucene index. Basically, at the application startup I run a select query on entities ( there are 5 of them as of now and may increase in future) and then create an index for each of these entities (5) i.e. five diffe

Re: Lucene with Database

2017-12-28 Thread Kumar, Santosh
2017-12-28 6:35 GMT+01:00 Kumar, Santosh : > > While looking up for examples of fuzzy search with Lucene, I came across > examples that demonstrate Lucene with file system predominantly, so was > wondering if there are any samples on ‘How to use Lucene with DB’

Re: Lucene with Database

2017-12-27 Thread Kumar, Santosh
Hi Trejkaz, Evert, Riccardo, Thank you for your inputs. We have an application which we plan to migrate to Cloudfoundry and are yet to make a decision on DataBase with the contenders being PostgreSQL, MySQL, HANA DB, MongoDB. In the current setup, we use HANA DB which already has a fuzzy search

Lucene with Database

2017-12-21 Thread Kumar, Santosh
Hi, I’m currently working on project which has the following scenario: 1. I have entities in DB on which I would like to prevent duplicates by same name or near match, for example, SalesOrder or SlsOrd or SalesOrd etc…are all considered same. For this, I would like to use fuzzy search and r

how do i improve Indexing and Searching performance of 2 billion documents over SolrCloud

2017-02-13 Thread yeshwanth kumar
Hi, we have 4 solr instances running we are using solr cloud for indexing hbase table column names. each column in hbase will end up as a document in solr, which resulted in over 2 billion documents in solr. primary goal is to search the column names. we have 4 shards for the collection, queries a

Re: How to build a Lucene BooleanQuery?

2016-09-21 Thread Chaitanya Kumar Ch
4297102p4297106.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h.

Re: How to build a Lucene BooleanQuery?

2016-09-21 Thread Chaitanya Kumar Ch
lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Thank You, Chaitanya Kumar Ch, +91 9550837582

Re: Issue while searching text with special characters like @,#

2016-09-07 Thread Chaitanya Kumar Ch
hs.get(PATH))); > IndexSearcher searcher = new IndexSearcher(reader); > QueryParser parser = new QueryParser( FIELD_NAME, analyzer); > Query query = parser.parse("+text:iker#"); > ScoreDoc[] docs = searcher.search(query, 2).scoreDocs; > for( ScoreDoc d : docs ){ > Sy

Re: Issue while searching text with special characters like @,#

2016-09-06 Thread Chaitanya Kumar Ch
.ietf.org/html/rfc3986 see section 2.2 so you would have to > URL encode them > > My 2 cents > > 2016-09-06 10:20 GMT-04:00 Chaitanya Kumar Ch : > > > Thanks for the reply. > > I have tried that but didn't work. > > Also please note that *@,# are not part of c

Re: Issue while searching text with special characters like @,#

2016-09-06 Thread Chaitanya Kumar Ch
ersyntax. > html#Escaping%20Special%20Characters > > 2016-09-06 10:02 GMT-04:00 Chaitanya Kumar Ch : > > > Hi All! > > > > I am facing issue while trying to match a fields content with some > keywords > > which contains symbols like @,# > > > > I have a

Issue while searching text with special characters like @,#

2016-09-06 Thread Chaitanya Kumar Ch
Bridge() to the field but I am not getting results. Below query is generated If i am remove ignoreFieldBridge() +(+body:johndaly +body:baby) Stack overflow link <http://stackoverflow.com/questions/39350676/hibernate-search-lucene-search-text-with-special-characters-like> -- Thank You, Chaitanya Kumar Ch, +91 9550837582

Re: lucene index reader performance

2016-07-07 Thread Tarun Kumar
Any suggestions pls? On Mon, Jul 4, 2016 at 3:37 PM, Tarun Kumar wrote: > Hey Michael, > > docIds from multiple indices (from multiple machines) need to be > aggregated, sorted and first few thousand new to be queried. These few > thousand docs can be distributed among multiple

Re: lucene index reader performance

2016-07-04 Thread Tarun Kumar
llions of documents. > > Maybe you could make a custom collector, and use doc values, to do your > own custom aggregation. > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, Jul 4, 2016 at 1:39 AM, Tarun Kumar wrote: > >> Thanks for reply Michael! I

Re: lucene index reader performance

2016-07-03 Thread Tarun Kumar
be used to load just a few hits, like a > page worth or ~ 10 documents, per search. > > Mike McCandless > > http://blog.mikemccandless.com > > On Tue, Jun 28, 2016 at 7:05 AM, Tarun Kumar wrote: > >> I am running lucene 4.6.1. I am trying to get documents correspond

lucene index reader performance

2016-06-28 Thread Tarun Kumar
I am running lucene 4.6.1. I am trying to get documents corresponding to docIds. All threads get stuck (don't get stuck exactly but spend a LOT of time in) at: java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcherImpl.pread0(Native Method) at sun.nio.ch.FileDispatcherImpl.p

suggester Error

2015-10-26 Thread Rajesh Kumar
I have configured suggester for each Entity in application. and whenever Entity is being created and updated in application i am building the suggester manually. but since i have migrated the Solr from 4.7 to 5.3 i am getting the java.lang.IllegalStateException: suggester was not built Exception

Re: Lucene 5 : any merge performance metrics compared to 4.x?

2015-09-14 Thread Selva Kumar
understand. No plan to turn this off. * Is running checkIntegrity prior to index merge better than running post merge? On Mon, Sep 14, 2015 at 12:24 PM, Selva Kumar wrote: > We observe some merge slowness after we migrated from 4.10 to 5.2. > Is this expected? Any new tunable merge paramet

Lucene 5 : any merge performance metrics compared to 4.x?

2015-09-14 Thread Selva Kumar
We observe some merge slowness after we migrated from 4.10 to 5.2. Is this expected? Any new tunable merge parameters in Lucene 5 ? -Selva

Re: Lucene 5: Mutable/Immutable interface of BitSet

2015-09-13 Thread Selva Kumar
, Yonik Seeley wrote: > On Sun, Sep 13, 2015 at 4:23 PM, Selva Kumar > wrote: > > Mutable, "Immutable" interface of BitSet seems to be defined based on > > specific things like live docs and documents with DocValue etc. Any plan > to > > add general purpose

Lucene 5: Mutable/Immutable interface of BitSet

2015-09-13 Thread Selva Kumar
Mutable, "Immutable" interface of BitSet seems to be defined based on specific things like live docs and documents with DocValue etc. Any plan to add general purpose readonly interface to BitSet? -Selva

Re: Lucene 5 : are FixedBitSet and SparseFixedBitSet thread-safe?

2015-09-13 Thread Selva Kumar
. On Sun, Sep 13, 2015 at 11:39 AM, Toke Eskildsen wrote: > Selva Kumar wrote: > > Subject: Lucene 5 : are FixedBitSet and SparseFixedBitSet thread-safe? > > > Short answer: No. > > Longer answer: Reading values and calling methods that does not modify the > structure

Lucene 5 : are FixedBitSet and SparseFixedBitSet thread-safe?

2015-09-13 Thread Selva Kumar

new to Lucene

2015-08-07 Thread Nantha Kumar Subramaniam
egards, Assoc Prof Dr Nantha Kumar Subramaniam *Head of E-Learning* Open University Malaysia (OUM)

Lucene 5: Wrapping Collector

2015-06-27 Thread Selva Kumar
With wrapping collector scenarios, wrapping LeafCollector needs access to wrapped LeafCollector. If wrapping LeafCollector has access to LeafReaderContext, it seems one can use getLeafCollector "getter" anytime to get the wrapped leaf collector. if collect(int doc) method retrieves LeafCollector

Re: Lucene 5: Query and Filter merger

2015-06-27 Thread Selva Kumar
ou have custom filters, I would encourage you to rewrite them > using the query API instead of Filter. > > > On Sat, Jun 27, 2015 at 2:38 AM, Selva Kumar > wrote: > > With Query/Filter merger, it appears default Filter hashcode() varies by > > class, not object instance. D

Lucene 5: Query and Filter merger

2015-06-26 Thread Selva Kumar
With Query/Filter merger, it appears default Filter hashcode() varies by class, not object instance. Different object instances could have same value. Am I missing something?

Need Help To understand feasibility

2015-06-16 Thread suraj kumar
know if this is possible . you will be my life saver. -- Thanks and Regards Suraj kumar "The most powerful weapon on earth is the human soul on fire. "

Re: Boolean Search Query is not workng

2015-01-23 Thread parnab kumar
Hi, While indexing , a norm value is calculated for each field and injected in the index. This norm value is used as field level boosting which is also multiplied with other factors like tf-idf and query level boost which you specify with setBoost. so you see setting boosting is one of the s

Re: How best to compare tow sentences

2014-12-04 Thread parnab kumar
Hi, If you are comparing two song titles which are usually very short you are better of using custom set of several features rather than using one of cosine or levenstein or jaccard. You may use the combination of the following: 1. cosine sim score 2. Jaccard overlap coeff 3. how many words in th

Re: Document Term matrix

2014-11-11 Thread parnab kumar
hi, While indexing the documents , store the Term Vectors for the content field. Now for each document you will have an array of terms and their corresponding frequency in the document. Using the Index Reader you can retrieve this term vectors. Similarity between two documents can be computed as

TermQuery or PhraseQuery not working after switching from Lucene 3.6 to 4.6

2014-11-03 Thread Hemant Kumar
?Hi, I'm currently migrating from the really old version 3.6 to 4.6. I deleted my old index and re-indexed everything after migration the whole code base. Now everything works fine except one thing: My search doesn't return any results as soon as I add some TermQuery or PhraseQuery to an analy

free text suggester

2014-08-22 Thread parnab kumar
Hi, I am using lucene 4.8. I already have an index. I want to use the Free text suggester feature when a user queries the index. I am not sure how to start with this. A sample code snippet or a pointer to one would be really helpful. Thanks, Parnab

Re: Lucene newbie in need of a hint

2014-08-14 Thread parnab kumar
Have a look at this article if you have not already gone through it. http://blog.mikemccandless.com/2011/06/lucenes-near-real-time-search-is-fast.html On Thu, Aug 14, 2014 at 11:16 PM, Michael Jennings < mike.c.jenni...@gmail.com> wrote: > Hi everyone, > > I'm a bit of a Lucene newb, but a fairl

Re: bigram problem

2014-07-02 Thread parnab kumar
TF is straight forward, you can simply count the no of occurrences in the doc by simple string matching. For IDF you need to know total no of docs in the collection and the no. of docs having the bigram. reader.maxDoc() will give you the total no of docs in the collection. To calculate the number o

Re: Batch wise Indexing Structured Documents

2014-06-26 Thread parnab kumar
download lucene source code... and check the demo source files that are shipped with it ... you should find a sample indexing file... On Thu, Jun 26, 2014 at 9:27 PM, Venkata krishna wrote: > Hi, > > I have to index millions of files, that's why i am thinking batch wise > indexing is good. > >

Webinar On Hadoop!

2014-03-19 Thread Vivek Kumar
/54180991637732354 *Discussion Topics? * *What is Big Data ? *Challenges in Big Data *What is Hadoop ? *Opportunities in Hadoop / Big Data Best Regards, *Kumar Vivek * M +91-7675824584| si...@soapt.com www.soapttrainings.com <http://soapttrainings.com/index.php?actio

Re: please help me

2013-09-30 Thread parnab kumar
Just add the lucene jar files in the build path of the project. On Sat, Sep 28, 2013 at 5:04 PM, sajad naderi wrote: > hi > i want run code sample of "lucene in action"book by eclipse > please tell me how configure eclipse to run those code >

Re: DocIDBitSets & Grouping

2013-06-24 Thread Arun Kumar K
Thanks Uwe ! For part (1) of my query are there any smart ways ? Arun On Mon, Jun 24, 2013 at 4:29 PM, Uwe Schindler wrote: > Hi, > > > > With prior warming i find that (a) & (b) take almost same time. I knew > that > > only when we reuse the Filter we get its benefits. > > (c) takes around 30

DocIDBitSets & Grouping

2013-06-24 Thread Arun Kumar K
Hi Guys, I am using Lucene 4.2. 1> For my use case i am doing a search say name:xyz* and then i have a need to do a grouping with (from query same as name:xyz* + Filter + GroupSort) may be in same/different thread. >From my understanding the second internal search will be faster but i have good

Re: FieldCache & DocValues Filter

2013-06-06 Thread Arun Kumar K
Hi, Thanks Robert ! This info is exactly what i need. Just for getting myself clear. If the field is a DocValue field the FieldCacheTermsFilter will use the existing DocValues Field. For Normal Fields the filter will create a DocValues for that field using FieldCache. Arun On Thu, Jun 6, 2013

FieldCache & DocValues Filter

2013-06-06 Thread Arun Kumar K
Hi Guys, I was trying to better the filtering mechanism for my use case. When i use the existing filters like FieldCacheTermsFilter, TermsFilter i see that the first filtering take up enough time may be for building the FieldCache. Subsequent filters are fast enough. Currently, I am using CachingW

Re: Lucene 4.2 Doc Vals

2013-06-04 Thread Arun Kumar K
inefficient for random lookup. You schould do a bibary > search to find the right leaf. ComposuteReader and ReaderUtil have utility > methods to do this. > > Uwe > > > > Arun Kumar K schrieb: > >Hi Guys, > > > >I am trying to get hands on Lucene 4.2 Doc Va

Lucene 4.2 Doc Vals

2013-06-04 Thread Arun Kumar K
Hi Guys, I am trying to get hands on Lucene 4.2 Doc Values (RAM Based Which is by default). I have a 1GB index with 54 documents. When retrieving the DocVals for matched docs i am able to retrieve vals only upto some limit around 45000 docvals only. for (AtomicReaderContext context : reader

Re: Lucene 4.2 DocValues

2013-05-28 Thread Arun Kumar K
Adrein, Thanks for spending time to explain me the things clearly. I have got the things correctly now. Thanks, Arun On 29-May-2013, at 2:13 AM, Adrien Grand wrote: > On Tue, May 28, 2013 at 8:55 PM, Arun Kumar K wrote: >> Thanks for clarifying the things. >> I have some d

Re: Lucene 4.2 DocValues

2013-05-28 Thread Arun Kumar K
i right here? Thanks, Arun On 28-May-2013, at 8:31 PM, Adrien Grand wrote: > On Tue, May 28, 2013 at 4:48 PM, Arun Kumar K wrote: >> Hi Guys, > > Hi, > >> I have been trying to understand DocValues and get some hands on and have >> observed few things. &g

Lucene 4.2 DocValues

2013-05-28 Thread Arun Kumar K
Hi Guys, I have been trying to understand DocValues and get some hands on and have observed few things. I have added LongDocValuesField to the documents like: doc.add(new LongDocValuesField("id",1)); 1> In 4.0 i saw that there are two versions for docvalues, RAM Resident(using Sources.getSO

Re: WildCardQuery: TooManyClauses Exception

2013-04-18 Thread Arun Kumar K
gt; - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -----Original Message- > > From: Arun Kumar K [mailto:arunk...@gmail.com] > > Sent: Thursday, April 18, 2013 12:41 PM > > To: java-

WildCardQuery: TooManyClauses Exception

2013-04-18 Thread Arun Kumar K
Hi Guys, I am using following queries: 1>WildCardQuery 2>BooleanQuery having a WildCardQuery and TermQuery. WildCardQuery is field:* or say field:ab* >From Lucene FAQs and earlier discussions about TooManyClausesException i see that WildCardQuery gets expanded before doing search. For that i was

Re: "4.1 consuming more memory than 3.0.2 while Indexing"

2013-04-01 Thread Arun Kumar K
ram you are setting on the > index writer config? > > also how many threads are you using for indexing? > > simon > > On Mon, Apr 1, 2013 at 2:21 PM, Arun Kumar K wrote: > > Hi Adrien, > > > > I have seen memory usage using linux command top for RES memory &

Re: "4.1 consuming more memory than 3.0.2 while Indexing"

2013-04-01 Thread Arun Kumar K
M, Adrien Grand wrote: > On Mon, Apr 1, 2013 at 1:56 PM, Arun Kumar K wrote: > > Hi Guys, > > Hi, > > > I have been finding out the heap space requirement for indexing and > > searching with 3.0.2 vs 4.1 (with BlockPostings Format). > > > > I have a 2GB inde

"4.1 consuming more memory than 3.0.2 while Indexing"

2013-04-01 Thread Arun Kumar K
Hi Guys, I have been finding out the heap space requirement for indexing and searching with 3.0.2 vs 4.1 (with BlockPostings Format). I have a 2GB index with 1 million docs with around 42 fields with 40 fields being random strings. I have seen that memory for search has reduced by 5X with 4.1 (w

Re: Wild Card Query Performance

2013-03-29 Thread Arun Kumar K
ngeQuery(20130101, 20130131)? Another approach for improving > prefix queries is indexing additional terms: If you are always searching > for a 2-char prefix for "ab*", then simply index an additional term in a > separate field with 2 chars (e.g., "ab") in your documents

Re: Wild Card Query Performance

2013-03-29 Thread Arun Kumar K
dler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Arun Kumar K [mailto:arunk...@gmail.com] > > Sent: Friday, March 29, 2013 10:38 AM > > To: java-user > > Subject: Wi

Wild Card Query Performance

2013-03-29 Thread Arun Kumar K
Hi Guys, I have been testing the search time improvement in Lucene 4.0 from Lucene 3.0.2 version for Wildcard Queries (with atleast say 2 chars Eg.ar*). For a 2GB size index with 400 docs, the following observations were made: Around 3X improvement with and without STRING sort on a sortable

Searching for keywords .net,c#,...

2013-02-24 Thread kumar
e sure how to use it and have found scant references to it Any help is appreciated Thanks kumar

Re: Delete documents base on more than one condition?

2012-12-06 Thread parnab kumar
Hi Rajashekhar, yet it is possible . You can form a Boolean Query which will match the documents as per your required conditions . Then you can delete by the respective document ids by instantiating a indexReader. You can refer to Book Lucene in Action 2nd Edition for more details . Thanks, Parn

Re: Help for multi-language support

2012-12-04 Thread parnab kumar
Hi Deepak , Lucene already has multi-language support . For any language you just need to write the custom Analyzer for that language .While indexing you can configure the indexer to use the custom analyzer as and when needed . During searching also, the same applies .You just need to provide the

Re: Variable term weighting while indexing

2012-10-01 Thread parnab kumar
t > Erick > > On Sun, Sep 30, 2012 at 8:02 AM, parnab kumar > wrote: > > Hi Erick, > > Can you please share your thoughts on the following : > > Since lucene by default does vector space scoring , the > > weight component for a term from

Re: Lucene Index File Format

2012-09-30 Thread parnab kumar
Hi, Use IndexReader instead . You can loop through the index and read one document at a time . Thanks, Parnab On Mon, Oct 1, 2012 at 10:33 AM, Selvakumar wrote: > Hi, > > I'm new to Lucene and I reading the docs on Lucene. > > > I read through the Lucene Index File Format, so to e

Re: Variable term weighting while indexing

2012-09-30 Thread parnab kumar
are > indistinguishable. > > Best > Erick > > On Sat, Sep 29, 2012 at 12:23 PM, parnab kumar > wrote: > > Hi All, > > > >I have an algorithm by which i measure the importance of a > term > > in a document . While indexing i want to store weig

IndexUpgrader

2012-08-14 Thread sunil Kumar Verma
We have recently moved to 3.6 from lucene 2.2 and have seen that the way tokens get indexed are not the same. Although we are open to reindexing the data which was initially indexed with 2.2, I would like to know if there is a way I can avoid indexing? I am using IndexUpgrader tool to update the

RE: Storing same field twice (analyzed+not-analyzed), sorting

2012-04-27 Thread Vinaya Kumar Thimmappa
Why don't you store keywords related data in keywords field which can be analyzed and other field in as it is now. So all fields for which keywords is needed, move it to keywords section -v -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, April 27,

analyzer per document

2012-02-09 Thread Vinaya Kumar Thimmappa
Hello All, I have a requirement of using different analyzer per document. How can we do this? My analyzer would be locale specific. I have a file with 10 lines, each with different language. Document would be one line and I want analyzer to be changed based on the locale of the line. Is this po

RE: [OT] editing index with Luke question

2011-12-14 Thread Vinaya Kumar Thimmappa
Hope you have write permission on this index file Vinaya -Original Message- From: Michael Südkamp [mailto:michael.suedk...@docware.de] Sent: Wednesday, December 14, 2011 3:16 PM To: java-user@lucene.apache.org Subject: [OT] editing index with Luke question Hi, I know the Luke tool for

Lucene bangalore chapter

2011-12-05 Thread Vinaya Kumar Thimmappa
is there a lucene Bangalore chapter ? -Vinaya - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: need help

2011-06-21 Thread Vinaya Kumar Thimmappa
Hello Cheta, Check this site : http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/ Vinaya -Original Message- From: Marlen [mailto:zmach...@facinf.uho.edu.cu] Sent: Tuesday, June 21, 2011 7:19 PM To: java-user@lucene.apache.org Subject: need help I need to create a search engi

Boolean Search In Lucene

2011-06-03 Thread Ranjit Kumar
rger of x or y. But this is not happening. Please help how should I proceed to implement this? Any help or suggestion will be appreciated!!! Thanks & Regards, Ranjit Kumar === Private, Confidentia

Boolean Search In Lucene

2011-06-03 Thread Ranjit Kumar
rger of x or y. But this is not happening. Please help how should I proceed to implement this? Any help or suggestion will be appreciated!!! Thanks & Regards, Ranjit Kumar === Private, Confidentia

Re: lucene 3.0.3 | QueryParser | MultiFieldQueryParser

2011-04-27 Thread Ranjit Kumar
.net I need to use MultiFieldQueryParser to get correct result(document). Then Parser stripping off # but do not dot(.) so query became c AND .net Also, I have made changes for c#.net, vb.net, .net all these work properly with MultiFieldQueryParser except c# Thanks & Regards, Ranjit K

lucene 3.0.3 | QueryParser | MultiFieldQueryParser

2011-04-26 Thread Ranjit Kumar
MultiFieldQueryParser but it also do the same. Any help or suggestion will be appreciated!!! Thanks & Regards, Ranjit Kumar === Private, Confidential and Privileged. This e-mail and any files

lucene 3.0.3 | searching problem with *.docx file

2011-04-12 Thread Ranjit Kumar
Hi, I am creating index with help of StandardAnalyzer for *.docx file it's fine. But at the time of searching it do not gives result for these *.docx file. any help or suggestion will be appreciated!!! Thanks & Regards, Ran

Japanese/Chinese language support

2011-03-28 Thread Vinaya Kumar Thimmappa
Hello All, I am looking for Japanese/Chinese stemmer . Does this exists ? do we require it ? (Analyser are already present in lucene) I did a goggle and did not find any conclusive answer. Thanks in advance vinaya - To unsub

Re: lucene3.0.3 | Special character indexing

2011-03-14 Thread Vinaya Kumar Thimmappa
Hello Ranjit, Can you use the latest luke tool ? It has analyzer section which helps in deciding which analyzer to use based on the input. Hope this helps -vinaya On Monday 14 March 2011 07:18 PM, Ranjit Kumar wrote: Hi, I am creating index using Lucene 3.0.3 *StandardAnalyzer*. when

lucene3.0.3 | Special character indexing

2011-03-14 Thread Ranjit Kumar
existing analyzer will resolve this issue? Any suggestion will be appreciated!!! Thanks & Regards, Ranjit Kumar Associate Software Engineer [cid:image002.jpg@01CB7089.C0069B40] US: +1 408.540.0001 UK: +44 208.099.1660 India: +91 124.474.8100 | +91 124.410.1350 FAX: +1 408.51

Re: Indexing of multilingual labels

2011-03-14 Thread Vinaya Kumar Thimmappa
Hello Stephane, I think a better way is to have resource file with different language and store pointer in the index to get to correct resource file ( Something like I18N and L10N approach). Store the internationalised string in index and all related localised string in resource file . Thi

Re: lucene3.0.3 | get correct document in case of multiple Boolean query in search criteria

2011-02-22 Thread Ranjit Kumar
eanClause.Occur.MUST}; Query query = MultiFieldQueryParser.parse(Version.LUCENE_CURRENT, queree, fields, flags, analyzer); TopDocs docs = searcher.search(query, null, n); System.out.println("Total document matched: "+docs.totalHits);

Re: lucene3.0.3 | get correct document in case of multiple Boolean query in search criteria

2011-02-21 Thread Ranjit Kumar
string? If not? Please suggest me how to get correct hit(document). Your suggestion will be very helpful for me. Thanks & Regards, Ranjit Kumar === Private, Confidential and Privileged. This e-

lucene3.0.3 | get correct document in case of multiple Boolean query in search criteria

2011-02-18 Thread Ranjit Kumar
suggest me feasible solution. how could get correct document in case of multiple Boolean query in search criteria? Thanks & Regards, Ranjit Kumar === Private, Confidential and Privileged. This e-mail and

lucene 3.0.3 | phrase query problem

2011-02-10 Thread Ranjit Kumar
with this condition. Thanks & Regards, Ranjit Kumar === Private, Confidential and Privileged. This e-mail and any files and attachments transmitted with it are confidential and/or privileged. They are

lucene 3.0.3 | phrase query problem

2011-02-10 Thread Ranjit Kumar
s); It gives incorrect result or document. In case of using SpanQuery and SpanNearQuery it gives same result. Is this a bug? Can you give some hint about luke? Any suggestion will be appreciated. Thanks & Regards, Ranjit Kumar Associate Software Engineer [cid:image002.jpg@01CB7089.C006

lucene 3.0.3 | phrase query problem

2011-02-09 Thread Ranjit Kumar
l Server". It gives result for (sql. server) which is not correct. I am using StandardAnalyzer on both side while creating index or searching on files. While creating index Field.Index.NOT_ANALYZED is used. Please, give your suggestion !!! Thanks & Regards, Ranjit Kumar Associate S

lucene-core-3.0.3.jar | creating index too slow

2011-01-31 Thread Ranjit Kumar
Hi; While I am creating index using lucene-core-3.0.3.jar it takes more time as compare too lucene-core-3.0.2.jar . Please give your suggestion. Thanks & Regards, Ranjit K

Re: Best practices for multiple languages?

2011-01-18 Thread Vinaya Kumar Thimmappa
I think we should be using lucene with snowball jar's which means one index for all languages (ofcourse size of index is always a matter of concerns). Hope this helps. -vinaya On Tuesday 18 January 2011 11:23 PM, Clemens Wyss wrote: What is the "best practice" to support multiple languages, i

Lucene: how to get frequency of Boolean query

2010-12-25 Thread Ranjit Kumar
totalFreq = (Integer) lsta1.get(lsta.indexOf(docId)); } w.write(contId+"\t"+ID+"\t"+totalFreq+"\t"+reader.document(docId).get("path")+"\n"); } Thanks & Regards, Ran

FW: Re: lucene3.0.2: getting incorrect no. of occurrence in file

2010-12-08 Thread Ranjit Kumar
String path = document.get("path"); System.out.println("path>>" + path); totalFreq = termDocs.freq(); System.out.println("totalFreq >>" + totalFreq)

Ui Framework for lucene

2010-12-07 Thread Vinaya Kumar Thimmappa
Hello All, is there any ui framework that exists for lucene framework. i am not looking for luke kind of tool. But more like application ready to use. "Add a config file and ui is ready to use for adding data and also search data". Thanks and Regards Vinaya

  1   2   3   >