Re: Trace only exactly matching terms!

2010-05-07 Thread Anshum
Hi Manjula, Yes lucene by default would only tackle exact term matches unless you use a custom analyzer to expand the index/query. -- Anshum Gupta http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Fri

Re: Storing The content

2010-05-17 Thread Anshum
Hi Saurabh, I don't think there's a way to do that? Why not use other constructs? -- Anshum Gupta http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Mon, May 17, 2010 at 8:04 PM, Saurabh Aga

Re: Scaling Lucene to 1bln docs

2010-08-10 Thread Anshum
ch in case reading the source takes time in your case, though, the indexwriter would have to be shared among all threads. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Aug 10, 2010 at 12:24 PM, Shelly_Singh wrote: > Hi, > > I am developing an application which uses Lucene for ind

Re: Scaling Lucene to 1bln docs

2010-08-10 Thread Anshum
that period. This would make the data manageable and searchable within reasonable time. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Aug 10, 2010 at 5:49 PM, Shelly_Singh wrote: > No sort. I will need relevance based on TF. If I shard, I will have to > search in al indices. > &

Re: Scaling Lucene to 1bln docs

2010-08-10 Thread Anshum
So, you didn't really use the setRamBuffer.. ? Any reasons for that? -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Aug 11, 2010 at 10:28 AM, Shelly_Singh wrote: > My final settings are: > 1. 1.5 gig RAM to the jvm out of 2GB available for my desktop > 2. 100GB d

Re: about RAMDirectory based B/S plantform problem

2010-08-16 Thread Anshum
for your application? -- Anshum Gupta http://ai-cafe.blogspot.com 2010/8/17 xiaoyan Zheng > the question is like this: > > when one user is using IndexWirter.addDocument(doc), and another user has > already finished adding part and have closed IndexWirter, then, the first > u

Re: Sorting a Lucene index

2010-08-18 Thread Anshum
comfortably. btw, are you facing any issues in sort time or is it a presumption? -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Aug 18, 2010 at 5:12 PM, Shelly_Singh wrote: > Hi, > > I have a Lucene index that contains a numeric field along with certain > other fields. The order

Re: slow search threads during a disk copy

2010-08-23 Thread Anshum
Seems like a case of I/O issues. You may be reading content off the index while performing searches while the I/O for copy is also happening. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Aug 23, 2010 at 1:12 PM, wrote: > > Hi all, > > > We're observing search

Re: slow search threads during a disk copy

2010-08-23 Thread Anshum
There is bound to be IO contention. I'm sure iostat will give you a much better picture on it. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Aug 23, 2010 at 3:13 PM, wrote: > Yes, all version directories are on the same disk. iostat output should be > useful. Using rsync is

Re: Wanting batch update to avoid high disk usage

2010-08-23 Thread Anshum
ngedeletes(). -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Aug 24, 2010 at 4:38 AM, Justin wrote: > In an attempt to avoid doubling disk usage when adding new fields to all > existing documents, I added a call to IndexWriter::expungeDeletes. Then my > colleague pointed out that Luce

Re: Wanting batch update to avoid high disk usage

2010-08-23 Thread Anshum
reclaiming lost disc space. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Aug 24, 2010 at 9:22 AM, Justin wrote: > My actual code did not call expungeDeletes every time through the loop; > however, > calling expungeDeletes or optimize after the loop means that the index has > dou

Re: how to get the first term from index?

2010-09-30 Thread Anshum
this is what you intended! -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Sep 30, 2010 at 11:54 PM, Sahin Buyrukbilen < sahin.buyrukbi...@gmail.com> wrote: > Hi all, > > I need to get the first term in my index and iterate it. Can anybody help > me? > > Best. >

Re: Update lucene index

2010-10-11 Thread Anshum
on for you wanting to do so? is it that you only index data coming from a stream and you don't have access to the original source at a later time? -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Oct 12, 2010 at 11:35 AM, Nilesh Vijaywargiay < nilesh.vi...@gmail.com> wrote: > Hi

Re: Update lucene index

2010-10-11 Thread Anshum
ParallelReader though theoretically sounds useful, I doubt if how much the overhead of maintaining and synchronizing the document ids would be. I haven't used it so far, perhaps someone who's used the ParallelReader for such a purpose on production environment/scale may help you. -- An

Re: Indexing is hung or doesn't complete

2010-10-12 Thread Anshum
Version? Machine and JVM (32/64 bit)? This most probably seems like a code level issue rather than lucene, but I may be wrong. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Oct 13, 2010 at 8:08 AM, Ching wrote: > Hi All, > > Can anyone help with this issue? I have about 2000 pdf fil

Re: How to make a search log

2010-10-12 Thread Anshum
to begin, you may look at SOLR, which provides an out of the box engine. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Oct 13, 2010 at 8:57 AM, Hyun Joo Noh wrote: > Hi, how would you make Lucene leave a search log of > who searched what, when, etc (i.e. cookie, query, timestamp, etc

Re: Best implementation for address searching

2010-10-20 Thread Anshum
cord 1. Also while searching you may tokenize on a comma or whatever set of chars you fi nd appropriate. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Oct 19, 2010 at 8:59 PM, Jasper de Barbanson < lucene-mailingl...@de-barbanson.com> wrote: > I'm currently working on buil

Re: assign a id to document?

2010-10-20 Thread Anshum
Hi Nilesh, No you can't do that. Though you may store your own id as a separate field for whatever purpose you want. I don't see any reason why you'd essentially want to override the lucene document id with your own. Let me know in case there's something I didn't get.

Re: asking about index verification tools

2010-11-15 Thread Anshum
ndex. This would also give you a fair idea of the index state. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Nov 16, 2010 at 11:36 AM, Yakob wrote: > hello all, > I would like to ask about lucene index. I mean I created a simple > program that created lucene indexes and stored it

Re: asking about index verification tools

2010-11-17 Thread Anshum
index and the source. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Nov 17, 2010 at 1:36 PM, Lance Norskog wrote: > The Lucene CheckIndex program does this. It is a class somewhere in Lucene > with a main() method. > > > Samarendra Pratap wrote: > >> It is not gu

Re: lucene anchor-distance based search

2010-11-17 Thread Anshum
wiki.apache.org/lucene-java/SpatialSearch For your understanding, you could have a look at the bounding box approach. -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Nov 18, 2010 at 7:38 AM, yang Yang wrote: > We are using the hibernate search which is based on lucene as the search > e

Re: What is the difference between the "AND" and "+" operator?

2010-11-29 Thread Anshum
eanQuery.html#setMinimumNumberShouldMatch(int)>Finally all would depend on the case at hand and what you think is the expected behavior of search. Hope this helps. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Nov 29, 2010 at 1:31 PM, yang Yang wrote: > What is the difference between the &qu

Re: What is the difference between the "AND" and "+" operator?

2010-11-30 Thread Anshum
with a single '=' :) -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Nov 30, 2010 at 3:03 PM, maven apache wrote: > 2010/11/30 Chris Hostetter > > > > > : Subject: What is the difference between the "AND" and "+" operator? > > > &

Re: field cross search in lucene

2010-11-30 Thread Anshum
You could change Occur.SHOULD to Occur.MUST for both fields. This should work for you if what I understood is what you wanted. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Nov 30, 2010 at 5:12 PM, maven apache wrote: > Hi: I have two documents: > > title

Re: Editing StopWordList

2010-12-20 Thread Anshum
. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Dec 21, 2010 at 9:20 AM, manjula wijewickrema wrote: > Hi, > > 1) In my application, I need to add more words to the stop word list. > Therefore, is it possible to add more words into the default lucene stop > word list?

Re: Editing StopWordList

2010-12-21 Thread Anshum
ase 2 below). -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Dec 21, 2010 at 3:54 PM, manjula wijewickrema wrote: > Hi Gupta, > > Thanx a lot for your reply. But I could not understand whether I could > modify (adding more words) to the default stop word list or should I have

Re: Using Lucene to search live, being-edited documents

2010-12-28 Thread Anshum
type. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Dec 29, 2010 at 3:36 AM, software visualization < softwarevisualizat...@gmail.com> wrote: > This has probably been asked before but I couldn't find it, so... > > Is it possible / advisable / practical to use Lucene

Re: Using Lucene to search live, being-edited documents

2010-12-28 Thread Anshum
Hi Umesh, I'm not really confident that Zoie or anything built on the current version of Lucene would be able to handle search as you type kind of a setup. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Dec 29, 2010 at 10:39 AM, Umesh Prasad wrote: > You can also look at Zoie an

Re: Lucene index

2010-12-29 Thread Anshum
page, starting at http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB <http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB> -- Anshum Gupta http://ai-cafe.blogspot.com On We

Re: Can lucene index survives a machine crash during the merge or optimize operation?

2010-12-29 Thread Anshum
. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Dec 29, 2010 at 5:36 PM, Jiang mingyuan < mailtojiangmingy...@gmail.com> wrote: > Can lucene index survives a machine crash during the merge or optimize > operation? > > or can I stop the running index program during the

Re: Creating an index with multiple values for a single field

2011-01-07 Thread Anshum
Hi Ryan, You should try the synonym filter. That should help you with this kinda problem. You could also look at turning off norms for the name field, or turning off tf or idf. -- Anshum Gupta http://ai-cafe.blogspot.com On Sat, Jan 8, 2011 at 6:03 AM, Ryan Aylward wrote: > Our business ha

Re: Result ordering

2011-01-16 Thread Anshum
current query seems like you'd need more understanding on lucene and getting a copy of "Lucene In Action 2nd Ed<http://www.manning.com/hatcher3/>." would be a good idea for you and everyone in your position. Hope that helps. -- Anshum Gupta http://ai-cafe.blogspot.com On

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-18 Thread Anshum
mirrors them internally or via a downstream project) -- Anshum Gupta http://ai-cafe.blogspot.com

Re: Please Help

2011-01-20 Thread Anshum
erm). Something of an ngram, and then treat those phrases at terms. Doing it at runtime would not be a feasible option. -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Jan 20, 2011 at 3:30 PM, Ashish Pancholi wrote: > > Using Lucene_3.0.3. we would like to implement following: > The

Re: Scale out design patterns

2011-01-20 Thread Anshum
imple mod of some numeric (auto increment) userid. This works well under normal cases unless your partitioning is not predictable. -- Anshum Gupta http://ai-cafe.blogspot.com On Fri, Jan 21, 2011 at 10:52 AM, Ganesh wrote: > Hello all, > > Could you any one guide me what all the various

Re: lucene 3.0.3 | phrase query problem

2011-02-10 Thread Anshum
Hi Ranjit, That would be because all stop words (space, comma, stop word set, etc..) would be treated in a similar fashion and escaped while indexing, subject to the analyzer you use while index your content. Hope that explains the issue. -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Feb

Re: where can i download a sample index

2011-02-13 Thread Anshum
Why don't you generate your own index off some sample docs or dataset. Would give you a lot more flexibility to play around as otherwise even if you get an index, you wouldn't have info in the analyzer used etc.. while indexing. -- Anshum Gupta http://ai-cafe.blogspot.com On Sun, Fe

Re: Multi Index Search Query

2011-02-14 Thread Anshum
If you actually intend at getting the intersection of 2 results from a 'union' of 2 indexes, you could use the filter and query approach. You could use a multi searcher or a parallel multi searcher to perform the search in this case. -- Anshum Gupta http://ai-cafe.blogspot.com On M

Re: Search in multiple indexes which have differnt field name

2011-02-14 Thread Anshum
Hi Liat, You could use open a multi/parallelmultisearcher on the indexes that you have and then construct an OR query e.g. (contents:A OR text:A) I am assuming that the field names do not overlap. If that is not the case then you'd need another solution. -- Anshum Gupta http

Re: construct a field without analyzer?

2011-02-14 Thread Anshum
KeywordAnalyzer()); In the above snip, I instantiate an analyzer which by default would use the StandardAnalyzer but for 'anotherfield' would use KeywordAnalyzer. Hope this helps you. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Feb 15, 2011 at 2:19 AM, Yuhan Zhang wrote: >

Re: finding the length of a field

2011-02-28 Thread Anshum
Hi Lahiru, A few questions here. Why would you need that? Is the field stored? -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 1, 2011 at 11:04 AM, Lahiru Samarakoon wrote: > Hi all, > > Is there a way to find the length of a field of a lucene index document? > > Thanks, > Lahiru >

Re: document object

2011-03-10 Thread Anshum
should help you. Also, otherwise if you're using very selective field which may be used though a FieldCache it'd be a nice thing to do. Hope that helps. -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Mar 10, 2011 at 3:01 PM, suman.holani wrote: > > > Hi, > > >

Re: document object

2011-03-10 Thread Anshum
Depends on your data. I know that's a vague answer but that's the point. What you could do is use FieldCache if memory and data let you do so. Would it? -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Mar 10, 2011 at 3:12 PM, suman.holani wrote: > Hi Anshum, > > Than

Re: lucid gaze

2011-03-15 Thread Anshum
Hi Suman, I tried it a while ago. Found it nice and useful. You could get some hints on using it at http://ai-cafe.blogspot.com/2009/09/lucid-gaze-tough-nut.html (in case you need some ! :) ) -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Mar 16, 2011 at 11:37 AM, suman.holani wrote

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Anshum
Hi, No as of now, there's no way to do so. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011 at 12:29 PM, shrinath.m wrote: > I am asking for partial update in Lucene, > where I want to update only a selected field of all fields in the document. > Does Lucene prov

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Anshum
Also, Is there a particular reason why you wouldn't want to index that considering you'd want to 'update' documents. Its good practice to index the unique field specially if you have one. It has generally helped more often than not. -- Anshum Gupta http://ai-cafe.blogspot.

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Anshum
Yes, that's how its generally done. Also, you should just handle data/fields aptly rather than trying to avoid them in the first place. You could safely add these, use these internally and never return these or use these for an end user search. -- Anshum Gupta http://ai-cafe.blogspot.com O

Re: how to get all documents in the results ?

2011-03-22 Thread Anshum
Hi Patrick, You may have a look at this, perhaps this will help you with it. Let me know if you're still stuck up. http://stackoverflow.com/questions/3300265/lucene-3-iterating-over-all-hits -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011 at 4:10 PM, wrote: > Not s

Re: how to get all documents in the results ?

2011-03-22 Thread Anshum
so a few things 1. are you looking to get 'all' documents or only docs matching your query? 2. if its about fetching all docs, why not use the matchalldocs query? 3. did you try using a collector instead of topdocs? -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011

Re: how to get all documents in the results ?

2011-03-22 Thread Anshum
u are trying to achieve. You may have a completely different option that you haven't read which someone could advice if they know the exact intent. Hope this helps. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011 at 4:59 PM, Patrick Diviacco < patrick.divia...@gmail.com

Re: how to get all documents in the results ?

2011-03-23 Thread Anshum
need to specify anything there. The below would work and get you all the docs in the index as the result (provided you specify a limit high enough for the numDocs to match param) *Query query = new MatchAllDocsQuery();* *searcher.search(query.);* Hope this clarifies your doubt. -- Anshum

Re: how to get all documents in the results ?

2011-03-23 Thread Anshum
So functionally I am assuming you've achieved what you'd been aiming for. About the scores, the matchalldocs does score docs based on norm factors etc. therefore the score wouldn't be 0. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Mar 23, 2011 at 1:38 PM, Patrick Diviacco

Re: Update Document based on Query instead of Term

2011-04-13 Thread Anshum
So Update basically is nothing but delete and add (a fresh doc). You could just go ahead at using the deletedocument(Query query) function and then adding the new document? That is the general approach for such cases and it works just about fine. -- Anshum Gupta http://ai-cafe.blogspot.com On

Re: Choosing boosting in Lucene

2011-04-18 Thread Anshum
ts you the best. Relevance or an apt method about boost values, can again be figured out using varying the boost *via* *trial and error*. That is pretty much a general practice. Hope this helps you figuring out a reasonable solution and boost values. -- Anshum Gupta http://ai-cafe.blogspot.com O

Re: Calculate document lucene score after the search

2011-04-18 Thread Anshum
Hi Madhu, You could use IndexSearcher.explain(..) to explain the result and get the detailed breakup of the score. That should probably help you with understanding the boost and score as calculated by lucene for your app. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Apr 19, 2011 at 2:32

Re: best practice for reusing documents with multi-valued fields

2011-04-18 Thread Anshum
; ScoreDoc[] sd = is.search(query, 10).scoreDocs; for(ScoreDoc scoreDoc:sd){ System.out.println(ir.document(scoreDoc.doc)); } is.close(); ir.close(); iw.close(); *--Snip--* -- Anshum Gupta http://ai-cafe.blogspot.com On Fri, Apr 15,

Re: Lucene: Indexsearcher: java.lang.UnsupportedOperationException

2011-04-19 Thread Anshum
Could you also print and send the entire stack-trace? Also, the query.toString() -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Apr 19, 2011 at 7:40 PM, Patrick Diviacco < patrick.divia...@gmail.com> wrote: > I get the following error message: java.lang.UnsupportedOperation

Re: Lucene Indexing

2011-06-06 Thread Anshum
ency. Even the updateDocument function as of now would internally delete the document and add the new supplied document. Hope this answer helps. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Jun 6, 2011 at 11:59 AM, Pranav goyal wrote: > Hi all, > > I am a newbie to lucene. &g

Re: Lucene Document No

2011-06-06 Thread Anshum
achieve/target. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Jun 6, 2011 at 4:41 PM, Pranav goyal wrote: > Hi all, > > Is there any way to change my lucene document no? > Like if I can change my lucene document no's with con_key. > > I am a newbie and don't k

Re: Lucene Indexing

2011-06-06 Thread Anshum
Yes, You'd need to delete the document and then re-add a newly created document object. You may use the key and delete the doc using the Term(key, value). -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Jun 6, 2011 at 4:45 PM, Pranav goyal wrote: > Hi Anshum, > > Thanks fo

Re: Lucene Result

2011-06-07 Thread Anshum
field or any other field from the 'search' method. Also, I'd suggest you to grab a copy of Lucene in Action 2nd Edition as it'd help you a lot in understanding the way Lucene works/is used. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Jun 8, 2011 at 11:00 AM, Pranav goya

Re: Is There a Way To Split The Lucene Index Segments To Samller Size Less Than 1 GB

2011-07-27 Thread Anshum
other hand, why do you want to split a 9G index? Is there a reason? performance issue? It'd be good if you could share the reason as the problem could be completely different. -- Anshum Gupta http://ai-cafe.blogspot.com 2011/7/27 Gudi, Ravi Sankar > Hi Lucene Team, > > If you know

Re: Finding match term positions in the document

2011-10-28 Thread Anshum
Hi Vidya, Perhaps this could help you: http://hrycan.com/2009/10/25/lucene-highlighter-howto/ -- Anshum Gupta http://ai-cafe.blogspot.com On Fri, Oct 28, 2011 at 2:18 PM, Vidya Kanigiluppai Sivasubramanian < vidya...@hcl.com> wrote: > Hi, > > I am using lucene 2.4.1 in my proje

Re: How to search

2008-08-25 Thread Anshum
Hi , You could use wildcard queries in that case (In case I got you right). Though because of the way the indexed terms are stored, it would not be advisable to have a *word like query but a word* like would be doable in real world environment. Hope this answers your question. -- Anshum Gupta

Re: How to search

2008-08-25 Thread Anshum
. Else if you are so bent on wanting that, you may as well write an intelligent indexing wrapper to meet your needs (which would again require a lot of time and effort) -- Anshum On Mon, Aug 25, 2008 at 5:35 PM, Venkata Subbarayudu <[EMAIL PROTECTED]>wrote: > > Hi Anshum Gupta, >

Re: Concurrent search

2008-10-03 Thread Anshum
Hi Camello, It is pretty straight opening index searchers for the same index directory. In other words just open multiple saerchers to the same index location and it would work fine. -- -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the

Re: Document larger than setRAMBufferSizeMB()

2008-10-03 Thread Anshum
after doing whatever you wish to do using an exception handling block. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Fri, Oct 3, 2008 at 1:56 PM, Aditi Goyal <[EMAIL P

Re: Single searcher vs Multi Searcher

2008-10-03 Thread Anshum
to be document based (which I guess it should be as otherwise you would have to build a complete distributed system) 3. Do you plan to put your indexes on the RAM or on (physically) seperate HDDs? Though all said and done, sharded indexes are a good approach, if done the right way. -- Anshum Gupta

Re: Single searcher vs Multi Searcher

2008-10-06 Thread Anshum
hassles like the one mentioned above about moving data between indexes/DB). -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Mon, Oct 6, 2008 at 10:06 AM, Ganesh <[

Re: Single searcher vs Multi Searcher

2008-10-07 Thread Anshum
point out). I guess you should try it as speed of search is not realy all that important to you as compared to running it on a singe box within the memory limitation. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The d

Re: Single searcher vs Multi Searcher

2008-10-09 Thread Anshum
ns (though they are not that good) you could have a look at them. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Wed, Oct 8, 2008 at 5:39 PM, Ganesh <[EMAIL PROTECTED

Re: distinct field values

2008-10-14 Thread Anshum
Hi, You could try changing (or extending) TopFieldDocCollector and do your processing there (that is what I tried... and it worked fine). But that would mean changing lucene code a little bit. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody

Re: distinct field values

2008-10-14 Thread Anshum
You could go through this implementation. Have been using this (improvised) for a while now. There might be better ways to do so too. so you could check! http://www.gossamer-threads.com/lists/lucene/java-user/35704?search_string=categorycounts;#35704 -- Anshum Gupta Naukri Labs! http://ai

Re: Searching sets of documents

2008-10-14 Thread Anshum
. In case you only want to search for docs in folder A you could run a search only on the indexed 'Folder' field. i.e. Folder:A Hope this solves the problem. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The

Re: IndexSearcher update

2008-10-16 Thread Anshum
Yes you can! :) Very normally. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Thu, Oct 16, 2008 at 3:43 PM, mahdi yari <[EMAIL PROTECTED]> wrote: > hi dears >

Re: IndexSearcher update

2008-10-16 Thread Anshum
Yes you may do that as well... no updates are noted by the searcher until it (the searcher) is updated :) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Thu, Oct 16, 2008

Re: Lucene Index taking a lot to time

2008-10-28 Thread Anshum
picks up older lucene, thereby taking more time as compared to the older thing(which had no conflict in the jar) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Wed, Oct 29,

Re: Querying wildcard

2008-10-29 Thread Anshum
ecause the index terms are lexically sorted while storing and so the seem/fetch is efficient under normal cases(and not under the case of a prefix wildcard). -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction i

Re: Querying wildcard

2008-10-30 Thread Anshum
indexing) and search for reverse string while searching. Eric might have a better solution though ! :) Let me know if that solves the issue. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw

Re: possible score value

2008-11-06 Thread Anshum
scorer). Hope this clarifies things! :) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Thu, Nov 6, 2008 at 4:20 PM, Francisco Borges <[EMAIL PROTECTED] > wrote: &

Re: Storing part of the field

2008-11-14 Thread Anshum
Hi Ravi, In that case, you could have 2 fields. One of them would be indexed (i.e. "foo bar") and you could use the other only to store as per your logic. Hope this solves your purpose. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to ever

Re: Reg two versions of lucene on the same machine

2008-11-18 Thread Anshum
t directory, in which case you would have issues placing the 2 jars. It would be better of you completely remove lucene jars from the implicit included library dir, and place them in a different folder (and include that in your classpath). Hope that solves a bit of your doubt (atleast) ! -- Ansh

Re: Modify index contents

2008-11-20 Thread Anshum
Hi Chris, Its still the same model as far as my knowledge (always better to add the disclaimer :), though 'm pretty sure ) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw

Re: can I set Boost to the term while indexing?

2008-11-20 Thread Anshum
eld in a document by a particular boostfactor. By default all boosts are set to 1.0 in lucene. The field.setBoost would multiply the score of all matching docs by this factor while calculating relevance. Hope this solves your issue. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts e

Re: [ot] a reverse lucene

2008-11-22 Thread Anshum
Hi Ian, I guess that could be achieved if you write code to read the queries and query for each document (using lucene). Assuming that I got the question right! :) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The

Re: lucene search options

2008-12-06 Thread Anshum
myword" AND NOT filed1:jakarta This is just one of the solution, though I still would not understand if there's a logical reason for fetching such results.:) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. Th

Re: IDF scoring issue

2008-12-16 Thread Anshum
Hi Rajiv, If 'm interpreting your problem correctly, I'd suggest you to try using a phraseQuery with an appropriate slop value. Though again it depends on what is it that you exactly are trying to fetch. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here

Re: Lucene search question

2009-01-19 Thread Anshum
slop value (using ~) in that case while forming the query, also you could form an 'all words' query i.e. Boolean 'AND' or (MUST clause in case of lucene). so that your search results include all documents containing ALL the searched tokens. Hope this clears it up! -- Anshum Gup

Re: Lucene Indexing and Search Policy

2009-01-21 Thread Anshum
Hi msr, Perhaps this could be useful for you. Lucene implements a modified vector space model in short. http://jayant7k.blogspot.com/2006/07/document-scoringcalculating-relevance_08.html -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the

Re: Lucene Indexing and Search Policy

2009-01-21 Thread Anshum
Its about building a custom similarity class that scores using your normalization factors etc. This might help in that case, http://www.gossamer-threads.com/lists/lucene/java-user/69553 -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the

Re: indexing binary files?

2009-01-29 Thread Anshum
Hi Paul, Lucene is a 'text only' saerch lib. i.e. as long as you feed in anything as a string, you'd be able to use lucene else I don't think there's a way. How do you even intend to search in those binary files? as in... what would be the keyword/phrase? asking out of cu

Re: Using Lucene for user query parsing

2009-03-05 Thread Anshum
other). As per my knowledge lucene should be a better solution that anything else that 'I know of' for such a thing, but there'd be a few things that you would have to build yourself as well. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to

Re: Can you create a RAM index from a file index

2009-03-24 Thread Anshum
e same amount of 'free' ram to copy the index onto the RAM. You could then open your reader in the regular fashion straight off the RAM based tmpfs. You could also go through the archives for suggestions. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here

Re: Can you create a RAM index from a file index

2009-03-24 Thread Anshum
setup). -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Wed, Mar 25, 2009 at 11:37 AM, Ganesh wrote: > FileSystem index reader loads the data to RAM, I have tried with more tha

Re: ebook resources - including lucene in action

2009-04-21 Thread Anshum
such websites here, but not to pick pirated versions of these amazing works but for people here to jointly take action against such a thing. Let us rightfully support open source. -- Anshum Gupta Naukri Labs! while(1) { source="open"; } http://ai-cafe.blogspot.com The facts expressed here

Re: [ no subject ]

2009-04-30 Thread Anshum
creates a volatile index, runs a query, returns the similarity and clears the index (which would happen implicitly in case of a ramdir approach. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is you

Re: interpreting scores

2009-05-06 Thread Anshum
ing which could use score as a metric to absolutely cluster 'good' and 'not good' matches. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Thu, May 7, 2009

Re: How can i restrict search to only some documents.

2009-05-13 Thread Anshum
doc. Hope this helps. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Wed, May 13, 2009 at 2:15 PM, Velaboy V wrote: > Hi, > > I am new to Lucene. > > I h

Re: Help with phrase indexing

2009-05-13 Thread Anshum
a synonym analyzer so that. *cold icecream * is also indexed (and so is searchable as ) *chilled icecream*. This is completely doable using Lucene. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is you

Re: Help with phrase indexing

2009-05-17 Thread Anshum
Hi, Actually I'd really suggest you to 'buy' a copy of Lucene In Action - 2nd Edition. Its currently available as MEAP and its amazing. Perhaps the prices are also down 40% or something, though 'm not really sure about it. -- Anshum Gupta Naukri Labs! http://ai-cafe.blo

  1   2   3   >