from:"Anshum"

Community Over Code NA 2024 Search track, CFP closing soon

2024-04-08 Thread Anshum Gupta

ase submit your talks here - https://communityovercode.org/call-for-presentations/ We hope to see many of you talk about Search in Denver! -- Anshum Gupta

[REMINDER] CFP Open for Search Track at Community Over Code EU (Formerly ApacheCon)

2024-01-05 Thread Anshum Gupta

e you all there. -- Anshum Gupta

[REMINDER] CFP Open for Search Track at Community Over Code (Formerly ApacheCon)

2023-06-12 Thread Anshum Gupta

stions or ideas! Hope to see you all there. -- Anshum Gupta

Re: Call for Presentations now open, ApacheCon North America 2022

2022-03-30 Thread Anshum Gupta

e audience. Good luck! -Anshum On Wed, Mar 30, 2022 at 5:47 AM Michael Wechner wrote: > Hi Together > > I would be interested to submit a proposal/presentation re Lucene's > vector search, but would like to ask first whether somebody else wants > to do this as well or

Search Track at ApacheCon 2021, Sep 21-23

2021-09-15 Thread Anshum Gupta

website - https://www.apachecon.com/acah2021/index.html Registration - https://hopin.com/events/apachecon-2021-home Slack - http://s.apache.org/apachecon-slack Search Track - https://www.apachecon.com/acah2021/tracks/search.html See you all at ApacheCon 2021! -Anshum

Fwd: Call for Presentations for ApacheCon 2021 now open

2021-03-08 Thread Anshum Gupta

unsubscribe, e-mail: announce-unsubscr...@apachecon.com For additional commands, e-mail: announce-h...@apachecon.com -- Anshum Gupta

[ANNOUNCE] Apache Solr TLP Created

2021-02-18 Thread Anshum Gupta

can continue to expect critical bug fixes for releases previously made under the Apache Lucene project. We will send another update as the mailing lists and website are set up for the Solr project. -Anshum On behalf of the Apache Lucene and Solr PMC

ApacheCon at Home 2020 starts tomorrow!

2020-09-28 Thread Anshum Gupta

://www.apachecon.com/acah2020/tracks/search.html See you at ApacheCon. -- Anshum Gupta

Re: [VOTE] Lucene logo contest, third time's a charm

2020-09-08 Thread Anshum Gupta

> > https://lucene.apache.org/theme/images/lucene/lucene_logo_green_300.png > > > > Please vote for one of the above choices. This vote will close about one > > week from today, Mon, Sept 7, 2020 at 11:59PM. > > > > Thanks! > > > > [jira-issue] https://issues.apache.org/jira/browse/LUCENE-9221 > > [first-vote] > > > http://mail-archives.apache.org/mod_mbox/lucene-dev/202006.mbox/%3cCA+DiXd74Mz4H6o9SmUNLUuHQc6Q1-9mzUR7xfxR03ntGwo=d...@mail.gmail.com%3e > > [second-vote] > > > http://mail-archives.apache.org/mod_mbox/lucene-dev/202009.mbox/%3cCA+DiXd7eBrQu5+aJQ3jKaUtUTJUqaG2U6o+kUZfNe-m=smn...@mail.gmail.com%3e > > [rank-choice-voting] https://en.wikipedia.org/wiki/Instant-runoff_voting > > > -- Anshum Gupta

[ANNOUNCE] Apache Lucene 7.0.0 released

2017-09-20 Thread Anshum Gupta

mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also applies to Maven access. ReleaseNote70 (last edited 2017-09-20 10:27:30 by AnshumGupta <https://wiki.apache.org/lucene-java/AnshumGupta>) Anshum Gupta

[ANNOUNCE] Apache Lucene 5.5.3 released

2016-09-09 Thread Anshum Gupta

try another mirror. This also goes for Maven access. -Anshum Gupta

[ANNOUNCE] Apache Lucene 5.5.1 released

2016-05-05 Thread Anshum Gupta

replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. -- Anshum Gupta

[ANNOUNCE] Apache Lucene 5.3.2 released

2016-01-23 Thread Anshum Gupta

replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. -- Anshum Gupta

[ANNOUNCE] Apache Lucene 5.2.0 released

2015-06-07 Thread Anshum Gupta

Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. -- Anshum Gupta

Re: [ANNOUNCE] Apache Lucene 5.0.0 released

2015-02-20 Thread Anshum Gupta

hindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Anshum Gupta [mailto:ans...@anshumgupta.net] > > Sent: Friday, February 20, 2015 9:55 PM > > To: d...@lucene.apache.org; ge

[ANNOUNCE] Apache Lucene 5.0.0 released

2015-02-20 Thread Anshum Gupta

list of new features and notes on upgrading. Please report any feedback to the mailing lists ( http://lucene.apache.org/core/discussion.html) -- Anshum Gupta http://about.me/anshumgupta

Re: Bangalore Apache Lucene/Solr meetup

2013-05-21 Thread Anshum Gupta

Just an update, the it has been rescheduled due to some venue availability related issues and is now on the 8th of June. On Tue, May 21, 2013 at 11:58 AM, Anshum Gupta wrote: > Hi folks, > > We just created a new meetup group for all Lucene/Solr enthusiasts in and > around Bang

Bangalore Apache Lucene/Solr meetup

2013-05-20 Thread Anshum Gupta

the first meetup event: http://www.meetup.com/Bangalore-Apache-Solr-Lucene-Group/events/113806762/ . -- Anshum Gupta http://www.anshumgupta.net

Re: Finding match term positions in the document

2011-10-28 Thread Anshum

Hi Vidya, Perhaps this could help you: http://hrycan.com/2009/10/25/lucene-highlighter-howto/ -- Anshum Gupta http://ai-cafe.blogspot.com On Fri, Oct 28, 2011 at 2:18 PM, Vidya Kanigiluppai Sivasubramanian < vidya...@hcl.com> wrote: > Hi, > > I am using lucene 2.4.1 in my proje

Re: Is There a Way To Split The Lucene Index Segments To Samller Size Less Than 1 GB

2011-07-27 Thread Anshum

other hand, why do you want to split a 9G index? Is there a reason? performance issue? It'd be good if you could share the reason as the problem could be completely different. -- Anshum Gupta http://ai-cafe.blogspot.com 2011/7/27 Gudi, Ravi Sankar > Hi Lucene Team, > > If you know

Re: Lucene Result

2011-06-07 Thread Anshum

field or any other field from the 'search' method. Also, I'd suggest you to grab a copy of Lucene in Action 2nd Edition as it'd help you a lot in understanding the way Lucene works/is used. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Jun 8, 2011 at 11:00 AM, Pranav goya

Re: Lucene Indexing

2011-06-06 Thread Anshum

Yes, You'd need to delete the document and then re-add a newly created document object. You may use the key and delete the doc using the Term(key, value). -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Jun 6, 2011 at 4:45 PM, Pranav goyal wrote: > Hi Anshum, > > Thanks fo

Re: Lucene Document No

2011-06-06 Thread Anshum

achieve/target. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Jun 6, 2011 at 4:41 PM, Pranav goyal wrote: > Hi all, > > Is there any way to change my lucene document no? > Like if I can change my lucene document no's with con_key. > > I am a newbie and don't k

Re: Lucene Indexing

2011-06-06 Thread Anshum

ency. Even the updateDocument function as of now would internally delete the document and add the new supplied document. Hope this answer helps. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Jun 6, 2011 at 11:59 AM, Pranav goyal wrote: > Hi all, > > I am a newbie to lucene. &g

Re: Lucene: Indexsearcher: java.lang.UnsupportedOperationException

2011-04-19 Thread Anshum

Could you also print and send the entire stack-trace? Also, the query.toString() -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Apr 19, 2011 at 7:40 PM, Patrick Diviacco < patrick.divia...@gmail.com> wrote: > I get the following error message: java.lang.UnsupportedOperation

Re: best practice for reusing documents with multi-valued fields

2011-04-18 Thread Anshum

; ScoreDoc[] sd = is.search(query, 10).scoreDocs; for(ScoreDoc scoreDoc:sd){ System.out.println(ir.document(scoreDoc.doc)); } is.close(); ir.close(); iw.close(); *--Snip--* -- Anshum Gupta http://ai-cafe.blogspot.com On Fri, Apr 15,

Re: Calculate document lucene score after the search

2011-04-18 Thread Anshum

Hi Madhu, You could use IndexSearcher.explain(..) to explain the result and get the detailed breakup of the score. That should probably help you with understanding the boost and score as calculated by lucene for your app. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Apr 19, 2011 at 2:32

Re: Choosing boosting in Lucene

2011-04-18 Thread Anshum

ts you the best. Relevance or an apt method about boost values, can again be figured out using varying the boost *via* *trial and error*. That is pretty much a general practice. Hope this helps you figuring out a reasonable solution and boost values. -- Anshum Gupta http://ai-cafe.blogspot.com O

Re: Update Document based on Query instead of Term

2011-04-13 Thread Anshum

So Update basically is nothing but delete and add (a fresh doc). You could just go ahead at using the deletedocument(Query query) function and then adding the new document? That is the general approach for such cases and it works just about fine. -- Anshum Gupta http://ai-cafe.blogspot.com On

Re: how to get all documents in the results ?

2011-03-23 Thread Anshum

So functionally I am assuming you've achieved what you'd been aiming for. About the scores, the matchalldocs does score docs based on norm factors etc. therefore the score wouldn't be 0. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Mar 23, 2011 at 1:38 PM, Patrick Diviacco

Re: how to get all documents in the results ?

2011-03-23 Thread Anshum

need to specify anything there. The below would work and get you all the docs in the index as the result (provided you specify a limit high enough for the numDocs to match param) *Query query = new MatchAllDocsQuery();* *searcher.search(query.);* Hope this clarifies your doubt. -- Anshum

Re: how to get all documents in the results ?

2011-03-22 Thread Anshum

u are trying to achieve. You may have a completely different option that you haven't read which someone could advice if they know the exact intent. Hope this helps. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011 at 4:59 PM, Patrick Diviacco < patrick.divia...@gmail.com

Re: how to get all documents in the results ?

2011-03-22 Thread Anshum

so a few things 1. are you looking to get 'all' documents or only docs matching your query? 2. if its about fetching all docs, why not use the matchalldocs query? 3. did you try using a collector instead of topdocs? -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011

Re: how to get all documents in the results ?

2011-03-22 Thread Anshum

Hi Patrick, You may have a look at this, perhaps this will help you with it. Let me know if you're still stuck up. http://stackoverflow.com/questions/3300265/lucene-3-iterating-over-all-hits -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011 at 4:10 PM, wrote: > Not s

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Anshum

Yes, that's how its generally done. Also, you should just handle data/fields aptly rather than trying to avoid them in the first place. You could safely add these, use these internally and never return these or use these for an end user search. -- Anshum Gupta http://ai-cafe.blogspot.com O

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Anshum

Also, Is there a particular reason why you wouldn't want to index that considering you'd want to 'update' documents. Its good practice to index the unique field specially if you have one. It has generally helped more often than not. -- Anshum Gupta http://ai-cafe.blogspot.

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Anshum

Hi, No as of now, there's no way to do so. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011 at 12:29 PM, shrinath.m wrote: > I am asking for partial update in Lucene, > where I want to update only a selected field of all fields in the document. > Does Lucene prov

Re: lucid gaze

2011-03-15 Thread Anshum

Hi Suman, I tried it a while ago. Found it nice and useful. You could get some hints on using it at http://ai-cafe.blogspot.com/2009/09/lucid-gaze-tough-nut.html (in case you need some ! :) ) -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Mar 16, 2011 at 11:37 AM, suman.holani wrote

Re: document object

2011-03-10 Thread Anshum

Depends on your data. I know that's a vague answer but that's the point. What you could do is use FieldCache if memory and data let you do so. Would it? -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Mar 10, 2011 at 3:12 PM, suman.holani wrote: > Hi Anshum, > > Than

Re: document object

2011-03-10 Thread Anshum

should help you. Also, otherwise if you're using very selective field which may be used though a FieldCache it'd be a nice thing to do. Hope that helps. -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Mar 10, 2011 at 3:01 PM, suman.holani wrote: > > > Hi, > > >

Re: finding the length of a field

2011-02-28 Thread Anshum

Hi Lahiru, A few questions here. Why would you need that? Is the field stored? -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 1, 2011 at 11:04 AM, Lahiru Samarakoon wrote: > Hi all, > > Is there a way to find the length of a field of a lucene index document? > > Thanks, > Lahiru >

Re: construct a field without analyzer?

2011-02-14 Thread Anshum

KeywordAnalyzer()); In the above snip, I instantiate an analyzer which by default would use the StandardAnalyzer but for 'anotherfield' would use KeywordAnalyzer. Hope this helps you. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Feb 15, 2011 at 2:19 AM, Yuhan Zhang wrote: >

Re: Search in multiple indexes which have differnt field name

2011-02-14 Thread Anshum

Hi Liat, You could use open a multi/parallelmultisearcher on the indexes that you have and then construct an OR query e.g. (contents:A OR text:A) I am assuming that the field names do not overlap. If that is not the case then you'd need another solution. -- Anshum Gupta http

Re: Multi Index Search Query

2011-02-14 Thread Anshum

If you actually intend at getting the intersection of 2 results from a 'union' of 2 indexes, you could use the filter and query approach. You could use a multi searcher or a parallel multi searcher to perform the search in this case. -- Anshum Gupta http://ai-cafe.blogspot.com On M

Re: where can i download a sample index

2011-02-13 Thread Anshum

Why don't you generate your own index off some sample docs or dataset. Would give you a lot more flexibility to play around as otherwise even if you get an index, you wouldn't have info in the analyzer used etc.. while indexing. -- Anshum Gupta http://ai-cafe.blogspot.com On Sun, Fe

Re: lucene 3.0.3 | phrase query problem

2011-02-10 Thread Anshum

Hi Ranjit, That would be because all stop words (space, comma, stop word set, etc..) would be treated in a similar fashion and escaped while indexing, subject to the analyzer you use while index your content. Hope that explains the issue. -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Feb

Re: Scale out design patterns

2011-01-20 Thread Anshum

imple mod of some numeric (auto increment) userid. This works well under normal cases unless your partitioning is not predictable. -- Anshum Gupta http://ai-cafe.blogspot.com On Fri, Jan 21, 2011 at 10:52 AM, Ganesh wrote: > Hello all, > > Could you any one guide me what all the various

Re: Please Help

2011-01-20 Thread Anshum

erm). Something of an ngram, and then treat those phrases at terms. Doing it at runtime would not be a feasible option. -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Jan 20, 2011 at 3:30 PM, Ashish Pancholi wrote: > > Using Lucene_3.0.3. we would like to implement following: > The

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-18 Thread Anshum

mirrors them internally or via a downstream project) -- Anshum Gupta http://ai-cafe.blogspot.com

Re: Result ordering

2011-01-16 Thread Anshum

current query seems like you'd need more understanding on lucene and getting a copy of "Lucene In Action 2nd Ed<http://www.manning.com/hatcher3/>." would be a good idea for you and everyone in your position. Hope that helps. -- Anshum Gupta http://ai-cafe.blogspot.com On

Re: Creating an index with multiple values for a single field

2011-01-07 Thread Anshum

Hi Ryan, You should try the synonym filter. That should help you with this kinda problem. You could also look at turning off norms for the name field, or turning off tf or idf. -- Anshum Gupta http://ai-cafe.blogspot.com On Sat, Jan 8, 2011 at 6:03 AM, Ryan Aylward wrote: > Our business ha

Re: Can lucene index survives a machine crash during the merge or optimize operation？

2010-12-29 Thread Anshum

. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Dec 29, 2010 at 5:36 PM, Jiang mingyuan < mailtojiangmingy...@gmail.com> wrote: > Can lucene index survives a machine crash during the merge or optimize > operation？ > > or can I stop the running index program during the

Re: Lucene index

2010-12-29 Thread Anshum

page, starting at http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB <http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB> -- Anshum Gupta http://ai-cafe.blogspot.com On We

Re: Using Lucene to search live, being-edited documents

2010-12-28 Thread Anshum

Hi Umesh, I'm not really confident that Zoie or anything built on the current version of Lucene would be able to handle search as you type kind of a setup. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Dec 29, 2010 at 10:39 AM, Umesh Prasad wrote: > You can also look at Zoie an

Re: Using Lucene to search live, being-edited documents

2010-12-28 Thread Anshum

type. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Dec 29, 2010 at 3:36 AM, software visualization < softwarevisualizat...@gmail.com> wrote: > This has probably been asked before but I couldn't find it, so... > > Is it possible / advisable / practical to use Lucene

Re: Editing StopWordList

2010-12-21 Thread Anshum

ase 2 below). -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Dec 21, 2010 at 3:54 PM, manjula wijewickrema wrote: > Hi Gupta, > > Thanx a lot for your reply. But I could not understand whether I could > modify (adding more words) to the default stop word list or should I have

Re: Editing StopWordList

2010-12-20 Thread Anshum

. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Dec 21, 2010 at 9:20 AM, manjula wijewickrema wrote: > Hi, > > 1) In my application, I need to add more words to the stop word list. > Therefore, is it possible to add more words into the default lucene stop > word list?

Re: field cross search in lucene

2010-11-30 Thread Anshum

You could change Occur.SHOULD to Occur.MUST for both fields. This should work for you if what I understood is what you wanted. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Nov 30, 2010 at 5:12 PM, maven apache wrote: > Hi: I have two documents: > > title

Re: What is the difference between the "AND" and "+" operator?

2010-11-30 Thread Anshum

with a single '=' :) -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Nov 30, 2010 at 3:03 PM, maven apache wrote: > 2010/11/30 Chris Hostetter > > > > > : Subject: What is the difference between the "AND" and "+" operator? > > > &

Re: What is the difference between the "AND" and "+" operator?

2010-11-29 Thread Anshum

eanQuery.html#setMinimumNumberShouldMatch(int)>Finally all would depend on the case at hand and what you think is the expected behavior of search. Hope this helps. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Nov 29, 2010 at 1:31 PM, yang Yang wrote: > What is the difference between the &qu

Re: lucene anchor-distance based search

2010-11-17 Thread Anshum

wiki.apache.org/lucene-java/SpatialSearch For your understanding, you could have a look at the bounding box approach. -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Nov 18, 2010 at 7:38 AM, yang Yang wrote: > We are using the hibernate search which is based on lucene as the search > e

Re: asking about index verification tools

2010-11-17 Thread Anshum

index and the source. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Nov 17, 2010 at 1:36 PM, Lance Norskog wrote: > The Lucene CheckIndex program does this. It is a class somewhere in Lucene > with a main() method. > > > Samarendra Pratap wrote: > >> It is not gu

Re: asking about index verification tools

2010-11-15 Thread Anshum

ndex. This would also give you a fair idea of the index state. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Nov 16, 2010 at 11:36 AM, Yakob wrote: > hello all, > I would like to ask about lucene index. I mean I created a simple > program that created lucene indexes and stored it

Re: assign a id to document?

2010-10-20 Thread Anshum

Hi Nilesh, No you can't do that. Though you may store your own id as a separate field for whatever purpose you want. I don't see any reason why you'd essentially want to override the lucene document id with your own. Let me know in case there's something I didn't get.

Re: Best implementation for address searching

2010-10-20 Thread Anshum

cord 1. Also while searching you may tokenize on a comma or whatever set of chars you fi nd appropriate. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Oct 19, 2010 at 8:59 PM, Jasper de Barbanson < lucene-mailingl...@de-barbanson.com> wrote: > I'm currently working on buil

Re: How to make a search log

2010-10-12 Thread Anshum

to begin, you may look at SOLR, which provides an out of the box engine. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Oct 13, 2010 at 8:57 AM, Hyun Joo Noh wrote: > Hi, how would you make Lucene leave a search log of > who searched what, when, etc (i.e. cookie, query, timestamp, etc

Re: Indexing is hung or doesn't complete

2010-10-12 Thread Anshum

Version? Machine and JVM (32/64 bit)? This most probably seems like a code level issue rather than lucene, but I may be wrong. -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Oct 13, 2010 at 8:08 AM, Ching wrote: > Hi All, > > Can anyone help with this issue? I have about 2000 pdf fil

Re: Update lucene index

2010-10-11 Thread Anshum

ParallelReader though theoretically sounds useful, I doubt if how much the overhead of maintaining and synchronizing the document ids would be. I haven't used it so far, perhaps someone who's used the ParallelReader for such a purpose on production environment/scale may help you. -- An

Re: Update lucene index

2010-10-11 Thread Anshum

on for you wanting to do so? is it that you only index data coming from a stream and you don't have access to the original source at a later time? -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Oct 12, 2010 at 11:35 AM, Nilesh Vijaywargiay < nilesh.vi...@gmail.com> wrote: > Hi

Re: how to get the first term from index?

2010-09-30 Thread Anshum

this is what you intended! -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Sep 30, 2010 at 11:54 PM, Sahin Buyrukbilen < sahin.buyrukbi...@gmail.com> wrote: > Hi all, > > I need to get the first term in my index and iterate it. Can anybody help > me? > > Best. >

Re: Wanting batch update to avoid high disk usage

2010-08-23 Thread Anshum

reclaiming lost disc space. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Aug 24, 2010 at 9:22 AM, Justin wrote: > My actual code did not call expungeDeletes every time through the loop; > however, > calling expungeDeletes or optimize after the loop means that the index has > dou

Re: Wanting batch update to avoid high disk usage

2010-08-23 Thread Anshum

ngedeletes(). -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Aug 24, 2010 at 4:38 AM, Justin wrote: > In an attempt to avoid doubling disk usage when adding new fields to all > existing documents, I added a call to IndexWriter::expungeDeletes. Then my > colleague pointed out that Luce

Re: slow search threads during a disk copy

2010-08-23 Thread Anshum

There is bound to be IO contention. I'm sure iostat will give you a much better picture on it. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Aug 23, 2010 at 3:13 PM, wrote: > Yes, all version directories are on the same disk. iostat output should be > useful. Using rsync is

Re: slow search threads during a disk copy

2010-08-23 Thread Anshum

Seems like a case of I/O issues. You may be reading content off the index while performing searches while the I/O for copy is also happening. -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Aug 23, 2010 at 1:12 PM, wrote: > > Hi all, > > > We're observing search

Re: Sorting a Lucene index

2010-08-18 Thread Anshum

comfortably. btw, are you facing any issues in sort time or is it a presumption? -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Aug 18, 2010 at 5:12 PM, Shelly_Singh wrote: > Hi, > > I have a Lucene index that contains a numeric field along with certain > other fields. The order

Re: about RAMDirectory based B/S plantform problem

2010-08-16 Thread Anshum

for your application? -- Anshum Gupta http://ai-cafe.blogspot.com 2010/8/17 xiaoyan Zheng > the question is like this： > > when one user is using IndexWirter.addDocument(doc), and another user has > already finished adding part and have closed IndexWirter, then, the first > u

Re: Scaling Lucene to 1bln docs

2010-08-10 Thread Anshum

So, you didn't really use the setRamBuffer.. ? Any reasons for that? -- Anshum Gupta http://ai-cafe.blogspot.com On Wed, Aug 11, 2010 at 10:28 AM, Shelly_Singh wrote: > My final settings are: > 1. 1.5 gig RAM to the jvm out of 2GB available for my desktop > 2. 100GB d

Re: Scaling Lucene to 1bln docs

2010-08-10 Thread Anshum

that period. This would make the data manageable and searchable within reasonable time. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Aug 10, 2010 at 5:49 PM, Shelly_Singh wrote: > No sort. I will need relevance based on TF. If I shard, I will have to > search in al indices. > &

Re: Scaling Lucene to 1bln docs

2010-08-10 Thread Anshum

ch in case reading the source takes time in your case, though, the indexwriter would have to be shared among all threads. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Aug 10, 2010 at 12:24 PM, Shelly_Singh wrote: > Hi, > > I am developing an application which uses Lucene for ind

Re: Storing The content

2010-05-17 Thread Anshum

Hi Saurabh, I don't think there's a way to do that? Why not use other constructs? -- Anshum Gupta http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Mon, May 17, 2010 at 8:04 PM, Saurabh Aga

Re: Trace only exactly matching terms!

2010-05-07 Thread Anshum

Hi Manjula, Yes lucene by default would only tackle exact term matches unless you use a custom analyzer to expand the index/query. -- Anshum Gupta http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Fri

Re: Range Score in Lucene

2010-04-27 Thread Anshum

Hi Clara, Any particular reason why you'd need the score? Perhaps this would be of help http://lucene.apache.org/java/2_9_1/scoring.html http://lucene.apache.org/java/2_3_2/scoring.pdf Hope this explains whatever you were looking for. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspo

Re: VM options for faster lucene search

2010-04-26 Thread Anshum

There are a few things you could do, 1. Run the JVM in server mode [-server] 2. Assign more RAM (in case you're running a 64 bit architecture) (both initial and max limit) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions

Re: It is possible to change the meaning of a match in lucene

2010-04-22 Thread Anshum

could try using something like a synonym analyzer while conducting search in this case. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Fri, Apr 23, 2010 at 2:39 AM, Wei Yi wro

Re: Indexing and Searching fields that have unique values

2010-04-22 Thread Anshum

Hi Ravi, Adding to what Erick said, you could do index the numbers as numeric fields instead of strings. This should improve things for you by a considerable amount. P.S: I'm talking with my knowledge on Java Lucene. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expr

Re: Lucene India Users/Developers

2010-03-30 Thread Anshum

Reposting as the first post didn't get many hits! Apologies for all who consider this spam! -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Wed, Feb 17, 2010 at 3:

Re: What is the best practice of using synonymy ?

2010-03-23 Thread Anshum

Index time is a much better approach. The only negative about it is the index size increase. I've used it for a considerable sized dataset and even the index time doesn't seem to go up considerably. Searching of multiple terms is generally unoptimized when you can do it with 1. -- An

Re: how lucene search works in memory

2010-03-23 Thread Anshum

affecting the Disk copy in any manner though) -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Tue, Mar 23, 2010 at 3:51 PM, wrote: > Hello, > > > I am trying f

Re: Optimising the lucene search

2010-03-23 Thread Anshum

u could combine the fields at run time. As far as relational nature is concerned, I'd say lucene's model is pretty different from what you're taking it to be. Lucene documents are just a collection of field/value pairs. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The

Re: Prefix And Fuzzy

2010-03-19 Thread Anshum

e, it gets tokenized/processed prior to getting indexed. The way the processing would happen depends on your analyzer (which here is StopAnalyzer). So point 1. If you analyze a field with value *'My name is anshum' *it would get broken down into tokens, e.g. [my] [name] [is] [anshum] where ea

Re: search on documents which DO NOT have field defined

2010-03-11 Thread Anshum

Hi, How about indexing a dummy token for empty docs? that way you may pick up all docs that are actually null/empty by querying for the dummy token. Make sure that the dummy token is never a part of any actual document (token stream). Perhaps this should work! -- Anshum Gupta Naukri Labs! http

Re: Sharding of Indexes in clucene

2010-03-10 Thread Anshum

ument level using a mechanism created and maintained by you. There ofcourse are implementation schemes that you might want to try so as to split the index and query them using the appropriate searcher, but this logic has to be handled by you. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com

Re: how to use DuplicateFilter to get unique documents based on a fieldName

2010-03-05 Thread Anshum

ore 1 doc as 1 doc having multiple genres instead of duplicate entries. I'm still not sure if I've gotten tre problem correctly, but hope this is of help! -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to m

Lucene India Users/Developers

2010-02-17 Thread Anshum

http://groups.google.com/group/luceneindia* to join and share! -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw

Re: Limiting search result for web search engine

2010-02-02 Thread Anshum

Hi Mike, Not really through queries, but you may do this by writing a custom collector. You'd need some supporting data structure to mark/hash the occurrence of a domain in your result set. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody

Re: CANNOT use a * or ? symbol as the first character of a search.

2009-12-28 Thread Anshum

ld be: Index flipped terms (using an appropriate analyzer) i.e. cat is also indexed as tac. You may then query on ta* instead of at*. Does that solve your issues/concern? -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me

Re: Lucene 3.0.0 writer with a Lucene 2.3.1 index

2009-12-11 Thread Anshum

a growth in the index size should be anticipated and handled. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Fri, Dec 11, 2009 at 10:50 PM, Rob Staveley (Tom) wrote: >

Re: Recover special terms from StandardTokenizer

2009-12-11 Thread Anshum

How about getting the original token stream and then converting c++ to cplusplus or anyother such transform. Or perhaps you might look at using/extending(in the non java sense) some other tokenized! -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to

Re: How to include some more fields to be indexed in the file document class?

2009-12-04 Thread Anshum

te an indexer from scratch, you'd have to write a java file on the same lines as the demo (modified) and include it. Does that help? -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw...

Re: Storing image with Lucene

2009-12-02 Thread Anshum

e (in the wrapper code). -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Thu, Dec 3, 2009 at 8:02 AM, blazingwolf7 wrote: > > Hi, > > As per title...is it

1 2 3 >

1 - 100 of 211 matches

Mail list logo