ase submit your talks here -
https://communityovercode.org/call-for-presentations/
We hope to see many of you talk about Search in Denver!
--
Anshum Gupta
e you all there.
--
Anshum Gupta
stions or ideas! Hope to
see you all there.
--
Anshum Gupta
e audience.
Good luck!
-Anshum
On Wed, Mar 30, 2022 at 5:47 AM Michael Wechner
wrote:
> Hi Together
>
> I would be interested to submit a proposal/presentation re Lucene's
> vector search, but would like to ask first whether somebody else wants
> to do this as well or
website - https://www.apachecon.com/acah2021/index.html
Registration - https://hopin.com/events/apachecon-2021-home
Slack - http://s.apache.org/apachecon-slack
Search Track - https://www.apachecon.com/acah2021/tracks/search.html
See you all at ApacheCon 2021!
-Anshum
unsubscribe, e-mail: announce-unsubscr...@apachecon.com
For additional commands, e-mail: announce-h...@apachecon.com
--
Anshum Gupta
can continue to expect critical bug fixes for releases
previously made under the Apache Lucene project.
We will send another update as the mailing lists and website are set up for
the Solr project.
-Anshum
On behalf of the Apache Lucene and Solr PMC
://www.apachecon.com/acah2020/tracks/search.html
See you at ApacheCon.
--
Anshum Gupta
> > https://lucene.apache.org/theme/images/lucene/lucene_logo_green_300.png
> >
> > Please vote for one of the above choices. This vote will close about one
> > week from today, Mon, Sept 7, 2020 at 11:59PM.
> >
> > Thanks!
> >
> > [jira-issue] https://issues.apache.org/jira/browse/LUCENE-9221
> > [first-vote]
> >
> http://mail-archives.apache.org/mod_mbox/lucene-dev/202006.mbox/%3cCA+DiXd74Mz4H6o9SmUNLUuHQc6Q1-9mzUR7xfxR03ntGwo=d...@mail.gmail.com%3e
> > [second-vote]
> >
> http://mail-archives.apache.org/mod_mbox/lucene-dev/202009.mbox/%3cCA+DiXd7eBrQu5+aJQ3jKaUtUTJUqaG2U6o+kUZfNe-m=smn...@mail.gmail.com%3e
> > [rank-choice-voting] https://en.wikipedia.org/wiki/Instant-runoff_voting
> >
>
--
Anshum Gupta
mirroring network for
distributing releases. It is possible that the mirror you are using may not
have replicated the release yet. If that is the case, please try another
mirror. This also applies to Maven access.
ReleaseNote70 (last edited 2017-09-20 10:27:30 by AnshumGupta
<https://wiki.apache.org/lucene-java/AnshumGupta>)
Anshum Gupta
try another mirror. This also goes for Maven access.
-Anshum Gupta
replicated the release yet. If that is the case, please
try another mirror. This also goes for Maven access.
--
Anshum Gupta
replicated the release yet. If that is the case, please
try another mirror. This also goes for Maven access.
--
Anshum Gupta
Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using
may not have replicated the release yet. If that is the case, please try
another mirror. This also goes for Maven access.
--
Anshum Gupta
hindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Anshum Gupta [mailto:ans...@anshumgupta.net]
> > Sent: Friday, February 20, 2015 9:55 PM
> > To: d...@lucene.apache.org; ge
list of new features and notes on upgrading.
Please report any feedback to the mailing lists (
http://lucene.apache.org/core/discussion.html)
--
Anshum Gupta
http://about.me/anshumgupta
Just an update, the it has been rescheduled due to some venue availability
related issues and is now on the 8th of June.
On Tue, May 21, 2013 at 11:58 AM, Anshum Gupta wrote:
> Hi folks,
>
> We just created a new meetup group for all Lucene/Solr enthusiasts in and
> around Bang
the first meetup event:
http://www.meetup.com/Bangalore-Apache-Solr-Lucene-Group/events/113806762/ .
--
Anshum Gupta
http://www.anshumgupta.net
Hi Vidya,
Perhaps this could help you:
http://hrycan.com/2009/10/25/lucene-highlighter-howto/
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Fri, Oct 28, 2011 at 2:18 PM, Vidya Kanigiluppai Sivasubramanian <
vidya...@hcl.com> wrote:
> Hi,
>
> I am using lucene 2.4.1 in my proje
other hand, why do you want to split a 9G index? Is there a reason?
performance issue? It'd be good if you could share the reason as the problem
could be completely different.
--
Anshum Gupta
http://ai-cafe.blogspot.com
2011/7/27 Gudi, Ravi Sankar
> Hi Lucene Team,
>
> If you know
field or any other field from the
'search' method.
Also, I'd suggest you to grab a copy of Lucene in Action 2nd Edition as it'd
help you a lot in understanding the way Lucene works/is used.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Jun 8, 2011 at 11:00 AM, Pranav goya
Yes,
You'd need to delete the document and then re-add a newly created document
object. You may use the key and delete the doc using the Term(key, value).
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Jun 6, 2011 at 4:45 PM, Pranav goyal wrote:
> Hi Anshum,
>
> Thanks fo
achieve/target.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Jun 6, 2011 at 4:41 PM, Pranav goyal wrote:
> Hi all,
>
> Is there any way to change my lucene document no?
> Like if I can change my lucene document no's with con_key.
>
> I am a newbie and don't k
ency.
Even the updateDocument function as of now would internally delete the
document and add the new supplied document.
Hope this answer helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Jun 6, 2011 at 11:59 AM, Pranav goyal wrote:
> Hi all,
>
> I am a newbie to lucene.
&g
Could you also print and send the entire stack-trace?
Also, the query.toString()
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Apr 19, 2011 at 7:40 PM, Patrick Diviacco <
patrick.divia...@gmail.com> wrote:
> I get the following error message: java.lang.UnsupportedOperation
;
ScoreDoc[] sd = is.search(query, 10).scoreDocs;
for(ScoreDoc scoreDoc:sd){
System.out.println(ir.document(scoreDoc.doc));
}
is.close();
ir.close();
iw.close();
*--Snip--*
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Fri, Apr 15,
Hi Madhu,
You could use IndexSearcher.explain(..) to explain the result and get the
detailed breakup of the score. That should probably help you with
understanding the boost and score as calculated by lucene for your app.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Apr 19, 2011 at 2:32
ts you the best.
Relevance or an apt method about boost values, can again be figured out
using varying the boost *via* *trial and error*. That is pretty much a
general practice.
Hope this helps you figuring out a reasonable solution and boost values.
--
Anshum Gupta
http://ai-cafe.blogspot.com
O
So Update basically is nothing but delete and add (a fresh doc). You could
just go ahead at using the deletedocument(Query query) function and then
adding the new document? That is the general approach for such cases and it
works just about fine.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On
So functionally I am assuming you've achieved what you'd been aiming for.
About the scores, the matchalldocs does score docs based on norm factors
etc.
therefore the score wouldn't be 0.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Mar 23, 2011 at 1:38 PM, Patrick Diviacco
need to specify anything there.
The below would work and get you all the docs in the index as the result
(provided you specify a limit high enough for the numDocs to match param)
*Query query = new MatchAllDocsQuery();*
*searcher.search(query.);*
Hope this clarifies your doubt.
--
Anshum
u are
trying to achieve. You may have a completely different option that you
haven't read which someone could advice if they know the exact intent.
Hope this helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 22, 2011 at 4:59 PM, Patrick Diviacco <
patrick.divia...@gmail.com
so a few things
1. are you looking to get 'all' documents or only docs matching your query?
2. if its about fetching all docs, why not use the matchalldocs query?
3. did you try using a collector instead of topdocs?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 22, 2011
Hi Patrick,
You may have a look at this, perhaps this will help you with it. Let me know
if you're still stuck up.
http://stackoverflow.com/questions/3300265/lucene-3-iterating-over-all-hits
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 22, 2011 at 4:10 PM, wrote:
> Not s
Yes, that's how its generally done. Also, you should just handle data/fields
aptly rather than trying to avoid them in the first place. You could safely
add these, use these internally and never return these or use these for an
end user search.
--
Anshum Gupta
http://ai-cafe.blogspot.com
O
Also,
Is there a particular reason why you wouldn't want to index that considering
you'd want to 'update' documents. Its good practice to index the unique
field specially if you have one. It has generally helped more often than
not.
--
Anshum Gupta
http://ai-cafe.blogspot.
Hi,
No as of now, there's no way to do so.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 22, 2011 at 12:29 PM, shrinath.m wrote:
> I am asking for partial update in Lucene,
> where I want to update only a selected field of all fields in the document.
> Does Lucene prov
Hi Suman,
I tried it a while ago. Found it nice and useful.
You could get some hints on using it at
http://ai-cafe.blogspot.com/2009/09/lucid-gaze-tough-nut.html (in case you
need some ! :) )
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Mar 16, 2011 at 11:37 AM, suman.holani wrote
Depends on your data. I know that's a vague answer but that's the point.
What you could do is use FieldCache if memory and data let you do so. Would
it?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Mar 10, 2011 at 3:12 PM, suman.holani wrote:
> Hi Anshum,
>
> Than
should help you. Also, otherwise if you're using very selective field
which may be used though a FieldCache it'd be a nice thing to do.
Hope that helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Mar 10, 2011 at 3:01 PM, suman.holani wrote:
>
>
> Hi,
>
>
>
Hi Lahiru,
A few questions here.
Why would you need that? Is the field stored?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 1, 2011 at 11:04 AM, Lahiru Samarakoon wrote:
> Hi all,
>
> Is there a way to find the length of a field of a lucene index document?
>
> Thanks,
> Lahiru
>
KeywordAnalyzer());
In the above snip, I instantiate an analyzer which by default would use the
StandardAnalyzer but for 'anotherfield' would use KeywordAnalyzer.
Hope this helps you.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Feb 15, 2011 at 2:19 AM, Yuhan Zhang wrote:
>
Hi Liat,
You could use open a multi/parallelmultisearcher on the indexes that you
have and then construct an OR query e.g. (contents:A OR text:A)
I am assuming that the field names do not overlap. If that is not the case
then you'd need another solution.
--
Anshum Gupta
http
If you actually intend at getting the intersection of 2 results from a
'union' of 2 indexes, you could use the filter and query approach. You could
use a multi searcher or a parallel multi searcher to perform the search in
this case.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On M
Why don't you generate your own index off some sample docs or dataset. Would
give you a lot more flexibility to play around as otherwise even if you get
an index, you wouldn't have info in the analyzer used etc.. while indexing.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Sun, Fe
Hi Ranjit,
That would be because all stop words (space, comma, stop word set, etc..)
would be treated in a similar fashion and escaped while indexing, subject to
the analyzer you use while index your content.
Hope that explains the issue.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Feb
imple mod of
some numeric (auto increment) userid.
This works well under normal cases unless your partitioning is not
predictable.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Fri, Jan 21, 2011 at 10:52 AM, Ganesh wrote:
> Hello all,
>
> Could you any one guide me what all the various
erm). Something of an
ngram, and then treat those phrases at terms.
Doing it at runtime would not be a feasible option.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Jan 20, 2011 at 3:30 PM, Ashish Pancholi wrote:
>
> Using Lucene_3.0.3. we would like to implement following:
> The
mirrors them internally or via a
downstream project)
--
Anshum Gupta
http://ai-cafe.blogspot.com
current query seems like you'd need more understanding on
lucene and getting a copy of "Lucene In Action 2nd
Ed<http://www.manning.com/hatcher3/>."
would be a good idea for you and everyone in your position.
Hope that helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On
Hi Ryan,
You should try the synonym filter. That should help you with this kinda
problem.
You could also look at turning off norms for the name field, or turning off
tf or idf.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Sat, Jan 8, 2011 at 6:03 AM, Ryan Aylward wrote:
> Our business ha
.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Dec 29, 2010 at 5:36 PM, Jiang mingyuan <
mailtojiangmingy...@gmail.com> wrote:
> Can lucene index survives a machine crash during the merge or optimize
> operation?
>
> or can I stop the running index program during the
page, starting at
http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB
<http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB>
--
Anshum Gupta
http://ai-cafe.blogspot.com
On We
Hi Umesh,
I'm not really confident that Zoie or anything built on the current version
of Lucene would be able to handle search as you type kind of a setup.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Dec 29, 2010 at 10:39 AM, Umesh Prasad wrote:
> You can also look at Zoie an
type.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Dec 29, 2010 at 3:36 AM, software visualization <
softwarevisualizat...@gmail.com> wrote:
> This has probably been asked before but I couldn't find it, so...
>
> Is it possible / advisable / practical to use Lucene
ase 2
below).
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Dec 21, 2010 at 3:54 PM, manjula wijewickrema
wrote:
> Hi Gupta,
>
> Thanx a lot for your reply. But I could not understand whether I could
> modify (adding more words) to the default stop word list or should I have
.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Dec 21, 2010 at 9:20 AM, manjula wijewickrema
wrote:
> Hi,
>
> 1) In my application, I need to add more words to the stop word list.
> Therefore, is it possible to add more words into the default lucene stop
> word list?
You could change Occur.SHOULD to Occur.MUST for both fields.
This should work for you if what I understood is what you wanted.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Nov 30, 2010 at 5:12 PM, maven apache wrote:
> Hi: I have two documents:
>
> title
with a single '=' :)
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Nov 30, 2010 at 3:03 PM, maven apache wrote:
> 2010/11/30 Chris Hostetter
>
> >
> > : Subject: What is the difference between the "AND" and "+" operator?
> >
> &
eanQuery.html#setMinimumNumberShouldMatch(int)>Finally
all would depend on the case at hand and what you think is the
expected behavior of search.
Hope this helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Nov 29, 2010 at 1:31 PM, yang Yang wrote:
> What is the difference between the &qu
wiki.apache.org/lucene-java/SpatialSearch
For your understanding, you could have a look at the bounding box approach.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Nov 18, 2010 at 7:38 AM, yang Yang wrote:
> We are using the hibernate search which is based on lucene as the search
> e
index and
the source.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Nov 17, 2010 at 1:36 PM, Lance Norskog wrote:
> The Lucene CheckIndex program does this. It is a class somewhere in Lucene
> with a main() method.
>
>
> Samarendra Pratap wrote:
>
>> It is not gu
ndex. This would also give you a fair idea
of the index state.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Nov 16, 2010 at 11:36 AM, Yakob wrote:
> hello all,
> I would like to ask about lucene index. I mean I created a simple
> program that created lucene indexes and stored it
Hi Nilesh,
No you can't do that. Though you may store your own id as a separate field
for whatever purpose you want. I don't see any reason why you'd essentially
want to override the lucene document id with your own. Let me know in case
there's something I didn't get.
cord 1.
Also while searching you may tokenize on a comma or whatever set of chars
you fi
nd appropriate.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Oct 19, 2010 at 8:59 PM, Jasper de Barbanson <
lucene-mailingl...@de-barbanson.com> wrote:
> I'm currently working on buil
to begin, you may look at SOLR, which
provides an out of the box engine.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Oct 13, 2010 at 8:57 AM, Hyun Joo Noh wrote:
> Hi, how would you make Lucene leave a search log of
> who searched what, when, etc (i.e. cookie, query, timestamp, etc
Version? Machine and JVM (32/64 bit)?
This most probably seems like a code level issue rather than lucene, but I
may be wrong.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Oct 13, 2010 at 8:08 AM, Ching wrote:
> Hi All,
>
> Can anyone help with this issue? I have about 2000 pdf fil
ParallelReader though theoretically sounds useful, I doubt if how much the
overhead of maintaining and synchronizing the document ids would be. I
haven't used it so far, perhaps someone who's used the ParallelReader for
such a purpose on production environment/scale may help you.
--
An
on for you wanting to do so? is it that you
only index data coming from a stream and you don't have access to the
original source at a later time?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Oct 12, 2010 at 11:35 AM, Nilesh Vijaywargiay <
nilesh.vi...@gmail.com> wrote:
> Hi
this is what you intended!
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Sep 30, 2010 at 11:54 PM, Sahin Buyrukbilen <
sahin.buyrukbi...@gmail.com> wrote:
> Hi all,
>
> I need to get the first term in my index and iterate it. Can anybody help
> me?
>
> Best.
>
reclaiming lost disc
space.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 24, 2010 at 9:22 AM, Justin wrote:
> My actual code did not call expungeDeletes every time through the loop;
> however,
> calling expungeDeletes or optimize after the loop means that the index has
> dou
ngedeletes().
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 24, 2010 at 4:38 AM, Justin wrote:
> In an attempt to avoid doubling disk usage when adding new fields to all
> existing documents, I added a call to IndexWriter::expungeDeletes. Then my
> colleague pointed out that Luce
There is bound to be IO contention. I'm sure iostat will give you a much
better picture on it.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Aug 23, 2010 at 3:13 PM, wrote:
> Yes, all version directories are on the same disk. iostat output should be
> useful. Using rsync is
Seems like a case of I/O issues. You may be reading content off the index
while performing searches while the I/O for copy is also happening.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Aug 23, 2010 at 1:12 PM, wrote:
>
> Hi all,
>
>
> We're observing search
comfortably. btw,
are you facing any issues in sort time or is it a presumption?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Aug 18, 2010 at 5:12 PM, Shelly_Singh wrote:
> Hi,
>
> I have a Lucene index that contains a numeric field along with certain
> other fields. The order
for your application?
--
Anshum Gupta
http://ai-cafe.blogspot.com
2010/8/17 xiaoyan Zheng
> the question is like this:
>
> when one user is using IndexWirter.addDocument(doc), and another user has
> already finished adding part and have closed IndexWirter, then, the first
> u
So, you didn't really use the setRamBuffer.. ?
Any reasons for that?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Aug 11, 2010 at 10:28 AM, Shelly_Singh wrote:
> My final settings are:
> 1. 1.5 gig RAM to the jvm out of 2GB available for my desktop
> 2. 100GB d
that period.
This would make the data manageable and searchable within reasonable time.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 10, 2010 at 5:49 PM, Shelly_Singh wrote:
> No sort. I will need relevance based on TF. If I shard, I will have to
> search in al indices.
>
&
ch in case reading the source takes
time in your case, though, the indexwriter would have to be shared among all
threads.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 10, 2010 at 12:24 PM, Shelly_Singh wrote:
> Hi,
>
> I am developing an application which uses Lucene for ind
Hi Saurabh,
I don't think there's a way to do that? Why not use other constructs?
--
Anshum Gupta
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Mon, May 17, 2010 at 8:04 PM, Saurabh Aga
Hi Manjula,
Yes lucene by default would only tackle exact term matches unless you use a
custom analyzer to expand the index/query.
--
Anshum Gupta
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Fri
Hi Clara,
Any particular reason why you'd need the score? Perhaps this would be of
help
http://lucene.apache.org/java/2_9_1/scoring.html
http://lucene.apache.org/java/2_3_2/scoring.pdf
Hope this explains whatever you were looking for.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspo
There are a few things you could do,
1. Run the JVM in server mode [-server]
2. Assign more RAM (in case you're running a 64 bit architecture) (both
initial and max limit)
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions
could try using something like a synonym analyzer while conducting
search in this case.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Fri, Apr 23, 2010 at 2:39 AM, Wei Yi wro
Hi Ravi,
Adding to what Erick said, you could do index the numbers as numeric fields
instead of strings. This should improve things for you by a considerable
amount.
P.S: I'm talking with my knowledge on Java Lucene.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expr
Reposting as the first post didn't get many hits!
Apologies for all who consider this spam!
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Wed, Feb 17, 2010 at 3:
Index time is a much better approach. The only negative about it is the
index size increase. I've used it for a considerable sized dataset and even
the index time doesn't seem to go up considerably.
Searching of multiple terms is generally unoptimized when you can do it with
1.
--
An
affecting the Disk copy in
any manner though)
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Tue, Mar 23, 2010 at 3:51 PM, wrote:
> Hello,
>
>
> I am trying f
u could combine the fields
at run time.
As far as relational nature is concerned, I'd say lucene's model is pretty
different from what you're taking it to be. Lucene documents are just a
collection of field/value pairs.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The
e, it gets tokenized/processed prior
to getting indexed. The way the processing would happen depends on your
analyzer (which here is StopAnalyzer). So point 1. If you analyze a field
with value *'My name is anshum' *it would get broken down into tokens, e.g.
[my] [name] [is] [anshum] where ea
Hi,
How about indexing a dummy token for empty docs? that way you may pick up
all docs that are actually null/empty by querying for the dummy token.
Make sure that the dummy token is never a part of any actual document (token
stream).
Perhaps this should work!
--
Anshum Gupta
Naukri Labs!
http
ument level using a mechanism created and maintained by you.
There ofcourse are implementation schemes that you might want to try so as
to split the index and query them using the appropriate searcher, but this
logic has to be handled by you.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
ore 1 doc as 1 doc having multiple genres instead of duplicate
entries.
I'm still not sure if I've gotten tre problem correctly, but hope this is of
help!
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to m
http://groups.google.com/group/luceneindia* to join and share!
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
Hi Mike,
Not really through queries, but you may do this by writing a custom
collector. You'd need some supporting data structure to mark/hash the
occurrence of a domain in your result set.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody
ld be:
Index flipped terms (using an appropriate analyzer) i.e. cat is also indexed
as tac. You may then query on ta* instead of at*.
Does that solve your issues/concern?
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me
a growth in the index
size should be anticipated and handled.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Fri, Dec 11, 2009 at 10:50 PM, Rob Staveley (Tom)
wrote:
>
How about getting the original token stream and then converting c++ to
cplusplus or anyother such transform. Or perhaps you might look at
using/extending(in the non java sense) some other tokenized!
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to
te an indexer from scratch, you'd have to write
a java file on the same lines as the demo (modified) and include it.
Does that help?
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw...
e (in the wrapper code).
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Thu, Dec 3, 2009 at 8:02 AM, blazingwolf7 wrote:
>
> Hi,
>
> As per title...is it
1 - 100 of 211 matches
Mail list logo