I would love to come but I'm afraid I'm stuck in rainy old England :(
Amin
On 18 Apr 2009, at 01:08, Bradford Stephens
wrote:
OK, we've got 3 people... that's enough for a party? :)
Surely there must be dozens more of you guys out there... c'mon,
accelerate your knowledge! Join us in Seat
Perfect explanation, I think I have the idea now. Thanks so much! I would
also like to test out the update with a term that does not have any matches to
see if it will do an insert as that would make the code much simpler and
efficient. From the documentation an update is a delete followed by
What you're missing is that the example has no unique ID, it wasn't created
with update in mind.
There's no hidden magic for Lucene knowing *what* document you want
to have updated, you have to provide it yourself, and it should be unique.
Imagine a parts catalog, or an index of a directory tree.
Ok I am still confused.
Looking at the examples to index a document I would do something like the
following:
Document document = new Document();
document.add(Field.UnStored("article", article));
document.add(Field.Text("comments", comments));
Analyzer analyzer = n
OK, we've got 3 people... that's enough for a party? :)
Surely there must be dozens more of you guys out there... c'mon,
accelerate your knowledge! Join us in Seattle!
On Thu, Apr 16, 2009 at 3:27 PM, Bradford Stephens
wrote:
> Greetings,
>
> Would anybody be willing to join a PNW Hadoop and/o
On Fri, Apr 17, 2009 at 7:27 PM, Newman, Billy wrote:
> I am looking for info on how to use the IndexWriter.update method. A short
> example of how to add a document and then later update would
> be very helpful. I get lost because I can add a document with just the
> document, but I need a do
CustomScoreQuery only allows the secondary queries to be of type
ValueSourceQuery instead of allowing them to be any type of Query. Why
is that? Is there something that makes it hard to implement for
arbitrary queries?
Steve
P.S. I played around with this briefly, and simply replacing all
ValueSo
Little I know about GSA, there isn't a distributed solution (old
information, not sure if it is still the case), so it is not very easy to
scale your search system. Something you can achieve rather easily with a
Lucene/Solr implementation.
There are other benefits of using an open source solution s
I am looking for info on how to use the IndexWriter.update method. A short
example of how to add a document and then later update would be very helpful.
I get lost because I can add a document with just the document, but I need a
document and a Term. I am not really sure what a Term is since
I have a BooleanQuery with several clauses. After running a search, in
addition to seeing the overall score of each document, I need to see the
sub-score produced by each clause. When all clauses match, this is
relatively easy to get back by ".explain(...)", which gives me something
like this:
0.3
Hi Radha,
On 4/17/2009 at 6:19 AM, Radhalakshmi Sreedharan wrote:
> What I need is the following :
> If my document field is ( ab,bc,cd,ef) and Search tokens are
> (ab,bc,cd).
>
> Given the following :
> I should get a hit even if all of the search tokens aren't present
> If the tokens are f
On 4/17/2009 at 10:33 AM, Radhalakshmi Sreedharan wrote:
> > > I have a question related to SpanNearQuery.
> > >
> > > As of now, the SpanNearQuery has the constraint that all the
> > > terms need to present in the document.
[...]
> > > But [...] I need a hit even if there are 2/3 terms found with
Erm, I likely should have mentioned that this technique requires the use
of a MultiFieldQueryParser.
Matt
Matthew Hall wrote:
If you can build an analyzer that tokenizes the second field so that
it filters out the words you don't want, you can then take advantage
of more intelligent queries a
If you can build an analyzer that tokenizes the second field so that it
filters out the words you don't want, you can then take advantage of
more intelligent queries as well.
So for the example that pjaol wrote, the query would become something
like this:
Query= body:(game OR redskins) keyw
On Friday 17 April 2009 16:33:27 Radhalakshmi Sreedharan wrote:
> Thanks Paul. Is there any alternative way of implementing this requirement?
Start from scratch perhaps? Anyway, spans can be really tricky, so in
case you're writing code for this, I have only four advices: test, test,
test and test
On Apr 16, 2009, at 10:22 AM, Vasudevan Comandur wrote:
Hi,
The question that I am posting in this group may be inappropriate
and I
want to apologize for that.
I wouldn't say it's inappropriate, but I don't know if anyone here
could say with certainty b/c the last time I checked GSA w
Ah, Interesting... I didnt think of that! I will try it and report back
pjaol wrote:
>
> Why not put the keywords into the same document as another field? and
> search
> both fields
> at once, you can then use lucene syntax to give a boosting to the keyword
> fields.
> e.g.
> body:A good game
Thx, it works. :)
Daniel Susanto
http://susantodaniel.wordpress.com
--- On Fri, 4/17/09, Uwe Schindler wrote:
From: Uwe Schindler
Subject: RE: Is it possible to add new document into existing lucene index?
To: java-user@lucene.apache.org
Date: Friday, April 17, 2009, 9:18 PM
Hi Daniel,
Just
Why not put the keywords into the same document as another field? and search
both fields
at once, you can then use lucene syntax to give a boosting to the keyword
fields.
e.g.
body:A good game last night by the redskins
keyword: redskins
Query= body:(game OR redskins) keyword:(game OR redskins)^10
*Edit: each indexed text document contains a related field for identification
purposes, so I would be able to identify the scores for both indexes through
this field*
theDude_2 wrote:
>
> I appreciate your response, and read the wiki article concerning the
> Federated search
> and
>
> I'm not
I'm sorry If this question touches on too many things at once, but I'm
having problems putting some ideas together - hopefully someone can
help!
I have a set of indexes, each index contains a month's worth of
Articles. I need to be able to search the index (sorting by date) and
then apply access-
I appreciate your response, and read the wiki article concerning the
Federated search
and
I'm not sure that my project falls into the "Federated Search" bucket...
What I've done is created 2 indexes created with the same documents.
One index, contains the full documents - great for pure relevanc
I'd start by doing some research on the question rather than asking for a
solution..
What your asking for can be considered 'Federated Search'
http://en.wikipedia.org/wiki/Federated_search
And it can be conceived in as many ways as you have document types. Any
answer will probably end up
customize
(bump) - any thoughts?
theDude_2 wrote:
>
> hi!
>
> I am trying to do something a little unique...
>
> I have a 90k text documents that I am trying to search
> Search A: indexes and searches the documents using regular relevancy
> search
> Search B: indexes and searches the documents us
Thanks Paul. Is there any alternative way of implementing this requirement?
As a side note, Will the Shingle Filter help me getting all possible
combination of the input tokens?
-Original Message-
From: Paul Elschot [mailto:paul.elsc...@xs4all.nl]
Sent: Friday, April 17, 2009 8:00 PM
To
To avoid passing all combinations to a NearSpansQuery
some non trivial changes would be needed in the spans package.
NearSpansUnOrdered (and maybe also NearSpansOrdered)
would have to be extended to provide matching Spans when
(the Spans of) not all terms/subqueries match.
Also, quite likely, it
Hi Daniel,
Just open the IndexWriter on the same Lucene Directory and specify the
boolean ctor parameter "create" to false.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: daniel susanto [mailto:daniel_s
Hi all,
I'm NB in Lucene.
Is it possible to add new document into existing document? I think it is
important so that we don't need to re-index the all in file folder if just one
or two file that need to be added into index. Thx.
Daniel Susanto
http://susantodaniel.wordpress.com
Well, let's see the results of toString and/or Explain *from your code*.
Otherwise, you haven't given us much to go on.
Best
Erick
On Fri, Apr 17, 2009 at 1:07 AM, liat oren wrote:
> Thanks for the answer.
>
> In Luke, I used the WhiteSpaceAnalyzer as well. The scores AND the explain
> method w
To make the question simple,
What I need is the following :
If my document field is ( ab,bc,cd,ef) and Search tokens are (ab,bc,cd).
Given the following :
I should get a hit even if all of the search tokens aren't present
If the tokens are found they should be found within a distance x of ea
Just a reminder that this London meet-up is on Monday the 27th. Please
sign up or otherwise let me know so I can make sure there's anough
space booked.
Rich
2009/4/6 Richard Marr :
> Hi all,
>
> Just to let everyone know... I'm organising (if you can call it that)
> an informal London meet-up in
On Fri, Apr 17, 2009 at 5:05 AM, MakMak wrote:
> I am not retrieving many docs, the problem is that the whole file is stored
> in the doc. I need the file content for highlighter to work. But the files
> are normal-sized text files which in any case should not exceed 10-15mb.
> Retrieving 25 of t
Actually, HitCollector itself isn't a performance killer (eg, at the
end of the day, all searches inside Lucene are using some HitCollector
to gather results).
What is a performance killer is if you do something overly substantial
(eg, calling IndexReader.document(...)) with every hit passed to th
I am not retrieving many docs, the problem is that the whole file is stored
in the doc. I need the file content for highlighter to work. But the files
are normal-sized text files which in any case should not exceed 10-15mb.
Retrieving 25 of them(page size), worst case scenario will take 250mb of
Hi Alex,
As I know HitColector is useful when you need to deal with some data of
ALL the docs in the index, but when you need just top of them
HitCollector is said to be a performance killer. Then is better to use
Hits with the old API and TopDocs with current one.
Ivan
AlexElba wrote:
Why
Can you describe your app a bit? How many documents are you
retrieving for each search?
It seems like Weblogic noticed a single HTTP request took more than
600 seconds and then dumped out all stack traces? In which case,
maybe the threads were not actually "stuck", but were doing something
that
Hi Steven,
Thanks for your reply.
I tried out your approach and the problem got solved to an extent but still it
remains.
The problem is the score reduces quite a bit even now as bc is not found in the
combinations
( bc,cd) ( bc,ef) and ( ab,bc,cd,ef) etc.
The boosting infact has a negative
Please do not mind these more traces:
--
ExecuteThread: '30' for queue: 'weblogic.kernel.Default (self-tuning)' has
been busy for "647" seconds working on the request "Http Request:
/search_results.jsp
38 matches
Mail list logo