ave your Collector insure that any docs
> > were in the Filter..
> >
> > FWIW
> > Erick
> >
> >
> >
> > On Mon, Nov 23, 2009 at 11:01 AM, Eran Sevi wrote:
> >
> >> I've taken TermsFilter from contrib which does exactly tha
IDs using TermDocs.seek(Term) to see how long assembling
> the filter would take. Using the Filter in a query doesn't cost
> much at all
>
> Best
> Erick
>
>
> On Mon, Nov 23, 2009 at 8:12 AM, Eran Sevi wrote:
>
> > Erick,
> >
> > Maybe I didn
one you send to your query...
>
> If I'm off base here, could you post a reasonable extract of your filter
> construction code, and how you use them to search? Because I don't
> think we're all talking about the same thing here.
>
> HTH
> er...@thismakesnose
or loop, and see if there's *any* noticeable
> difference in speed. That'll tell you whether your problems
> arise from the filter construction/search or what you're doing
> in the collector
>
> Best
> Erick
>
> On Sun, Nov 22, 2009 at 11:41 AM, Eran Sevi
e is being spent? That'd
> be a big help in suggesting alternatives. If I'm on the right track,
> I'd expect the time to be spent assembling the filters.
>
> Not much help here, but I'm having trouble wrapping my head
> around this...
>
> Best
> Erick
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Eran Sevi [mailto:erans...@gmail.com]
> > Sent: Sunday, November 22, 2009 3:49
Hi,
I have a need to filter my queries using a rather large subset of terms (can
be 10K or even 50K).
All these terms are sure to exist in the index so the number of results can
be about the same number of terms in the filter.
The terms are numbers but are not subsequent and are from a large set o
Is there a recording of the Webinars for anyone who's missed it?
On Sat, Sep 19, 2009 at 12:03 AM, wrote:
> *Description*
>
>
>
> __
>
> Free Webinar: Apache Lucene 2.9: Discover the Powerful New Features
> ---
>
> J
et any work going, don't be shy to start posting code there, and
> perhaps you can get some additional eyes/help as you go.
>
> I think in the end, it might have to be an optional mode, if we get the
> code produced.
>
> --
> - Mark
>
> http://www.lucidimaginat
bit complicated, b/c actually getting the Spans
> is separate from doing the query. I agree there could be tighter
> integration. However, what you could do is use Spans.skipTo to move to the
> document you are examining in the search results.
>
> -Grant
>
>
> On Aug
Hi,
Does anyone knows of how to retrieve such score for any kind of span queries
(especially SpanNearQueries) ?
Thanks,
Eran.
Hi,
How can I get the score of a span that is the result of SpanQuery.getSpans()
? The score should can be the same for each document, but if it's unique per
span, it's even better.
I tried looking for a way to expose this functionality through the Spans
class but it looks too complicated.
I'm no
t; // }
> }
> }
>
> **
> ***
>
> For the synonyms with the weights, I tried the following code:
> BooleanQuery bq = new BooleanQuery();
> TermQuery tq = new TermQuery(new Term(WordIndex.FIELD_WORLDS, "3"));
> tq.setBoost((float) 1.0);
Hi,
You might want to take a look at Payloads. If you know the frequency of the
words in each world in advance than during tokenization for each world you
could save the frequency as the payload.
During searches you could use BoostingTermQuery to take the frequency into
account.
Eran.
On Tue, Ap
est
> patch
> (see the case here: https://issues.apache.org/jira/browse/LUCENE-1465)
> since
> it fixed some important bugs I had come across.
>
> I hope this made sense, I haven't finished my morning coffee yet so I can't
> be too sure : ) Let me know if you h
Hi,
Can you please shed some light on how your final architecture looks like?
Do you manually use the PayloadSpanUtil for each document separately?
How did you solve the problem with phrase results?
Thanks in advance for your time,
Eran.
On Tue, Nov 25, 2008 at 10:30 PM, Greg Shackles <[EMAIL PROTE
If you don't have a lot of entries for each invoice you can duplicate the
invoice for each entry - you'll have some field duplications (and bigger
index size) between the different invoices but it'll be easy to find exactly
what you want.
If you have too many different values, I built a solution s
Hi,
I have the same need - to obtain "attributes" for terms stored in some
field. I also need all the results and can't take just the first few docs.
I'm using an older version of lucene and the method i'm using right now is
this:
1. Store the words as usual in some field.
2. Store the attributeso
imized index. That will delete the old files.
>
> On other OSs, which usually implement "delete on last close", the disk
> space should be automatically freed up once you close the old reader.
>
> Mike
>
>
> Eran Sevi wrote:
>
> Hi,
>>
>> I have the
Hi,
I have the following scenario using Lucene 2.1
1. Open reader on index to perform some searches.
2. Use reader to check if index is optimized.
2. Open writer and run optimize()
3. Close old reader and open a new reader for further searches.
I expected that after closing the old reader , the
Hi Chris,
I asked exactly the same question a little while ago and got a pretty good
answer from Paul Elschot.
Try searching the archives for 'Filtering a SpanQuery'. It was around the
13/5/08.
Hope it helps,
Eran.
On Mon, Aug 25, 2008 at 8:18 PM, Christopher M Collins
<[EMAIL PROTECTED]>wrote:
is from the "IndexWriter.addIndexes(Directory[])" documentation:
> >
> > "This method is transactional in how Exceptions are handled: it does not
> > commit a new segments_N file until all indexes are added. This means if
> an
> > Exception occurs (for example
are fires and floods and earthquakes to consider
>
> Best
> Erick
>
> On Thu, Jun 26, 2008 at 10:28 AM, Eran Sevi <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> > I'm looking for the correct way to create an index given the following
> > restrictions:
Hi,
I'm looking for the correct way to create an index given the following
restrictions:
1. The documents are received in batches of variable sizes (not more then
100 docs in a batch).
2. The batch insertion must be transactional - either the whole batch is
added to the index (exists physically o
Hi,
I'm running a SpanQuery and get the Spans result which tell me the documents
and positions of what I searched for.
I would now like to get the payloads in those documents and positions
without having to iterate on TermPositions since I don't have a term but I
do have the document and position.
ednesday 07 May 2008 10:18:38 schreef Eran Sevi:
> > Thanks Paul for your reply,
> >
> > Since my index contains a couple of millions documents and the filter
> > is supposed to limit the search space to a few thousands I was hoping
> > I won't have to do the filtering m
:
> > Eran,
> >
> > Op Tuesday 06 May 2008 10:15:10 schreef Eran Sevi:
> > > Hi,
> > >
> > > I am looking for a way to filter a SpanQuery according to some
> > > other query (on another field from the one used for the SpanQuery).
> > >
Hi,
I am looking for a way to filter a SpanQuery according to some other query
(on another field from the one used for the SpanQuery). I need to get access
to the spans themselves of course.
I don't care about the scoring of the filter results and just need the
positions of hits found in the docu
If you read the payloads in sequence they're not arranged by their original
position whereas when you use a stored field you get the terms in the
correct order.
If you need to sort the values it doesn't matter of course.
On Fri, Apr 25, 2008 at 5:42 PM, Nadav Har'El <[EMAIL PROTECTED]>
wrote:
> On
e would be good).
The code you're executing when you get the error.
Imagine you're trying to advise someone else and think about what you'd find
useful and try to provide that, please.
Best
Erick
On Wed, Mar 19, 2008 at 9:54 AM, Eran Sevi <[EMAIL PROTECTED]> wro
Hi,
I'm trying to write to a specific index from several different processes and
encounter problems with locked files (deletable for example).
I don't perform any specific locking because as I understand it there should
be file-specific locking mechanism used by lucene API. This doesn't seem to
be
Hi,
I'm trying to write to a specific index from several different processes and
encounter problems with locked files (deletable for example).
I don't perform any specific locking because as I understand it there should
be file-specific locking mechanism used by lucene API. This doesn't seem to
Indeed it seems like a problematic way.
I would also have a problem searching for documents with more then one
value. if the query is something simple like : "value1 AND value2" I would
expect to get all xml docs with both values, but if I use the doc=element
method, I won't get any result because
Hi,
What's the best way to query Lucene for a "bigger then" term, for example "
value > 10".
I know there's a range query where I can use a large upper bound but maybe
there's something more efficient (instead of Lucene transfrom to query to
thousands of OR queries).
Thanks,
Eran.
?
Thanks in advance.
On Tue, Mar 11, 2008 at 5:48 PM, Steven A Rowe <[EMAIL PROTECTED]> wrote:
> Hi Eran, see my comments below inline:
>
> On 03/11/2008 at 9:23 AM, Eran Sevi wrote:
> > I would like to ask for suggestions of the best design for
> > the following scen
Hi,
I would like to ask for suggestions of the best design for the following
scenario:
I have a very large number of XML files (around 1M).
Each file contains several sections. Each section contains many elements
(about 1000-5000).
Each element has a value and some attributes describing the value
36 matches
Mail list logo