: sloppyFreq(distance). hyperbolicTf() only comes into play if you
: override the tf method in your own subclass to call it instead of the
: baselineTf which it normally calls. I also didn't get what it was
: trying to do.
Correct, as documented...
http://lucene.apache.org/core/old_versioned
My IndexWriter only create once and cached in memery.
I restart tomcat this morning,and the index become 94M...But I restart
yesterday serveral times ,it still too big...
My deletion policy is in above reply,it only compare the timestamp,not
actually deleting commits.
--
View this message in cont
I'd love to hear what you find out. I have been working with this also.
I only changed the sweet spot to a slightly larger range than the one in the
original paper (but kept the same steepness) and I tweaked the sloppy freq to
not score multiple occurances of a phrase as strong as the they are i
Thanks Eric,
Yes, the limitations you pointed confirm my first feeling on it. Even
if it is doable with Solr or Lucene, I would have to go deep inside of
it to get the most out of it.
About my RDBMS issues... there are 2 reasons:
First, Im interested in this whole cloud crazyness. I love to work
Actually, you might well have your index be larger than your source, assuming
you're going to be both storing and indexing everything.
There's also the "deep paging" issue, see:
https://issues.apache.org/jira/browse/SOLR-1726
which comes into play if you expect to return a lot of rows.
Solr really
> : Basically for queries such as field1:foo AND field2:*bar, I think it
> : would be highly beneficial to restrict evaluation of the second field on
> : the result of the first to avoid scanning the index in its entirety due
> : to the leading wildcard.
>
> This is exactly how the BooleanQuery cl
: Basically for queries such as field1:foo AND field2:*bar, I think it
: would be highly beneficial to restrict evaluation of the second field on
: the result of the first to avoid scanning the index in its entirety due
: to the leading wildcard.
This is exactly how the BooleanQuery class in Luce
Hi guys,
I hope I'm sending this to the right place.
I have this possible idea in mind (still fuzzy, but enough to describe
this), and I was wondering if Lucene or Solr could help in this. I've
implemented a Lucene index on custom enterprise data before and have
it running on Azure as well, so I
Hi
I have a noobie question. I am trying to use the SweetSpotSimilarity (SSS)
class.
http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/contrib-misc/org/apache/lucene/misc/SweetSpotSimilarity.html
I understand the scoring behavior of Lucene
http://lucene.apache.org/core/old_ve
Hi again,
I just have to remind that sorting on multi-valued fields is not supported by
Lucene! This has nothing to do with numeric, it just does not work and may
throw other exceptions depending on the version you use.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.th
Uwe, thank you very much. This sounds like the pretty best solution!
2012/2/15 Uwe Schindler :
> Hi,
>
> Thanks for explanation. I almost expected that it has to do with" stored
> fields". It's easy to fix:
>
>> ah ok, I know what you mean. We have to read out the stored field values
>> later.
Hi,
Thanks for explanation. I almost expected that it has to do with" stored
fields". It's easy to fix:
> ah ok, I know what you mean. We have to read out the stored field values
> later.
> A field can have multiple (stored) values (several
> document.add(fieldable) invocations for one field).
ah ok, I know what you mean. We have to read out the stored field
values later. A field can have multiple (stored) values (several
document.add(fieldable) invocations for one field). Further, we have
the problem that some field values are logically related to each
other. Since Lucene has no possibi
Hi,
This looks like an XY problem
(http://www.perlmonks.org/index.pl?node_id=542341). Maybe you should first
explain to us, why you need that. In Lucene fields have no "equal length" or
something like that, especially numeric fields are tokenized and contain of
several tokens separately indexe
Hi,
I've been looking for a short circuit AND operator in Lucene or a way to
do subquerying.
Basically for queries such as field1:foo AND field2:*bar, I think it
would be highly beneficial to restrict evaluation of the second field on
the result of the first to avoid scanning the index in its
Is your deletion policy actually deleting commits?
Mike McCandless
http://blog.mikemccandless.com
On Wed, Feb 15, 2012 at 5:21 AM, superruiye wrote:
> http://lucene.472066.n3.nabble.com/file/n3746464/index.jpg
>
> The index files are same size,and the index increase to 7.5G in one day,but
> it
http://lucene.472066.n3.nabble.com/file/n3746464/index.jpg
The index files are same size,and the index increase to 7.5G in one day,but
it should only 90-100M...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Why-read-past-EOF-tp3639401p3746464.html
Sent from the Lucene - J
for now lucene don't provide any thing like this.
maybe you can diff each version before add them into index . so it just
indexes and stores difference for newer version.
On Wed, Feb 15, 2012 at 4:25 PM, Jamie wrote:
> Greetings All.
>
> I'd like to index data corresponding to different versions
Greetings All.
I'd like to index data corresponding to different versions of the same
file. These files consists of PDF documents, word documents, and the
like. So as to ensure that no information is lost, I'd like to create a
new Lucene document for every version (or change) in a file. Each
19 matches
Mail list logo