HI Benson:
This is the case with n-gramming (though you have a more complicated start
chooser than most I imagine). Does that help get your ideas unblocked?
Will
-Original Message-
From: Benson Margulies [mailto:bimargul...@gmail.com]
Sent: Friday, October 24, 2014 4:43 PM
To: java-us
lemma2 PI 0
lemmaN PI 0
comp0-1 PI 0
comp1-1 PI 0
comp0-N
compM-N
That is, group all the first-components, and all the second-components.
But now the bits and pieces of the compounds are interspersed. Maybe that's OK.
On Fri, Oct 2
Hi:
Would you mind doing websearch and cataloging the relevant pages into a
primer?
Thx,
Will
-Original Message-
From: 王建军 [mailto:jianjun200...@163.com]
Sent: Tuesday, September 22, 2015 4:02 AM
To: java-user@lucene.apache.org
Subject: hello,I have a problem about lucene,please help me t
http://opensourceconnections.com/blog/2014/07/13/reindexing-collections-with-solrs-cursor-support/
-Original Message-
From: Ajinkya Kale [mailto:kaleajin...@gmail.com]
Sent: Monday, September 28, 2015 2:46 PM
To: solr-u...@lucene.apache.org; java-user@lucene.apache.org
Subject: Solr jav
So, if its new, it adds to pre-existing time? So it is a cost that needs to be
understood I think.
And, I'm really curious, what happens to the result of the post merge
checkIntegrity IFF (if and only if) there was corruption pre-merge: I mean if
you let it merge anyway could you get a false
o
we implemented a check step once the index is in its final state to ensure
that it is OK.
So, since we want to do the check post-merge, is there a way to disable the
check during merge so we don't have to do two checks?
Thanks!
Jim
____
From: will mar
rom the runtime system.
The file system is EMC Isilon via NFS.
Jim
____
From: will martin
Sent: 29 September 2015 14:29
To: java-user@lucene.apache.org
Subject: RE: Lucene 5 : any merge performance metrics compared to 4.x?
This sounds robust. Is the index
call IndexReader.checkIntegrity.
Mike McCandless
http://blog.mikemccandless.com
On Tue, Sep 29, 2015 at 9:00 PM, will martin wrote:
> Ok So I'm a little confused:
>
> The 4.10 JavaDoc for LiveIndexWriterConfig supports volatile access on
> a flag to setCheckIntegrityAtMerge ..
Hi Rob:
Doesn’t this look like known SE issue JDK-4724038 and discussed by Peter Levart
and Uwe Schindler on a lucene-dev thread 9/9/2015?
MappedByteBuffer …. what OS are you on Rob? What JVM?
http://bugs.java.com/view_bug.do?bug_id=4724038
http://mail-archives.apache.org/mod_mbox/lucene-dev/
expand your due diligence beyond wikipedia:
i.e.
http://ciir.cs.umass.edu/pubfiles/ir-464.pdf
> On Dec 13, 2015, at 8:30 AM, Shay Hummel wrote:
>
> LMDiricletbut its feasibilit
g'luck
> On Dec 13, 2015, at 10:55 AM, Shay Hummel wrote:
>
> Hi
>
> I am sorry but I didn't understand your answer. Can you please elaborate?
>
> Shay
>
> On Sun, Dec 13, 2015 at 3:41 PM will martin wrote:
>
>> expand your due d
cool list. Thanks Uwe.
Opportunities to gain competitive advantage in selected domains.
> On Dec 14, 2015, at 6:02 PM, Uwe Schindler wrote:
>
> Hi,
>
> Next to BM25 and TF-IDF, Lucene also privides many more similarity
> implementations:
>
> https://lucene.apache.org/core/5_4_0/core/org/apac
Yonghui:
Do you mean sort, rank or score?
Thanks,
Will
> On Dec 22, 2015, at 4:02 AM, Yonghui Zhao wrote:
>
> Hi,
>
> Is there any query can sort docs by hamming distance if field values are
> same length,
>
> Seems fuzzy query only works on edit distance.
---
Todd:
"This trick just converts the multi term queries like PrefixQuery or RangeQuery
to boolean query by expanding the terms using index reader."
http://stackoverflow.com/questions/7662829/lucene-net-range-queries-highlighting
beware cost. (my comment)
g’luck
will
> On Dec 23, 2015, at 4:49
m distance 0 to 3.
>
> 2015-12-22 21:42 GMT+08:00 will martin :
>
>> Yonghui:
>>
>> Do you mean sort, rank or score?
>>
>> Thanks,
>> Will
>>
>>
>>
>>> On Dec 22, 2015, at 4:02 AM, Yonghui Zhao wrote:
>>>
>&
Please read the javadoc for System.nanoTime(). I won’t bore you with the
details about how computer clocks work.
> On Jan 8, 2016, at 4:14 AM, Vishnu Mishra wrote:
>
> I am using Solr 5.3.1 and we are facing OutOfMemory exception while doing
> some complex wildcard and proximity query (even fo
Hi Dancer:
Found this thread with good info that may be irrelevant to your scenario but,
this in particular struck me
writer.waitForMerges();
writer.commit();
replicator. replicate(new IndexRevision(writer));
writer.close();
—
even though writer.close() can
hi
aren’t we waltzing terribly close to the use of a bit vector in your field
caches?
there’s no reason to not filter longword operations on a cache if alignment is
consistent across multiple caches
just be sure to abstract your operations away from individual bits….imo
-will
> On Aug 27, 2
are you familiar with pivoted normalized document length practice or
theory? or croft's recent work on relevance algorithms accounting for
structured field presence?
On 11/17/2016 5:20 PM, Nicolás Lichtmaier wrote:
That depends on what you want. In this case I want to use a
discrimination po
In this work, we aim to improve the field weighting for structured doc-
ument retrieval. We first introduce the notion of field relevance as the
generalization of field weights, and discuss how it can be estimated using
relevant documents, which effectively implements relevance feedback for
f
https://doi.org/10.3115/981574.981579
On 12/20/2016 12:21 PM, Dwaipayan Roy wrote:
Hello,
Can anyone help me understand the scoring function in the
LMJelinekMercerSimilarity class?
The scoring function in LMJelinekMercerSimilarity is shown below:
-
From the javadoc for DocMaker:
* *doc.stored* - specifies whether fields should be stored (default
*false*).
* *doc.body.stored* - specifies whether the body field should be
stored (default = *doc.stored*).
So ootb you won't get content stored. Does this help?
regards
-will
On 1/22/2
22 matches
Mail list logo