from:"will martin"

RE: A really hairy token graph case

2014-10-24 Thread Will Martin

HI Benson: This is the case with n-gramming (though you have a more complicated start chooser than most I imagine). Does that help get your ideas unblocked? Will -Original Message- From: Benson Margulies [mailto:bimargul...@gmail.com] Sent: Friday, October 24, 2014 4:43 PM To: java-us

RE: A really hairy token graph case

2014-10-24 Thread Will Martin

lemma2 PI 0 lemmaN PI 0 comp0-1 PI 0 comp1-1 PI 0 comp0-N compM-N That is, group all the first-components, and all the second-components. But now the bits and pieces of the compounds are interspersed. Maybe that's OK. On Fri, Oct 2

RE: hello,I have a problem about lucene,please help me to explain ,thank you

2015-09-22 Thread will martin

Hi: Would you mind doing websearch and cataloging the relevant pages into a primer? Thx, Will -Original Message- From: 王建军 [mailto:jianjun200...@163.com] Sent: Tuesday, September 22, 2015 4:02 AM To: java-user@lucene.apache.org Subject: hello,I have a problem about lucene,please help me t

RE: Solr java.lang.OutOfMemoryError: Java heap space

2015-09-28 Thread will martin

http://opensourceconnections.com/blog/2014/07/13/reindexing-collections-with-solrs-cursor-support/ -Original Message- From: Ajinkya Kale [mailto:kaleajin...@gmail.com] Sent: Monday, September 28, 2015 2:46 PM To: solr-u...@lucene.apache.org; java-user@lucene.apache.org Subject: Solr jav

RE: Lucene 5 : any merge performance metrics compared to 4.x?

2015-09-29 Thread will martin

So, if its new, it adds to pre-existing time? So it is a cost that needs to be understood I think. And, I'm really curious, what happens to the result of the post merge checkIntegrity IFF (if and only if) there was corruption pre-merge: I mean if you let it merge anyway could you get a false

RE: Lucene 5 : any merge performance metrics compared to 4.x?

2015-09-29 Thread will martin

o we implemented a check step once the index is in its final state to ensure that it is OK. So, since we want to do the check post-merge, is there a way to disable the check during merge so we don't have to do two checks? Thanks! Jim ____ From: will mar

RE: Lucene 5 : any merge performance metrics compared to 4.x?

2015-09-29 Thread will martin

rom the runtime system. The file system is EMC Isilon via NFS. Jim ____ From: will martin Sent: 29 September 2015 14:29 To: java-user@lucene.apache.org Subject: RE: Lucene 5 : any merge performance metrics compared to 4.x? This sounds robust. Is the index

RE: Lucene 5 : any merge performance metrics compared to 4.x?

2015-09-30 Thread will martin

call IndexReader.checkIntegrity. Mike McCandless http://blog.mikemccandless.com On Tue, Sep 29, 2015 at 9:00 PM, will martin wrote: > Ok So I'm a little confused: > > The 4.10 JavaDoc for LiveIndexWriterConfig supports volatile access on > a flag to setCheckIntegrityAtMerge ..

Re: debugging growing index size

2015-11-13 Thread will martin

Hi Rob: Doesn’t this look like known SE issue JDK-4724038 and discussed by Peter Levart and Uwe Schindler on a lucene-dev thread 9/9/2015? MappedByteBuffer …. what OS are you on Rob? What JVM? http://bugs.java.com/view_bug.do?bug_id=4724038 http://mail-archives.apache.org/mod_mbox/lucene-dev/

Re: Jensen–Shannon divergence

2015-12-13 Thread will martin

expand your due diligence beyond wikipedia: i.e. http://ciir.cs.umass.edu/pubfiles/ir-464.pdf > On Dec 13, 2015, at 8:30 AM, Shay Hummel wrote: > > LMDiricletbut its feasibilit

Re: Jensen–Shannon divergence

2015-12-13 Thread will martin

g'luck > On Dec 13, 2015, at 10:55 AM, Shay Hummel wrote: > > Hi > > I am sorry but I didn't understand your answer. Can you please elaborate? > > Shay > > On Sun, Dec 13, 2015 at 3:41 PM will martin wrote: > >> expand your due d

Re: Jensen–Shannon divergence

2015-12-14 Thread will martin

cool list. Thanks Uwe. Opportunities to gain competitive advantage in selected domains. > On Dec 14, 2015, at 6:02 PM, Uwe Schindler wrote: > > Hi, > > Next to BM25 and TF-IDF, Lucene also privides many more similarity > implementations: > > https://lucene.apache.org/core/5_4_0/core/org/apac

Re: Any lucene query sorts docs by Hamming distance?

2015-12-22 Thread will martin

Yonghui: Do you mean sort, rank or score? Thanks, Will > On Dec 22, 2015, at 4:02 AM, Yonghui Zhao wrote: > > Hi, > > Is there any query can sort docs by hamming distance if field values are > same length, > > Seems fuzzy query only works on edit distance. ---

Re: range query highlighting

2015-12-23 Thread will martin

Todd: "This trick just converts the multi term queries like PrefixQuery or RangeQuery to boolean query by expanding the terms using index reader." http://stackoverflow.com/questions/7662829/lucene-net-range-queries-highlighting beware cost. (my comment) g’luck will > On Dec 23, 2015, at 4:49

Re: Any lucene query sorts docs by Hamming distance?

2015-12-24 Thread will martin

m distance 0 to 3. > > 2015-12-22 21:42 GMT+08:00 will martin : > >> Yonghui: >> >> Do you mean sort, rank or score? >> >> Thanks, >> Will >> >> >> >>> On Dec 22, 2015, at 4:02 AM, Yonghui Zhao wrote: >>> >&

Re: SolrIndexSearcher throws Misleading Error Message When timeAllowed is Specified.

2016-01-08 Thread will martin

Please read the javadoc for System.nanoTime(). I won’t bore you with the details about how computer clocks work. > On Jan 8, 2016, at 4:14 AM, Vishnu Mishra wrote: > > I am using Solr 5.3.1 and we are facing OutOfMemory exception while doing > some complex wildcard and proximity query (even fo

Re: how to backup index files with Replicator

2016-01-23 Thread will martin

Hi Dancer: Found this thread with good info that may be irrelevant to your scenario but, this in particular struck me writer.waitForMerges(); writer.commit(); replicator. replicate(new IndexRevision(writer)); writer.close(); — even though writer.close() can

Re: Searching in a bitMask

2016-08-27 Thread will martin

hi aren’t we waltzing terribly close to the use of a bit vector in your field caches? there’s no reason to not filter longword operations on a cache if alignment is consistent across multiple caches just be sure to abstract your operations away from individual bits….imo -will > On Aug 27, 2

Re: Multi-field IDF

2016-11-17 Thread Will Martin

are you familiar with pivoted normalized document length practice or theory? or croft's recent work on relevance algorithms accounting for structured field presence? On 11/17/2016 5:20 PM, Nicolás Lichtmaier wrote: That depends on what you want. In this case I want to use a discrimination po

Re: Multi-field IDF

2016-11-18 Thread Will Martin

In this work, we aim to improve the field weighting for structured doc- ument retrieval. We first introduce the notion of field relevance as the generalization of field weights, and discuss how it can be estimated using relevant documents, which effectively implements relevance feedback for f

Re: Explain Scoring function in LMJelinekMercerSimilarity Class

2016-12-20 Thread Will Martin

https://doi.org/10.3115/981574.981579 On 12/20/2016 12:21 PM, Dwaipayan Roy wrote: Hello, Can anyone help me understand the scoring function in the LMJelinekMercerSimilarity class? The scoring function in LMJelinekMercerSimilarity is shown below: -

Re: Format of Wikipedia Index

2018-01-22 Thread Will Martin

From the javadoc for DocMaker: * *doc.stored* - specifies whether fields should be stored (default *false*). * *doc.body.stored* - specifies whether the body field should be stored (default = *doc.stored*). So ootb you won't get content stored. Does this help? regards -will On 1/22/2

RE: A really hairy token graph case

RE: A really hairy token graph case

RE: hello,I have a problem about lucene,please help me to explain ,thank you

RE: Solr java.lang.OutOfMemoryError: Java heap space

RE: Lucene 5 : any merge performance metrics compared to 4.x?

RE: Lucene 5 : any merge performance metrics compared to 4.x?

RE: Lucene 5 : any merge performance metrics compared to 4.x?

RE: Lucene 5 : any merge performance metrics compared to 4.x?

Re: debugging growing index size

Re: Jensen–Shannon divergence

Re: Jensen–Shannon divergence

Re: Jensen–Shannon divergence

Re: Any lucene query sorts docs by Hamming distance?

Re: range query highlighting

Re: Any lucene query sorts docs by Hamming distance?

Re: SolrIndexSearcher throws Misleading Error Message When timeAllowed is Specified.

Re: how to backup index files with Replicator

Re: Searching in a bitMask

Re: Multi-field IDF

Re: Multi-field IDF

Re: Explain Scoring function in LMJelinekMercerSimilarity Class

Re: Format of Wikipedia Index

22 matches

Site Navigation

Mail list logo

Footer information