How to index & search arrays of double?

2015-08-06 Thread Estanislao Oubel
Hello everybody, I'm currently investigating methods for content-based image retrieval. In this context, I would like to index documents containing arrays of doubles and then perform an approximate search based on these arrays. For example, I would like to insert in the index three documents (d1,d

Re: How to index & search arrays of double?

2015-08-06 Thread Phaneendra N
Hello Stan, Great question. I come across with one such implementation based on lucene. Its called LIRE . This is an open source project. http://www.lire-project.net/ You might get some ideas there. Please let me know if you find answers to your specific questions there. I'm curious. Thanks Phan

Re: How to index & search arrays of double?

2015-08-06 Thread Estanislao Oubel
Thanks Phaneendra for responding, I know LIRE, I have been playing around with this library but I don't understand which is the added value. To be more specific, LIRE allows computing several image features and similarity between them, No problem so far. My main concern is that the index used by L

Re: How to index & search arrays of double?

2015-08-06 Thread McKinley, James T
Hi Stan, I played around with LIRE a couple years ago. I don't know exactly how it works, but it doesn't just use Lucene from what I remember, it has its own classes built around Lucene to perform the image search. There used to be a PDF of a paper on the site, but I couldn't find a link when

Standard highlighter returns whole document as a fragment

2015-08-06 Thread Robert Alexander
Hey everyone, I ran into an issue with the standard highlighter in 4.10.4 and was hoping that someone could help. I'm attempting to fragment a result based on a SpanNearQuery. If the words in the query are next to each other, the fragmenter will often return one large result containing the entire

Why do the Japanese analyser FST files change every release?

2015-08-06 Thread Trejkaz
I have recently done updates from Lucene 3.6 to 4.x and 4.x to 5.2. During this process, I noticed that the FST used by the Japanese analyser (AKA Kuromoji) was changing between releases. As I fear breakages in backwards compatibility, I worried that the dictionary had changed, so I wrote a little

bug of highlighter/SimpleSpanFragmenter, returned longer fragment than expected?

2015-08-06 Thread Duke DAI
Hi experts, I'm trying to reproduce a bug from Lucene side, and found something. In latest codeline, 5.2.1, I modified test case HighlighterTest.testSimpleQueryTermScorerHighlighter a little to below, mainly to use SimpleSpanFragmenter to get only one fragment with length 64. public void testS

Re: Why do the Japanese analyser FST files change every release?

2015-08-06 Thread Dawid Weiss
It is (b). D. On Fri, Aug 7, 2015 at 3:05 AM, Trejkaz wrote: > I have recently done updates from Lucene 3.6 to 4.x and 4.x to 5.2. > > During this process, I noticed that the FST used by the Japanese > analyser (AKA Kuromoji) was changing between releases. As I fear > breakages in backwards comp

Mapping doc values back to doc ID (in decent time)

2015-08-06 Thread Trejkaz
Hi all. It's that time again. I'm trying to kill off our long-standing reliance on stable doc IDs. To that end, I am adding an additional field which contains the ID. But we use these IDs a lot and for all kinds of purposes, and in some of these purposes, many lookups are done at once, so perform