Hi Heshan,
I think you can achieve what you are looking for. You may read "lucene in
Action 2nd edition" about lucene scoring system and FuzzyQuery. Hope this
may help. May be someone can suggest much better approach.
On Wed, Apr 1, 2015 at 8:14 AM, hesh jay wrote:
> hi,
> I am second year under
Hi All,
I have successfully setup a merged indices and drilldown and usual search
operations work perfect.
But, I have a side question. If I selected RAMDirectory as the destination
Indices in merging, probably the jvm can go out of memory if the merged
indices are too big. Is there a way I can ha
Hi Gimantha,
why do you use a RAMDirectory? If your merged index fits into RAM completely, a
MMapDirectory should offer almost the same performance. And if not, it is
definitely the better choice.
Regards
Christoph
Am 02.04.2015 um 12:38 schrieb Gimantha Bandara:
Hi All,
I have successfully
In some cases, MMapDirectory offers even better performance, since the JVM
doesn't need to manage that RAM when it's doing GC.
Also, using only RAMDirectory is not safe in that if the JVM crashes, your
index is lost.
On Thu, Apr 2, 2015 at 12:54 PM, Christoph Kaser
wrote:
> Hi Gimantha,
>
> why
Hi Christoph and Shai,
Thanks for the quick response!.
Indices are stored in a relational database ( using a custom Directory
implementation ). The Problem comes since the indices are sharded (both
taxonomy indices and normal doc indices), when a user wants to drilldown, I
have to merge all the in
Btw I was using a RAMDirectory for just testing purposes..
On Thu, Apr 2, 2015 at 5:16 PM, Gimantha Bandara wrote:
> Hi Christoph and Shai,
>
> Thanks for the quick response!.
> Indices are stored in a relational database ( using a custom Directory
> implementation ). The Problem comes since the
MMapDirectory uses memory-mapped files. This is an operating system level
feature, where even though the file resides on disk, the OS can memory-map
it and access it more efficiently. It is loaded into memory outside the JVM
heap, and usually on a properly configured server you should not worry
abo
Hi Heshan
one approach could be something like this:
1- vectorize each ngram of each sentence. One vectorization strategy is to
use word2vec (the deep learning package). i believe someone has ported
word2vec (originally in C) to Lucene. do google search
2- aggregate each word vector (i.e some clu
Hi Shai
Currently I am using a DB, But the platform we are developing needs to
support RDBMS, HBase and other Datasource types for indices to be stored.
So the user should be able to use whatever the underlying filesystem he
wants to use. I am not sure if Solr can support multiple datasource types
I can't get the suggested way to work (either the child scorer or creating a
query wrapper), so may end up doing a query on each field, just not sure how
expensive that will end up being...
Additional thoughts?
-Todd
-Original Message-
From: Sanne Grinovero [mailto:sanne.grinov...@gmai
10 matches
Mail list logo