Hi David, thanks for your answer. it really helped a lot! so, you have an
index with more than 2 billions segments. this is pretty much the answer I
was searching for: lucene alone is able to manage such a big index.
which kind of problems do you have with the parallel searchers? I'm going to
buil
Actually I've been bitten by an still-unresolved issue with the parallel
searchers and recommend a MultiReader instead.
We have a couple billion docs in our archives as well. Breaking them up by day
worked well for us, but you'll need to do something.
-Original Message-
From: Luca Ronda
thank you both!
Johannes, katta seems interesting but I will need to solve the problems of
"hot" updates to the index
Yonik, I see your point - so your suggestion would be to build an
architecture based on ParallelMultiSearcher?
On Sun, Nov 21, 2010 at 3:48 PM, Yonik Seeley wrote:
> On Sun, No
On Sun, Nov 21, 2010 at 6:33 PM, Luca Rondanini
wrote:
> Hi everybody,
>
> I really need some good advice! I need to index in lucene something like 1.4
> billions documents. I had experience in lucene but I've never worked with
> such a big number of documents. Also this is just the number of docs
Hi Luca,
Katta is an open-source project that integrates Lucene with Hadoop
http://katta.sourceforge.net
Johannes
2010/11/21 Luca Rondanini
> Hi everybody,
>
> I really need some good advice! I need to index in lucene something like
> 1.4
> billions documents. I had experience in lucene but I'
Hi everybody,
I really need some good advice! I need to index in lucene something like 1.4
billions documents. I had experience in lucene but I've never worked with
such a big number of documents. Also this is just the number of docs at
"start-up": they are going to grow and fast.
I don't have to