: Cassandra Database using too much space
Hi Jack ,
Thanks for replying.
Here what I meant by 1.5M words is not 1.5 Distincts words, it is the count of
all words we added to the corpus (total word instances). Then in word_frequency
and word_ordered_frequency CFs, we have a row for each distinct word
ach row once after the full corpus has
> been read?
>
> Also, what is the corpus size – total word instances, both for the full
> corpus and for the subset containing your 1.5 million words?
>
> -- Jack Krupansky
>
> *From:* Chamila Wijayarathna
> *Sent:* Sunday, Dece
instances, both for the full corpus
and for the subset containing your 1.5 million words?
-- Jack Krupansky
From: Chamila Wijayarathna
Sent: Sunday, December 14, 2014 7:01 AM
To: user@cassandra.apache.org
Subject: Cassandra Database using too much space
Hello all,
We are trying to develop a
Hi Ryan,
Thank you very much. This helps a lot.
On Sun, Dec 14, 2014 at 9:14 PM, Ryan Svihla wrote:
>
> Well your data model looks fine at a glance, a lot of tables, but they
> appear to be mapping to logically obvious query paths. This denormalization
> will make your queries fast but eat up mo
Well your data model looks fine at a glance, a lot of tables, but they
appear to be mapping to logically obvious query paths. This denormalization
will make your queries fast but eat up more disk, and if disk is really a
pain point, Id suggest looking at your economics a bit, and look at your
trade
Hello all,
We are trying to develop a language corpus by using Cassandra as its
storage medium.
https://gist.github.com/cdwijayarathna/7550176443ad2229fae0 shows the types
of information we need to extract from corpus interface.
So we designed schema at
https://gist.github.com/cdwijayarathna/6491