Hi Ram.
1) As an Operations DBA, I consider all versions of Cassandra to be alpha.
So whether you pick 2.0.10 or 2.1.0 doesn't really matter since you
will have to do your own acceptance testing.
2) Data modelling is everything when it comes to a distributed database
like Cassandra. You can read
Thanks DuyHai,
I think the trouble of bloom filter on all row keys & column names is
memory usage. However, if a CF has only hundreds of columns per row, the
number of total columns will be much fewer, so the bloom filter is possible
for this condition, right? Is there a good way to adjust bloom
On Sun, Sep 14, 2014 at 11:22 AM, Philo Yang wrote:
> After reading some docs, I find that bloom filter is built on row keys,
> not on column key. Can anyone tell me what is considered for not building
> bloom filter on column key? Is it a good idea to offer a table property
> option between row
On Sat, Sep 13, 2014 at 3:49 PM, Ram N wrote:
> Is 2.1 a production ready release?
>
https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
> Datastax Java driver - I get too confused with CQL and the underlying
> storage model. I am also not clear on the indexing stru
On Sat, Sep 13, 2014 at 11:48 PM, Paulo Ricardo Motta Gomes <
paulo.mo...@chaordicsystems.com> wrote:
> Apparently Apple is using Cassandra as a massive multi-DC cache, as per
> their announcement during the summit, but probably DSE with in-memory
> enabled option. Would love to hear about similar
Nice catch Rob
On Mon, Sep 15, 2014 at 8:04 PM, Robert Coli wrote:
> On Sun, Sep 14, 2014 at 11:22 AM, Philo Yang wrote:
>
>> After reading some docs, I find that bloom filter is built on row keys,
>> not on column key. Can anyone tell me what is considered for not building
>> bloom filter on c
If you’re indexing and querying on that many columns (dozens, or more than a
handful), consider DSE/Solr, especially if you need to query on multiple
columns in the same query.
-- Jack Krupansky
From: Robert Coli
Sent: Monday, September 15, 2014 11:07 AM
To: user@cassandra.apache.org
Subject:
Jack,
Using Solr or an external search/indexing service is an option but
increases the complexity of managing different systems. I am curious to
understand the impact of having wide-rows on a separate CF for inverted
index purpose which if I understand correctly is what Rob's response,
having a se
We are trying to add new data center in us-east. Servers in each DC are
running inside VPC. We currently have a cluster in us-west and all servers
are running 2.0.7. The two DCs are talking via VPN. listen_address and
broadcast_address have private ip. Our endpoint_snitch is
GossipingPropertyFileSn
Hello.
http://stackoverflow.com/questions/19969329/why-not-enable-virtual-node-in-an-hadoop-node/19974621#19974621
Based on this stackoverflow question, vnodes effect the number of mappers
Hadoop needs to spawn. Which in then affect performance.
With the spark connector for cassandra would the s
Sorry. Trigger finger on the send.
Would vnodes affect performance for spark in a similar fashion for spark.
On Monday, September 15, 2014, Eric Plowe wrote:
> Hello.
>
>
> http://stackoverflow.com/questions/19969329/why-not-enable-virtual-node-in-an-hadoop-node/19974621#19974621
>
> Based on t
Ram,
The reason secondary indexes are not recommended is that since
they can't use the partition key, the values have to be fetched from
all nodes. So you have higher latency, and likely timeouts.
The C* solutions are:
a) use a denormalized ("materialized") table
b) use a clustered index if all
As hadoop* again sorry..
On Monday, September 15, 2014, Eric Plowe wrote:
> Sorry. Trigger finger on the send.
>
> Would vnodes affect performance for spark in a similar fashion for spark.
>
> On Monday, September 15, 2014, Eric Plowe > wrote:
>
>> Hello.
>>
>>
>> http://stackoverflow.com/quest
On 09/12/2014 04:34 PM, Michael Shuler wrote:
I'll have 2.0.10 deb/rpm packages in the repos on Monday, barring any
issues.
Just a quick update - I had a few issues with the Windows 2.0.10
release, which finally succeeded a few minutes ago, so I'll push to the
repositories tomorrow, so we can
Hi there,
I just encountered an error which left a log '/hs_err_pid3013.log'. So is
there a way to solve this?
#
> # There is insufficient memory for the Java Runtime Environment to
> continue.
> # Native memory allocation (malloc) failed to allocate 12288 bytes for
> committing reserved memory.
On Mon, Sep 15, 2014 at 5:55 PM, Yatong Zhang wrote:
> I just encountered an error which left a log '/hs_err_pid3013.log'. So is
> there a way to solve this?
>
> # There is insufficient memory for the Java Runtime Environment to
>> continue.
>> # Native memory allocation (malloc) failed to alloca
On Mon, Sep 15, 2014 at 4:57 PM, Eric Plowe wrote:
> Based on this stackoverflow question, vnodes effect the number of mappers
> Hadoop needs to spawn. Which in then affect performance.
>
> With the spark connector for cassandra would the same situation happen?
> Would vnodes affect performance i
On Mon, Sep 15, 2014 at 1:34 PM, Ram N wrote:
> Would be great to understand the design decision to go with present
> implementation on Secondary Index when the alternative is better? Looking
> at JIRAs is still confusing to come up with the why :)
>
http://mail-archives.apache.org/mod_mbox/incu
It's during the startup. I tried to upgrade cassandra from 2.0.7 to 2.0.10,
but looks like cassandra could not start again. Also I found the following
log at '/var/log/messages':
Sep 16 09:06:59 storage6 kernel: INFO: task java:4971 blocked for more than
> 120 seconds.
> Sep 16 09:06:59 storage6 k
Interesting. The way I understand the spark connector is that it's
basically a client executing a cql query and filling a spark rdd. Spark
will then handle the partitioning of data. Again, this is my understanding,
and it maybe incorrect.
On Monday, September 15, 2014, Robert Coli wrote:
> On Mo
I understand that cassandra uses ParNew GC for New Gen and CMS for Old Gen
(tenured). I'm trying to interpret in the logs when a Full GC happens and
what kind of Full GC is used. It never says "Full GC" or anything like that.
But I see that whenever there's a line like
2014-09-15T18:04:1
Look into the source code of the Spark connector. CassandraRDD try to find
all token ranges (even when using vnodes) for each node (endpoint) and
create RDD partition to match this distribution of token ranges. Thus data
locality is guaranteed
On Tue, Sep 16, 2014 at 4:39 AM, Eric Plowe wrote:
>
22 matches
Mail list logo