I'm using cassandra java driver to access a small cassandra cluster
* The cluster have 3 nodes in DC1 and 3 nodes in DC2
* The keyspace is originally created in DC1 only with RF=2
* The client had good read latency about 40 ms of 99 percentile under 100
requests/sec (measured at the client side)
*
Never mind. I found the root cause. This has nothing to do with Cassandra
and repair. Some web services called by the client caused the problem.
On Fri, Aug 19, 2016 at 11:53 AM, Benyi Wang wrote:
> I'm using cassandra java driver to access a small cassandra cluster
>
> * The
* I have a keyspace with RF=2;
* The client read the table using LOCAL_ONE;
* There is a batch job loading data into the tables using ALL.
I want to change RF to 3 and both the client and the batch job use
LOCAL_QUORUM.
My question is "Will the client still read the correct data when the repair
i
LL in your case (RF=2) if you have just one DC, so you can
> change the batch CL later.
>
> Cheers,
> Hannu
>
> > On 8 Sep 2016, at 14:42, Benyi Wang wrote:
> >
> > * I have a keyspace with RF=2;
> > * The client read the table using LOCAL_ONE;
> > * There
ote:
> Yep, you can fix it by running repair or even faster by changing the
> consistency level to local_quorum and deploying the new version of the app.
>
> Hannu
>
> On 8 Sep 2016, at 17:51, Benyi Wang wrote:
>
> Thanks Hannu,
>
> Unfortunately, we started chang
afe to do.
>
> When the repair has run you can start with the plan I suggested and run
> repairs afterwards.
>
> Hannu
>
> On 8 Sep 2016, at 18:01, Benyi Wang wrote:
>
> Thanks. What about this situation:
>
> * Change RF 2 => 3
> * Start repair
> * Roll back RF 3
I need to batch load a lot of data everyday into a keyspace across two DCs,
one DC is at west coast and the other is at east coast.
I assume that the network delay between two DCs at different sites will
cause a lot of dropped mutation messages if I write too fast in LOCAL DC
using LOCAL_QUORUM.
I have a small cluster with 3 nodes and installed Cassandra 2.1.2 from
DataStax YUM repository. I knew 2.1.2 is not recommended for production.
The problem I observed is:
- When I use vnode with num_token=256, the read latency is about 20ms
for 50 percentile.
- If I disable vnode, the re
I have a cassandra cluster provides data to a web service. And there is a
daily batch load writing data into the cluster.
- Without the batch loading, the service’s Latency 99thPercentile is
3ms. But during the load, it jumps to 90ms.
- I checked cassandra keyspace’s ReadLatency.99thPerce
- Write to Cassandra.
I knew using CQLBulkOutputFormat would be better, but it doesn't supports
DELETE.
On Thu, Sep 24, 2015 at 1:27 PM, Gerard Maas wrote:
> How are you loading the data? I mean, what insert method are you using?
>
> On Thu, Sep 24, 2015 at 9:58 PM, Benyi Wang wrote:
mething like foreach partition).
>
> Also you can easily tune up and down the size of those tasks and therefore
> batches to minimize harm on the prod system.
>
> On Sep 24, 2015, at 5:37 PM, Benyi Wang wrote:
>
> I use Spark and spark-cassandra-connector with a customized Ca
Is there a page explaining what happens at server side when using
SSTableLoader?
I'm seeking the answers of the following questions:
1. What's about the existing data in the table? From my test, the data
in sstable files will be applied to the existing data. Am I right?
- The new data
We have one ring and two virtual data centers in our Cassandra cluster? one
is for Real-Time and the other is for analytics. My questions are:
1. Are there memtables in Analytics Data Center? To my understanding, it
is true.
2. Is it possible to flush memtables if exist in Analytics Data
CQLSSTableWriter only accepts an INSERT or UPDATE statement. I'm wondering
whether make it accept DELETE statement.
I need to update my cassandra table with a lot of data everyday.
* I may need to delete a row (given the partition key)
* I may need to delete some columns. For example, there are 2
I set up two virtual data centers, one for analytics and one for REST
service. The analytics data center sits top on Hadoop cluster. I want to
bulk load my ETL results into the analytics data center so that the REST
service won't have the heavy load. I'm using CQLTableInputFormat in my
Spark Applic
ia thrift
> or
> > cql writes between data centers go out as 1 copy, then that node will
> > forward on to the other replicas. This means intra data center traffic in
> > this case would be 3x more with the bulk loader than with using a
> > traditional cql or thrift based c
On Fri, Jan 9, 2015 at 3:55 PM, Robert Coli wrote:
> On Fri, Jan 9, 2015 at 11:38 AM, Benyi Wang wrote:
>
>>
>>- Is it possible to modify SSTableLoader to allow it access one data
>>center?
>>
>> Even if you only write to nodes in DC A, if you replica
In C* 2.1.2, is there a way you can delete without specifying the row key?
create table (
guid text,
key1 text,
key2 text,
data int
primary key (guid, key1, key2)
);
delete from a_table where key1='' and key2='';
I'm trying to avoid doing like this:
* query the table to get guids (32 b
Create table tomb_test (
guid text,
content text,
range text,
rank int,
id text,
cnt int
primary key (guid, content, range, rank)
)
Sometime I delete the rows using cassandra java driver using this query
DELETE FROM tomb_test WHERE guid=? and content=? and range=?
in Batch s
w many nodes do you have in the cluster and what is the replication
> factor for the keyspace?
>
> On Mon, Mar 30, 2015 at 7:41 PM, Benyi Wang wrote:
>
>> Create table tomb_test (
>>guid text,
>>content text,
>>range text,
>>rank int,
>
dra are you running? Are you by any chance running
> repairs on your data?
>
> On Mon, Mar 30, 2015 at 5:39 PM, Benyi Wang wrote:
>
>> Thanks for replying.
>>
>> In cqlsh, if I change to Quorum (Consistency quorum), sometime the select
>> return the deleted r
r understand what's going on (especially if
> you have an example of the wrong data being returned) is to do a
> sstable2json on all your tables and simply grep for an example key.
>
> On Mon, Mar 30, 2015 at 4:39 PM, Benyi Wang wrote:
>
>> Thanks for replying.
>>
>>
I read the document for several times, but I still not quite sure how to
run repair and compaction.
To my understanding,
- I need to run compaction one each node,
- To repair a table (column family), I only need to run repair on any of
nodes.
Am I right?
Thanks.
ausing a
validation compaction.*
Does it sound like -pr runs on one node?
I'm still don't understand "the first range returned by the partitioned for
a node"?
On Mon, Apr 13, 2015 at 1:40 PM, Robert Coli wrote:
> On Mon, Apr 13, 2015 at 1:36 PM, Benyi Wang wrote:
>
>
I ran "nodetool repair -- keyspace table" for a table, and it is still
running after 4 days. I knew there is an issue for repair with vnodes
https://issues.apache.org/jira/browse/CASSANDRA-5220. My question is how I
can kill this sequential repair?
I killed the process which I ran the repair comma
It didn't work. I ran the command on all nodes, but I still can see the
repair activities.
On Wed, Apr 15, 2015 at 3:20 PM, Sebastian Estevez <
sebastian.este...@datastax.com> wrote:
> nodetool stop *VALIDATION*
> On Apr 15, 2015 5:16 PM, "Benyi Wang" wrote:
Using JMX worked. Thanks a lot.
On Wed, Apr 15, 2015 at 3:57 PM, Robert Coli wrote:
> On Wed, Apr 15, 2015 at 3:30 PM, Benyi Wang wrote:
>
>> It didn't work. I ran the command on all nodes, but I still can see the
>> repair activities.
>>
>
> Your input
27 matches
Mail list logo