You can fill in var-args with SomeColumns(myColumns:_*) style syntax.
See for example:
https://stackoverflow.com/questions/1832061/scala-pass-seq-to-var-args-functions
On Fri, Mar 30, 2018 at 6:15 PM Guillermo Ortiz
wrote:
>
> The poblem is that I have a method from Spark to insert into Cassand
What is the value returned from Charset.defaultCharset() on both systems?
On Wed, May 16, 2018 at 5:00 AM rami dabbah wrote:
> Hi,
>
> I am trying to query text filed from Cassandra using java driver see code
> below. In windows it is working fine but in linux i am getting ??
> instead of Ch
I believe that key estimates won't immediately respond to expired TTL, Not
until after compaction has completely dropped the records (which will
include subtle logic related to gc_grace and partitions with data in
multiple SSTables).
On Wed, May 23, 2018 at 6:18 AM Rahul Singh
wrote:
> If the TT
I know this is the official recommendation, and has been for a while. I
highly recommend testing it for yourself though, as our own testing has
shown that for _most_ versions of Cassandra (not all), unlogged batch
meaningfully outperforms parallel execution of individual statements,
especially at
We are engaging in both strategies at the same time:
1) We call it functional sharding - we write to clusters targeted according
to the type of data being written. Because different data types often have
different workloads this has the nice side effect of being able to tune
each cluster accordin
RE #2, some have had good results from having coordinator-only nodes:
https://www.slideshare.net/DataStax/optimizing-your-cluster-with-coordinator-nodes-eric-lubow-simplereach-cassandra-summit-2016
Assuming finite resources, it might be better to be certain you have good
token awareness in your ap
Depending on the use case, creating separate prepared statements for each
combination of set / unset values in large INSERT/UPDATE statements may be
prohibitive.
Instead, you can look into driver level support for UNSET values. Requires
Cassandra 2.2 or later IIRC.
See:
Java Driver:
https://docs
This will depend on what driver you're using at the client. The Java
driver, for example, has ways to configure each of the things you
mentioned, with a variety of implementations you can choose from. There
are also ways to provide your own custom implementation if you don't like
the options avai
It sounds like you're trying to build a queue in Cassandra, which is one of
the classic anti-pattern use cases for Cassandra.
You may be able to do something clever with triggers, but I highly
recommend you look at purpose-built queuing software such as Kafka to solve
this instead.
On Tue, May 24
Large numbers of tables is generally recommended against. Each table has a
fixed on-heap memory overhead, and by your description it sounds like you
might have as many as 12,000 total tables when you start running into
trouble.
With such a small heap to begin with, you've probably used up most of
I'm not familiar with Titan's usage patterns for Cassandra, but I wonder if
this is because of the consistency level it's querying Cassandra at - i.e.
if CL isn't LOCAL_[something], then this might just be lots of little
checksums required to satisfy consistency requirements.
On Mon, May 23, 2016
the version. Given that it is a single node cluster for the time
> being, would your remarks apply to that particular setup?
>
>
> Thanks again!
> Ralf
>
>
> On 24.05.2016, at 19:18, Eric Stevens wrote:
>
> I'm not familiar with Titan's usage patterns for Ca
think this also is an anti-pattern.
>
> Regards,
> Aaditya
>
> On Tue, May 24, 2016 at 12:45 PM, Mark Reddy
> wrote:
>
>> +1 to what Eric said, a queue is a classic C* anti-pattern. Something
>> like Kafka or RabbitMQ might fit your use case better.
>>
>>
>&
use except for very
small databases.
On Tue, May 24, 2016 at 11:54 AM Justin Lin
wrote:
> so i guess i have to 1) increase the heap size or 2) reduce the number of
> keyspaces/column families.
>
> Thanks for you confirmation.
>
> On Tue, May 24, 2016 at 10:08 AM, Eric Stevens
If you aren't removing elements from the map, you should instead be able to
use an UPDATE statement and append the map. It will have the same effect as
overwriting it, because all the new keys will take precedence over the
existing keys. But it'll happen without generating a tombstone first.
If yo
Those are rough guidelines, actual effective node size is going to depend
on your read/write workload and the compaction strategy you choose. The
biggest reason data density per node usually needs to be limited is due to
data grooming overhead introduced by compaction. Data at rest essentially
be
?
>
> On 27 May 2016 at 15:20, Eric Stevens wrote:
>
>> If you aren't removing elements from the map, you should instead be able
>> to use an UPDATE statement and append the map. It will have the same effect
>> as overwriting it, because all the new keys will take prec
sert on a row with a map will always
> create tombstones :-(
>
>
>
> 2016-06-02 2:02 GMT+02:00 Eric Stevens :
>
>> From that perspective, you could also use a frozen collection which takes
>> away the ability to append, but for which overwrites shouldn't gener
This is better kept to the User groups.
What are your JVM memory settings for Cassandra, and have you seen big GC's
in your logs?
The reason I ask is because that's a large number of column families, which
produces memory pressure, and at first blush that strikes me as a likely
cause.
On Wed, Ju
As a side note, if you're inserting records quickly enough that you're
potentially doing multiple in the same millisecond, it seems likely to me
that your partition size is going to be too large at a day level unless
your writes are super bursty: ((appkey, pub_date), pub_timestamp). You
might need
There's an effort to improve the docs, but while that's catching up, 3.0
has the latest version of the document you're looking for:
https://cassandra.apache.org/doc/cql3/CQL-3.0.html#createKeyspaceStmt
On Wed, Jun 15, 2016 at 5:28 AM Steve Anderson
wrote:
> Couple of Cqlsh questions:
>
> 1) Why
re is a fixed memory cost per CF). Reconsider
your data model, usually this many column families suggests dynamically
creating CF's (eg to solve multi-tenancy). If your CF count will grow
steadily over time at any appreciable rate, that's an anti-pattern.
On Thu, Jun 16, 2016 at 2:40 AM V
If a given partition only ever contains one set of those columns, it
probably makes no practical difference, though it suggests an unintuitive
data model, so you might break it up just because it no longer seems to
make sense to keep them together.
If you really don't ever overlap your columns dur
Those ciphers are not available on Java 6, on the off chance that you're
trying to run Cassandra on that (you'll run into other troubles).
The more likely problem is that I think those ciphers are only available if
you install the Unlimited Strength JCE policy files in your JVM on each
node. Doub
Tombstones will not get removed even after gc_grace if bloom filters
indicate that there is overlapping data with the tombstone's partition in a
different sstable. This is because compaction can't be certain that the
tombstone doesn't overlap data in that other table. If you're writing to
one end
though.
>
> This still sounds very weird to me but I am glad you solved your issue
> (temporary at least).
>
> C*heers,
> ---
> Alain Rodriguez - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thela
eading at all the whole partition (let's limit the example to a
> single SSTable) ?
>
> On Fri, Jul 29, 2016 at 7:00 PM, Eric Stevens wrote:
>
>> > Sai was describing a timeout, not a failure due to the 100 K tombstone
>> limit from cassandra.yaml. But I still might b
say nothing about iterating all cells in a single partition
> if having a partition tombstone, I need to dig further
>
>
>
>
> On Sat, Jul 30, 2016 at 2:03 AM, Eric Stevens wrote:
>
>> I haven't tested that specifically, but I haven't bumped into any
>> p
When you say merge cells, do you mean re-aggregating the data into courser
time buckets?
On Thu, Aug 4, 2016 at 5:59 AM Michael Burman wrote:
> Hi,
>
> Considering the following example structure:
>
> CREATE TABLE data (
> metric text,
> value double,
> time timestamp,
> PRIMARY KEY((metric), ti
These sound like driver-side questions that might be better addressed to
your specific driver's mailing list. But from the terminology I'd guess
you're using a DataStax driver, possibly the Java one.
If so, you can look at WhiteListPolicy if you want to target specific
node(s). However aside fro
to Java driver.
> I used DCAwareRoundRobin, TokenAware Policy for application flow.
> Would ask question1 on driver mailing list, If someone could help with
> question 2.
>
>
>
>
>
>
>
>
>
>
> On Fri, Sep 2, 2016 at 6:59 PM, Eric Stevens wrote:
>
>> Thes
I might be inclined to include a generation ID in the partition keys. Keep
a separate table where you upgrade the generation ID when your processing
is complete. You can even use CAS operations in case you goofed up and
generated two generations at the same time (or your processing time exceeds
y
It's important to note that this answer differs quite significantly
depending on whether you're talking about Cassandra < 3.0 or >= 3.0
DataStax has a good article on < 3.0:
http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architecturePlanningUserData_t.html
The Last Pickle has a g
Using keyspaces to support multi tenancy is very close to an anti pattern
unless there is a finite and reasonable upper bound to how many tenants
you'll support overall. Large numbers of tables comes with cluster overhead
and operational complexity you will come to regret eventually.
>and because
What is your replication factor in this DC?
On Fri, Sep 30, 2016 at 8:08 AM techpyaasa . wrote:
> Hi ,
>
> We have c*-2.0.17 with 3 data centers . Each data center has 9 nodes. vnodes
> enabled in all nodes.
>
> When I ran -local repair(./nodetool -local repair keyspace_name1
> columnfamily_1
It sounds like you're trying to avoid the latency of waiting for a write
confirmation to a remote data center?
App ==> DC1 ==high-latency==> DC2
If you need the write to be confirmed before you consider the write
successful in your application (definitely recommended unless you're ok
with losing
You would have to perform a SELECT on the row in the trigger code in order
to determine if there was underlying data. Cassandra is in essence an
append-only data store, when an INSERT or UPDATE is executed, it has no
idea if there is already a row underlying it, and for write performance
reasons i
If you happen to be using Scala, we recently released some tooling we wrote
around using CCM for integration testing:
https://github.com/protectwise/cassandra-util
You define clusters and nodes in configuration, then ask the service to go:
https://github.com/protectwise/cassandra-util/blob/master/
You're able to set the timestamp of the write in the client application.
If you have a table which is especially sensitive to out of order writes
and want to deal with the repeated second correctly, you could do slewing
at your client application layer and be explicit with the timestamp for
those s
You probably want to look at change data capture rather than triggers:
http://cassandra.apache.org/doc/latest/operating/cdc.html
Be aware that one of your criteria regarding operation order is going to be
very difficult to guarantee due to eventual consistency.
On Fri, Dec 16, 2016, 2:43 AM Matij
The purpose of timestamps is to guarantee out-of-order conflicting writes
are resolved as last-write-wins. Cassandra doesn't really expect you to be
writing timestamps with wide variations from record to record. Indeed, if
you're doing this, it'll violate some of the assumptions in places such as
This question is probably better suited for the user@ group.
It doesn't sound to me like you've uncovered a bug, but rather you're
engaging in highly contentious paxos, which is rarely going to have a
favorable outcome. Likely you're overwhelming your cluster (or more
specifically the replicas fo
> We’ve actually had several customers where we’ve done the opposite -
split large clusters apart to separate uses cases
We do something similar but for a single application. We're functionally
sharding data to different clusters from a single application. We can have
different server classes fo
Those future tombstones are going to continue to cause problems on those
partitions. If you're still writing to those partitions, you might be
losing data in the mean time. It's going to be hard to get the tombstone
out of the way so that new writes can begin to happen there (newly written
data w
> I side-tracked some punctual benchmarks and stumbled on the observations
of unlogged inserts being *A LOT* faster than the async counterparts.
My own testing agrees very strongly with this. When this topic came up on
this list before, there was a concern that batch coordination produces GC
pres
tra hops, if you eliminate the
> extra hops by token awareness then it just comes down to write size
> optimization.
>
> On Sep 24, 2015, at 5:18 PM, Eric Stevens wrote:
>
> > I side-tracked some punctual benchmarks and stumbled on the
> observations of unlogged inserts be
r your cluster is the factor to focus on and however you
> get there is fantastic. more replies inline:
>
> On Sep 25, 2015, at 1:24 PM, Eric Stevens wrote:
>
> > compaction usually is the limiter for most clusters, so the difference
> between async versus unlogged batch e
Since you have most of your reads hitting 5-8 SSTables, it's probably
related to that increasing your latency. That makes this look like your
write workload is either overwrite-heavy or append-heavy. Data for a
single partition key is being written to repeatedly over long time periods,
and this w
Can you give us an example of the duplicate records that comes back? How
reliable is it (i.e. is it every record, is it one record per read, etc)?
By any chance is it just the `data` field that duplicates while the other
fields change per row?
> I don’t see duplicates in cqlsh.
I've never seen t
Basically your client just needs a route to talk to the IP being broadcast
by each node. We do plenty in EC2 and we use the instance private IP in
the broadcast address. If you are doing multi-datacenter in EC2 it gets a
little harrier, where you need to use the public IP (but not necessarily
ela
If you're at 1 node (N=1) and RF=1 now, and you want to go N=3 RF=3, you
ought to be able to increase RF to 3 before bootstrapping your new nodes,
with no downtime and no loss of data (even temporary). Effective RF is
min-bounded by N, so temporarily having RF > N ought to behave as RF = N.
If yo
The DataStax Java driver is based on Netty and is non blocking; if you do
any CQL work you should look into it. At ProtectWise we use it with high
write volumes from Scala/Akka with great success.
We have a thin Scala wrapper around the Java driver that makes it act more
Scalaish (eg, Scala futur
You cannot busy throw bigger and bigger disks at a cluster that is
accumulating data as it fills up. This is due to data grooming tasks that
increase in cost as your data density per node increases (for example,
compaction), as well as other factors that are impacted by data density
(such as cache
If the columns are not dynamically named (as in "actionId" and "code") you
should be able to add that to your CQL table definition with ALTER TABLE,
and those columns should be available in the query results.
If the columns *are* dynamically named, and you can't reasonably add every
option to the
You probably could, but if I were you, I'd consider a tool built for that
purpose, such as Zookeeper. It'd open up access to a lot of other great
cluster coordination features.
On Thu, Oct 15, 2015 at 8:47 AM Jan Algermissen
wrote:
> Hi,
>
> suppose I have two data centers and want to coordinat
seems a little overkill for just 1 feature though. LOCAL_SERIAL is
> fine if all you want to do is keep a handful of keys up to date.
>
> There’s a massive cost in adding something new to your infrastructure, and
> imo, very little gain in this case.
>
> On Oct 15, 2015, at 8:29 A
It seems to me that as long as cleanup hasn't happened, if you
*decommission* the newly joined nodes, they'll stream whatever writes they
took back to the original replicas. Presumably that should be pretty quick
as they won't have nearly as much data as the original nodes (as they only
hold data
Serial consistency gets invoked at the protocol level when doing
lightweight transactions such as CAS operations. If you're expecting that
your topology is RF=2, N=2, it seems like some keyspace has RF=3, and so
there aren't enough nodes available to satisfy serial consistency.
See
http://docs.da
increased as well; you
>>> may need to increase your internal node-to-node timeouts .
>>>
>>> On Mon, Nov 2, 2015 at 8:01 PM, Ajay Garg
>>> wrote:
>>>
>>>> Hi Eric,
>>>>
>>>> I am sorry, but I don't understand.
In short: Yes, but it's not a good idea.
To do it, you want to look into WhiteListPolicy for your loadbalancer
policy, if your WhiteListPolicy contains only the same host(s) that you
added as contact points, then the client will only connect to those hosts.
However it's probably not a good idea f
The server is binding to the IPv4 "all addresses" reserved address
(0.0.0.0), but binding it as IPv4 over IPv6 (:::0.0.0.0), which does
not have the same meaning as the IPv6 all addresses reserved IP (being ::,
aka 0:0:0:0:0:0:0:0).
My guess is you have an IPv4 address of 0.0.0.0 in rpc_addres
If you switch reads to CL=LOCAL_ALL, you should be able to increase RF,
then run repair, and after repair is complete, go back to your old
consistency level. However, while you're operating at ALL consistency, you
have no tolerance for a node failure (but at RF=1 you already have no
tolerance for
If you're talking about logged batches, these absolutely have an impact on
performance of about 30%. The whole batch will succeed or fail as a unit,
but throughput will go down and load will go up. Keep in mind that logged
batches are atomic but are not isolated - i.e. it's totally possible to ge
> 512G memory , 128core cpu
This seems dramatically oversized for a Cassandra node. You'd do *much* better
to have a much larger cluster of much smaller nodes.
On Thu, Nov 5, 2015 at 8:25 AM Jack Krupansky
wrote:
> I don't know what current numbers are, but last year the idea of getting 1
> m
Check nodetool status to see if the replacement node is fully joined (UN
status). If it is and it didn't stream any data, then either
auto_bootstrap was false, or the node was in its own seeds list. If you
lost a node, then replace_address as Jonny mentioned would probably be a
good idea.
On Mon
> 3) check the system.schema_columns if these column_name(s) exist in the
table
> 4) If the column don't exist in the table "ALTER table tablename add new
column_name text"
Unless you have some external control on this so that you know two
processors will never attempt the same operation within a
Inconsistent reads are most often the result of inconsistent data between
nodes. Inconsistent data during tests like this is quite often the result
of having loaded data fast enough that you dropped mutations (writing even
at quorum means that you could still be dropping data on some nodes and not
Generally speaking (both for Cassandra as well as for many other projects),
timestamps don't carry a timezone directly. A single point in time has a
consistent value for timestamp regardless of the timezone, and when you
convert a timestamp to a human-friendly value, you can attach a timezone to
s
It seems like this exact problem pops up every few weeks on this list. I
think the documentation does a dangerously bad job of describing the
limitations of CREATE TABLE...IF NOT EXISTS.
CREATE TABLE...IF NOT EXISTS is a dangerous construct because it seems to
advertise atomicity and isolation, n
There's still a race condition there, because two clients could SELECT at
the same time as each other, then both INSERT.
You'd be better served with a CAS operation, and let Paxos guarantee
at-most-once execution.
On Tue, Jan 26, 2016 at 9:06 AM Francisco Reyes wrote:
> On 01/22/2016 10:29 PM,
It's definitely not true for every use case of a large number of tables,
but for many uses where you'd be tempted to do that, adding whatever would
have driven your table naming instead as a column in your partition key on
a smaller number of tables will meet your needs. This is especially true
if
We have been working on filtering compaction for a month or so (though we
call it deleting compaction, its implementation is as a filtering
compaction strategy). The feature is nearing completion, and we have used
it successfully in a limited production capacity against DSE 4.8 series.
Our use ca
In addition to writing null values acting as tombstones, also INSERTing a
collection (or UPDATE where you set the collection rather than append to
it) are also operations which will create tombstones.
On Wed, Mar 23, 2016 at 12:09 PM Robert Coli wrote:
> On Wed, Mar 23, 2016 at 9:50 AM, Ralf Ste
> Local quorum works in the same data center as the coordinator node,
but when an app server execute the write query, how is the coordinator
node chosen?
It typically depends on the driver, and decent drivers offer you several
options for this, usually called load balancing strategy. You indicate
IIRC in DSE 4.6 using vnodes is basically always a bad idea in your Solr
datacenter. The overhead was more than you could reasonably want to pay
unless your vnode count was low enough that you lost all the advantage.
Around 4.7 there were significant performance improvements for vnodes in
DSE Sol
if so, how?) or manually specified
> somewhere?
> * Whether local_xxx consistencies always fail when a partition is not
> replicated in the local DC, as specified in its replication strategy.
>
> Perhaps I should ask the node.js client authors about this.
>
>
> On Monday, March
Append-only workloads are a good candidate for Date Tiered or better Time
Windowed compaction. Effectively depending on how you set it up, data in
older SStables will eventually come to rest and never be compacted again.
On Fri, Apr 8, 2016 at 7:42 AM Robert Wille wrote:
> You still need compac
Maybe include nodetool status here? Are the four nodes serving reads in
one DC (local to your driver's config) while the others are in another?
On Tue, Apr 12, 2016, 1:01 AM Anishek Agarwal wrote:
> hello,
>
> we have 8 nodes in one cluster and attached is the traffic patterns across
> the node
wrote:
> We have two DC one with the above 8 nodes and other with 3 nodes.
>
>
>
> On Tue, Apr 12, 2016 at 8:06 PM, Eric Stevens wrote:
>
>> Maybe include nodetool status here? Are the four nodes serving reads in
>> one DC (local to your driver's config) while the
gt; UN 10.124.114.98 165.34 GB 256 37.6%
> cdc69c7d-b9d6-4abd-9388-1cdcd35d946c RAC1
>
> UN 10.124.114.113 145.22 GB 256 35.7%
> 1557af04-e658-4751-b984-8e0cdc41376e RAC1
>
> UN 10.125.138.59 162.65 GB 256 38.6%
> 9ba1b7b6-5655-456e-b1a1-6f429750fc96
t;
>> host ip: 10.125.138.59
>>
>> host DC : WDC
>>
>> distance of host: LOCAL
>>
>> host is up: true
>>
>> cassandra version : 2.0.17
>>
>> host ip: 10.124.114.97
>>
>> host DC : WDC
>>
>> distance of host: LOCAL
Assuming an even distribution of data in your cluster, and an even
distribution across those keys by your readers, you would not need to
increase RF with cluster size to increase read performance. If you have 3
nodes with RF=3, and do 3 million reads, with good distribution, each node
has served 1
23, 2017 at 11:43 PM Alain Rastoul
wrote:
On 24/03/2017 01:00, Eric Stevens wrote:
> Assuming an even distribution of data in your cluster, and an even
> distribution across those keys by your readers, you would not need to
> increase RF with cluster size to increase read performance. If yo
Jim's basic model is similar to how we've solved this exact kind of problem
many times. From my own experience, I strongly recommend that you make a
`bucket` field in the partition key, and a `time` field in the cluster
key. Make both of these of data type `timestamp`. Then use application
logic
Just curious if you've looked at materialized views. Something like:
CREATE MATERIALIZED VIEW users_by_mod_date AS
SELECT dept_id,mod_date,user_id,user_name FROM users
WHERE mod_date IS NOT NULL
PRIMARY KEY (dept_id,mod_date,user_id)
WITH CLUSTERING ORDER BY (mod_date
> If using the SimpleStrategy replication class, it appears that
> replication_factor is the only option, which applies to the entire
> cluster, so only one node in both datacenters would have the data.
This runs counter to my understanding, or else I'm not reading your
statement correctly. When
We've seen an unusually high instance failure rate with i3's (underlying
hardware degradation). Especially with the nodes that have been around
longer (recently provisioned nodes have a more typical failure rate). I
wonder if your underlying hardware is degraded and EC2 just hasn't noticed
yet.
You should be able to fairly efficiently iterate all the partition keys
like:
select id, token(id) from table where token(id) >= -9204925292781066255
limit 1000;
id | system.token(id)
+--
...
The original timestamp is bigger than the timestamp you're using in your
batch. Cassandra uses timestamps for conflict resolution, so the batch
write will lose.
On Wed, Sep 13, 2017 at 11:59 AM Deepak Panda
wrote:
> Hi All,
>
> Am in the process of learning batch operations. Here is what I trie
To hop on what Jon said, if your concern is automatic application of schema
migrations, you want to be very careful with this. I'd consider it an
unsolved problem in Cassandra for some methods of schema application.
The failed ALTER is not what you have to worry about, it's two successful
ALTERs
Hi,
Just in case you haven't seen it, I gave a talk last year at the summit. In
the first part of the talk I speak for a while about the lifecycle of a
tombstone, and how they don't always get cleaned up when you might expect.
https://youtu.be/BhGkSnBZgJA
It looks like you're deleting cluster ke
Someone can correct me if I'm wrong, but I believe if you do a large IN()
on a single partition's cluster keys, all the reads are going to be served
from a single replica. Compared to many concurrent individual equal
statements you can get the performance gain of leaning on several replicas
for pa
We've been commissioning some new nodes on a 2.0.10 community edition
cluster, and we're seeing streams that look like they're shipping way more
data than they ought for individual files during bootstrap.
/var/lib/cassandra/data/x/y/x-y-jb-11748-Data.db
3756423/3715409 bytes(101%)
7878> which is fixed
> in 2.0.11 / 2.1.1
>
>
> Mark
>
> On 1 November 2014 14:08, Eric Stevens wrote:
>
>> We've been commissioning some new nodes on a 2.0.10 community edition
>> cluster, and we're seeing streams that look like they're shipping
If this is just for doing tests to make sure you get back the data you
expect, I would recommend looking some sort of eventually construct in your
testing. We use Specs2 as our testing framework, and our write-then-read
tests look something like this:
someDAO.write(someObject)
eventually {
s
> They do not use Raid10 on the node, they don't use dual power as well,
because it's not cheap in cluster of many nodes
I think the point here is that money spent on traditional failure avoidance
models is better spent in a Cassandra cluster by instead having more nodes
of less expensive hardware
Wouldn't it be a better idea to issue removenode on the crashed node, wipe
the whole data directory (including system) and let it bootstrap cleanly so
that it's not part of the cluster while it gets back up to speed?
On Tue, Nov 11, 2014, 12:32 PM Robert Coli wrote:
> On Tue, Nov 11, 2014 at 10:
You may be able to do something with conditional updates, however trying to
use Cassandra for this kind of coordination smells to me a lot like typical
antipatterns (eg write then read or read then write). You probably would
do better if you need one writer to consistently win a race condition ove
I'm not aware of a way to query TTL or writetime on collections from CQL
yet. You can access this information from Thrift though.
On Sat Nov 15 2014 at 12:51:55 AM DuyHai Doan wrote:
> Why don't you use map to store write time as value and data as key?
> Le 15 nov. 2014 00:24, "Kevin Burton" a
> load average on DC1 nodes are around 3-5 and on DC2 around 7-10
Anecdotally I can say that loads in the 7-10 range have been dangerously
high. When we had a cluster running in this range, the cluster was falling
behind on important tasks such as compaction, and we really struggled to
successful
1 - 100 of 235 matches
Mail list logo