Re: Is replication possible with already existing data?

2015-10-09 Thread anuja jain
Hi Ajay,


On Fri, Oct 9, 2015 at 9:00 AM, Ajay Garg  wrote:

> On Thu, Oct 8, 2015 at 9:47 AM, Ajay Garg  wrote:
> > Thanks Eric for the reply.
> >
> >
> > On Thu, Oct 8, 2015 at 1:44 AM, Eric Stevens  wrote:
> >> If you're at 1 node (N=1) and RF=1 now, and you want to go N=3 RF=3, you
> >> ought to be able to increase RF to 3 before bootstrapping your new
> nodes,
> >> with no downtime and no loss of data (even temporary).  Effective RF is
> >> min-bounded by N, so temporarily having RF > N ought to behave as RF =
> N.
> >>
> >> If you're starting at N > RF and you want to increase RF, things get
> >> harrier
> >> if you can't afford temporary consistency issues.
> >>
> >
> > We are ok with temporary consistency issues.
> >
> > Also, I was going through the following articles
> >
> https://10kloc.wordpress.com/2012/12/27/cassandra-chapter-5-data-replication-strategies/
> >
> > and following doubts came up in my mind ::
> >
> >
> > a)
> > Let's say at site-1, Application-Server (APP1) uses the two
> > Cassandra-instances (CAS11 and CAS12), and APP1 generally uses CAS11 for
> all
> > its needs (of course, whatever happens on CAS11, the same is replicated
> to
> > CAS12 at Cassandra-level).
> >
> > Now, if CAS11 goes down, will it be the responsibility of APP1 to
> "detect"
> > this and pick up CAS12 for its needs?
> > Or some automatic Cassandra-magic will happen?
> >
>
In this case, it will be the responsibility of APP1 to start connection to
 CAS12. On the other hand if your APP1 is connecting to cassandra using
Java driver, you can add multiple contact points(CAS11 and CAS12 here) so
that if CAS11 is down it will directly connect to CAS12.

> b)
> > In the same above scenario, let's say before CAS11 goes down, the amount
> of
> > data in both CAS11 and CAS12 was "x".
> >
> > After CAS11 goes down, the data is being put in CAS12 only.
> > After some time, CAS11 comes back up.
> >
> > Now, data in CAS11 is still "x", while data in CAS12 is "y" (obviously,
> "y"
> >> "x").
> >
> > Now, will the additional ("y" - "x") data be automatically
> > put/replicated/whatever back in CAS11 through Cassandra?
> > Or it has to be done manually?
> >
>
> In such a case, CAS12 will store hints for the data to be stored on CAS11
(the tokens of which lies within the range of tokens CAS11 holds)  and
whenever CAS11 is up again, the hints will be transferred to it and the
data will be distributed evenly.


> >
> > If there are easy recommended solutions to above, I am beginning to think
> > that a 2*2 (2 nodes each at 2 data-centres) will be the ideal setup
> > (allowing failures of entire site, or a few nodes on the same site).
> >
> > I am sorry for asking such newbie questions, and I will be grateful if
> these
> > silly questions could be answered by the experts :)
> >
> >
> > Thanks and Regards,
> > Ajay
>
>
>
> --
> Regards,
> Ajay
>


compaction with LCS

2015-10-09 Thread Anishek Agarwal
hello,

on doing cfstats for the column family i see

SSTables in each level: [1, 10, 109/100, 1, 0, 0, 0, 0, 0]

i thought compaction would trigger since the 3rd level tables are move than
expected number,

but on doing compactionstats its shows "n/a" -- any reason why its not
triggering, should i be worried ?

we have 5 node cluster running 2.0.15 cassandra version,

thanks
anishek


Re: compaction with LCS

2015-10-09 Thread Anishek Agarwal
Looks like some of the nodes have higher sstables on L0 and compaction is
running there, so only few nodes run compaction at a time and the
preference is given to lower level nodes for compaction before going to
higher levels ? so is compaction cluster aware then ?


On Fri, Oct 9, 2015 at 5:17 PM, Anishek Agarwal  wrote:

> hello,
>
> on doing cfstats for the column family i see
>
> SSTables in each level: [1, 10, 109/100, 1, 0, 0, 0, 0, 0]
>
> i thought compaction would trigger since the 3rd level tables are move
> than expected number,
>
> but on doing compactionstats its shows "n/a" -- any reason why its not
> triggering, should i be worried ?
>
> we have 5 node cluster running 2.0.15 cassandra version,
>
> thanks
> anishek
>


Re: Node won't go away

2015-10-09 Thread Carlos Alonso
So if the idea is to completely remove it, just deleting the corresponding
entry from system.peers should do it.

Some versions of Cassandra have a bug that leaves the entry in the
system.peers table after decommissioning, and the fix is just to delete it.

Here is the link to the JIRA:
https://issues.apache.org/jira/browse/CASSANDRA-6053

Carlos Alonso | Software Engineer | @calonso 

On 8 October 2015 at 19:24, sai krishnam raju potturi 
wrote:

> the below solution should work.
>
> For each node in the cluster :
>  a : Stop cassandra service on the node.
>  b : manually delete data under $data_directory/system/peers/  directory.
>  c : In cassandra-env.sh file, add the line JVM_OPTS="$JVM_OPTS
> -Dcassandra.load_ring_state=false".
>  d : Restart service on the node.
>  e : delete the added line in cassandra-env.sh  JVM_OPTS="$JVM_OPTS
> -Dcassandra.load_ring_state=false".
>
> thanks
> Sai Potturi
>
>
>
> On Thu, Oct 8, 2015 at 11:27 AM, Robert Wille  wrote:
>
>> We had some problems with a node, so we decided to rebootstrap it. My IT
>> guy screwed up, and when he added -Dcassandra.replace_address to
>> cassandra-env.sh, he forgot the closing quote. The node bootstrapped, and
>> then refused to join the cluster. We shut it down, and then noticed that
>> nodetool status no longer showed that node, and the “Owns” column had
>> increased from ~10% per node to ~11% (we originally had 10 nodes). I don’t
>> know why Cassandra decided to automatically remove the node from the
>> cluster, but it did. We figured it would be best to make sure the node was
>> completely forgotten, and then add it back into the cluster as a new node.
>> Problem is, it won’t completely go away.
>>
>> nodetool status doesn’t list it, but its still in system.peers, and
>> OpsCenter still shows it. When I run nodetool removenode, it says that it
>> can’t find the node.
>>
>> How do I completely get rid of it?
>>
>> Thanks in advance
>>
>> Robert
>>
>>
>


Spark and intermediate results

2015-10-09 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Hello, 

I saw this nice link from an event:

http://www.datastax.com/dev/blog/zen-art-spark-maintenance?mkt_tok=3RkMMJWWfF9wsRogvqzIZKXonjHpfsX56%2B8uX6GylMI%2F0ER3fOvrPUfGjI4GTcdmI%2BSLDwEYGJlv6SgFSrXMMblswLgIXBY%3D

I would like to test using Spark to perform some operations on a column family, 
my objective is reading from CF A and writing the output of my M/R job to CF B. 

That said, I've read this from Spark's FAQ (http://spark.apache.org/faq.html):

"Do I need Hadoop to run Spark?
No, but if you run on a cluster, you will need some form of shared file system 
(for example, NFS mounted at the same path on each node). If you have this type 
of filesystem, you can just deploy Spark in standalone mode."

The question I ask is - if I don't want to have a HDFS instalation just to run 
Spark on Cassandra, is my only option to have this NFS mounted over network? 
It doesn't seem smart to me to have something as NFS to store Spark files, as 
it would probably affect performance, and at the same time I wouldn't like to 
have an additional HDFS cluster just to run jobs on Cassandra. 
Is there a way of using Cassandra itself as this "some form of shared file 
system"?

-Marcelo


<< ideas don't deserve respect >>

RE: Why can't nodetool status include a hostname?

2015-10-09 Thread SEAN_R_DURITY
I ended up writing some of my own utilities and aliases to make output more 
useful for me (and reduce some typing, too). Resolving host names was a big one 
for me, too. Ip addresses are almost useless. Up time in seconds is useless.

The –r in nodetool is a nice addition, but I like the short host name instead.

hostname:/home/cassuser> cinfo
DSE Version: 4.7.0
Cassandra Ver  : 2.1.5.469
Gossip active  : true
Thrift active  : true
Native Transport active: true
Load   : 2.67 GB
Up since   : Sun Sep 13 00:36:50 EDT 2015
Heap Memory (MB)   : 3645.74 / 7987.25
Off Heap Memory (MB)   : 203.93
Heap Used %: 45.64
Thrift Conns   : 0
CQL Conns  : 12
Topology   : DC1 : RAC1

hostname:/home/cassuser> cstatus
Datacenter: DC1
===
Status=Up/Down
/ State=Normal/Leaving/Joining/Moving
--  Address Load   Tokens  OwnsHost ID  
 Rack
UN  cplinpys   8.47 GB256 ?   
f498c0f9-0041-404c-979d-d1269c6a2287  RAC1
UN  cplinpyr   2.67 GB256 ?   
397546c2-e229-482e-aa50-de367ab6add8  RAC1
UN  cplinpyt   2.17 GB256 ?   
f61da10c-c2c6-4a5a-8fdc-d2693f2239bc  RAC1

Sean Durity – Lead Cassandra Admin

From: Gene [mailto:gh5...@gmail.com]
Sent: Thursday, October 08, 2015 12:43 PM
To: user@cassandra.apache.org
Subject: Re: Why can't nodetool status include a hostname?

Yeah, -r or --resolve-ip is what you're looking for.

Cassandra's nodetool command is kind of wonky.  Inconsistent across functions 
(e.g. sometimes 'keyspace.columnfamily' other times 'keyspace columnfamily', 
pay attention to the character between the items), doesn't resolve IPs by 
default (while standard linux commands require you to pass something like -n to 
not resolve names), so on and so forth.

When in doubt run nodetool without specifying a command and it'll list all of 
the available options (another example of wonkiness, the 'help' argument is not 
listed in this output)

-Gene

On Thu, Oct 8, 2015 at 7:01 AM, Paulo Motta 
mailto:pauloricard...@gmail.com>> wrote:
Have you tried using the -r or --resolve-ip option?

2015-10-07 19:59 GMT-07:00 Kevin Burton 
mailto:bur...@spinn3r.com>>:
I find it really frustrating that nodetool status doesn't include a hostname

Makes it harder to track down problems.

I realize it PRIMARILY uses the IP but perhaps cassandra.yml can include an 
optional 'hostname' parameter that can be set by the user.  OR have the box 
itself include the hostname in gossip when it starts up.

I realize that hostname wouldn't be authoritative and that the IP must still be 
shown but we could add another column for the hostname.

--
We’re hiring if you know of any awesome Java Devops or Linux Operations 
Engineers!

Founder/CEO Spinn3r.com
Location: San Francisco, CA
blog: http://burtonator.wordpress.com
… or check out my Google+ 
profile





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Cassandra query degradation with high frequency updated tables.

2015-10-09 Thread Carlos Alonso
Yeah, I was about to suggest the compaction strategy too. Leveled
compaction sounds like a better fit when records are being updated

Carlos Alonso | Software Engineer | @calonso 

On 8 October 2015 at 22:35, Tyler Hobbs  wrote:

> Upgrade to 2.2.2.  Your sstables are probably not compacting due to
> CASSANDRA-10270 ,
> which was fixed in 2.2.2.
>
> Additionally, you may want to look into using leveled compaction (
> http://www.datastax.com/dev/blog/when-to-use-leveled-compaction).
>
> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala 
> wrote:
>
>>
>> Hi,
>>
>> so we are developing a system that computes profile of things that it
>> observes. The observation comes in form of events. Each thing that it
>> observe has an id and each thing has a set of subthings in it which has
>> measurement of some kind. Roughly there are about 500 subthings within each
>> thing. We receive events containing measurements of these 500 subthings
>> every 10 seconds or so.
>>
>> So as we receive events, we  read the old profile value, calculate the
>> new profile based on the new value and save it back. We use the following
>> schema to hold the profile.
>>
>> CREATE TABLE myprofile (
>> id text,
>> month text,
>> day text,
>> hour text,
>> subthings text,
>> lastvalue double,
>> count int,
>> stddev double,
>>  PRIMARY KEY ((id, month, day, hour), subthings)
>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>>
>>
>> This profile will then be use for certain analytics that can use in the
>> context of the ‘thing’ or in the context of specific thing and subthing.
>>
>> A profile can be defined as monthly, daily, hourly. So in case of monthly
>> the month will be set to the current month (i.e. ‘Oct’) and the day and
>> hour will be set to empty ‘’ string.
>>
>>
>> The problem that we have observed is that over time (actually in just a
>> matter of hours) we will see a huge degradation of query response  for the
>> monthly profile. At the start it will be respinding in 10-100 ms and after
>> a couple of hours it will go to 2000-3000 ms . If you leave it for a couple
>> of days you will start experiencing readtimeouts . The query is basically
>> just :
>>
>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and
>> hour=‘'
>>
>> This will have only about 500 rows or so.
>>
>>
>> I believe that this is cause by the fact there are multiple updates done
>> to this specific partition. So what do we think can be done to resolve this
>> ?
>>
>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is just
>> running on a single node.
>>
>>
>>
>>
>>
>
>
> --
> Tyler Hobbs
> DataStax 
>


Re: Re : Nodetool Cleanup on multiple nodes in parallel

2015-10-09 Thread sai krishnam raju potturi
thanks Jonathan. I see a advantage in doing it one AZ or rack at a time.

On Thu, Oct 8, 2015 at 6:41 PM, Jonathan Haddad  wrote:

> My hunch is the bigger your cluster the less impact it will have, as each
> node takes part in smaller and smaller % of total queries.  Considering
> that compaction is always happening, I'd wager if you've got a big cluster
> (as you say you do) you'll probably be ok running several cleanups at a
> time.
>
> I'd say start one, see how your perf is impacted (if at all) and go from
> there.
>
> If you're running a proper snitch you could probably do an entire rack /
> AZ at a time.
>
>
> On Thu, Oct 8, 2015 at 3:08 PM sai krishnam raju potturi <
> pskraj...@gmail.com> wrote:
>
>> We plan to do it during non-peak hours when customer traffic is less.
>> That sums up to 10 nodes a day, which is concerning as we have other data
>> centers to be expanded eventually.
>>
>> Since cleanup is similar to compaction, which is CPU intensive and will
>> effect reads  if this data center were to serve traffic. Is running cleanup
>> in parallel advisable??
>>
>> On Thu, Oct 8, 2015, 17:53 Jonathan Haddad  wrote:
>>
>>> Unless you're close to running out of disk space, what's the harm in it
>>> taking a while?  How big is your DC?  At 45 min per node, you can do 32
>>> nodes a day.  Diverting traffic away from a DC just to run cleanup feels
>>> like overkill to me.
>>>
>>>
>>>
>>> On Thu, Oct 8, 2015 at 2:39 PM sai krishnam raju potturi <
>>> pskraj...@gmail.com> wrote:
>>>
 hi;
our cassandra cluster currently uses DSE 4.6. The underlying
 cassandra version is 2.0.14.

 We are planning on adding multiple nodes to one of our datacenters.
 This requires "nodetool cleanup". The "nodetool cleanup" operation
 takes around 45 mins for each node.

 Datastax documentation recommends running "nodetool cleanup" for one
 node at a time. That would be really long, owing to the size of our
 datacenter.

 If we were to divert the read and write traffic away from a particular
 datacenter, could we run "cleanup" on multiple nodes in parallel for
 that datacenter??


 http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html


 thanks
 Sai

>>>


Re: Spark and intermediate results

2015-10-09 Thread Jonathan Haddad
You can run spark against your Cassandra data directly without using a
shared filesystem.

https://github.com/datastax/spark-cassandra-connector


On Fri, Oct 9, 2015 at 6:09 AM Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemil...@bloomberg.net> wrote:

> Hello,
>
> I saw this nice link from an event:
>
>
> http://www.datastax.com/dev/blog/zen-art-spark-maintenance?mkt_tok=3RkMMJWWfF9wsRogvqzIZKXonjHpfsX56%2B8uX6GylMI%2F0ER3fOvrPUfGjI4GTcdmI%2BSLDwEYGJlv6SgFSrXMMblswLgIXBY%3D
>
> I would like to test using Spark to perform some operations on a column
> family, my objective is reading from CF A and writing the output of my M/R
> job to CF B.
>
> That said, I've read this from Spark's FAQ (
> http://spark.apache.org/faq.html):
>
> "Do I need Hadoop to run Spark?
> No, but if you run on a cluster, you will need some form of shared file
> system (for example, NFS mounted at the same path on each node). If you
> have this type of filesystem, you can just deploy Spark in standalone mode.
> "
>
> The question I ask is - if I don't want to have a HDFS instalation just to
> run Spark on Cassandra, is my only option to have this NFS mounted over
> network?
> It doesn't seem smart to me to have something as NFS to store Spark files,
> as it would probably affect performance, and at the same time I wouldn't
> like to have an additional HDFS cluster just to run jobs on Cassandra.
> Is there a way of using Cassandra itself as this "some form of shared
> file system"?
>
> -Marcelo
>
>
> << ideas don't deserve respect >>
>


Re: Spark and intermediate results

2015-10-09 Thread Marcelo Valle (BLOOMBERG/ LONDON)
I know the connector, but having the connector only means it will take *input* 
data from Cassandra, right? What about intermediate results?
If it stores intermediate results on Cassandra, could you please clarify how 
data locality is handled? Will it store in other keyspace? 
I could not find any doc about it...

From: user@cassandra.apache.org 
Subject: Re: Spark and intermediate results

You can run spark against your Cassandra data directly without using a shared 
filesystem. 

https://github.com/datastax/spark-cassandra-connector


On Fri, Oct 9, 2015 at 6:09 AM Marcelo Valle (BLOOMBERG/ LONDON) 
 wrote:

Hello, 

I saw this nice link from an event:

http://www.datastax.com/dev/blog/zen-art-spark-maintenance?mkt_tok=3RkMMJWWfF9wsRogvqzIZKXonjHpfsX56%2B8uX6GylMI%2F0ER3fOvrPUfGjI4GTcdmI%2BSLDwEYGJlv6SgFSrXMMblswLgIXBY%3D

I would like to test using Spark to perform some operations on a column family, 
my objective is reading from CF A and writing the output of my M/R job to CF B. 

That said, I've read this from Spark's FAQ (http://spark.apache.org/faq.html):

"Do I need Hadoop to run Spark?
No, but if you run on a cluster, you will need some form of shared file system 
(for example, NFS mounted at the same path on each node). If you have this type 
of filesystem, you can just deploy Spark in standalone mode."

The question I ask is - if I don't want to have a HDFS instalation just to run 
Spark on Cassandra, is my only option to have this NFS mounted over network? 
It doesn't seem smart to me to have something as NFS to store Spark files, as 
it would probably affect performance, and at the same time I wouldn't like to 
have an additional HDFS cluster just to run jobs on Cassandra. 
Is there a way of using Cassandra itself as this "some form of shared file 
system"?

-Marcelo


<< ideas don't deserve respect >>


<< ideas don't deserve respect >>

SSTableWriter error: incorrect row data size

2015-10-09 Thread Eiti Kimura
Hello Guys,

Have a cluster with 6 nodes using Cassandra 1.2.
I have my Keyspace and tables created using thrift cassandra-cli.

Now I just created a new table using cqlsh as follows:

CREATE TABLE idx_conf (
  conf_id int,
  ref_id text,
  subs_key text,
  data text,
  enabled boolean,
  expiration timestamp,
  last_charge timestamp,
  last_charge_att timestamp,
  last_queued timestamp,
  master boolean,
  msg_balance int,
  origin_id int,
  seq_index int,
  status_id int,
  PRIMARY KEY (conf_id, subs_key)
) WITH
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

And I wrote a program to add data to this table using Datastax java-driver
version 2.1.5.

So I created my cluster with right protocol version for Cassandra 1.2
version:
 cluster = Cluster.builder()
 .addContactPoints(addresses)
 .withProtocolVersion(ProtocolVersion.V1)

 .withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE)
 .withReconnectionPolicy(new
ExponentialReconnectionPolicy(1000L, 3L))
 .withLoadBalancingPolicy(new
DCAwareRoundRobinPolicy())
 .build();

The inserts, updates and queries are working fine, but I saw some weird
exceptions in Cassandra server's log (system.log):

ERROR [CompactionExecutor:6523] 2015-10-09 12:33:23,551
CassandraDaemon.java (line 191) Exception in thread
Thread[CompactionExecutor:6523,1,main]
java.lang.AssertionError: incorrect row data size 568009715 written to
/movile/cassandra-data/SBSPlatform/idx_config/SBSPlatform-idx_config-tmp-ic-715-Data.db;
correct is 568010203
at
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:162)
at
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162)
at
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
at
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
at
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Looks like that SSTableWriter was not able to flush data to disk when
writing the new table: SBSPlatform-idx_config-tmp-ic-715-Data.db

Can you help me with that?
What are the consequences of this errors?
Thanks

J.P. Eiti Kimura
Plataformas

+55 19 3518  5500
+ 55 19 98232 2792
skype: eitikimura

  



Re: Cassandra query degradation with high frequency updated tables.

2015-10-09 Thread Nazario Parsacala
So I upgraded to 2.2.2 and change the compaction strategy from 
DateTieredCompactionStrategy to LeveledCompactionStrategy. But the problem 
still exists.
At the start we were getting responses around 80 to a couple of hundred of ms. 
But after 1.5 hours of running, it is now hitting 1447 ms. I think this will 
degrade some more as time progresses. I will let this run a couple of hours 
more  and will also try to force compaction.

BTW, with 2.2.2 I am getting the following exceptions. Not sure if there is 
already a bug report on this.

Caused by: java.io.IOException: Seek position 182054 is not within mmap segment 
(seg offs: 0, length: 182054)
at 
org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) 
~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) 
~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
... 17 common frames omitted
WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221 
AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-42,5,main]: {}
java.lang.RuntimeException: 
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_60]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.2.2.jar:2.2.2]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 
0, length: 182054)
at 
org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.(SSTableSliceIterator.java:42)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) 
~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
... 4 common frames omitted
Caused by: java.io.IOException: Seek position 182054 is not within mmap segment 
(seg offs: 0, length: 182054)
at 
org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
 ~[apache-cassandra-2.2.2.jar:2.2.2]
at 
org.apache.cassandra.io.util.FileUtils.s

Re: Cassandra query degradation with high frequency updated tables.

2015-10-09 Thread Tyler Hobbs
That looks like CASSANDRA-10478
, which will
probably result in 2.2.3 being released shortly.  I'm not sure how that
affects performance, but as mentioned in the ticket, you can add
"disk_access_mode: standard" to cassandra.yaml to avoid it.

If you still see performance problems after that, can you try tracing the
query with cqlsh?

On Fri, Oct 9, 2015 at 12:01 PM, Nazario Parsacala 
wrote:

> So I upgraded to 2.2.2 and change the compaction strategy from 
> DateTieredCompactionStrategy
> to LeveledCompactionStrategy. But the problem still exists.
> At the start we were getting responses around 80 to a couple of hundred of
> ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think this
> will degrade some more as time progresses. I will let this run a couple of
> hours more  and will also try to force compaction.
>
> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there
> is already a bug report on this.
>
> Caused by: java.io.IOException: Seek position 182054 is not within mmap
> segment (seg offs: 0, length: 182054)
> at
> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> ... 17 common frames omitted
> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread
> Thread[SharedPool-Worker-42,5,main]: {}
> java.lang.RuntimeException:
> org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: Seek position 182054 is not within mmap segment (seg
> offs: 0, length: 182054)
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_60]
> at
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [apache-cassandra-2.2.2.jar:2.2.2]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: Seek position 182054 is not within mmap segment (seg
> offs: 0, length: 182054)
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.(SSTableSliceIterator.java:42)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> ... 4 common frames omitted
> Caused by: java.io.IOException: Seek position 182054 is not w

Re: Cassandra query degradation with high frequency updated tables.

2015-10-09 Thread Nazario Parsacala
Compaction did not help too. 



> On Oct 9, 2015, at 1:01 PM, Nazario Parsacala  wrote:
> 
> So I upgraded to 2.2.2 and change the compaction strategy from 
> DateTieredCompactionStrategy to LeveledCompactionStrategy. But the problem 
> still exists.
> At the start we were getting responses around 80 to a couple of hundred of 
> ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think this 
> will degrade some more as time progresses. I will let this run a couple of 
> hours more  and will also try to force compaction.
> 
> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there is 
> already a bug report on this.
> 
> Caused by: java.io.IOException: Seek position 182054 is not within mmap 
> segment (seg offs: 0, length: 182054)
>   at 
> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) 
> ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) 
> ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   ... 17 common frames omitted
> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221 
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-42,5,main]: {}
> java.lang.RuntimeException: 
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
> Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-2.2.2.jar:2.2.2]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
> java.io.IOException: Seek position 182054 is not within mmap segment (seg 
> offs: 0, length: 182054)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.(SSTableSliceIterator.java:42)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) 
> ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   ... 4 common frames omitted
> Caused by: java.io.IOException: Seek position 182054 is not within mmap 
> segment (seg offs: 0, length: 182054)
>   at 
> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInp

Re: Spark and intermediate results

2015-10-09 Thread karthik prasad
Spark's core module uses this connector to read data from Cassandra and
create RDD's or DataFrames in its workspace (In memory/on disc, depending
on the spark configurations). Then transformations or queries are applied
on RDD's or DataFrames respectively. The end results are stored back into
Cassandra using the connector.

Note: If you just want to read/write from Cassandra using spark, you can
try Kundera's Spark-Cassandra Module
.
Kundera exposes the operations in a JPA way and helps in quick development.

-Karthik

On Fri, Oct 9, 2015 at 8:09 PM, Marcelo Valle (BLOOMBERG/ LONDON) <
mvallemil...@bloomberg.net> wrote:

> I know the connector, but having the connector only means it will take
> *input* data from Cassandra, right? What about intermediate results?
> If it stores intermediate results on Cassandra, could you please clarify
> how data locality is handled? Will it store in other keyspace?
> I could not find any doc about it...
>
> From: user@cassandra.apache.org
> Subject: Re: Spark and intermediate results
>
> You can run spark against your Cassandra data directly without using a
> shared filesystem.
>
> https://github.com/datastax/spark-cassandra-connector
>
>
> On Fri, Oct 9, 2015 at 6:09 AM Marcelo Valle (BLOOMBERG/ LONDON) <
> mvallemil...@bloomberg.net> wrote:
>
>> Hello,
>>
>> I saw this nice link from an event:
>>
>>
>> http://www.datastax.com/dev/blog/zen-art-spark-maintenance?mkt_tok=3RkMMJWWfF9wsRogvqzIZKXonjHpfsX56%2B8uX6GylMI%2F0ER3fOvrPUfGjI4GTcdmI%2BSLDwEYGJlv6SgFSrXMMblswLgIXBY%3D
>>
>> I would like to test using Spark to perform some operations on a column
>> family, my objective is reading from CF A and writing the output of my M/R
>> job to CF B.
>>
>> That said, I've read this from Spark's FAQ (
>> http://spark.apache.org/faq.html):
>>
>> "Do I need Hadoop to run Spark?
>> No, but if you run on a cluster, you will need some form of shared file
>> system (for example, NFS mounted at the same path on each node). If you
>> have this type of filesystem, you can just deploy Spark in standalone mode.
>> "
>>
>> The question I ask is - if I don't want to have a HDFS instalation just
>> to run Spark on Cassandra, is my only option to have this NFS mounted over
>> network?
>> It doesn't seem smart to me to have something as NFS to store Spark
>> files, as it would probably affect performance, and at the same time I
>> wouldn't like to have an additional HDFS cluster just to run jobs on
>> Cassandra.
>> Is there a way of using Cassandra itself as this "some form of shared
>> file system"?
>>
>> -Marcelo
>>
>>
>> << ideas don't deserve respect >>
>>
>
>
>
> << ideas don't deserve respect >>
>


CLUSTERING ORDER BY importance with ssd's

2015-10-09 Thread Ricardo Sancho
If I have a table

CREATE TABLE status (
user text,
time timestamp,
status text,
PRIMARY KEY (user, time))
WITH CLUSTERING ORDER BY (time ASC);

adapted from http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1

This means at the top of the partition the oldest date appears first.
If I am selecting a range from the bottom of the partition, does it make
much of a difference (considering I only use ssd's) if the clustering order
is ASC or DESC.
DESC would mean I would most of the time access the top of the partition.
Newest dates would be at the top.
While the current ASC means I access mostly the bottom.

Thanks.


Re: Realtime data and (C)AP

2015-10-09 Thread Brice Dutheil
On Fri, Oct 9, 2015 at 2:27 AM, Steve Robenalt 
wrote:

In general, if you write at QUORUM and read at ONE (or LOCAL variants
> thereof if you have multiple data centers), your apps will work well
> despite the theoretical consistency issues.

Nit-picky comment : if consistency is something important then reading at
QUORUM is important. If read is ONE then the read operation *may* not see
important update. The safest option is QUORUM for both write and read. Then
depending on the business or feature the consistency may be tuned.

— Brice
​


Re: CLUSTERING ORDER BY importance with ssd's

2015-10-09 Thread Nate McCall
>
>
> If I am selecting a range from the bottom of the partition, does it make
> much of a difference (considering I only use ssd's) if the clustering order
> is ASC or DESC.
>

The only impact is that there is an extra seek to the bottom of the
partition.


Re: Cassandra query degradation with high frequency updated tables.

2015-10-09 Thread Nazario Parsacala


So the trace is varying a lot. And does not seem to correlate with the data 
return from the client ? Maybe datastax java  driver related. ..? (not 
likely).. Just checkout the results.


Below is the one that I took when from the client (java application) 
perspective it was returning data in  about 1100 ms.



racing session: 566477c0-6ebc-11e5-9493-9131aba66d63

 activity   

  | timestamp   
   | source| source_elapsed
--++---+


   Execute CQL3 query | 2015-10-09 
15:31:28.70 | 172.31.17.129 |  0
 Parsing select * from processinfometric_profile where profilecontext='GENERIC' 
and id=‘1' and month='Oct' and day='' and hour='' and minute=''; 
[SharedPool-Worker-1] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |
101


Preparing statement [SharedPool-Worker-1] | 2015-10-09 
15:31:28.701000 | 172.31.17.129 |334

  Executing single-partition 
query on processinfometric_profile [SharedPool-Worker-3] | 2015-10-09 
15:31:28.701000 | 172.31.17.129 |692


   Acquiring sstable references [SharedPool-Worker-3] | 2015-10-09 
15:31:28.701000 | 172.31.17.129 |713


Merging memtable tombstones [SharedPool-Worker-3] | 2015-10-09 
15:31:28.701000 | 172.31.17.129 |726


  Key cache hit for sstable 209 [SharedPool-Worker-3] | 2015-10-09 
15:31:28.704000 | 172.31.17.129 |   3143

Seeking to 
partition beginning in data file [SharedPool-Worker-3] | 2015-10-09 
15:31:28.704000 | 172.31.17.129 |   3169


  Key cache hit for sstable 208 [SharedPool-Worker-3] | 2015-10-09 
15:31:28.704000 | 172.31.17.129 |   3691

Seeking to 
partition beginning in data file [SharedPool-Worker-3] | 2015-10-09 
15:31:28.704000 | 172.31.17.129 |   3713

  Skipped 0/2 non-slice-intersecting 
sstables, included 0 due to tombstones [SharedPool-Worker-3] | 2015-10-09 
15:31:28.704000 | 172.31.17.129 |   3807

 Merging 
data from memtables and 2 sstables [SharedPool-Worker-3] | 2015-10-09 
15:31:28.704000 | 172.31.17.129 |   3818


Read 462 live and 0 tombstone cells [SharedPool-Worker-3] | 2015-10-09 
15:31:29.611000 | 172.31.17.129 | 910723


 Request complete | 2015-10-09 
15:31:29.649251 | 172.31.17.129 | 949251




Below when this is around 1400 ms . But the trace data seems to look faster ..?



racing session: 7c591550-6ebf-11e5-9493-9131aba66d63

 activity

Re: CLUSTERING ORDER BY importance with ssd's

2015-10-09 Thread Ricardo Sancho
this probably depends on the number of rows we have
but should one worry performance wise about this seek?
or from how many rows should we worry about this?

On 9 October 2015 at 21:26, Nate McCall  wrote:

>
>> If I am selecting a range from the bottom of the partition, does it make
>> much of a difference (considering I only use ssd's) if the clustering order
>> is ASC or DESC.
>>
>
> The only impact is that there is an extra seek to the bottom of the
> partition.
>
>
>
>
>


OpsCenter issue with DCE 2.1.9

2015-10-09 Thread Kai Wang
Hi,

OpsCenter/Agent works sporadically for me. I am testing with DCE 2.1.9 on
Win7 x64. I seem to narrow it down to the following log messages.

When it works:
 INFO [Initialization] 2015-10-01 08:49:02,016 New JMX connection (
127.0.0.1:7199)
 ERROR [Initialization] 2015-10-01 08:49:02,344 Error connecting via JMX:
java.rmi.ConnectIOException: Exception creating connection to:
169.254.253.126; nested exception is:
java.net.SocketException: Network is unreachable: connect
  INFO [main] 2015-10-01 08:49:02,359 Reconnecting to a backup OpsCenter
instance

When it doesn't work:
  INFO [Initialization] 2015-10-09 16:57:43,008 New JMX connection (
127.0.0.1:7199)
 ERROR [Initialization] 2015-10-09 16:57:43,010 Error connecting via JMX:
java.rmi.ConnectIOException: Exception creating connection to:
169.254.253.126; nested exception is:
java.net.SocketException: Network is unreachable: connect
  INFO [Initialization] 2015-10-09 16:57:43,010 Sleeping for 20s before
trying to determine IP over JMX again

Where is this IP address 169.254.253.126 coming from? And what is a "backup
OpsCenter instance"?

Thanks.


Re: Cassandra query degradation with high frequency updated tables.

2015-10-09 Thread Tyler Hobbs
Hmm, it seems off to me that the merge step is taking 1 to 2 seconds,
especially when there are only ~500 cells from one sstable and the
memtables.  Can you open a ticket (
https://issues.apache.org/jira/browse/CASSANDRA) with your schema, details
on your data layout, and these traces?

On Fri, Oct 9, 2015 at 3:47 PM, Nazario Parsacala 
wrote:

>
>
> So the trace is varying a lot. And does not seem to correlate with the
> data return from the client ? Maybe datastax java  driver related. ..? (not
> likely).. Just checkout the results.
>
>
> Below is the one that I took when from the client (java application)
> perspective it was returning data in  about 1100 ms.
>
>
>
> *racing session: *566477c0-6ebc-11e5-9493-9131aba66d63
>
>  *activity*
>
>   |
> *timestamp*  | *source*| *source_elapsed*
>
> --++---+
>
>
> *Execute CQL3 query* | *2015-10-09
> 15:31:28.70* | *172.31.17.129* |  *0*
>  *Parsing select * from processinfometric_profile where
> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 15:31:28.701000* |
> *172.31.17.129* |*101*
>
>
>   *Preparing statement [SharedPool-Worker-1]* | 
> *2015-10-09
> 15:31:28.701000* | *172.31.17.129* |*334*
>
> *Executing
> single-partition query on processinfometric_profile [SharedPool-Worker-3]*
>  | *2015-10-09 15:31:28.701000* | *172.31.17.129* |*692*
>
>
> *Acquiring sstable references [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.701000* | *172.31.17.129* |*713*
>
>
>   *Merging memtable tombstones [SharedPool-Worker-3]* | 
> *2015-10-09
> 15:31:28.701000* | *172.31.17.129* |*726*
>
>
> *Key cache hit for sstable 209 [SharedPool-Worker-3]* | 
> *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |   *3143*
>
>   
> *Seeking
> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |   *3169*
>
>
> *Key cache hit for sstable 208 [SharedPool-Worker-3]* | 
> *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |   *3691*
>
>   
> *Seeking
> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |   *3713*
>
> *Skipped 0/2
> non-slice-intersecting sstables, included 0 due to tombstones
> [SharedPool-Worker-3]* | *2015-10-09 15:31:28.704000* | *172.31.17.129* |
> *3807*
>
>   
> *Merging
> data from memtables and 2 sstables [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |   *3818*
>
>
>   *Read 462 live and 0 tombstone cells [SharedPool-Worker-3]* | 
> *2015-10-09
> 15:31:29.611000* | *172.31.17.129* | *910723*
>
>
>   *Request complete* | *2015-10-09
> 15:31:29.649251* | *172.31.17.129* | *949251*
>
>
>
>
> Below when this is around 1400 ms . But the trace data seems to look
> faster ..?
>
>
>
> *racing session: *7c591550-6ebf-11e5-9493-9131aba66d63
>
>  *activity*
>
> |
> *timestamp*  | *source*| *source_elapsed*
>
> ++---+
>
>
>   *Execute CQL3 query* | 
> *2015-10-09
> 15:54:00.869000* | *172.31.17.129* |  *0*
>  *Parsing select * from processinfometric_profile where
> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
> and minute=''; [SharedPool-Worker-133]* | *2015-10-09 15:54:00.869000* |
> *172.31.17.129* |*122*
>
>
>   *Preparing statement [SharedPool-Worker-133]* | 
> *2015-10-09
> 15:54:00.869000* | *172.31.17.129* |*265*
>
>   *Executing
> single-partition query on processinfometric_profile [SharedPool-Worker-9]*
> | *2015-10-09 15:54:00.87* | *172.31.17.129*

Re: Cassandra query degradation with high frequency updated tables.

2015-10-09 Thread Jonathan Haddad
I'd be curious to see GC logs.

jstat -gccause 

On Fri, Oct 9, 2015 at 2:16 PM Tyler Hobbs  wrote:

> Hmm, it seems off to me that the merge step is taking 1 to 2 seconds,
> especially when there are only ~500 cells from one sstable and the
> memtables.  Can you open a ticket (
> https://issues.apache.org/jira/browse/CASSANDRA) with your schema,
> details on your data layout, and these traces?
>
> On Fri, Oct 9, 2015 at 3:47 PM, Nazario Parsacala 
> wrote:
>
>>
>>
>> So the trace is varying a lot. And does not seem to correlate with the
>> data return from the client ? Maybe datastax java  driver related. ..? (not
>> likely).. Just checkout the results.
>>
>>
>> Below is the one that I took when from the client (java application)
>> perspective it was returning data in  about 1100 ms.
>>
>>
>>
>> *racing session: *566477c0-6ebc-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>> |
>> *timestamp*  | *source*| *source_elapsed*
>>
>> --++---+
>>
>>
>>   *Execute CQL3 query* | 
>> *2015-10-09
>> 15:31:28.70* | *172.31.17.129* |  *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 15:31:28.701000* |
>> *172.31.17.129* |*101*
>>
>>
>>   *Preparing statement [SharedPool-Worker-1]* | 
>> *2015-10-09
>> 15:31:28.701000* | *172.31.17.129* |*334*
>>
>> *Executing
>> single-partition query on processinfometric_profile [SharedPool-Worker-3]*
>>  | *2015-10-09 15:31:28.701000* | *172.31.17.129* |*692*
>>
>>
>>   *Acquiring sstable references [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:28.701000* | *172.31.17.129* |*713*
>>
>>
>>   *Merging memtable tombstones [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:28.701000* | *172.31.17.129* |*726*
>>
>>
>> *Key cache hit for sstable 209 [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |   *3143*
>>
>>   
>> *Seeking
>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |   *3169*
>>
>>
>> *Key cache hit for sstable 208 [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |   *3691*
>>
>>   
>> *Seeking
>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |   *3713*
>>
>> *Skipped 0/2
>> non-slice-intersecting sstables, included 0 due to tombstones
>> [SharedPool-Worker-3]* | *2015-10-09 15:31:28.704000* | *172.31.17.129* |
>> *3807*
>>
>> 
>> *Merging
>> data from memtables and 2 sstables [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |   *3818*
>>
>>
>>   *Read 462 live and 0 tombstone cells [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:29.611000* | *172.31.17.129* | *910723*
>>
>>
>> *Request complete* | 
>> *2015-10-09
>> 15:31:29.649251* | *172.31.17.129* | *949251*
>>
>>
>>
>>
>> Below when this is around 1400 ms . But the trace data seems to look
>> faster ..?
>>
>>
>>
>> *racing session: *7c591550-6ebf-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>>   |
>> *timestamp*  | *source*| *source_elapsed*
>>
>> ++---+
>>
>>
>> *Execute CQL3 query* | 
>> *2015-10-09
>> 15:54:00.869000* | *172.31.17.129* |  *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-133]* | *2015-10-09 15:54:00.869000* |
>> *172.31.17.129* |*122*
>>
>>
>>   *Preparing statement [SharedPool-Worker-133]* | 
>> *2015-10-0

Re: Cassandra query degradation with high frequency updated tables.

2015-10-09 Thread Nazario Parsacala
I will send the jstat output later.

I have created the ticket:

https://issues.apache.org/jira/browse/CASSANDRA-10502





> On Oct 9, 2015, at 5:20 PM, Jonathan Haddad  wrote:
> 
> I'd be curious to see GC logs.  
> 
> jstat -gccause 
> 
> On Fri, Oct 9, 2015 at 2:16 PM Tyler Hobbs  > wrote:
> Hmm, it seems off to me that the merge step is taking 1 to 2 seconds, 
> especially when there are only ~500 cells from one sstable and the memtables. 
>  Can you open a ticket (https://issues.apache.org/jira/browse/CASSANDRA 
> ) with your schema, details 
> on your data layout, and these traces?
> 
> On Fri, Oct 9, 2015 at 3:47 PM, Nazario Parsacala  > wrote:
> 
> 
> So the trace is varying a lot. And does not seem to correlate with the data 
> return from the client ? Maybe datastax java  driver related. ..? (not 
> likely).. Just checkout the results.
> 
> 
> Below is the one that I took when from the client (java application) 
> perspective it was returning data in  about 1100 ms.
> 
> 
> 
> racing session: 566477c0-6ebc-11e5-9493-9131aba66d63
> 
>  activity 
>   
>   | timestamp 
>  | source| source_elapsed
> --++---+
>   
>   
>Execute CQL3 query | 2015-10-09 
> 15:31:28.70 | 172.31.17.129 |  0
>  Parsing select * from processinfometric_profile where 
> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour='' 
> and minute=''; [SharedPool-Worker-1] | 2015-10-09 15:31:28.701000 | 
> 172.31.17.129 |101
>   
>   
> Preparing statement [SharedPool-Worker-1] | 2015-10-09 
> 15:31:28.701000 | 172.31.17.129 |334
>   
> Executing 
> single-partition query on processinfometric_profile [SharedPool-Worker-3] | 
> 2015-10-09 15:31:28.701000 | 172.31.17.129 |692
>   
>   
>Acquiring sstable references [SharedPool-Worker-3] | 2015-10-09 
> 15:31:28.701000 | 172.31.17.129 |713
>   
>   
> Merging memtable tombstones [SharedPool-Worker-3] | 2015-10-09 
> 15:31:28.701000 | 172.31.17.129 |726
>   
>   
>   Key cache hit for sstable 209 [SharedPool-Worker-3] | 2015-10-09 
> 15:31:28.704000 | 172.31.17.129 |   3143
>   
>   Seeking 
> to partition beginning in data file [SharedPool-Worker-3] | 2015-10-09 
> 15:31:28.704000 | 172.31.17.129 |   3169
>   
>   
>   Key cache hit for sstable 208 [SharedPool-Worker-3] | 2015-10-09 
> 15:31:28.704000 | 172.31.17.129 |   3691
>   
>   Seeking 
> to partition beginning in data file [SharedPool-Worker-3] | 2015-10-09 
> 15:31:28.704000 | 172.31.17.129 |   3713
>   
> Skipped 0/2 non-slice-intersecting 
> sstables, included 0 due to tombstones [SharedPool-Worker-3] | 2015-10-09 
> 15:31:28.704000 | 172.31.17.129 |   3807
>   
>
> Me

Re: Realtime data and (C)AP

2015-10-09 Thread Steve Robenalt
Hi Brice,

I agree with your nit-picky comment, particularly with respect to the OP's
emphasis, but there are many cases where read at ONE is sufficient and
performance is "better enough" to justify the possibility of a wrong
result. As with anything Cassandra, it's highly dependent on the nature of
the workload.

Steve


On Fri, Oct 9, 2015 at 12:36 PM, Brice Dutheil 
wrote:

> On Fri, Oct 9, 2015 at 2:27 AM, Steve Robenalt 
> wrote:
>
> In general, if you write at QUORUM and read at ONE (or LOCAL variants
>> thereof if you have multiple data centers), your apps will work well
>> despite the theoretical consistency issues.
>
> Nit-picky comment : if consistency is something important then reading at
> QUORUM is important. If read is ONE then the read operation *may* not see
> important update. The safest option is QUORUM for both write and read. Then
> depending on the business or feature the consistency may be tuned.
>
> — Brice
> ​
>



-- 
Steve Robenalt
Software Architect
sroben...@highwire.org 
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication


Post portem of a large Cassandra datacenter migration.

2015-10-09 Thread Kevin Burton
We just finished up a pretty large migration of about 30 Cassandra boxes to
a new datacenter.

We'll be migrating to about 60 boxes here in the next month so scalability
(and being able to do so cleanly) is important.

We also completed an Elasticsearch migration at the same time.  The ES
migration worked fine. A few small problems with it doing silly things with
relocating nodes too often but all in all it was somewhat painless.

At one point we were doing 200 shard reallocations in parallel and pushing
about 2-4Gbit...

The Cassandra migration, however, was a LOT harder.

One quick thing I wanted to point out - we're hiring.  So if you're a
killer Java Devops guy drop me an email

Anyway.  Back to the story.

Obviously we did a bunch of research before hand to make sure we had plenty
of bandwidth.  This was a migration from Washington DC to Germany.

Using iperf, we could consistently push about 2Gb back and forth between DC
and Germany.  This includes TCP as we switched to using large window sizes.

The big problem that we had, was that we could only bootstrap one node at a
time.  The ends up taking a LOT more time because you have to keep checking
on a node so that you can start the next one.

I imagine one could write a coordinator script but we had so many problems
with CS that it wouldn't have worked if we tried.

We had 2-3 main problems.

1.  Sometimes streams would just stop and lock up.  No explanation why.
They would just lock up and not resume.  We'd wait 10-15 minutes with no
response.. This would require us abort and retry.  Had we updated to
Cassandra 2.2 before hand I think the new resume support would work.

2.  Some of our keyspaces created by Thrift caused exceptions regarding
"too few resources" when trying to bootstrap. Dropping these keyspaces
fixed the problem.  They were just test keyspaces so it didn't matter.

3.  Because of #1, it's probably better to make sure you have 2x or more
disk space on the remote end before you do the migration.  This way you can
boot the same number of nodes you had before and just decommission the old
ones quickly. (er use nodetool removenode - see below)

4.  We're not sure why, but our OLDER machines kept locking up during this
process.  This kept requiring us to do a rolling restart on all the older
nodes.  We suspect this is GC and we were seeing single cores to 100%.  I
didn't have time to attach a profiler as were all burned out at this point
and just wanted to get it over with.  This problem meant that #1 was
exacerbated because our old boxes would either refuse to send streams or
refuse to accept them.  It seemed to get better when we upgraded the older
boxes to use Java 8.

5.  Don't use nodetool decommission if you have a large number of nodes.
Instead, use nodetool removenode.  It's MUCH faster and does M-N
replication between nodes directly.  The downside is that you go down to
N-1 replicas during this process. However, it was easily 20-30x faster.
This probably saved me about 5 hours of sleep!

In hindsight, I'm not sure what we would have done differently.  Maybe
bought more boxes.  Maybe upgraded to Cassandra 2.2 and probably java 8 as
well.

Setting up datacenter migration might have worked out better too.

Kevin

-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile



Re: Realtime data and (C)AP

2015-10-09 Thread Graham Sanderson
Most of our writes are not user facing so local_quorum is good... We also read 
at local_quorum because we prefer guaranteed consistency... But we very quickly 
fall back to local_one in the cases where some data fast is better than a 
failure. Currently we do that on a per read basis but we could I suppose detect 
a pattern or just look at the gossip to decide to go en masse into a degraded 
read mode

Sent from my iPhone

> On Oct 9, 2015, at 5:39 PM, Steve Robenalt  wrote:
> 
> Hi Brice,
> 
> I agree with your nit-picky comment, particularly with respect to the OP's 
> emphasis, but there are many cases where read at ONE is sufficient and 
> performance is "better enough" to justify the possibility of a wrong result. 
> As with anything Cassandra, it's highly dependent on the nature of the 
> workload.
> 
> Steve
> 
> 
>> On Fri, Oct 9, 2015 at 12:36 PM, Brice Dutheil  
>> wrote:
>>> On Fri, Oct 9, 2015 at 2:27 AM, Steve Robenalt  
>>> wrote:
>>> 
>>> In general, if you write at QUORUM and read at ONE (or LOCAL variants 
>>> thereof if you have multiple data centers), your apps will work well 
>>> despite the theoretical consistency issues.
>> 
>> Nit-picky comment : if consistency is something important then reading at 
>> QUORUM is important. If read is ONE then the read operation may not see 
>> important update. The safest option is QUORUM for both write and read. Then 
>> depending on the business or feature the consistency may be tuned.
>> 
>> — Brice
>> 
> 
> 
> 
> -- 
> Steve Robenalt 
> Software Architect
> sroben...@highwire.org 
> (office/cell): 916-505-1785
> 
> HighWire Press, Inc.
> 425 Broadway St, Redwood City, CA 94063
> www.highwire.org
> 
> Technology for Scholarly Communication


Re: Realtime data and (C)AP

2015-10-09 Thread Graham Sanderson
Actually maybe I'll open a JIRA issue for a (local)quorum_or_one consistency 
level... It should be trivial to implement on server side with exist timeouts 
... I'll need to check the CQL protocol to see if there is a good place to 
indicate you didn't reach quorum (in time)

Sent from my iPhone

> On Oct 9, 2015, at 8:02 PM, Graham Sanderson  wrote:
> 
> Most of our writes are not user facing so local_quorum is good... We also 
> read at local_quorum because we prefer guaranteed consistency... But we very 
> quickly fall back to local_one in the cases where some data fast is better 
> than a failure. Currently we do that on a per read basis but we could I 
> suppose detect a pattern or just look at the gossip to decide to go en masse 
> into a degraded read mode
> 
> Sent from my iPhone
> 
>> On Oct 9, 2015, at 5:39 PM, Steve Robenalt  wrote:
>> 
>> Hi Brice,
>> 
>> I agree with your nit-picky comment, particularly with respect to the OP's 
>> emphasis, but there are many cases where read at ONE is sufficient and 
>> performance is "better enough" to justify the possibility of a wrong result. 
>> As with anything Cassandra, it's highly dependent on the nature of the 
>> workload.
>> 
>> Steve
>> 
>> 
>>> On Fri, Oct 9, 2015 at 12:36 PM, Brice Dutheil  
>>> wrote:
 On Fri, Oct 9, 2015 at 2:27 AM, Steve Robenalt  
 wrote:
 
 In general, if you write at QUORUM and read at ONE (or LOCAL variants 
 thereof if you have multiple data centers), your apps will work well 
 despite the theoretical consistency issues.
>>> 
>>> Nit-picky comment : if consistency is something important then reading at 
>>> QUORUM is important. If read is ONE then the read operation may not see 
>>> important update. The safest option is QUORUM for both write and read. Then 
>>> depending on the business or feature the consistency may be tuned.
>>> 
>>> — Brice
>>> 
>> 
>> 
>> 
>> -- 
>> Steve Robenalt 
>> Software Architect
>> sroben...@highwire.org 
>> (office/cell): 916-505-1785
>> 
>> HighWire Press, Inc.
>> 425 Broadway St, Redwood City, CA 94063
>> www.highwire.org
>> 
>> Technology for Scholarly Communication


Re: Realtime data and (C)AP

2015-10-09 Thread Steve Robenalt
Hi Graham,

I've used the Java driver's DowngradingConsistencyRetryPolicy for that in
cases where it makes sense.

Ref:
http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/DowngradingConsistencyRetryPolicy.html

Steve



On Fri, Oct 9, 2015 at 6:06 PM, Graham Sanderson  wrote:

> Actually maybe I'll open a JIRA issue for a (local)quorum_or_one
> consistency level... It should be trivial to implement on server side with
> exist timeouts ... I'll need to check the CQL protocol to see if there is a
> good place to indicate you didn't reach quorum (in time)
>
> Sent from my iPhone
>
> On Oct 9, 2015, at 8:02 PM, Graham Sanderson  wrote:
>
> Most of our writes are not user facing so local_quorum is good... We also
> read at local_quorum because we prefer guaranteed consistency... But we
> very quickly fall back to local_one in the cases where some data fast is
> better than a failure. Currently we do that on a per read basis but we
> could I suppose detect a pattern or just look at the gossip to decide to go
> en masse into a degraded read mode
>
> Sent from my iPhone
>
> On Oct 9, 2015, at 5:39 PM, Steve Robenalt  wrote:
>
> Hi Brice,
>
> I agree with your nit-picky comment, particularly with respect to the OP's
> emphasis, but there are many cases where read at ONE is sufficient and
> performance is "better enough" to justify the possibility of a wrong
> result. As with anything Cassandra, it's highly dependent on the nature of
> the workload.
>
> Steve
>
>
> On Fri, Oct 9, 2015 at 12:36 PM, Brice Dutheil 
> wrote:
>
>> On Fri, Oct 9, 2015 at 2:27 AM, Steve Robenalt 
>> wrote:
>>
>> In general, if you write at QUORUM and read at ONE (or LOCAL variants
>>> thereof if you have multiple data centers), your apps will work well
>>> despite the theoretical consistency issues.
>>
>> Nit-picky comment : if consistency is something important then reading at
>> QUORUM is important. If read is ONE then the read operation *may* not
>> see important update. The safest option is QUORUM for both write and read.
>> Then depending on the business or feature the consistency may be tuned.
>>
>> — Brice
>> ​
>>
>
>
>
> --
> Steve Robenalt
> Software Architect
> sroben...@highwire.org 
> (office/cell): 916-505-1785
>
> HighWire Press, Inc.
> 425 Broadway St, Redwood City, CA 94063
> www.highwire.org
>
> Technology for Scholarly Communication
>
>


-- 
Steve Robenalt
Software Architect
sroben...@highwire.org 
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication