Strange behavior of timestamp column

2015-10-05 Thread Daniel Stucky
Hi all,

we have a very simple cassandra table that contains just a single row. We have 
a 3-machine cluster using Cassandra 2.1.8, cqlsh 5.0.1.

I do the following:

CREATE TABLE IF NOT EXISTS  scheduler_config (name text, suspended boolean, 
modified_ts timestamp, last_scheduled_start_ts timestamp, 
last_triggered_start_ts timestamp, PRIMARY KEY((name)));
INSERT INTO  scheduler_config (name, last_scheduled_start_ts) 
VALUES('scheduler', '2015-01-01T00:00:00.000Z');

Now I do some select * on the table and always receive the same value for  
last_scheduled_start_ts that was inserted ('2015-01-01T00:00:00.000Z').
After 30-60 seconds the select * statement suddenly returns the current 
timestamp (e.g. '2015-10-05T09:00:00.000Z' )instead of the inserted value.
Future select * requests also return this value ('2015-10-05T09:00:00.000Z'), 
so the timestamp does change once but not with every request.


-  If I insert the initial value again, the same happens.

-  If I insert another value (in the past), the same happens.

-  Droping and recreating the table does not change anything

-  Repairing the table also has no effect


On the other hand

-  If I insert a value in the future (e.g. 2016) the inserted value is 
always returned.

-  If I set the second timestamp column, the inserted value for that 
column is always  returned.

-  If I add a second row, the inserted value for both rows are always 
returned.

This behavior is really strange and seems to come from Cassandra itself (no 
active client besides cqlsh).

As far as we can tell this behavior started to occur on September 30th 18:00:00 
+0200 (not sure i fit was so from the beginning, at least we did not notice 
this effect).
Before this time was a cassandra restart, as we increased server side timeouts.


Anybody any idea what is causing this problem ?

Thanks,
Daniel



Re: Is HEAP_NEWSIZE configuration is no more useful from cassandra 2.1 ?

2015-10-05 Thread Daniel Chia
G1GC still has an Eden size, however, it's strongly recommended *NOT* to
set the new gen size G1GC and just let it figure it out based on your
target pause time.

Thanks,
Daniel

On Sun, Oct 4, 2015 at 4:11 PM, Tushar Agrawal 
wrote:

> If you are using CMS garbage collector then you still have to set the
> HEAP_NEWSIZE. With G1GC (new recommended GC) there is no concept of New or
> Older generation.
>
> On Sun, Oct 4, 2015 at 5:30 PM, Kiran mk  wrote:
>
>> Is HEAP_NEWSIZE configuration is no more useful from cassandra 2.1 ?
>>
>> Best Regards,
>> Kiran.M.K.
>>
>
>


RE: Consistency Issues

2015-10-05 Thread Walsh, Stephen
It did, but a ran it again on one node – that node never recovered. ☹

From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: 02 October 2015 21:20
To: user@cassandra.apache.org
Subject: Re: Consistency Issues

On Fri, Oct 2, 2015 at 1:32 AM, Walsh, Stephen 
mailto:stephen.wa...@aspect.com>> wrote:
Sorry for the late reply, I ran the nodetool resetlocalschema on all nodes but 
in the end it just removed all the schemas and crashed the applications.
I need to reset and try again. I’ll try get you the gc stats today ☺

FTR, running resetlocalschema on all nodes (especially simultaneously) seems 
likely to nuke all of your schema.

=Rob

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


Re: Changing schema on multiple nodes while they are isolated

2015-10-05 Thread Stephen Baynes
> Why don’t you simply let the node join the cluster? It will pull new
tables and the data automatically.


Because there is no guarantee the rest of the cluster is up, or even if
there is anything more than a cluster of one at this time. This is a plug
in and go environment where the user does not even know or care about the
details of Cassandra. It is not a managed datacenter.

On 2 October 2015 at 17:16, Jacques-Henri Berthemet <
jacques-henri.berthe...@genesys.com> wrote:

> Why don’t you simply let the node join the cluster? It will pull new
> tables and the data automatically.
>
>
>
> *--*
>
> *Jacques-Henri Berthemet*
>
>
>
> *From:* Stephen Baynes [mailto:stephen.bay...@smoothwall.net]
> *Sent:* vendredi 2 octobre 2015 18:08
> *To:* user@cassandra.apache.org
> *Subject:* Re: Changing schema on multiple nodes while they are isolated
>
>
>
> Hi Jacques-Henri
>
>
>
> You are right - serious trouble. I managed some more testing and it does
> not repair or share any data. In the logs I see lots of:
>
>
>
> WARN  [MessagingService-Incoming-/10.50.16.214] 2015-10-02 16:52:36,810
> IncomingTcpConnection.java:100 - UnknownColumnFamilyException reading from
> socket; closing
>
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
> cfId=e6828dd0-691a-11e5-8a27-b1780df21c7c
>
>  at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:163)
> ~[apache-cassandra-2.2.1.jar:2.2.1]
>
>  at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:96)
> ~[apache-cassandra-2.2.1.jar:2.2.1]
>
>
>
> and some:
>
>
>
> ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,546
> RepairMessageVerbHandler.java:164 - Got error, removing parent repair
> session
>
> ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,548
> CassandraDaemon.java:183 - Exception in thread
> Thread[AntiEntropyStage:1,5,main]
>
> java.lang.RuntimeException: java.lang.NullPointerException
>
>  at
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:167)
> ~[apache-cassandra-2.2.1.jar:2.2.1]
>
>  at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
> ~[apache-cassandra-2.2.1.jar:2.2.1]
>
>
>
>
>
> Will need to do some thinking about this. I wonder about shiping a backup
> of a good system keyspace and restore it on each node before it starts for
> the first time - but will that end up with each node having the same
> internal id?
>
>
>
>
>
>
>
> On 2 October 2015 at 16:27, Jacques-Henri Berthemet <
> jacques-henri.berthe...@genesys.com> wrote:
>
> Hi Stephen,
>
>
>
> If you manage to create tables on each node while node A and B are
> separated, you’ll get into troubles when they will reconnect again. I had
> the case previously and Cassandra complained that tables with same names
> but different ids were present in the keyspace. I don’t know if there is a
> way to fix that with nodetool but I don’t think that it is a good practice.
>
>
>
> To solve this, we have a “schema creator” application node that is
> responsible to change the schema. If this node is down, schema updates are
> not possible. We can make any node ‘creator’, but only one can be enabled
> at any given time.
>
> *--*
>
> *Jacques-Henri Berthemet*
>
>
>
> *From:* Stephen Baynes [mailto:stephen.bay...@smoothwall.net]
> *Sent:* vendredi 2 octobre 2015 16:46
> *To:* user@cassandra.apache.org
> *Subject:* Changing schema on multiple nodes while they are isolated
>
>
>
> Is it safe to make schema changes ( e.g. create keyspace and tables ) on
> multiple separate nodes of a cluster while they are out of communication
> with other nodes in the cluster? For example create on node A while node B
> is down, create on node B while A is down, then bring both up together.
>
>
>
> We are looking to embed Cassandra invisibly in another product and we have
> no control in what order users may start/stop the nodes up or add/remove
> them from clusters. And Cassandra must come up and be working with at least
> local access regardless. So this means always creating keyspaces and tables
> so they are always present. But this means nodes joining clusters which
> already have the same keyspace and table defined. Will it cause any issues?
> I have done some testing and saw some some issues when I tried to nodetool
> repair to bring things into sync. However at the time I was fighting with
> what I later discovered was CASSANDRA-9689 keyspace does not show in
> describe list, if create query times out.
>  and did not know
> what was what. I will give it another try sometime, but would appreciate
> knowing if this is going to run into trouble before we find it.
>
>
>
> We are basically using Cassandra to share fairly transient information We
> can cope with data loss during environment changes and occasional losses at
> other times. But if the environment is stable then it should all just work,
> 

RE: Changing schema on multiple nodes while they are isolated

2015-10-05 Thread Jacques-Henri Berthemet
Then maybe Cassandra is not the right tool for that, or you need a different 
data structure. For example you could keep a single table where what used to be 
your table name is now a part of you partition key. That way any “offline” data 
will be merged when the nodes join again. If you have conflicts it will be 
resolved on the basis of the row timestamp.

--
Jacques-Henri Berthemet

From: Stephen Baynes [mailto:stephen.bay...@smoothwall.net]
Sent: lundi 5 octobre 2015 11:00
To: user@cassandra.apache.org
Subject: Re: Changing schema on multiple nodes while they are isolated

> Why don’t you simply let the node join the cluster? It will pull new tables 
> and the data automatically.

Because there is no guarantee the rest of the cluster is up, or even if there 
is anything more than a cluster of one at this time. This is a plug in and go 
environment where the user does not even know or care about the details of 
Cassandra. It is not a managed datacenter.

On 2 October 2015 at 17:16, Jacques-Henri Berthemet 
mailto:jacques-henri.berthe...@genesys.com>>
 wrote:
Why don’t you simply let the node join the cluster? It will pull new tables and 
the data automatically.

--
Jacques-Henri Berthemet

From: Stephen Baynes 
[mailto:stephen.bay...@smoothwall.net]
Sent: vendredi 2 octobre 2015 18:08
To: user@cassandra.apache.org
Subject: Re: Changing schema on multiple nodes while they are isolated

Hi Jacques-Henri

You are right - serious trouble. I managed some more testing and it does not 
repair or share any data. In the logs I see lots of:

WARN  [MessagingService-Incoming-/10.50.16.214] 2015-10-02 
16:52:36,810 IncomingTcpConnection.java:100 - UnknownColumnFamilyException 
reading from socket; closing
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
cfId=e6828dd0-691a-11e5-8a27-b1780df21c7c
 at 
org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:163)
 ~[apache-cassandra-2.2.1.jar:2.2.1]
 at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:96)
 ~[apache-cassandra-2.2.1.jar:2.2.1]

and some:

ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,546 
RepairMessageVerbHandler.java:164 - Got error, removing parent repair session
ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,548 CassandraDaemon.java:183 - 
Exception in thread Thread[AntiEntropyStage:1,5,main]
java.lang.RuntimeException: java.lang.NullPointerException
 at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:167)
 ~[apache-cassandra-2.2.1.jar:2.2.1]
 at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) 
~[apache-cassandra-2.2.1.jar:2.2.1]


Will need to do some thinking about this. I wonder about shiping a backup of a 
good system keyspace and restore it on each node before it starts for the first 
time - but will that end up with each node having the same internal id?



On 2 October 2015 at 16:27, Jacques-Henri Berthemet 
mailto:jacques-henri.berthe...@genesys.com>>
 wrote:
Hi Stephen,

If you manage to create tables on each node while node A and B are separated, 
you’ll get into troubles when they will reconnect again. I had the case 
previously and Cassandra complained that tables with same names but different 
ids were present in the keyspace. I don’t know if there is a way to fix that 
with nodetool but I don’t think that it is a good practice.

To solve this, we have a “schema creator” application node that is responsible 
to change the schema. If this node is down, schema updates are not possible. We 
can make any node ‘creator’, but only one can be enabled at any given time.
--
Jacques-Henri Berthemet

From: Stephen Baynes 
[mailto:stephen.bay...@smoothwall.net]
Sent: vendredi 2 octobre 2015 16:46
To: user@cassandra.apache.org
Subject: Changing schema on multiple nodes while they are isolated

Is it safe to make schema changes ( e.g. create keyspace and tables ) on 
multiple separate nodes of a cluster while they are out of communication with 
other nodes in the cluster? For example create on node A while node B is down, 
create on node B while A is down, then bring both up together.

We are looking to embed Cassandra invisibly in another product and we have no 
control in what order users may start/stop the nodes up or add/remove them from 
clusters. And Cassandra must come up and be working with at least local access 
regardless. So this means always creating keyspaces and tables so they are 
always present. But this means nodes joining clusters which already have the 
same keyspace and table defined. Will it cause any issues? I have done some 
testing and saw some some issues when I tried to nodetool repair to bring 
things into sync. However at the time I was fighting with what I later 
discovered was CA

Re: Example of JavaBeanColumnMapper

2015-10-05 Thread Alexandre Dutra
Hi Ashish,

The components you mention are not part of the Java driver but belong to
the Spark Cassandra Connector project
, so I guess you
will have more comprehensive answers by asking your question on the
project's dedicated mailing list

.

That said, the Java driver *does* provide a Java Pojo mapping feature: you
can read more about it here
.

And finally, you might also be interested in Achilles
, which is also another
(third-party) mapping framework for Cassandra.

Hope that helps,

Alexandre

On Mon, Oct 5, 2015 at 12:21 AM Ashish Soni  wrote:

> Hi All ,
>
> Are there any Java examples of how to use JavaBeanColumnMapper or
> RowReader and RowWriter Factory.
>
> Any link to example code will be helpful.
>
> Ashish
>
-- 
Alexandre Dutra
Driver & Tools Engineer @ DataStax


Re: Changing schema on multiple nodes while they are isolated

2015-10-05 Thread Stephen Baynes
Clint Martin's idea of each node creating its own keyspace, but then
deciding which to use depending on the state of the cluster is really
interesting. I am going to explore that in more detail.

Thanks for the good idea.

On 3 October 2015 at 00:03, Clint Martin <
clintlmar...@coolfiretechnologies.com> wrote:

> You could use a two key space method.  At startup, wait some time for the
> node to join the cluster.
>
> the first time the app starts, you can be in one of three states:
>
> The happiest state is that you succeed in joining a cluster.  in this case
> you will get replicated the cluster's keyspace and can start using it as
> normal.
>
> the other two cases are exception cases: either you are the only node ever
> to exist or you are a new node for a cluster that you cannot communicate
> with.  in either of these cases, you create a local private copy of your
> schema in its own/unique keyspace.
>
> you application will use the "private" schema going forward until it
> receives notification about another node joining the cluster. when this
> occurs, your app attempts to create the "REAL" schema/keyspace (making
> liberal use of "if not exists") and (if necessary) migrates the data from
> it's "private" schema into the "real" schema followed by deleting the
> "private" schema.
>
> There are edge cases and likely race conditions inherent to this method
> that you would have to deal with, but it should do what you are describing.
>
> Clint
> Hi Jacques-Henri
>
> You are right - serious trouble. I managed some more testing and it does
> not repair or share any data. In the logs I see lots of:
>
> WARN  [MessagingService-Incoming-/10.50.16.214] 2015-10-02 16:52:36,810
> IncomingTcpConnection.java:100 - UnknownColumnFamilyException reading from
> socket; closing
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
> cfId=e6828dd0-691a-11e5-8a27-b1780df21c7c
> at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:163)
> ~[apache-cassandra-2.2.1.jar:2.2.1]
> at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:96)
> ~[apache-cassandra-2.2.1.jar:2.2.1]
>
> and some:
>
> ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,546
> RepairMessageVerbHandler.java:164 - Got error, removing parent repair
> session
> ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,548
> CassandraDaemon.java:183 - Exception in thread
> Thread[AntiEntropyStage:1,5,main]
> java.lang.RuntimeException: java.lang.NullPointerException
> at
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:167)
> ~[apache-cassandra-2.2.1.jar:2.2.1]
> at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
> ~[apache-cassandra-2.2.1.jar:2.2.1]
>
>
> Will need to do some thinking about this. I wonder about shiping a backup
> of a good system keyspace and restore it on each node before it starts for
> the first time - but will that end up with each node having the same
> internal id?
>
>
>
> On 2 October 2015 at 16:27, Jacques-Henri Berthemet <
> jacques-henri.berthe...@genesys.com> wrote:
>
>> Hi Stephen,
>>
>>
>>
>> If you manage to create tables on each node while node A and B are
>> separated, you’ll get into troubles when they will reconnect again. I had
>> the case previously and Cassandra complained that tables with same names
>> but different ids were present in the keyspace. I don’t know if there is a
>> way to fix that with nodetool but I don’t think that it is a good practice.
>>
>>
>>
>> To solve this, we have a “schema creator” application node that is
>> responsible to change the schema. If this node is down, schema updates are
>> not possible. We can make any node ‘creator’, but only one can be enabled
>> at any given time.
>>
>> *--*
>>
>> *Jacques-Henri Berthemet*
>>
>>
>>
>> *From:* Stephen Baynes [mailto:stephen.bay...@smoothwall.net]
>> *Sent:* vendredi 2 octobre 2015 16:46
>> *To:* user@cassandra.apache.org
>> *Subject:* Changing schema on multiple nodes while they are isolated
>>
>>
>>
>> Is it safe to make schema changes ( e.g. create keyspace and tables ) on
>> multiple separate nodes of a cluster while they are out of communication
>> with other nodes in the cluster? For example create on node A while node B
>> is down, create on node B while A is down, then bring both up together.
>>
>>
>>
>> We are looking to embed Cassandra invisibly in another product and we
>> have no control in what order users may start/stop the nodes up or
>> add/remove them from clusters. And Cassandra must come up and be working
>> with at least local access regardless. So this means always creating
>> keyspaces and tables so they are always present. But this means nodes
>> joining clusters which already have the same keyspace and table defined.
>> Will it cause any issues? I have done some testing and saw some some issues
>> when I tried to nodetool repair to bring things into sync. However at the
>> 

Re: Changing schema on multiple nodes while they are isolated

2015-10-05 Thread Clint Martin
No problem, glad to help. I'd love to see how it works out for you.

Clint
On Oct 5, 2015 8:12 AM, "Stephen Baynes" 
wrote:

> Clint Martin's idea of each node creating its own keyspace, but then
> deciding which to use depending on the state of the cluster is really
> interesting. I am going to explore that in more detail.
>
> Thanks for the good idea.
>
> On 3 October 2015 at 00:03, Clint Martin <
> clintlmar...@coolfiretechnologies.com> wrote:
>
>> You could use a two key space method.  At startup, wait some time for the
>> node to join the cluster.
>>
>> the first time the app starts, you can be in one of three states:
>>
>> The happiest state is that you succeed in joining a cluster.  in this
>> case you will get replicated the cluster's keyspace and can start using it
>> as normal.
>>
>> the other two cases are exception cases: either you are the only node
>> ever to exist or you are a new node for a cluster that you cannot
>> communicate with.  in either of these cases, you create a local private
>> copy of your schema in its own/unique keyspace.
>>
>> you application will use the "private" schema going forward until it
>> receives notification about another node joining the cluster. when this
>> occurs, your app attempts to create the "REAL" schema/keyspace (making
>> liberal use of "if not exists") and (if necessary) migrates the data from
>> it's "private" schema into the "real" schema followed by deleting the
>> "private" schema.
>>
>> There are edge cases and likely race conditions inherent to this method
>> that you would have to deal with, but it should do what you are describing.
>>
>> Clint
>> Hi Jacques-Henri
>>
>> You are right - serious trouble. I managed some more testing and it does
>> not repair or share any data. In the logs I see lots of:
>>
>> WARN  [MessagingService-Incoming-/10.50.16.214] 2015-10-02 16:52:36,810
>> IncomingTcpConnection.java:100 - UnknownColumnFamilyException reading from
>> socket; closing
>> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
>> cfId=e6828dd0-691a-11e5-8a27-b1780df21c7c
>> at
>> org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:163)
>> ~[apache-cassandra-2.2.1.jar:2.2.1]
>> at
>> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:96)
>> ~[apache-cassandra-2.2.1.jar:2.2.1]
>>
>> and some:
>>
>> ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,546
>> RepairMessageVerbHandler.java:164 - Got error, removing parent repair
>> session
>> ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,548
>> CassandraDaemon.java:183 - Exception in thread
>> Thread[AntiEntropyStage:1,5,main]
>> java.lang.RuntimeException: java.lang.NullPointerException
>> at
>> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:167)
>> ~[apache-cassandra-2.2.1.jar:2.2.1]
>> at
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
>> ~[apache-cassandra-2.2.1.jar:2.2.1]
>>
>>
>> Will need to do some thinking about this. I wonder about shiping a backup
>> of a good system keyspace and restore it on each node before it starts for
>> the first time - but will that end up with each node having the same
>> internal id?
>>
>>
>>
>> On 2 October 2015 at 16:27, Jacques-Henri Berthemet <
>> jacques-henri.berthe...@genesys.com> wrote:
>>
>>> Hi Stephen,
>>>
>>>
>>>
>>> If you manage to create tables on each node while node A and B are
>>> separated, you’ll get into troubles when they will reconnect again. I had
>>> the case previously and Cassandra complained that tables with same names
>>> but different ids were present in the keyspace. I don’t know if there is a
>>> way to fix that with nodetool but I don’t think that it is a good practice.
>>>
>>>
>>>
>>> To solve this, we have a “schema creator” application node that is
>>> responsible to change the schema. If this node is down, schema updates are
>>> not possible. We can make any node ‘creator’, but only one can be enabled
>>> at any given time.
>>>
>>> *--*
>>>
>>> *Jacques-Henri Berthemet*
>>>
>>>
>>>
>>> *From:* Stephen Baynes [mailto:stephen.bay...@smoothwall.net]
>>> *Sent:* vendredi 2 octobre 2015 16:46
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Changing schema on multiple nodes while they are isolated
>>>
>>>
>>>
>>> Is it safe to make schema changes ( e.g. create keyspace and tables ) on
>>> multiple separate nodes of a cluster while they are out of communication
>>> with other nodes in the cluster? For example create on node A while node B
>>> is down, create on node B while A is down, then bring both up together.
>>>
>>>
>>>
>>> We are looking to embed Cassandra invisibly in another product and we
>>> have no control in what order users may start/stop the nodes up or
>>> add/remove them from clusters. And Cassandra must come up and be working
>>> with at least local access regardless. So this means always creating
>>> keyspaces and tables so they are always present. But t

AW: Strange behavior of timestamp column

2015-10-05 Thread Daniel Stucky
Please forget about this email,

there was a long forgotten client running somwhere in our data center that 
caused this problem.

Von: Daniel Stucky [mailto:daniel.stu...@empolis.com]
Gesendet: Montag, 5. Oktober 2015 09:02
An: user@cassandra.apache.org
Betreff: Strange behavior of timestamp column

Hi all,

we have a very simple cassandra table that contains just a single row. We have 
a 3-machine cluster using Cassandra 2.1.8, cqlsh 5.0.1.

I do the following:

CREATE TABLE IF NOT EXISTS  scheduler_config (name text, suspended boolean, 
modified_ts timestamp, last_scheduled_start_ts timestamp, 
last_triggered_start_ts timestamp, PRIMARY KEY((name)));
INSERT INTO  scheduler_config (name, last_scheduled_start_ts) 
VALUES('scheduler', '2015-01-01T00:00:00.000Z');

Now I do some select * on the table and always receive the same value for  
last_scheduled_start_ts that was inserted ('2015-01-01T00:00:00.000Z').
After 30-60 seconds the select * statement suddenly returns the current 
timestamp (e.g. '2015-10-05T09:00:00.000Z' )instead of the inserted value.
Future select * requests also return this value ('2015-10-05T09:00:00.000Z'), 
so the timestamp does change once but not with every request.


-  If I insert the initial value again, the same happens.

-  If I insert another value (in the past), the same happens.

-  Droping and recreating the table does not change anything

-  Repairing the table also has no effect


On the other hand

-  If I insert a value in the future (e.g. 2016) the inserted value is 
always returned.

-  If I set the second timestamp column, the inserted value for that 
column is always  returned.

-  If I add a second row, the inserted value for both rows are always 
returned.

This behavior is really strange and seems to come from Cassandra itself (no 
active client besides cqlsh).

As far as we can tell this behavior started to occur on September 30th 18:00:00 
+0200 (not sure i fit was so from the beginning, at least we did not notice 
this effect).
Before this time was a cassandra restart, as we increased server side timeouts.


Anybody any idea what is causing this problem ?

Thanks,
Daniel



[RELEASE] Apache Cassandra 2.1.10 released

2015-10-05 Thread Jake Luciani
The Cassandra team is pleased to announce the release of Apache Cassandra
version 2.1.10.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 2.1 series. As always, please
pay
attention to the release notes[2] and Let us know[3] if you were to
encounter
any problem.

Enjoy!

[1]: http://goo.gl/KE0tlf (CHANGES.txt)
[2]: http://goo.gl/0CW2iz (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA


[RELEASE] Apache Cassandra 2.2.2 released

2015-10-05 Thread Jake Luciani
The Cassandra team is pleased to announce the release of Apache Cassandra
version 2.2.2.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 2.2 series. As always, please
pay
attention to the release notes[2] and Let us know[3] if you were to
encounter
any problem.

Enjoy!

[1]: http://goo.gl/d9xIEO (CHANGES.txt)
[2]: http://goo.gl/S64khA (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA


Re: Cassandra certification

2015-10-05 Thread Michael Shuler

On 10/02/2015 09:11 PM, Renato Perini wrote:

What credibility can have a certification with a non disclosure agreement?


Many (most?) technical certifications require signing and NDA agreeing 
to not disclose the test material. Cisco and RedHat certification tests 
that I have taken both required signing an NDA. Heck, the SAT college 
entrance exam has a non-disclosure, too..


--
Kind regards,
Michael



Re: Is HEAP_NEWSIZE configuration is no more useful from cassandra 2.1 ?

2015-10-05 Thread Mark Curtis
Just to add some credibility to not using this setting. I've also seen
information on Oracle's blogs too:

http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html

Hope that helps

Mark

On 5 October 2015 at 08:59, Daniel Chia  wrote:

> G1GC still has an Eden size, however, it's strongly recommended *NOT* to
> set the new gen size G1GC and just let it figure it out based on your
> target pause time.
>
> Thanks,
> Daniel
>
> On Sun, Oct 4, 2015 at 4:11 PM, Tushar Agrawal 
> wrote:
>
>> If you are using CMS garbage collector then you still have to set the
>> HEAP_NEWSIZE. With G1GC (new recommended GC) there is no concept of New or
>> Older generation.
>>
>> On Sun, Oct 4, 2015 at 5:30 PM, Kiran mk  wrote:
>>
>>> Is HEAP_NEWSIZE configuration is no more useful from cassandra 2.1 ?
>>>
>>> Best Regards,
>>> Kiran.M.K.
>>>
>>
>>
>


Re: Cassandra Configuration VS Static IPs.

2015-10-05 Thread Eric Stevens
Basically your client just needs a route to talk to the IP being broadcast
by each node.  We do plenty in EC2 and we use the instance private IP in
the broadcast address.  If you are doing multi-datacenter in EC2 it gets a
little harrier, where you need to use the public IP (but not necessarily
elastic IPs).

> A client is a program connecting to a cassandra instance. All it needs to
know is an IP, a keyspace and a table to operate.

More correctly it's talking to a Cassandra cluster; typically you'll
configure your client with several seed nodes, and the client will use
those to discover the topology of the rest of the cluster.  Your client
should be able to talk to all the nodes in at least one DC (usually the
same physical as well as logical DC as the client).  This helps any good
client which will attempt to route queries directly to the replicas for
each piece of data to save the cluster the overhead and increased latency
of coordinating reads and writes.

On Sun, Oct 4, 2015 at 6:41 PM Jonathan Haddad  wrote:

> You've been talking about configuring static, public IPs.  Public IPs are
> only needed if you want to connect to your Cassandra servers from a public
> network aka not from the same datacenter.
>
> AWS instances don't get a new IP on rebooting.  The instance doesn't
> shutdown when you tell a server to reboot, it just keeps running & keeps
> the same IP.
>
> You can connect to the internal address (192.168.x.x) or (10.x.x.x.) if
> you're on the same network.  That's not a public IP.  You don't have to
> hard code an address in your yaml, you can just the rpc_interface and set
> it to eth0 (or whatever AWS uses by default for server's NIC).
>
> Also, you know you can control IP addresses in a VPC, right?
>
> On Sun, Oct 4, 2015 at 8:31 PM Renato Perini 
> wrote:
>
>> Jonathan, I have some difficulties in understanding what you're talking
>> about. A client is a program connecting to a cassandra instance. All it
>> needs to know is an IP, a keyspace and a table to operate. My client is
>> nothing more than a simple textual version of a program like datastax
>> devcenter. No "same dc" concepts are involved for using it.
>> As for AWS, I'm not changing anything. The instances, as I said multiple
>> times, don't have an elastic ip, so the public IP is dynamic. This means it
>> changes automatically at every reboot.
>>
>>
>> Il 05/10/2015 02:22, Jonathan Haddad ha scritto:
>>
>> If your client is in the same DC, then you shouldn't use *public* ip
>> addresses.  If you're using a recent version of Cassandra you can just set
>> the listen_interface and rpc_interface to whatever network interface you've
>> got.
>>
>> If you're really changing IPs when you reboot machines (I have no idea
>> why you'd do this, AWS definitely doesn't work this way) then I think
>> you're going to hit a whole set of other issues.
>>
>>
>> On Sun, Oct 4, 2015 at 7:10 PM Renato Perini 
>> wrote:
>>
>>> Yes, the client uses the same datacenter (us-west-2).
>>> Maybe I haven't explained well the situation. I'm not asking to connect
>>> to nodes *without* using a static IP address, but allowing Cassandra to
>>> determine the current public address at the time of connection.
>>> Spark, for example, uses shell scripts for configuration, so the public
>>> IP (in AWS) can be assigned using the command `curl
>>> http://169.254.169.254/latest/meta-data/public-ipv4`, whatever it is at
>>> the time of boot.
>>> Cassandra uses a yaml file for the main configuration, so this is
>>> impossibile to achieve. Basically I would like to make the client connect
>>> correctly on all nodes using their public IPs without being required to
>>> know them (the client would discover them dynamically while connecting).
>>>
>>>
>>>
>>> Il 05/10/2015 00:55, Jonathan Haddad ha scritto:
>>>
>>> So you're not running the client in the same DC as your Cassandra
>>> cluster.  In that case you'll need to be able to connect to the public
>>> address of all the nodes.  Technically you could have a whitelist and only
>>> connect to 1, I wouldn't recommend it.
>>>
>>> This is no different than any other database in that you would need a
>>> public address to be able to connect to the servers from a machine not in
>>> your datacenter.  How else would you connect to them if you don't provide
>>> access?
>>>
>>> On Sun, Oct 4, 2015 at 6:35 PM Renato Perini 
>>> wrote:
>>>
 Seems to be not the case when connecting to my (single) data center
 using the java connector with a small client I have developed for testing.
 For the broadcast_rpc_address I have configured the local IP of the
 nodes. The cluster works fine and nodes communicates fairly well using
 their local IPs. When I connect to a node (let's say node 1) from the
 outside using the java driver and the node's public IP, the cluster
 discovery uses internal IPs for contacting other nodes, leading to
 (obviously) errors.

 As for AWS, Elastic IPs are free as long as the