Re: TWCS sstables gets merged following node removal

2018-12-18 Thread Roy Burstein
read repair is disabled in this table :

CREATE TABLE gil_test.my_test (
id int,
creation_time timestamp,
...
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 3600
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';

1228 is the result of compaction from 873 and 1196, so it makes sense that
they would have the same max timestamp
1196 is an sstable that was created during streaming (added that part of
the log before), since it was compacted right away when the node removal
was done i don't have its metadata.

when data is being streamed from other nodes during node removal, do i have
data mixed from different time window in one stream?
also even if it is being separated, what would stop old data being stored
in the same memtable as data just being written from the application?


On Mon, Dec 17, 2018 at 7:18 PM Jeff Jirsa  wrote:

>
> The min timestamps vary (likely due to read repairing old values into the
> memtable and flushing into these sstables), but the max timestamps for both
> are in the same second (same microsecond, even, so probably the same write):
>
> Maximum timestamp: 1544903882074190
> Maximum timestamp: 1544903882074190
>
> jjirsa:~ jjirsa$ date -r 1544903882
> Sat Dec 15 11:58:02 PST 2018
>
> TWCS buckets based on max timestamp per file, so they belong together:
>
>
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L247
>
>
>
> On Sun, Dec 16, 2018 at 11:39 PM Roy Burstein 
> wrote:
>
>> hey jeff, attaching more information.
>> so this the situation before - 3 nodes in the cluster (3.11.3 in this
>> case but i saw same thing in 2.1 and 3.0), there is a script writing one
>> row every minute and another script doing nodetool flush every 10 minute.
>> window is defined as two hours, so after a few days this is how the
>> directory listing looks :
>>
>> drwxr-xr-x 2 cassandra cassandra 4096 Dec 11 10:38 backups
>> -rw-r--r-- 1 cassandra cassandra  646 Dec 12 05:25 mc-171-big-Index.db
>> -rw-r--r-- 1 cassandra cassandra  104 Dec 12 05:25 mc-171-big-Filter.db
>> -rw-r--r-- 1 cassandra cassandra   56 Dec 12 05:25 mc-171-big-Summary.db
>> -rw-r--r-- 1 cassandra cassandra 3561 Dec 12 05:25 mc-171-big-Data.db
>> -rw-r--r-- 1 cassandra cassandra   10 Dec 12 05:25 mc-171-big-Digest.crc32
>> -rw-r--r-- 1 cassandra cassandra   59 Dec 12 05:25
>> mc-171-big-CompressionInfo.db
>> -rw-r--r-- 1 cassandra cassandra 4893 Dec 12 05:25
>> mc-171-big-Statistics.db
>> -rw-r--r-- 1 cassandra cassandra   92 Dec 12 05:25 mc-171-big-TOC.txt
>> -rw-r--r-- 1 cassandra cassandra  565 Dec 12 05:25 mc-172-big-Index.db
>> -rw-r--r-- 1 cassandra cassandra   96 Dec 12 05:25 mc-172-big-Filter.db
>> -rw-r--r-- 1 cassandra cassandra   56 Dec 12 05:25 mc-172-big-Summary.db
>> -rw-r--r-- 1 cassandra cassandra 3475 Dec 12 05:25 mc-172-big-Data.db
>> -rw-r--r-- 1 cassandra cassandra   10 Dec 12 05:25 mc-172-big-Digest.crc32
>> -rw-r--r-- 1 cassandra cassandra   59 Dec 12 05:25
>> mc-172-big-CompressionInfo.db
>> -rw-r--r-- 1 cassandra cassandra 4865 Dec 12 05:25
>> mc-172-big-Statistics.db
>> -rw-r--r-- 1 cassandra cassandra   92 Dec 12 05:25 mc-172-big-TOC.txt
>> -rw-r--r-- 1 cassandra cassandra  637 Dec 12 05:25 mc-173-big-Index.db
>> -rw-r--r-- 1 cassandra cassandra  104 Dec 12 05:25 mc-173-big-Filter.db
>> -rw-r--r-- 1 cassandra cassandra   56 Dec 12 05:25 mc-173-big-Summary.db
>> -rw-r--r-- 1 cassandra cassandra 3678 Dec 12 05:25 mc-173-big-Data.db
>> -rw-r--r-- 1 cassandra cassandra   10 Dec 12 05:25 mc-173-big-Digest.crc32
>> -rw-r--r-- 1 cassandra cassandra   59 Dec 12 05:25
>> mc-173-big-CompressionInfo.db
>> -rw-r--r-- 1 cassandra cassandra   92 Dec 12 05:25 mc-173-big-TOC.txt
>> -rw-r--r-- 1 cassandra cassandra 4888 Dec 12 05:25
>> mc-173-big-Statistics.db
>> .
>> .
>> -rw-r--r-- 1 cassandra cassandra  340 Dec 15 20:10 mc-873-big-Index.db
>> -rw-r--r-- 1 cassandra cassandra   64 Dec 15 20:10 mc-873-big-Filter.db
>> -rw-r--r-- 1 cassandra cassandra   56 Dec 15 20:10 mc-873-big-Summary.db
>> -rw-r--r-- 1 cassandra cassandra 1910 Dec 15 20:10 mc-873-big-Data.db
>> -rw-r--r-- 1 cassandra cassandra   10 Dec 15 20:10 mc-873-big-Digest.crc32
>> -rw-r--r-- 1 cassandra cassandra   51 Dec 15 20:10
>> mc-873-big-CompressionInfo.db
>> -rw-r--r-- 1 cassandra cassandra 4793 Dec 15 20:10
>> mc-873-big-Statistics.db
>> -rw-r--r-- 1 cassandra cassandra   92 Dec 15 20:10 mc-873-big-TOC.txt
>> .
>> .
>> .
>> -rw-r--r-- 1 cassandra cassandra   24 Dec 17 06:50 mc-1150-big-Filter.db
>> -rw-r--r-- 1 cassandra cassandra   51 Dec 17 06:50 mc-1150-big-Index.db
>> -rw-r--r-- 1 cassandra cassandra   56 Dec 17 06:50 mc-1150-big-Summary.db
>> -rw-r--r-- 1 cassandra cassandra   10 Dec 17 06:50
>> mc-1150-big-Digest.crc32
>> -rw-r--r-- 1 cassandra cassandra  226 De

Re: Cassandra Integrated Auth for JMX

2018-12-18 Thread Cyril Scetbon
So it seems we can use jaas with Jolokia as we can do with JMX. Has anyone set 
it up ? 

I tried adding authMode=jaas to the Jolokia agent’s configuration in 
jvm.options and at the end I get the following set of options :

-javaagent:/usr/local/share/jolokia-agent.jar=host=0.0.0.0,executor=fixed,authMode=jaas
 
-Dcom.sun.management.jmxremote.authenticate=true, 
-Dcassandra.jmx.remote.login.config=CassandraLogin, 
-Djava.security.auth.login.config=/etc/cassandra/cassandra-jaas.config, 
-Dcassandra.jmx.authorizer=org.apache.cassandra.auth.jmx.AuthorizationProxy, 
-Dcom.sun.management.jmxremote, 
-Dcom.sun.management.jmxremote.ssl=false, 
-Dcom.sun.management.jmxremote.local.only=false, 
-Dcassandra.jmx.remote.port=7199, 
-Dcom.sun.management.jmxremote.rmi.port=7199, 
-Djava.rmi.server.hostname= 2a1d064ce844, 

It seems I’m missing something cause I always get 401 http return codes. Maybe 
the realm configuration or something else ?
—
Cyril Scetbon

> On Dec 16, 2018, at 2:07 PM, Cyril Scetbon  wrote:
> 
> Good catch Jonathan, I forgot that layer between me and JMX… So I need to add 
> the authentication at Jolokia’s level and not JMX. 
> 
> Thank you !
> —
> Cyril Scetbon
> 
>> On Dec 16, 2018, at 12:50 PM, Jonathan Haddad > > wrote:
>> 
>> Jolokia is running as an agent, which means it runs in process and has 
>> access to everything within the JVM.
>> 
>> JMX credentials are supplies to the JMX server, which Jolokia is bypassing.
>> 
>> You'll need to read up on Jolokia's security if you want to keep using it: 
>> https://jolokia.org/reference/html/security.html 
>> 
>> 
>> Jon
>> 
>> On Sun, Dec 16, 2018 at 7:26 AM Cyril Scetbon > > wrote:
>> Hey guys,
>> 
>> I’ve followed 
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/secureJmxAuthentication.html
>>  
>> 
>>  to setup JMX with Cassandra’s internal auth using Cassandra 3.11.3
>> 
>> However I still can connect to JMX without authenticating. You can see in 
>> the following attempts that authentication is set up :
>> 
>> cassandra@ 2a1d064ce844 / $ cqlsh -u cassandra -p cassandra
>> Connected to MyCluster at 127.0.0.1:9042 .
>> [cqlsh 5.0.1 | Cassandra 3.11.3 | CQL spec 3.4.4 | Native protocol v4]
>> Use HELP for help.
>> cassandra@cqlsh>
>> 
>> cassandra@ 2a1d064ce844 / $ cqlsh -u cassandra -p cassandra2
>> Connection error: ('Unable to connect to any servers', {'127.0.0.1': 
>> AuthenticationFailed('Failed to authenticate to 127.0.0.1 
>> : Error from server: code=0100 [Bad credentials] 
>> message="Provided username cassandra and/or password are incorrect"',)})
>> 
>> Here is my whole JVM's configuration :
>> 
>> -Xloggc:/var/log/cassandra/gc.log, -XX:+UseThreadPriorities, 
>> -XX:ThreadPriorityPolicy=42, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, 
>> -XX:StringTableSize=103, -XX:+AlwaysPreTouch, -XX:-UseBiasedLocking, 
>> -XX:+UseTLAB, -XX:+ResizeTLAB, -Djava.net.preferIPv4Stack=true, -Xms128M, 
>> -Xmx128M, -XX:+UseG1GC, -XX:G1RSetUpdatingPauseTimePercent=5, 
>> -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC, 
>> -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, 
>> -XX:+PrintPromotionFailure, 
>> -javaagent:/usr/local/share/jolokia-agent.jar=host=0.0.0.0,executor=fixed, 
>> -javaagent:/usr/local/share/prometheus-agent.jar=1234:/etc/cassandra/prometheus.yaml,
>>  -XX:+PrintCommandLineFlags, -Xloggc:/var/lib/cassandra/log/gc.log, 
>> -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=10, -XX:GCLogFileSize=10M, 
>> -Dcassandra.migration_task_wait_in_seconds=1, 
>> -Dcassandra.ring_delay_ms=3, 
>> -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler, 
>> -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar, 
>> -Dcassandra.jmx.remote.port=7199, 
>> -Dcom.sun.management.jmxremote.rmi.port=7199, 
>> -Djava.library.path=/usr/share/cassandra/lib/sigar-bin, 
>> -Dcom.sun.management.jmxremote.authenticate=true, 
>> -Dcassandra.jmx.remote.login.config=CassandraLogin, 
>> -Djava.security.auth.login.config=/etc/cassandra/cassandra-jaas.config, 
>> -Dcassandra.jmx.authorizer=org.apache.cassandra.auth.jmx.AuthorizationProxy, 
>> -Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.ssl=false, 
>> -Dcom.sun.management.jmxremote.local.only=false, 
>> -Dcassandra.jmx.remote.port=7199, 
>> -Dcom.sun.management.jmxremote.rmi.port=7199, -Djava.rmi.server.hostname= 
>> 2a1d064ce844, 
>> -Dcassandra.libjemalloc=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1, 
>> -XX:OnOutOfMemoryError=kill -9 %p, -Dlogback.configurationFile=logback.xml, 
>> -Dcassandra.logdir=/var/log/cassandra, 
>> -Dcassandra.storagedir=/var/lib/cassandra, -Dcassandra-foreground=yes
>> 
>> But I still can query JMX without authenticating :
>> 
>> echo '{"mbean": "org.apache.cassandra.db:typ

Re: TWCS sstables gets merged following node removal

2018-12-18 Thread Jeff Jirsa
The read repair you have disabled is the probabilistic background repair - 
foreground repair due to mismatch still happens

Streaming should respect windows. Streaming doesn’t write to the memtable, only 
the write path puts data into the memtable.


-- 
Jeff Jirsa


> On Dec 18, 2018, at 1:49 AM, Roy Burstein  wrote:
> 
> read repair is disabled in this table :
> 
> CREATE TABLE gil_test.my_test (
> id int,
> creation_time timestamp,
> ...
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 3600
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'NONE';
> 
> 1228 is the result of compaction from 873 and 1196, so it makes sense that 
> they would have the same max timestamp
> 1196 is an sstable that was created during streaming (added that part of the 
> log before), since it was compacted right away when the node removal was done 
> i don't have its metadata.
> 
> when data is being streamed from other nodes during node removal, do i have 
> data mixed from different time window in one stream? 
> also even if it is being separated, what would stop old data being stored in 
> the same memtable as data just being written from the application?
> 
> 
>> On Mon, Dec 17, 2018 at 7:18 PM Jeff Jirsa  wrote:
>> 
>> The min timestamps vary (likely due to read repairing old values into the 
>> memtable and flushing into these sstables), but the max timestamps for both 
>> are in the same second (same microsecond, even, so probably the same write):
>> 
>> Maximum timestamp: 1544903882074190
>> Maximum timestamp: 1544903882074190
>> 
>> jjirsa:~ jjirsa$ date -r 1544903882
>> Sat Dec 15 11:58:02 PST 2018
>> 
>> TWCS buckets based on max timestamp per file, so they belong together: 
>> 
>> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java#L247
>> 
>> 
>> 
>>> On Sun, Dec 16, 2018 at 11:39 PM Roy Burstein  
>>> wrote:
>>> hey jeff, attaching more information.
>>> so this the situation before - 3 nodes in the cluster (3.11.3 in this case 
>>> but i saw same thing in 2.1 and 3.0), there is a script writing one row 
>>> every minute and another script doing nodetool flush every 10 minute.
>>> window is defined as two hours, so after a few days this is how the 
>>> directory listing looks :
>>> 
>>> drwxr-xr-x 2 cassandra cassandra 4096 Dec 11 10:38 backups
>>> -rw-r--r-- 1 cassandra cassandra  646 Dec 12 05:25 mc-171-big-Index.db
>>> -rw-r--r-- 1 cassandra cassandra  104 Dec 12 05:25 mc-171-big-Filter.db
>>> -rw-r--r-- 1 cassandra cassandra   56 Dec 12 05:25 mc-171-big-Summary.db
>>> -rw-r--r-- 1 cassandra cassandra 3561 Dec 12 05:25 mc-171-big-Data.db
>>> -rw-r--r-- 1 cassandra cassandra   10 Dec 12 05:25 mc-171-big-Digest.crc32
>>> -rw-r--r-- 1 cassandra cassandra   59 Dec 12 05:25 
>>> mc-171-big-CompressionInfo.db
>>> -rw-r--r-- 1 cassandra cassandra 4893 Dec 12 05:25 mc-171-big-Statistics.db
>>> -rw-r--r-- 1 cassandra cassandra   92 Dec 12 05:25 mc-171-big-TOC.txt
>>> -rw-r--r-- 1 cassandra cassandra  565 Dec 12 05:25 mc-172-big-Index.db
>>> -rw-r--r-- 1 cassandra cassandra   96 Dec 12 05:25 mc-172-big-Filter.db
>>> -rw-r--r-- 1 cassandra cassandra   56 Dec 12 05:25 mc-172-big-Summary.db
>>> -rw-r--r-- 1 cassandra cassandra 3475 Dec 12 05:25 mc-172-big-Data.db
>>> -rw-r--r-- 1 cassandra cassandra   10 Dec 12 05:25 mc-172-big-Digest.crc32
>>> -rw-r--r-- 1 cassandra cassandra   59 Dec 12 05:25 
>>> mc-172-big-CompressionInfo.db
>>> -rw-r--r-- 1 cassandra cassandra 4865 Dec 12 05:25 mc-172-big-Statistics.db
>>> -rw-r--r-- 1 cassandra cassandra   92 Dec 12 05:25 mc-172-big-TOC.txt
>>> -rw-r--r-- 1 cassandra cassandra  637 Dec 12 05:25 mc-173-big-Index.db
>>> -rw-r--r-- 1 cassandra cassandra  104 Dec 12 05:25 mc-173-big-Filter.db
>>> -rw-r--r-- 1 cassandra cassandra   56 Dec 12 05:25 mc-173-big-Summary.db
>>> -rw-r--r-- 1 cassandra cassandra 3678 Dec 12 05:25 mc-173-big-Data.db
>>> -rw-r--r-- 1 cassandra cassandra   10 Dec 12 05:25 mc-173-big-Digest.crc32
>>> -rw-r--r-- 1 cassandra cassandra   59 Dec 12 05:25 
>>> mc-173-big-CompressionInfo.db
>>> -rw-r--r-- 1 cassandra cassandra   92 Dec 12 05:25 mc-173-big-TOC.txt
>>> -rw-r--r-- 1 cassandra cassandra 4888 Dec 12 05:25 mc-173-big-Statistics.db
>>> .
>>> .
>>> -rw-r--r-- 1 cassandra cassandra  340 Dec 15 20:10 mc-873-big-Index.db
>>> -rw-r--r-- 1 cassandra cassandra   64 Dec 15 20:10 mc-873-big-Filter.db
>>> -rw-r--r-- 1 cassandra cassandra   56 Dec 15 20:10 mc-873-big-Summary.db
>>> -rw-r--r-- 1 cassandra cassandra 1910 Dec 15 20:10 mc-873-big-Data.db
>>> -rw-r--r-- 1 cassandra cassandra   10 Dec 15 20:10 mc-873-big-Digest.crc32
>>> -rw-r--r-- 1 cassandra cassandra   51 Dec 15 20:10 
>>> mc-873-big-CompressionInfo.db
>>> -rw-r--r-- 1 cassandra cassandra 4793 Dec 15 20:10 mc-873-b

Re: Timestamp of Last Repair

2018-12-18 Thread Fred Habash
Much appreciated. 


-
Thank you. 

> On Dec 11, 2018, at 10:40 PM, Laxmikant Upadhyay  
> wrote:
> 
> Below info may be helpful for you :
> 
> 1. In System.log logs (grep for below pattern)
> 
> RepairSession.java (line 244) [repair #2e7009b0-c03d-11e4-9012-99a64119c9d8] 
> new session:
> RepairSession.java (line 282) [repair #2e7009b0-c03d-11e4-9012-99a64119c9d8] 
> session completed successfully
> 
> 2. In table you can check: started_at and  finished_at field in 
> system_distributed.parent_repair_history
> 
> regards,
> Laxmikant
> 
>> On Wed, Dec 12, 2018 at 12:54 AM Fred Habash  wrote:
>> We are trying to detect a scenario where some of our smaller clusters go 
>> un-repaired for extended periods of times mostly due to defects in 
>> deployment pipelines or human errors.
>> 
>> We would like to automate a check for clusters where nodes that go 
>> un-repaired for more than 7 days, to shoot out an exception alert. 
>> 
>> The 'Repaied at' field emits a long integer. I'm not sure if this can be 
>> converted to a timestamp. If not, is there an internal dictionary table in 
>> C* that captures repair history? If not, again, can this be done at all?
>> 
>> 
>> 
>> Thank you
>> 
>> 
> 
> 
> -- 
> 
> regards,
> Laxmikant Upadhyay
>