Michael,

Are you able to connect to any c* node via OpenSSL?

Openssl s_client -connect <ip address >:9042

Cqlsh <ip address> —ssl 

Subroto 

> On Aug 26, 2019, at 2:47 PM, Marc Selwan <marc.sel...@datastax.com> wrote:
> 
> which exact version of OpenJDK are you using? Is it possible you don't have 
> JCE on those nodes? (I believe more recent versions of Java 8 has this baked 
> in so that might not be it)
> 
> 
> Marc Selwan | DataStax | PM, Server Team | (925) 413-7079 | Twitter 
> 
>   Quick links | DataStax | Training | Documentation | Downloads  
> 
> 
> 
>> On Mon, Aug 26, 2019 at 1:56 PM Michael Carlise 
>> <mcarl...@salesforce.com.invalid> wrote:
>> 
>> I originally opened this issue on stackoverflow 
>> (https://stackoverflow.com/questions/57516660/cassandra-node-to-node-encryption-throws-unable-to-gossip-with-peers-exception).
>>   
>> 
>> However, I haven't gotten any responses in over a week.  I'm going to post 
>> it here and maybe someone will have an idea on where I can look.
>> 
>> We currently run a multi region cassandra cluster in AWS. It runs in four 
>> regions, 12 nodes per region. It runs without node to node encryption (or 
>> client encryption either). We are trying to enable inter datacenter node to 
>> node encryption. However, when we flip encryption over we get an exception 
>> that nodes are unable to gossip with any peers.
>> 
>> It could possibly be that we didn't build our jks keystore/truststores 
>> correctly (more on how we built these files below). But, we additionally do 
>> not see intra datacenter communication working (which should be set to 
>> unencrypted communication). Additionally, cqlsh cannot connect to the node 
>> either; even though we have (by default) client_auth_required set to false.
>> 
>> ERROR [main] 2019-08-15 18:46:32,241 CassandraDaemon.java:749 - Exception 
>> encountered during startup
>> java.lang.RuntimeException: Unable to gossip with any peers
>>         at 
>> org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1435) 
>> ~[apache-cassandra-3.11.4.jar:3.11.4]
>>         at 
>> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:566)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>>         at 
>> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:823)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>>         at 
>> org.apache.cassandra.service.StorageService.initServer(StorageService.java:683)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>>         at 
>> org.apache.cassandra.service.StorageService.initServer(StorageService.java:632)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>>         at 
>> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:388) 
>> [apache-cassandra-3.11.4.jar:3.11.4]
>>         at 
>> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:620)
>>  [apache-cassandra-3.11.4.jar:3.11.4]
>>         at 
>> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:732) 
>> [apache-cassandra-3.11.4.jar:3.11.4]
>> INFO  [main] 2019-08-15 18:47:07,384 YamlConfigurationLoader.java:89 - 
>> Configuration location: file:/etc/cassandra/cassandra.yaml
>> 
>> Something to note is that this error message occurs after a few minutes of 
>> the node being up. (i.e. there is a delay between start up before this 
>> exception is thrown).
>> 
>> Information about our cassandra setup
>> 
>> cassandra version: 3.11.4
>> JDK version: openjdk-8.
>> Linux: Ubuntu 18.04 (bionic).
>> 
>> cassandra.yaml
>> 
>> endpoint_snitch: Ec2MultiRegionSnitch
>> 
>> server_encryption_options:
>>   internode_encryption: dc
>>   keystore: <omitted>
>>   keystore_password: <omitted>
>>   truststore: <omitted>
>>   truststore_password: <omitted>
>> 
>> client_encryption_options:
>>   enabled: false
>> cassandra-rackdc.properties
>> 
>> prefer_local=true
>> No obvious errors with SSH output
>> 
>> When starting cassandra with JVM_OPTS="$JVM_OPTS -Djavax.net.debug=ssl" 
>> added to cassandra-env.sh we see SSL logs printed to stdout (Note: Subject 
>> and Issuer were omitted on purpose).
>> 
>> found key for : cassy-us-west-2                                              
>>                                                                              
>>                                                                             
>> adding as trusted cert:                                                      
>>                                                                              
>>                                                                             
>>   Subject: ...                                                               
>>                                                                              
>>           
>>   Issuer:  ...                                                               
>>                                                                              
>>           
>>   Algorithm: RSA; Serial number: 0xdad28d843fc73325d4c1a75207d4e74           
>>                                                                              
>>                                                                             
>>   Valid from Fri May 27 00:00:00 UTC 2016 until Tue May 26 23:59:59 UTC 2026 
>>  
>> 
>> ...
>> 
>> trigger seeding of SecureRandom
>> done seeding SecureRandom   
>> Looking at Java SE SSL/TLS connection debugging, this looks correct. But to 
>> note, we see this series of messages (along with the RSA key signature 
>> output) repeated several times in rapid fire. We never observe any messages 
>> about the trust store being added; however that might be something that 
>> occurs only on client initiation (?)
>> 
>> Additionally, we do see cassandra report that the Encrypted Messaging 
>> service has been started.
>> 
>> INFO  [main] 2019-08-15 18:45:31,022 MessagingService.java:704 - Starting 
>> Encrypted Messaging Service on SSL port 7001
>> Doesn't appear to be a cassandra.yaml configuration problem
>> 
>> We can bring the node back online by simply configuring 
>> internode_encryption: none. This action seems to rule out a 
>> broadcast_address or rpc_address configuration problem.
>> 
>> How we built our keystore/truststores
>> 
>> We followed the basic template datastax docs for preparing SSL certificates. 
>> One minor difference was that our private key and CSRs were generated using 
>> openssl. One per each region (we plan to share key/signed certs across nodes 
>> in regions). This was created using a command template as:
>> 
>> openssl req -new -newkey rsa:2048 -out cassy-<region>.csr -keyout 
>> cassy-<region>.key -config cassy-<region>.conf -subj "..." -nodes -sha256
>> The generated CSR was then signed by an internal root CA. Because we 
>> generated our files using openssl, we had to build our jks files by 
>> importing our certs into them.
>> 
>> Commands to generate truststore
>> 
>> We distribute this one file to all nodes.
>> 
>> keytool -importcert 
>>     -keystore generic-server-truststore.jks 
>>     -alias rootCa  
>>     -file rootCa.crt 
>>     -noprompt
>>     -keypass omitted 
>>     -storepass omitted 
>> Commands to generate keystore
>> 
>> This was done one per region; but essentially we created a keystore with 
>> keytool, then deleted the key entry and then imported our key entry using 
>> keytool from a pkcs12 file.
>> 
>> keytool -genkeypair -keyalg RSA -alias cassy-${region} -keystore 
>> cassy-${region}.jks -storepass omitted -keypass omitted -validity 365 
>> -keysize 2048 -dname "..." 
>> 
>> keytool -delete -alias cassy-${region} -keystore cassy-${region}.jks 
>> -storepass omitted
>> 
>> openssl pkcs12 -export -in signed_certs/${region}.pem -inkey 
>> keys/cassandra.${region}.key -name cassy-${region} -out ${region}.p12 
>> 
>> keytool -importkeystore -deststorepass omitted -destkeystore 
>> cassy-${region}.jks -srckeystore ${region}.p12 -srcstoretype PKCS12 
>> 
>> keytool -importcert -keystore cassy-${region}.jks -alias rootCa -file ca.crt 
>> -noprompt -keypass omitted -storepass omitted 
>> Looking back at this, I don't remember why we used keytool to generate a 
>> keypair/keystore, then deleted and imported. I think it was because the 
>> keytool importkeystore command refused to run if the keystore didn't already 
>> exist.
>> 
>> ca.crt and pem file
>> 
>> The ca.crt file contains the root certificate and the intermediate 
>> certificate that was used to sign the CSR. The pem file contains the signed 
>> CSR returned to us, the intermediate cert, and the root CA (in that order).
>> 
>> openssl verify ca.crt and pem
>> 
>> openssl verify -CAfile ca.crt us-west-2.pem
>> signed_certs/us-west-2.pem: OK
>> Command output after enabling encryption
>> 
>> nodetool status (output truncated)
>> 
>> Datacenter: us-east                                                          
>>                                       
>> ===================                                      
>> Status=Up/Down                                           
>> |/ State=Normal/Leaving/Joining/Moving                   
>> --  Address         Load       Tokens       Owns (effective)  Host ID        
>>                        Rack
>> ?N  52.44.11.221    ?          256          25.4%             null           
>>                        1c             
>> ...
>> ?N  52.204.232.195  ?          256          23.2%             null           
>>                        1d             
>> Datacenter: us-west-2                                                        
>>                                       
>> =====================
>> Status=Up/Down                                           
>> |/ State=Normal/Leaving/Joining/Moving                   
>> --  Address         Load       Tokens       Owns (effective)  Host ID        
>>                        Rack           
>> ?N  34.209.2.144    ?          256          26.5%             null           
>>                        2c             
>> UN  52.40.32.177    105.99 GiB  256          23.7%             null          
>>                         2c            
>> ?N  34.210.109.203  ?          256          24.7%             null           
>>                        2a   
>> ...                  
>> With the online node being the node with encryption set.
>> 
>> cqlsh to localhost
>> 
>> cassy-node6:~$ cqlsh
>> Connection error: ('Unable to connect to any servers', {'127.0.0.1': 
>> error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: 
>> Connection refused")})
>> cqlsh to remote node Remote node is a node with encryption enabled
>> 
>> cassy-node6:~$ cqlsh 10.0.2.7
>> Connection error: ('Unable to connect to any servers', {'10.0.2.7': 
>> error(111, "Tried connecting to [('10.0.2.7', 9042)]. Last error: Connection 
>> refused")})

Reply via email to