One additional important info, I checked here and the seeds seems really different on each node. The command echo `curl http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds` returns ip2 on first node and ip1,ip1 on second node. Any idea why? It's probably what is causing cassandra to die, right?
2013/2/27 Marcelo Elias Del Valle <mvall...@gmail.com> > Hello Ben, Thanks for the willingness to help, > > 2013/2/27 Ben Bromhead <b...@instaclustr.com> >> >> Have your added the priam java agent to cassandras JVM argurments (e.g. >> -javaagent:$CASS_HOME/lib/priam-cass-extensions-1.1.15.jar) and does >> the web container running priam have permissions to write to the cassandra >> config directory? Also what do the priam logs say? >> > > I put the priam log of the first node bellow. Yes, I have added > priam-cass-extensions to java args and Priam IS actually writting to > cassandra dir. > > >> If you want to get up and running quickly with cassandra, AWS and priam >> quickly check out >> www.instaclustr.com<http://www.instaclustr.com/?cid=cass-list>you. >> We deploy Cassandra under your AWS account and you have full root access >> to the nodes if you want to explore and play around + there is a free tier >> which is great for experimenting and trying Cassandra out. >> > > That sounded really great. I am not sure if it would apply to our case > (will consider it though), but some partners would have a great benefit > from it, for sure! I will send your link to them. > > What priam says: > > 2013-02-27 14:14:58.0614 INFO pool-2-thread-1 > com.netflix.priam.utils.SystemUtils Calling URL API: > http://169.254.169.254/latest/meta-data/public-hostname returns: > ec2-174-129-59-107.compute-1.amazon > aws.com > 2013-02-27 14:14:58.0615 INFO pool-2-thread-1 > com.netflix.priam.utils.SystemUtils Calling URL API: > http://169.254.169.254/latest/meta-data/public-ipv4 returns: > 174.129.59.107 > 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 > com.netflix.priam.utils.SystemUtils Calling URL API: > http://169.254.169.254/latest/meta-data/instance-id returns: i-88b32bfb > 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 > com.netflix.priam.utils.SystemUtils Calling URL API: > http://169.254.169.254/latest/meta-data/instance-type returns: c1.medium > 2013-02-27 14:14:59.0614 INFO pool-2-thread-1 > com.netflix.priam.defaultimpl.PriamConfiguration REGION set to us-east-1, > ASG Name set to dmp_cluster-useast1b > 2013-02-27 14:14:59.0746 INFO pool-2-thread-1 > com.netflix.priam.defaultimpl.PriamConfiguration appid used to fetch > properties is: dmp_cluster > 2013-02-27 14:14:59.0843 INFO pool-2-thread-1 > org.quartz.simpl.SimpleThreadPool Job execution threads will use class > loader of thread: pool-2-thread-1 > 2013-02-27 14:14:59.0861 INFO pool-2-thread-1 > org.quartz.core.SchedulerSignalerImpl Initialized Scheduler Signaller of > type: class org.quartz.core.SchedulerSignalerImpl > 2013-02-27 14:14:59.0862 INFO pool-2-thread-1 > org.quartz.core.QuartzScheduler Quartz Scheduler v.1.7.3 created. > 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.simpl.RAMJobStore > RAMJobStore initialized. > 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 > org.quartz.impl.StdSchedulerFactory Quartz scheduler > 'DefaultQuartzScheduler' initialized from default resource file in Quartz > package: 'quartz.propertie > s' > 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 > org.quartz.impl.StdSchedulerFactory Quartz scheduler version: 1.7.3 > 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 > org.quartz.core.QuartzScheduler JobFactory set to: > com.netflix.priam.scheduler.GuiceJobFactory@1b6a1c4 > 2013-02-27 14:15:00.0239 INFO pool-2-thread-1 > com.netflix.priam.aws.AWSMembership Querying Amazon returned following > instance in the ASG: us-east-1b --> i-8eb32bfd,i-88b32bfb > 2013-02-27 14:15:01.0470 INFO Timer-0 org.quartz.utils.UpdateChecker New > update(s) found: 1.8.5 [ > http://www.terracotta.org/kit/reflector?kitID=default&pageID=QuartzChangeLog > ] > 2013-02-27 14:15:10.0925 INFO pool-2-thread-1 > com.netflix.priam.identity.InstanceIdentity Found dead instances: i-d49a0da7 > 2013-02-27 14:15:11.0397 ERROR pool-2-thread-1 > com.netflix.priam.aws.SDBInstanceFactory Conditional check failed. > Attribute (instanceId) value exists > 2013-02-27 14:15:11.0398 ERROR pool-2-thread-1 > com.netflix.priam.utils.RetryableCallable Retry #1 for: Status Code: 409, > AWS Service: AmazonSimpleDB, AWS Request ID: > 96ca7ae5-f352-b13a-febd-8801d46fe > e83, AWS Error Code: ConditionalCheckFailed, AWS Error Message: > Conditional check failed. Attribute (instanceId) value exists > 2013-02-27 14:15:11.0686 INFO pool-2-thread-1 > com.netflix.priam.aws.AWSMembership Querying Amazon returned following > instance in the ASG: us-east-1b --> i-8eb32bfd,i-88b32bfb > 2013-02-27 14:15:25.0258 INFO pool-2-thread-1 > com.netflix.priam.identity.InstanceIdentity Found dead instances: i-d89a0dab > 2013-02-27 14:15:25.0588 INFO pool-2-thread-1 > com.netflix.priam.identity.InstanceIdentity Trying to grab slot 1808575601 > with availability zone us-east-1b > 2013-02-27 14:15:25.0732 INFO pool-2-thread-1 > com.netflix.priam.identity.InstanceIdentity My token: > 56713727820156410577229101240436610842 > 2013-02-27 14:15:25.0732 INFO pool-2-thread-1 > org.quartz.core.QuartzScheduler Scheduler > DefaultQuartzScheduler_$_NON_CLUSTERED started. > 2013-02-27 14:15:25.0878 INFO pool-2-thread-1 > org.apache.cassandra.db.HintedHandOffManager cluster_name: dmp_cluster > initial_token: null > hinted_handoff_enabled: true > max_hint_window_in_ms: 8 > hinted_handoff_throttle_in_kb: 1024 > max_hints_delivery_threads: 2 > authenticator: org.apache.cassandra.auth.AllowAllAuthenticator > authorizer: org.apache.cassandra.auth.AllowAllAuthorizer > partitioner: org.apache.cassandra.dht.RandomPartitioner > data_file_directories: > - /var/lib/cassandra/data > commitlog_directory: /var/lib/cassandra/commitlog > disk_failure_policy: stop > key_cache_size_in_mb: null > key_cache_save_period: 14400 > row_cache_size_in_mb: 0 > row_cache_save_period: 0 > row_cache_provider: SerializingCacheProvider > saved_caches_directory: /var/lib/cassandra/saved_caches > commitlog_sync: periodic > commitlog_sync_period_in_ms: 10000 > commitlog_segment_size_in_mb: 32 > seed_provider: > - class_name: com.netflix.priam.cassandra.extensions.NFSeedProvider > parameters: > - seeds: 127.0.0.1 > flush_largest_memtables_at: 0.75 > reduce_cache_sizes_at: 0.85 > reduce_cache_capacity_to: 0.6 > concurrent_reads: 32 > concurrent_writes: 32 > memtable_flush_queue_size: 4 > trickle_fsync: false > trickle_fsync_interval_in_kb: 10240 > storage_port: 7000 > ssl_storage_port: 7001 > listen_address: null > start_native_transport: false > native_transport_port: 9042 > start_rpc: true > rpc_address: null > rpc_port: 9160 > rpc_keepalive: true > rpc_server_type: sync > thrift_framed_transport_size_in_mb: 15 > thrift_max_message_length_in_mb: 16 > incremental_backups: true > snapshot_before_compaction: false > auto_snapshot: true > column_index_size_in_kb: 64 > in_memory_compaction_limit_in_mb: 128 > multithreaded_compaction: false > compaction_throughput_mb_per_sec: 8 > compaction_preheat_key_cache: true > read_request_timeout_in_ms: 10000 > range_request_timeout_in_ms: 10000 > write_request_timeout_in_ms: 10000 > truncate_request_timeout_in_ms: 60000 > request_timeout_in_ms: 10000 > cross_node_timeout: false > endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch > dynamic_snitch_update_interval_in_ms: 100 > dynamic_snitch_reset_interval_in_ms: 600000 > dynamic_snitch_badness_threshold: 0.1 > request_scheduler: org.apache.cassandra.scheduler.NoScheduler > index_interval: 128 > server_encryption_options: > internode_encryption: none > keystore: conf/.keystore > keystore_password: cassandra > truststore: conf/.truststore > truststore_password: cassandra > client_encryption_options: > enabled: false > keystore: conf/.keystore > keystore_password: cassandra > internode_compression: all > inter_dc_tcp_nodelay: true > auto_bootstrap: true > memtable_total_space_in_mb: 1024 > stream_throughput_outbound_megabits_per_sec: 400 > num_tokens: 1 > > 2013-02-27 14:15:25.0884 INFO pool-2-thread-1 > com.netflix.priam.utils.SystemUtils Starting cassandra server ....Join > ring=true > 2013-02-27 14:15:25.0915 INFO pool-2-thread-1 > com.netflix.priam.utils.SystemUtils Starting cassandra server .... > 2013-02-27 14:15:30.0013 INFO http-bio-8080-exec-1 > com.netflix.priam.aws.AWSMembership Query on ASG returning 3 instances > 2013-02-27 14:15:31.0726 INFO http-bio-8080-exec-2 > com.netflix.priam.aws.AWSMembership Query on ASG returning 3 instances > 2013-02-27 14:15:37.0360 INFO DefaultQuartzScheduler_Worker-5 > com.netflix.priam.aws.S3FileSystem Uploading to > backup/us-east-1/dmp_cluster/56713727820156410577229101240436610842/201302271415/SST/system/local/system-local-ib-1-CompressionInfo.db > with chunk size 10485760 > > > > Best regards, > -- > Marcelo Elias Del Valle > http://mvalle.com - @mvallebr > -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr