Off the top of my head I would check to make sure the Autoscaling Group you
created is restricted to a single Availability Zone, also Priam sets the
number of EC2 instances it expects based on the maximum instance count you
set on your scaling group (it did this last time i checked a few months
ago, it's behaviour may have changed).

So I would make your desired, min and max instances for your scaling group
are all the same, make sure your ASG is restricted to a
single availability zone (e.g. us-east-1b) and then (if you are able to and
there is no data in your cluster) delete all the SimpleDB entries Priam has
created and then also possibly clear out the cassandra data directory.

Other than that I see you've raised it as an issue on the Priam project
page , so see what they say ;)

Cheers

Ben

On Thu, Feb 28, 2013 at 3:40 AM, Marcelo Elias Del Valle <mvall...@gmail.com
> wrote:

> One additional important info, I checked here and the seeds seems really
> different on each node. The command
> echo `curl 
> http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds`<http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds>
> returns ip2 on first node and ip1,ip1 on second node.
> Any idea why? It's probably what is causing cassandra to die, right?
>
>
> 2013/2/27 Marcelo Elias Del Valle <mvall...@gmail.com>
>
>> Hello Ben, Thanks for the willingness to help,
>>
>> 2013/2/27 Ben Bromhead <b...@instaclustr.com>
>>>
>>> Have your added the priam java agent to cassandras JVM argurments (e.g.
>>> -javaagent:$CASS_HOME/lib/priam-cass-extensions-1.1.15.jar)  and does
>>> the web container running priam have permissions to write to the cassandra
>>> config directory? Also what do the priam logs say?
>>>
>>
>> I put the priam log of the first node bellow. Yes, I have added
>> priam-cass-extensions to java args and Priam IS actually writting to
>> cassandra dir.
>>
>>
>>> If you want to get up and running quickly with cassandra, AWS and priam
>>> quickly check out 
>>> www.instaclustr.com<http://www.instaclustr.com/?cid=cass-list>you.
>>> We deploy Cassandra under your AWS account and you have full root access
>>> to the nodes if you want to explore and play around + there is a free tier
>>> which is great for experimenting and trying Cassandra out.
>>>
>>
>> That sounded really great. I am not sure if it would apply to our case
>> (will consider it though), but some partners would have a great benefit
>> from it, for sure! I will send your link to them.
>>
>> What priam says:
>>
>> 2013-02-27 14:14:58.0614 INFO pool-2-thread-1
>> com.netflix.priam.utils.SystemUtils Calling URL API:
>> http://169.254.169.254/latest/meta-data/public-hostname returns:
>> ec2-174-129-59-107.compute-1.amazon
>> aws.com
>> 2013-02-27 14:14:58.0615 INFO pool-2-thread-1
>> com.netflix.priam.utils.SystemUtils Calling URL API:
>> http://169.254.169.254/latest/meta-data/public-ipv4 returns:
>> 174.129.59.107
>> 2013-02-27 14:14:58.0618 INFO pool-2-thread-1
>> com.netflix.priam.utils.SystemUtils Calling URL API:
>> http://169.254.169.254/latest/meta-data/instance-id returns: i-88b32bfb
>> 2013-02-27 14:14:58.0618 INFO pool-2-thread-1
>> com.netflix.priam.utils.SystemUtils Calling URL API:
>> http://169.254.169.254/latest/meta-data/instance-type returns: c1.medium
>> 2013-02-27 14:14:59.0614 INFO pool-2-thread-1
>> com.netflix.priam.defaultimpl.PriamConfiguration REGION set to us-east-1,
>> ASG Name set to dmp_cluster-useast1b
>> 2013-02-27 14:14:59.0746 INFO pool-2-thread-1
>> com.netflix.priam.defaultimpl.PriamConfiguration appid used to fetch
>> properties is: dmp_cluster
>> 2013-02-27 14:14:59.0843 INFO pool-2-thread-1
>> org.quartz.simpl.SimpleThreadPool Job execution threads will use class
>> loader of thread: pool-2-thread-1
>> 2013-02-27 14:14:59.0861 INFO pool-2-thread-1
>> org.quartz.core.SchedulerSignalerImpl Initialized Scheduler Signaller of
>> type: class org.quartz.core.SchedulerSignalerImpl
>> 2013-02-27 14:14:59.0862 INFO pool-2-thread-1
>> org.quartz.core.QuartzScheduler Quartz Scheduler v.1.7.3 created.
>> 2013-02-27 14:14:59.0864 INFO pool-2-thread-1
>> org.quartz.simpl.RAMJobStore RAMJobStore initialized.
>> 2013-02-27 14:14:59.0864 INFO pool-2-thread-1
>> org.quartz.impl.StdSchedulerFactory Quartz scheduler
>> 'DefaultQuartzScheduler' initialized from default resource file in Quartz
>> package: 'quartz.propertie
>> s'
>> 2013-02-27 14:14:59.0864 INFO pool-2-thread-1
>> org.quartz.impl.StdSchedulerFactory Quartz scheduler version: 1.7.3
>> 2013-02-27 14:14:59.0864 INFO pool-2-thread-1
>> org.quartz.core.QuartzScheduler JobFactory set to:
>> com.netflix.priam.scheduler.GuiceJobFactory@1b6a1c4
>> 2013-02-27 14:15:00.0239 INFO pool-2-thread-1
>> com.netflix.priam.aws.AWSMembership Querying Amazon returned following
>> instance in the ASG: us-east-1b --> i-8eb32bfd,i-88b32bfb
>> 2013-02-27 14:15:01.0470 INFO Timer-0 org.quartz.utils.UpdateChecker New
>> update(s) found: 1.8.5 [
>> http://www.terracotta.org/kit/reflector?kitID=default&pageID=QuartzChangeLog
>> ]
>> 2013-02-27 14:15:10.0925 INFO pool-2-thread-1
>> com.netflix.priam.identity.InstanceIdentity Found dead instances: i-d49a0da7
>> 2013-02-27 14:15:11.0397 ERROR pool-2-thread-1
>> com.netflix.priam.aws.SDBInstanceFactory Conditional check failed.
>> Attribute (instanceId) value exists
>> 2013-02-27 14:15:11.0398 ERROR pool-2-thread-1
>> com.netflix.priam.utils.RetryableCallable Retry #1 for: Status Code: 409,
>> AWS Service: AmazonSimpleDB, AWS Request ID:
>> 96ca7ae5-f352-b13a-febd-8801d46fe
>> e83, AWS Error Code: ConditionalCheckFailed, AWS Error Message:
>> Conditional check failed. Attribute (instanceId) value exists
>> 2013-02-27 14:15:11.0686 INFO pool-2-thread-1
>> com.netflix.priam.aws.AWSMembership Querying Amazon returned following
>> instance in the ASG: us-east-1b --> i-8eb32bfd,i-88b32bfb
>> 2013-02-27 14:15:25.0258 INFO pool-2-thread-1
>> com.netflix.priam.identity.InstanceIdentity Found dead instances: i-d89a0dab
>> 2013-02-27 14:15:25.0588 INFO pool-2-thread-1
>> com.netflix.priam.identity.InstanceIdentity Trying to grab slot 1808575601
>> with availability zone us-east-1b
>> 2013-02-27 14:15:25.0732 INFO pool-2-thread-1
>> com.netflix.priam.identity.InstanceIdentity My token:
>> 56713727820156410577229101240436610842
>> 2013-02-27 14:15:25.0732 INFO pool-2-thread-1
>> org.quartz.core.QuartzScheduler Scheduler
>> DefaultQuartzScheduler_$_NON_CLUSTERED started.
>> 2013-02-27 14:15:25.0878 INFO pool-2-thread-1
>> org.apache.cassandra.db.HintedHandOffManager cluster_name: dmp_cluster
>> initial_token: null
>> hinted_handoff_enabled: true
>> max_hint_window_in_ms: 8
>> hinted_handoff_throttle_in_kb: 1024
>> max_hints_delivery_threads: 2
>> authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
>> authorizer: org.apache.cassandra.auth.AllowAllAuthorizer
>> partitioner: org.apache.cassandra.dht.RandomPartitioner
>> data_file_directories:
>> - /var/lib/cassandra/data
>> commitlog_directory: /var/lib/cassandra/commitlog
>> disk_failure_policy: stop
>> key_cache_size_in_mb: null
>> key_cache_save_period: 14400
>> row_cache_size_in_mb: 0
>> row_cache_save_period: 0
>> row_cache_provider: SerializingCacheProvider
>> saved_caches_directory: /var/lib/cassandra/saved_caches
>> commitlog_sync: periodic
>> commitlog_sync_period_in_ms: 10000
>> commitlog_segment_size_in_mb: 32
>> seed_provider:
>> - class_name: com.netflix.priam.cassandra.extensions.NFSeedProvider
>>   parameters:
>>   - seeds: 127.0.0.1
>> flush_largest_memtables_at: 0.75
>> reduce_cache_sizes_at: 0.85
>> reduce_cache_capacity_to: 0.6
>> concurrent_reads: 32
>> concurrent_writes: 32
>> memtable_flush_queue_size: 4
>> trickle_fsync: false
>> trickle_fsync_interval_in_kb: 10240
>> storage_port: 7000
>> ssl_storage_port: 7001
>> listen_address: null
>> start_native_transport: false
>> native_transport_port: 9042
>> start_rpc: true
>> rpc_address: null
>> rpc_port: 9160
>> rpc_keepalive: true
>> rpc_server_type: sync
>> thrift_framed_transport_size_in_mb: 15
>> thrift_max_message_length_in_mb: 16
>> incremental_backups: true
>> snapshot_before_compaction: false
>> auto_snapshot: true
>> column_index_size_in_kb: 64
>> in_memory_compaction_limit_in_mb: 128
>> multithreaded_compaction: false
>> compaction_throughput_mb_per_sec: 8
>> compaction_preheat_key_cache: true
>> read_request_timeout_in_ms: 10000
>> range_request_timeout_in_ms: 10000
>> write_request_timeout_in_ms: 10000
>> truncate_request_timeout_in_ms: 60000
>> request_timeout_in_ms: 10000
>> cross_node_timeout: false
>> endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch
>> dynamic_snitch_update_interval_in_ms: 100
>> dynamic_snitch_reset_interval_in_ms: 600000
>> dynamic_snitch_badness_threshold: 0.1
>> request_scheduler: org.apache.cassandra.scheduler.NoScheduler
>> index_interval: 128
>> server_encryption_options:
>>   internode_encryption: none
>>   keystore: conf/.keystore
>>   keystore_password: cassandra
>>   truststore: conf/.truststore
>>   truststore_password: cassandra
>> client_encryption_options:
>>   enabled: false
>>   keystore: conf/.keystore
>>   keystore_password: cassandra
>> internode_compression: all
>> inter_dc_tcp_nodelay: true
>> auto_bootstrap: true
>> memtable_total_space_in_mb: 1024
>> stream_throughput_outbound_megabits_per_sec: 400
>> num_tokens: 1
>>
>> 2013-02-27 14:15:25.0884 INFO pool-2-thread-1
>> com.netflix.priam.utils.SystemUtils Starting cassandra server ....Join
>> ring=true
>> 2013-02-27 14:15:25.0915 INFO pool-2-thread-1
>> com.netflix.priam.utils.SystemUtils Starting cassandra server ....
>> 2013-02-27 14:15:30.0013 INFO http-bio-8080-exec-1
>> com.netflix.priam.aws.AWSMembership Query on ASG returning 3 instances
>> 2013-02-27 14:15:31.0726 INFO http-bio-8080-exec-2
>> com.netflix.priam.aws.AWSMembership Query on ASG returning 3 instances
>> 2013-02-27 14:15:37.0360 INFO DefaultQuartzScheduler_Worker-5
>> com.netflix.priam.aws.S3FileSystem Uploading to
>> backup/us-east-1/dmp_cluster/56713727820156410577229101240436610842/201302271415/SST/system/local/system-local-ib-1-CompressionInfo.db
>> with chunk size 10485760
>>
>>
>>
>> Best regards,
>> --
>> Marcelo Elias Del Valle
>> http://mvalle.com - @mvallebr
>>
>
>
>
> --
> Marcelo Elias Del Valle
> http://mvalle.com - @mvallebr
>



-- 
Ben Bromhead

Co-founder
*relational.io* | @benbromhead <https://twitter.com/BenBromhead> | ph: +61
415 936 359

Reply via email to