Many thanks for suggestion, I had successfully run the aurora.pex job create command after restarting mesos-slave with provided rack, host, ip values.
Nevertheless, I got the following error from UI against this job 'THROTTLED : Rescheduled, penalized for 60000 ms for flapping' Here is the log of aurora-scheduler on that https://gist.github.com/xasima/12de906475d70523316a#comment-1396134 Needs to mention that thermos and executors pex are built with appropriate mesos eggs (see below), and also provided to aurora-scheduler. ls /opt/apache-aurora-0.7.0-incubating/third_party/ mesos-0.20.1-py2.7-linux-x86_64.egg mesos.interface-0.20.1-py2.7.egg mesos.native-0.20.1-py2.7-linux-x86_64.egg mesos-0.21.0-py2.7-linux-x86_64.egg mesos.interface-0.21.1-py2.7.egg mesos.native-0.21.1-py2.7-linux-x86_64.egg How to overcome this? On Wed, Feb 18, 2015 at 12:18 AM, Steve Niemitz <st...@tellapart.com> wrote: > Is there a reason you set zk_in_proc=true? Setting it tells the scheduler > to ignore the "real" ZK server and use an in-proc one instead. > > -zk_in_proc=false > Launches an embedded zookeeper server for local testing causing > -zk_endpoints to be ignored if specified. > > (com.twitter.common.zookeeper.guice.client.flagged.FlaggedClientConfig.zk_in_proc) > > On Tue, Feb 17, 2015 at 4:09 PM, Xasima <xas...@gmail.com> wrote: > > > Hello. I'm bump in into following problems when trying to perform the > very > > first 'aurora.pex job create' command. > > 1) 'Could not connect to scheduler: No schedulers detected in devcluster' > > and > > 2) 'Failed to connect to Zookeeper within 10 seconds.' > > > > It had tried to check everything in configurations, but I can't find the > > root of the problem so far. I have zookeeper, mesos-master, mesos-slave, > > and aurora-scheduler running on the same server. The little difference > from > > the default vagrant/example configuration is the usage of non default > > http_port for aurora scheduler. > > > > Namely, I have aurora scheduler running with the following /vars prop > > > > *jvm_prop_sun_java_command *org.apache.aurora.scheduler.app.SchedulerMain > > > > > -thermos_executor_path=/opt/apache-aurora-0.7.0-incubating/dist/thermos_executor.pex > > > -gc_executor_path=/opt/apache-aurora-0.7.0-incubating/dist/gc_executor.pex > > -http_port=8091 -zk_in_proc=true -zk_endpoints=localhost:2181 > > -zk_session_timeout=2secs -serverset_path=/aurora/scheduler > > -mesos_master_address=zk://localhost:2181/mesos -cluster_name=devcluster > > -native_log_quorum_size=1 > > -native_log_file_path=/usr/local/aurora-scheduler/db > > -native_log_zk_group_path=/local/service/mesos-native-log > > -backup_dir=/usr/local/aurora-scheduler/backups -logtostderr -vlog=INFO > > > > and here is the successful tail of aurora-scheduler log > > > > W0217 20:42:25.952 THREAD140 > > com.twitter.common.zookeeper.ServerSetImpl.join: Joining a ServerSet > > without a shard ID is deprecated and will soon break. > > com.twitter.common.zookeeper.Group$ActiveMembership.join: Set group > member > > ID to member_0000000001 > > > > I0217 20:42:26.026 THREAD132 > > com.twitter.common.zookeeper.ServerSetImpl$ServerSetWatcher.logChange: > > server set /aurora/scheduler change: from 0 members to 1 > > joined: > > > > ServiceInstance(serviceEndpoint:Endpoint(host:bymsq-bsu-hmetrics002, > > port:8091), > additionalEndpoints:{http=Endpoint(host:bymsq-bsu-hmetrics002, > > port:8091)}, status:ALIVE) > > > > I0217 20:42:26.026 THREAD132 > > > org.apache.aurora.scheduler.http.LeaderRedirect$SchedulerMonitor.onChange: > > Found leader scheduler at > > [ServiceInstance(serviceEndpoint:Endpoint(host:bymsq-bsu-hmetrics002, > > port:8091), > additionalEndpoints:{http=Endpoint(host:bymsq-bsu-hmetrics002, > > port:8091)}, status:ALIVE)] > > > > Not sure, if this is suspicious, but I see in zookeeper > > /local/service/mesos-native-log/0000000010 node, and > /mesos/info_000000003 > > nodes, but there are no /aurora/scheduler node. > > > > The configuration file /etc/aurora/clusters.json points to zk with > proper > > scheduler_zk_path. All *.pex files are built with pants against > appropriate > > build or downloaded AURORA_DIST/third_party/mesos_*.egg. This gist > > contains all the details on my configurations > > https://gist.github.com/xasima/12de906475d70523316a > > > > Nevertheless, the very trivial hello_world service fails to run with > > errors on > > WARN] Could not connect to scheduler: No schedulers detected in > > devcluster! > > WARN] Could not connect to scheduler: Failed to connect to Zookeeper > within > > 10 seconds. > > > > Could please someone help and examine the configuration above? > > > > -- > > Best regards, > > ~ Xasima ~ > > > -- Best regards, ~ Xasima ~