2020-04-17 18:40:10 UTC - anbutech17: @anbutech17 has joined the channel ---- 2020-04-17 19:54:36 UTC - matt_innerspace.io: Where do the results of `Context.recordMetric(name,value)` go when called within a function? I have basic prometheus monitoring setup as per the documentation, but don't see my metrics available anywhere? ---- 2020-04-17 20:01:57 UTC - matt_innerspace.io: this is a java function... ---- 2020-04-17 21:46:11 UTC - Sijie Guo: you will get metrics from `functions stats` command. ---- 2020-04-17 21:47:10 UTC - Addison Higham: you can also get them from prometheus ---- 2020-04-17 21:47:27 UTC - Addison Higham: if you hit the workers metrics endpoint ---- 2020-04-17 21:47:32 UTC - Sijie Guo: yes ---- 2020-04-18 05:57:02 UTC - Addison Higham: uhhh... the helm charts for ZK set `-Dzookeeper.forceSync=no` as a system property... is that intended? that seems... less than ideal ---- 2020-04-18 06:05:31 UTC - Addison Higham: annnndd I still have those set, that might explain some of my ZK issues of things going super wonky and losing state on a node. ---- 2020-04-18 06:07:24 UTC - Addison Higham: although I do have forceSync=yes in my conf files... so I guess it depends which one takes precedence ---- 2020-04-18 06:11:35 UTC - Addison Higham: looks like config file should have precedence, either way, probably something to change in the helm charts :confused: ---- 2020-04-18 06:34:14 UTC - Addison Higham: okay, one last final question: in a k8s environment, the helm charts default to setting `useHostNameAsBookieID=true`. This seems to make sense with a statefulset that gives a stable hostname, *however* in most cloud k8s envs, pod IPs change for each new pod. What I have had happen a number of times now: - a bookie fails (mostly due to memory issues, still working that out) - the brokers start failing to connected the downed bookie and mark a member of the ensemble as bad - the brokers don't know that the member is back up and if another bookie fails, I get brokers that can't form an ensemble and never self heal - a restart of the broker fixes the problem as it re-builds it's ensemble
Seems like there are two ways to fix this: figure out what is going on in the bookkeeper managed ledger library and fix the issue OR do I use IPs? If bookieID and IP can be distinct (i.e. name a bookie but different advertise address) and ensembles then look for changed IPs, maybe that is the right answer? otherwise, having bookie ID change all the time as IPs change seems maybe like it would be a bad thing? ---- 2020-04-18 07:29:17 UTC - Julius S: @Dan Kitchen (TCEU) possibly relevant for our case the other day? ^^ ---- 2020-04-18 07:29:26 UTC - Dan Kitchen (TCEU): @Dan Kitchen (TCEU) has joined the channel ----