Thanks Konstantin, That makes sense. To give you some context, the reason we are gravitating towards Queryable State is the architectural preference of Prometheus to scrape ( pull rather then push model ) and our intent to expose aggregations. That said your idea makes sense. The worry I had is the ip resolution of TMs that QueryableStateClient does and our wanting to avoid static ips . If I understand correctly you are proposing a proxy "external" to the Job deployment, as in an external service that discovers the job and works off the ingress End Point that exposes the Queryable Port of the TMs ?
That creates a fragmented architecture that I wanted to avoid, iff I understood your advise correctly. Vishal On Fri, Mar 29, 2019 at 5:42 AM Konstantin Knauf <konstan...@ververica.com> wrote: > Hi Vishal, > > my approach would be a single Kubernetes service, which is backed by all > Taskmanagers of the job. The Taskmanagers will proxy the request for a > specific key to the correct Taskmanager. Yes, the Taskmanagers will cache > the location of the key groups. > > In addition to this Kubernetes service, you can of course have a > Jetty/Jersey REST based server that sends queries to this service. > > Please le me know if this works for you. > > Hope this helps and cheers, > > Konstantin > > > On Thu, Mar 28, 2019 at 12:37 AM Vishal Santoshi < > vishal.santo...@gmail.com> wrote: > >> I think I got a handle on this. For those who might want to do this >> >> >> Here are the steps ( I could share the Jetty/Jersey REST code too is >> required ) >> >> >> *1.* Create a side car container on each pod that has a TM. I wrote a >> simple Jetty/Jersey REST based server that execute queries against the >> local TM query server. >> >> . >> >> . >> >> - name: queryable-state >> >> image: _IMAGE_ >> >> args: ["queryable-state"] >> >> env: >> >> - name: POD_IP >> >> valueFrom: >> >> fieldRef: >> >> fieldPath: status.podIP >> >> ports: >> >> - containerPort: 9999 >> >> name: qstate-client >> >> resources: >> >> requests: >> >> cpu: "0.25" >> >> memory: "256Mi" >> >> >> Note that POD_IP is the ip used by the REST based server to start the >> QueryableStateClient and the port is the default port of the TM query >> server ( 9069 I think ) of the colocated TM container. >> >> >> >> *2.* Expose the port ( in this case 9999 ) at the k8s service layer. >> >> >> >> And that did it. >> >> >> >> >> >> >> >> >> I though am worried about a couple of things >> >> >> *1*. >> >> The TM query server will ask JM for the key group and hence the TM a >> key belongs to for every request and then coordinate the coummunication >> between the client and that TM. Does flink do any optimzation, as in cache >> the key ranges and thus the affinity to a TM to reduce JM stress. I would >> imagine that being some well known distribution function on some well known >> hash algorithm, an incident key could be pinned to a TM without visiting >> the JM more then once. >> >> >> *2. * >> >> We do have use cases where we would want to iterate over all the keys in >> a key group ( and by extension on a TM ) for a job. Is that a possibility ? >> >> >> >> *3. * >> >> The overhead of having as many client containers as TMs. >> >> >> >> Any advise/ideas on the 3 worry points ? >> >> >> >> Regards >> >> On Mon, Mar 25, 2019 at 8:57 PM Vishal Santoshi < >> vishal.santo...@gmail.com> wrote: >> >>> I have 2 options >>> >>> 1. A Rest Based, in my case a Jetty/REST based QueryableStateClient in >>> a side car container colocated on JM ( Though it could on all TMs but that >>> looks to an overkill ) >>> >>> 2.A Rest Based, in my case a Jetty/REST based QueryableStateClient in >>> a side car container colocated on TMs. The Query Proxies are on the TMs, >>> so in essence the communication would be within containers of the POD and I >>> could load balance ( have ot test ) >>> >>> The second alternative seems doable, but looks an overkill but am not >>> sure how to establish a TM on the standalone QueryableStateClient, given >>> that TM's pod IP is not known till the pod is launched. >>> >>> Has anyone had a successful QueryableState setup for flink on k8s? >>> >>> Regards, >>> >> > > -- > > Konstantin Knauf | Solutions Architect > > +49 160 91394525 > > <https://www.ververica.com/> > > Follow us @VervericaData > > -- > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > -- > > Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > -- > Data Artisans GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen >