I am a little confused by what you say. I can see how it has to build the state when it is not available on restart but i don’t think it will process old messages from input topics. It should start from the last committed offset whatever that is before the crash. Could you confirm ? I thought that if you can live with the extra delay at the beginning of the app restart, stateful set is not required
________________________________ From: Guozhang Wang <wangg...@gmail.com> Sent: Thursday, June 6, 2019 10:13 AM To: users@kafka.apache.org Subject: Re: Streams reprocessing whole topic when deployed but not locally If you deploy your streams app into a docker container, you'd need to make sure local state directories are preserved, since otherwise whenever you restart all the state would be lost and Streams then has to bootstrap from scratch. E.g. if you are using K8s for cluster management, you'd better use stateful sets to make sure local states are preserves across re-deployment. Guozhang On Wed, Jun 5, 2019 at 4:52 PM Alessandro Tagliapietra < tagliapietra.alessan...@gmail.com> wrote: > Hi Guozhang, > > sorry, by "app" i mean the stream processor app, the one shown in > pipeline.kt. > > The app reads a topic of data sent by a sensor each second and generates a > 20 second window output to another topic. > My "problem" is that when running locally with my local kafka setup, let's > say I stop it and start it again, it continues processing the last window. > When deploying the app into a docker container and using the confluent > cloud as broker, every time I restart the app it starts processing again > from the beginning of the input topic and generates again old windows it > already processed. > > In the meantime I'm trying to upgrade to kafka 2.2.1 to see if I get any > improvement. > > -- > Alessandro Tagliapietra > > > On Wed, Jun 5, 2019 at 4:45 PM Guozhang Wang <wangg...@gmail.com> wrote: > > > Hello Alessandro, > > > > What did you do for `restarting the app online`? I'm not sure I follow > the > > difference between "restart the streams app" and "restart the app online" > > from your description. > > > > > > Guozhang > > > > > > On Wed, Jun 5, 2019 at 10:42 AM Alessandro Tagliapietra < > > tagliapietra.alessan...@gmail.com> wrote: > > > > > > Hello everyone, > > > > > > I've a small streams app, the configuration and part of the code I'm > > using > > > can be found here > > > https://gist.github.com/alex88/6b7b31c2b008817a24f63246557099bc > > > There's also the log when the app is started locally and when the app > is > > > started on our servers connecting to the confluent cloud kafka broker. > > > > > > The problem is that locally everything is working properly, if I > restart > > > the streams app it just continues where it left, if I restart the app > > > online it reprocesses the whole topic. > > > > > > That shouldn't happen right? > > > > > > Thanks in advance > > > > > > -- > > > Alessandro Tagliapietra > > > > > > > > -- > > -- Guozhang > > > -- -- Guozhang