Hey Peter Try clearing the local state -- if you have stateful tasks then by default Streams will use rocksdb to store records locally in directories specific to/named after that task. This is presumably why you're seeing errors related to "the task for peek node missing in old nodes"
You can delete the local state manually, which is generally recommended, or by running `KafkaStreams#cleanup` -- you just want to make sure to remove the #cleanup before you resume running normally, in case of a restart for whatever reason (since wiping out the local state can take a while to restore from the changelogs) Of course, just wiping out the local state alone will get you past any problems related to that, but you might have to fully reset the application (or might just have lost your current data/state) if eg the changelog or repartition topic names changed. In general upgrading topologies is hard to do in a compatible way, but there are a few tricks/features you can to make it possible in some cases. For now we recommend naming all of your operators to reduce the chance of processor node names getting scrambled, and there's an experimental feature which will allow you to break up your topology into discrete "named topologies" which can be just a single task, which then allows adding or removing these named topologies without disrupting the others. It's still technically not a public feature yet, however, so there's no "public API" (you can still use it, it's just not public in the sense that there's no contract for the API and it may change across versions). Happy to expand on this more if you want but hopefully the above is sufficient for now cheers, sophie On Thu, Aug 18, 2022 at 7:28 AM Peter Cipov <pci...@twilio.com.invalid> wrote: > Hello, > > I am looking for help regarding topologies and tasks in KS. Mostly where > this information is stored outside of KS app. > > My case is upgrading the KS topology from v1 to v2, the topology is > different (f.e adding peek node in v2). I made a change in code, prepared a > jar and deployed. > > The issue that I observed was when old and new versions of app are up and > running (by mistake). The task for peek node is missing in old nodes hence > deployment crashes. > > My next steps were shutting down all nodes and deployment without old > nodes. But this did not help. New deployment crashed too and I was not able > to figure out why. The topology just stopped ingestion from the source > topic. No error in logs. > > My hypothesis is that the former deployment of (KS topology v1 and v2 at > once) leaves some sort of ephemeral state in the kafka cluster itself. > > And here is my question. what kind of state except messages in repartition > and changelog topics, consumer offsets are stored by KS in kafka cluster ? > > What can be the root cause from your point of view ? > > Thank you > Peter >