1. How would a prod admin user/other engineers understand which process is this random groupid which is consuming a specific topic? why is it designed this way? 2. I don't see the groupid changing all the time. It is repeating on restarts. Not able to understand when and how it changes. I know it is trying to get the next offset from last consumed, but it is failing as the offset has been expired. What is the solution for this?
On Thu, Mar 19, 2020 at 10:53 AM lec ssmi <shicheng31...@gmail.com> wrote: > 1.Maybe we can't use customized group id in structured streaming. > 2.When restarting from failure or killing , the group id changes, but the > starting offset will be the last one you consumed last time . > > Srinivas V <srini....@gmail.com> 于2020年3月19日周四 下午12:36写道: > >> Hello, >> 1. My Kafka consumer name is randomly being generated by spark structured >> streaming. Can I override this? >> 2. When testing in development, when I stop my streaming job for Kafka >> consumer job for couple of days and try to start back again, the job keeps >> failing for missing offsets as the offsets get expired after 4 hours. I >> read that when restarting to consume messages SS always tries to get the >> earliest offset but not latest offset. How to handle this problem? >> >> Regards >> Srini >> >