Ah, I forgot about localThreadsMetadata(). In that. case I agree, there's no reason to introduce a new method when we can get both the names and number of all running threads from this.
I assume that we would update localThreadsMetadata to only return currently alive threads as part of this KIP -- at a quick glance, it seems like we don't do any pruning of dead threads at the moment On Tue, Sep 8, 2020 at 1:58 PM Matthias J. Sax <mj...@apache.org> wrote: > I am not sure if we need a new method? There is already > `localThreadsMetadata()`. What do we gain by adding a new one? > > Returning the thread's name (as `Optional<String>`) for both add() and > remove() is fine with me. > > > -Matthias > > On 9/8/20 12:58 PM, Sophie Blee-Goldman wrote: > > Sorry Bruno, I think I missed the end of your message with the > > numberOfAliveStreamThreads() > > proposal. I agree, that would be better than the alternatives I listed. > > That said: > > > >> They rather suggest that the method returns a list of handles to the > > stream threads. > > > > I hadn't thought of that originally, but now that you mention it, this > > might be a good idea. > > I don't think we should return actual handles on the threads, but maybe a > > list of the thread > > names rather than a single number of currently alive threads. > > > > Since we seem to think it would be difficult if not impossible to keep > > track of the number > > of running stream threads, we should apply the same reasoning to the > names > > and not > > assume the user can/will keep track of every thread returned by > > addStreamThread() or > > removeStreamThread(). Users should generally take any required action > > immediately > > after adding/removing the thread -- eg deregistering the thread metrics > -- > > but it might > > still be useful to provide a convenience method listing all of the > current > > threads > > > > And of course you could still get the number of threads easily by > invoking > > size() on the > > returned list (or ordered set?). > > > > On Tue, Sep 8, 2020 at 12:16 PM Bruno Cadonna <br...@confluent.io> > wrote: > > > >> Thank you again for the feedback Sophie! > >> > >> As I tried to point out in my previous e-mail, removing a stream thread > >> from a Kafka Streams client that does not have alive stream threads is > >> nothing exceptional for the client per se. However, it can become > >> exceptional within the context of the user. For example, if users want > >> to remove a stream thread from a client without alive stream threads > >> because one if their metrics say so, then this is exceptional in the > >> context of that user metric not in the context of the Kafka Streams > >> client. In that case, users should throw an exception and handle it. > >> > >> Regarding returning null, I do not like to return null because from a > >> development point of view there is no distinction between returning null > >> because we have a bug in the code or returning null because there are no > >> alive stream threads. Additionally, Optional<String> makes it more > >> explicit that the result could also be empty. > >> > >> Thank you for the alternative method names! However, with the names you > >> propose it is not immediately clear that the method returns an amount of > >> stream threads. They rather suggest that the method returns a list of > >> handles to the stream threads. I chose to use "aliveStreamThreads" to be > >> consistent with the client-level metric "alive-stream-threads" which > >> reports the same number of stream threads that > >> numberOfAliveStreamThreads() should report. If others also think that > >> the proposed name in the KIP is too clumsy, I am open to rename it, > though. > >> > >> Best, > >> Bruno > >> > >> > >> On 08.09.20 20:12, Sophie Blee-Goldman wrote: > >>>> it's never a good sign when the discussion moves into the vote thread > >>> > >>> Hah, sorry, the gmail consolidation of [VOTE] and [DISCUSS] threads > >> strikes > >>> again. > >>> Thanks for redirecting me Bruno > >>> > >>> I suppose it's unfair to expect the callers to keep perfect track of > the > >>> current > >>> number of stream threads, but it also seems like you shouldn't be > >> calling > >>> removeStreamThread() when there are no threads left. Either you're just > >>> haphazardly removing threads and could unintentionally slip into a > state > >> of > >>> no > >>> running threads without realizing it, or more realistically, you're > >>> carefully > >>> removing threads based on some metric(s) that convey whether the system > >> is > >>> over or under-provisioned. If your metrics say you're over-provisioned > >> but > >>> there's > >>> not one thread running, well, that certainly sounds exceptional to me. > Or > >>> you might > >>> be right in that the cluster is over-provisioned but have just been > >>> directing the > >>> removeStreamThread() and addStreamThread() calls to instances at > random, > >> and > >>> end up with one massive instance and one with no threads at all. Again, > >>> this > >>> probably merits some human intervention (or system redesign) > >>> > >>> That said, I don't think there's any real harm to just returning null > in > >>> this case, but I hope > >>> that users would pay attention to this since it seems likely to > indicate > >>> something has gone > >>> seriously wrong. I suppose Optional<String> would be a reasonable > >>> compromise. > >>> > >>> As for the method name, what about activeStreamThreads() or > >>> liveStreamThreads() ? > >>> > >>> On Mon, Sep 7, 2020 at 1:45 AM Bruno Cadonna <br...@confluent.io> > wrote: > >>> > >>>> Hi John, > >>>> > >>>> I agree with you except for checking null. I would rather prefer to > use > >>>> Optional<String> as the return type to both methods. > >>>> > >>>> I changed the subject from [VOTE] to [DISCUSS] so that we can follow > up > >>>> in the discussion thread. > >>>> > >>>> Best, > >>>> Bruno > >>>> > >>>> On 04.09.20 23:12, John Roesler wrote: > >>>>> Hi Sophie, > >>>>> > >>>>> Uh, oh, it's never a good sign when the discussion moves > >>>>> into the vote thread :) > >>>>> > >>>>> I agree with you, it seems like a good touch for > >>>>> removeStreamThread() to return the name of the thread that > >>>>> got removed, rather than a boolean flag. Maybe the return > >>>>> value would be `null` if there is no thread to remove. > >>>>> > >>>>> If we go that way, I'd suggest that addStreamThread() also > >>>>> return the name of the newly created thread, or null if no > >>>>> thread can be created right now. > >>>>> > >>>>> I'm not completely sure if I think that callers of this > >>>>> method would know exactly how many threads there are. Sure, > >>>>> if a human being is sitting there looking at the metrics or > >>>>> logs and decides to call the method, it would work out, but > >>>>> I'd expect this kind of method to find its way into > >>>>> automated tooling that reacts to things like current system > >>>>> load or resource saturation. Those kinds of toolchains often > >>>>> are part of a distributed system, and it's probably not that > >>>>> easy to guarantee that the thread count they observe is > >>>>> fully consistent with the number of threads that are > >>>>> actually running. Therefore, an in-situ `int > >>>>> numStreamThreads()` method might not be a bad idea. Then > >>>>> again, it seems sort of optional. A caller can catch an > >>>>> exception or react to a `null` return value just the same > >>>>> either way. Having both add/remove methods behave similarly > >>>>> is probably more valuable. > >>>>> > >>>>> Thanks, > >>>>> -John > >>>>> > >>>>> > >>>>> On Thu, 2020-09-03 at 12:15 -0700, Sophie Blee-Goldman > >>>>> wrote: > >>>>>> Hey, sorry for the late reply, I just have one minor suggestion. > Since > >>>> we > >>>>>> don't > >>>>>> make any guarantees about which thread gets removed or allow the > user > >> to > >>>>>> specify, I think we should return either the index or full name of > the > >>>>>> thread > >>>>>> that does get removed by removeThread(). > >>>>>> > >>>>>> I know you just updated the KIP to return true/false if there > >>>> are/aren't any > >>>>>> threads to be removed, but I think this would be more appropriate as > >> an > >>>>>> exception than as a return type. I think it's reasonable to expect > >>>> users to > >>>>>> have some sense to how many threads are remaining, and not try to > >> remove > >>>>>> a thread when there is none left. To me, that indicates something > >> wrong > >>>>>> with the user application code and should be treated as an > exceptional > >>>> case. > >>>>>> I don't think the same code clarify argument applies here as to the > >>>>>> addStreamThread() case, as there's no reason for an application to > be > >>>>>> looping and retrying removeStreamThread() since if that fails, it's > >>>> because > >>>>>> there are no threads left and thus it will continue to always fail. > >> And > >>>> if > >>>>>> the > >>>>>> user actually wants to shut down all threads, they should just close > >> the > >>>>>> whole application rather than call removeStreamThread() in a loop. > >>>>>> > >>>>>> While I generally think it should be straightforward for users to > >> track > >>>> how > >>>>>> many stream threads they have running, maybe it would be nice to add > >>>>>> a small utility method that does this for them. Something like > >>>>>> > >>>>>> // Returns the number of currently alive threads > >>>>>> boolean runningStreamThreads(); > >>>>>> > >>>>>> On Thu, Sep 3, 2020 at 7:41 AM Matthias J. Sax <mj...@apache.org> > >>>> wrote: > >>>>>> > >>>>>>> +1 (binding) > >>>>>>> > >>>>>>> On 9/3/20 6:16 AM, Bruno Cadonna wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I would like to start the voting on KIP-663 that proposes to add > >>>> methods > >>>>>>>> to the Kafka Streams client to add and remove stream threads > during > >>>>>>>> execution. > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-663%3A+API+to+Start+and+Shut+Down+Stream+Threads > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Bruno > >>>>> > >>>> > >>> > >> > > > >