[ 
https://issues.apache.org/jira/browse/KAFKA-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502327#comment-14502327
 ] 

Ewen Cheslack-Postava commented on KAFKA-2121:
----------------------------------------------

Odd, I don't see those, but I do see some like this now:

kafka.api.ProducerFailureHandlingTest > testSendAfterClosed FAILED
    org.apache.kafka.common.KafkaException: Failed to close kafka producer
        at 
org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:555)
        at 
org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:526)
        at 
kafka.api.ProducerFailureHandlingTest.tearDown(ProducerFailureHandlingTest.scala:73)

        Caused by:
        java.nio.channels.ClosedSelectorException
            at sun.nio.ch.SelectorImpl.keys(SelectorImpl.java:51)
            at org.apache.kafka.common.network.Selector.close(Selector.java:175)
            at 
org.apache.kafka.clients.NetworkClient.close(NetworkClient.java:319)
            at 
org.apache.kafka.clients.ClientUtils.closeQuietly(ClientUtils.java:57)
            at 
org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:549)
            ... 2 more

Note that this is in core:tests. A bunch of client behavior isn't thoroughly 
tested until integration tests with the server.

The problem is that NetworkClient doesn't seem to handle close() being called 
twice. Since this is already available public interface, we'd have to consider 
that a regression -- we can't really change that behavior at this point just by 
cleaning up the tests. I think this is related to the change you made to pull 
that call out of Sender because it used to be called from Sender.run() and that 
wouldn't have been invoked twice. Another solution might be to null out the 
fields of KafkaProducer and KafkaConsumer as you close() them, but this 
probably means we need to think through that change or revert it.

> prevent potential resource leak in KafkaProducer and KafkaConsumer
> ------------------------------------------------------------------
>
>                 Key: KAFKA-2121
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2121
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>    Affects Versions: 0.8.2.0
>            Reporter: Steven Zhen Wu
>            Assignee: Jun Rao
>         Attachments: KAFKA-2121.patch, KAFKA-2121_2015-04-16_09:55:14.patch, 
> KAFKA-2121_2015-04-16_10:43:55.patch, KAFKA-2121_2015-04-18_20:09:20.patch, 
> KAFKA-2121_2015-04-19_20:08:45.patch, KAFKA-2121_2015-04-19_20:30:18.patch
>
>
> On Mon, Apr 13, 2015 at 7:17 PM, Guozhang Wang <wangg...@gmail.com> wrote:
> It is a valid problem and we should correct it as soon as possible, I'm
> with Ewen regarding the solution.
> On Mon, Apr 13, 2015 at 5:05 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
> > Steven,
> >
> > Looks like there is even more that could potentially be leaked -- since key
> > and value serializers are created and configured at the end, even the IO
> > thread allocated by the producer could leak. Given that, I think 1 isn't a
> > great option since, as you said, it doesn't really address the underlying
> > issue.
> >
> > 3 strikes me as bad from a user experience perspective. It's true we might
> > want to introduce additional constructors to make testing easier, but the
> > more components I need to allocate myself and inject into the producer's
> > constructor, the worse the default experience is. And since you would have
> > to inject the dependencies to get correct, non-leaking behavior, it will
> > always be more code than previously (and a backwards incompatible change).
> > Additionally, the code creating a the producer would have be more
> > complicated since it would have to deal with the cleanup carefully whereas
> > it previously just had to deal with the exception. Besides, for testing
> > specifically, you can avoid exposing more constructors just for testing by
> > using something like PowerMock that let you mock private methods. That
> > requires a bit of code reorganization, but doesn't affect the public
> > interface at all.
> >
> > So my take is that a variant of 2 is probably best. I'd probably do two
> > things. First, make close() safe to call even if some fields haven't been
> > initialized, which presumably just means checking for null fields. (You
> > might also want to figure out if all the methods close() calls are
> > idempotent and decide whether some fields should be marked non-final and
> > cleared to null when close() is called). Second, add the try/catch as you
> > suggested, but just use close().
> >
> > -Ewen
> >
> >
> > On Mon, Apr 13, 2015 at 3:53 PM, Steven Wu <stevenz...@gmail.com> wrote:
> >
> > > Here is the resource leak problem that we have encountered when 0.8.2
> > java
> > > KafkaProducer failed in constructor. here is the code snippet of
> > > KafkaProducer to illustrate the problem.
> > >
> > > -------------------------------
> > > public KafkaProducer(ProducerConfig config, Serializer<K> keySerializer,
> > > Serializer<V> valueSerializer) {
> > >
> > >     // create metrcis reporter via reflection
> > >     List<MetricsReporter> reporters =
> > >
> > >
> > config.getConfiguredInstances(ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG,
> > > MetricsReporter.class);
> > >
> > >     // validate bootstrap servers
> > >     List<InetSocketAddress> addresses =
> > >
> > >
> > ClientUtils.parseAndValidateAddresses(config.getList(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG));
> > >
> > > }
> > > -------------------------------
> > >
> > > let's say MyMetricsReporter creates a thread in constructor. if hostname
> > > validation threw an exception, constructor won't call the close method of
> > > MyMetricsReporter to clean up the resource. as a result, we created
> > thread
> > > leak issue. this becomes worse when we try to auto recovery (i.e. keep
> > > creating KafkaProducer again -> failing again -> more thread leaks).
> > >
> > > there are multiple options of fixing this.
> > >
> > > 1) just move the hostname validation to the beginning. but this is only
> > fix
> > > one symtom. it didn't fix the fundamental problem. what if some other
> > lines
> > > throw an exception.
> > >
> > > 2) use try-catch. in the catch section, try to call close methods for any
> > > non-null objects constructed so far.
> > >
> > > 3) explicitly declare the dependency in the constructor. this way, when
> > > KafkaProducer threw an exception, I can call close method of metrics
> > > reporters for releasing resources.
> > >     KafkaProducer(..., List<MetricsReporter> reporters)
> > > we don't have to dependency injection framework. but generally hiding
> > > dependency is a bad coding practice. it is also hard to plug in mocks for
> > > dependencies. this is probably the most intrusive change.
> > >
> > > I am willing to submit a patch. but like to hear your opinions on how we
> > > should fix the issue.
> > >
> > > Thanks,
> > > Steven
> > >
> >
> >
> >
> > --
> > Thanks,
> > Ewen
> >
> --
> -- Guozhang



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to