[ https://issues.apache.org/jira/browse/KAFKA-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ewen Cheslack-Postava updated KAFKA-2121: ----------------------------------------- Fix Version/s: 0.8.3 > prevent potential resource leak in KafkaProducer and KafkaConsumer > ------------------------------------------------------------------ > > Key: KAFKA-2121 > URL: https://issues.apache.org/jira/browse/KAFKA-2121 > Project: Kafka > Issue Type: Bug > Components: producer > Affects Versions: 0.8.2.0 > Reporter: Steven Zhen Wu > Assignee: Steven Zhen Wu > Fix For: 0.8.3 > > Attachments: KAFKA-2121.patch, KAFKA-2121_2015-04-16_09:55:14.patch, > KAFKA-2121_2015-04-16_10:43:55.patch, KAFKA-2121_2015-04-18_20:09:20.patch, > KAFKA-2121_2015-04-19_20:08:45.patch, KAFKA-2121_2015-04-19_20:30:18.patch, > KAFKA-2121_2015-04-20_09:06:09.patch, KAFKA-2121_2015-04-20_09:51:51.patch, > KAFKA-2121_2015-04-20_09:52:46.patch, KAFKA-2121_2015-04-20_09:57:49.patch, > KAFKA-2121_2015-04-20_22:48:31.patch > > > On Mon, Apr 13, 2015 at 7:17 PM, Guozhang Wang <wangg...@gmail.com> wrote: > It is a valid problem and we should correct it as soon as possible, I'm > with Ewen regarding the solution. > On Mon, Apr 13, 2015 at 5:05 PM, Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > Steven, > > > > Looks like there is even more that could potentially be leaked -- since key > > and value serializers are created and configured at the end, even the IO > > thread allocated by the producer could leak. Given that, I think 1 isn't a > > great option since, as you said, it doesn't really address the underlying > > issue. > > > > 3 strikes me as bad from a user experience perspective. It's true we might > > want to introduce additional constructors to make testing easier, but the > > more components I need to allocate myself and inject into the producer's > > constructor, the worse the default experience is. And since you would have > > to inject the dependencies to get correct, non-leaking behavior, it will > > always be more code than previously (and a backwards incompatible change). > > Additionally, the code creating a the producer would have be more > > complicated since it would have to deal with the cleanup carefully whereas > > it previously just had to deal with the exception. Besides, for testing > > specifically, you can avoid exposing more constructors just for testing by > > using something like PowerMock that let you mock private methods. That > > requires a bit of code reorganization, but doesn't affect the public > > interface at all. > > > > So my take is that a variant of 2 is probably best. I'd probably do two > > things. First, make close() safe to call even if some fields haven't been > > initialized, which presumably just means checking for null fields. (You > > might also want to figure out if all the methods close() calls are > > idempotent and decide whether some fields should be marked non-final and > > cleared to null when close() is called). Second, add the try/catch as you > > suggested, but just use close(). > > > > -Ewen > > > > > > On Mon, Apr 13, 2015 at 3:53 PM, Steven Wu <stevenz...@gmail.com> wrote: > > > > > Here is the resource leak problem that we have encountered when 0.8.2 > > java > > > KafkaProducer failed in constructor. here is the code snippet of > > > KafkaProducer to illustrate the problem. > > > > > > ------------------------------- > > > public KafkaProducer(ProducerConfig config, Serializer<K> keySerializer, > > > Serializer<V> valueSerializer) { > > > > > > // create metrcis reporter via reflection > > > List<MetricsReporter> reporters = > > > > > > > > config.getConfiguredInstances(ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG, > > > MetricsReporter.class); > > > > > > // validate bootstrap servers > > > List<InetSocketAddress> addresses = > > > > > > > > ClientUtils.parseAndValidateAddresses(config.getList(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG)); > > > > > > } > > > ------------------------------- > > > > > > let's say MyMetricsReporter creates a thread in constructor. if hostname > > > validation threw an exception, constructor won't call the close method of > > > MyMetricsReporter to clean up the resource. as a result, we created > > thread > > > leak issue. this becomes worse when we try to auto recovery (i.e. keep > > > creating KafkaProducer again -> failing again -> more thread leaks). > > > > > > there are multiple options of fixing this. > > > > > > 1) just move the hostname validation to the beginning. but this is only > > fix > > > one symtom. it didn't fix the fundamental problem. what if some other > > lines > > > throw an exception. > > > > > > 2) use try-catch. in the catch section, try to call close methods for any > > > non-null objects constructed so far. > > > > > > 3) explicitly declare the dependency in the constructor. this way, when > > > KafkaProducer threw an exception, I can call close method of metrics > > > reporters for releasing resources. > > > KafkaProducer(..., List<MetricsReporter> reporters) > > > we don't have to dependency injection framework. but generally hiding > > > dependency is a bad coding practice. it is also hard to plug in mocks for > > > dependencies. this is probably the most intrusive change. > > > > > > I am willing to submit a patch. but like to hear your opinions on how we > > > should fix the issue. > > > > > > Thanks, > > > Steven > > > > > > > > > > > -- > > Thanks, > > Ewen > > > -- > -- Guozhang -- This message was sent by Atlassian JIRA (v6.3.4#6332)