You do not need to delete the data folder, I think "file handles" here are mostly due to socket leaks, i.e. network socket file handlers, not disk file handlers. Just restart the broker should do the work.
Guozhang On Mon, Nov 10, 2014 at 7:47 AM, Marco <zentrop...@yahoo.co.uk> wrote: > We're using kafka 0.8.1.1. > > About network partition, it is an option. > now i'm just wondering if deleting the data folder on the second node will > at least have it come up again. > > i think another guy tried a kafka-reassign-partitions just before it all > blew up. > > > Il Lunedì 10 Novembre 2014 16:36, Guozhang Wang <wangg...@gmail.com> ha > scritto: > Hi Marco, > > The fetch error comes from "UnresolvedAddressException", could you try to > check if you have a network partition issue during that time? > > As for the "Too many file handlers", I think this is due to not properly > handling such exceptions that it does not close the socket in time, which > version of Kafka are you using? > > Guozhang > > > > > On Mon, Nov 10, 2014 at 6:08 AM, Marco <zentrop...@yahoo.co.uk> wrote: > > > Hi, > > i've got a 2-machine kafka cluster. For some reasons after a restart the > > second node won't start. > > i get tons of "Error in fetch Name" until I get a final "Too many open > > files". > > > > How do i start dealing with this? > > > > thanks > > > > this is the error > > > > [2014-11-10 14:48:01,169] INFO [Kafka Server 2], started > > (kafka.server.KafkaServer) > > [2014-11-10 14:48:01,378] INFO [ReplicaFetcherManager on broker 2] > Removed > > fetcher for partitions > > [news,3],[test,0],[test,2],[news,1],[test3,1],[test3,3] > > (kafka.server.ReplicaFetcherManager) > > [2014-11-10 14:48:01,459] INFO Truncating log news-3 to offset 249. > > (kafka.log.Log) > > [2014-11-10 14:48:01,462] INFO Truncating log test-0 to offset 0. > > (kafka.log.Log) > > [2014-11-10 14:48:01,462] INFO Truncating log test-2 to offset 0. > > (kafka.log.Log) > > [2014-11-10 14:48:01,463] INFO Truncating log news-1 to offset 268. > > (kafka.log.Log) > > [2014-11-10 14:48:01,464] INFO Truncating log test3-1 to offset 0. > > (kafka.log.Log) > > [2014-11-10 14:48:01,464] INFO Truncating log test3-3 to offset 0. > > (kafka.log.Log) > > [2014-11-10 14:48:01,530] INFO [ReplicaFetcherThread-0-1], Starting > > (kafka.server.ReplicaFetcherThread) > > [2014-11-10 14:48:01,535] INFO [ReplicaFetcherManager on broker 2] Added > > fetcher for partitions ArrayBuffer([[news,3], initOffset 249 to broker > > id:1,host:machine1,port:9092] , [[news,1], initOffset 268 to broker > > id:1,host:machine1,port:9092] ) (kafka.server.ReplicaFetcherManager) > > [2014-11-10 14:48:01,551] ERROR [ReplicaFetcherThread-0-1], Error in > fetch > > Name: FetchRequest; Version: 0; CorrelationId: 0; ClientId: > > ReplicaFetcherThread-0-1; ReplicaId: 2; MaxWait: 500 ms; MinBytes: 1 > bytes; > > RequestInfo: [news,3] -> PartitionFetchInfo(249,1048576),[news,1] -> > > PartitionFetchInfo(268,1048576) (kafka.server.ReplicaFetcherThread) > > java.nio.channels.UnresolvedAddressException > > at sun.nio.ch.Net.checkAddress(Net.java:127) > > ... > > > > > > -- > -- Guozhang > -- -- Guozhang