Did you recently add topics / partitions? Each partitions takes a memory
buffer for replication, so you sometimes get OOME by adding partitions
without sizing memory.

You basically need the Java heapsize to be larger than # partitions on the
broker X replica.fetch.size

Gwen

On Wed, Dec 14, 2016 at 12:03 PM, Zakee <kzak...@netzero.net> wrote:

> Recently, we have seen our brokers crash with below errors, any idea what
> might be wrong here?  The brokers have been running for long with the same
> hosts/configs without this issue before. Is this something to do with new
> version 0.10.0.1 (which we upgraded recently) or could it be a h/w issue?
> 10 hosts are dedicated for one broker per host. Each host has 128 gb RAM
> and 20TB of storage mounts. Any pointers will help...
>
>
> [2016-12-12 02:49:58,134] FATAL [app=broker] [ReplicaFetcherThread-15-15]
> [ReplicaFetcherThread-15-15], Disk error while replicating data for
> mytopic-19 (kafka.server.ReplicaFetcherThread)
> kafka.common.KafkaStorageException: I/O exception in append to log ’
> mytopic-19'
>         at kafka.log.Log.append(Log.scala:349)
>         at kafka.server.ReplicaFetcherThread.processPartitionData(
> ReplicaFetcherThread.scala:130)
>         at kafka.server.ReplicaFetcherThread.processPartitionData(
> ReplicaFetcherThread.scala:42)
>         at kafka.server.AbstractFetcherThread$$
> anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$
> anonfun$apply$2.apply(AbstractFetcherThread.scala:159)
>         at kafka.server.AbstractFetcherThread$$
> anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$
> anonfun$apply$2.apply(AbstractFetcherThread.scala:141)
>         at scala.Option.foreach(Option.scala:257)
>         at kafka.server.AbstractFetcherThread$$
> anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(
> AbstractFetcherThread.scala:141)
>         at kafka.server.AbstractFetcherThread$$
> anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(
> AbstractFetcherThread.scala:138)
>         at scala.collection.mutable.ResizableArray$class.foreach(
> ResizableArray.scala:59)
>         at scala.collection.mutable.ArrayBuffer.foreach(
> ArrayBuffer.scala:48)
>         at kafka.server.AbstractFetcherThread$$
> anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:
> 138)
>         at kafka.server.AbstractFetcherThread$$
> anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:138)
>         at kafka.server.AbstractFetcherThread$$
> anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:138)
>         at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
>         at kafka.server.AbstractFetcherThread.processFetchRequest(
> AbstractFetcherThread.scala:136)
>         at kafka.server.AbstractFetcherThread.doWork(
> AbstractFetcherThread.scala:103)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> Caused by: java.io.IOException: Map failed
>         at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:907)
>         at kafka.log.AbstractIndex$$anonfun$resize$1.apply(
> AbstractIndex.scala:116)
>         at kafka.log.AbstractIndex$$anonfun$resize$1.apply(
> AbstractIndex.scala:106)
>         at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
>         at kafka.log.AbstractIndex.resize(AbstractIndex.scala:106)
>         at kafka.log.AbstractIndex$$anonfun$trimToValidSize$1.
> apply$mcV$sp(AbstractIndex.scala:160)
>         at kafka.log.AbstractIndex$$anonfun$trimToValidSize$1.
> apply(AbstractIndex.scala:160)
>         at kafka.log.AbstractIndex$$anonfun$trimToValidSize$1.
> apply(AbstractIndex.scala:160)
>         at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:234)
>         at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.
> scala:159)
>         at kafka.log.Log.roll(Log.scala:772)
>         at kafka.log.Log.maybeRoll(Log.scala:742)
>         at kafka.log.Log.append(Log.scala:405)
>         ... 16 more
> Caused by: java.lang.OutOfMemoryError: Map failed
>         at sun.nio.ch.FileChannelImpl.map0(Native Method)
>         at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:904)
>         ... 28 more
>
>
> Thanks
> -Zakee




-- 
*Gwen Shapira*
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
<http://www.confluent.io/blog>

Reply via email to