date:20150513

Re: Hitting integer limit when setting log segment.bytes

2015-05-13 Thread Mike Axiak

Jay Kreps has commented on this before: https://issues.apache.org/jira/browse/KAFKA-1670?focusedCommentId=14161185&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14161185 Basically, you can always have more segment files. Having too large of segment files will signif

Re: kafka.message.InvalidMessageException: Message is corrupt crc

2015-05-13 Thread Roger Hoover

Are you using snappy compression? I ran into an issue with message corruption with the new producer, snappy compression, and broker restart. On Mon, May 4, 2015 at 12:55 AM, scguo wrote: > Hi > > > > Here is my questions. > > > > kafka.message.InvalidMessageException: Message is corrupt (store

Re: Experiences testing new producer performance across multiple threads/producer counts

2015-05-13 Thread Jay Kreps

Hey Garry, Super interesting. We honestly never did a ton of performance tuning on the producer. I checked the profiles early on in development and we fixed a few issues that popped up in deployment, but I don't think anyone has done a really scientific look. If you (or anyone else) want to dive i

Re: Producer garbage collection problem

2015-05-13 Thread pengfei li

Hi, I met the same problem. The scala bug https://github.com/scala/scala/pull/3450 was fixed in version 2.11, and I try the kafka_2.11-0.8.2.1.tgz which compiled with scala 2.11, there is still the same problem. Could you found the solution? Thanks 2014-02-05 0:47 GMT+08:00 Florian O

Re: Hitting integer limit when setting log segment.bytes

2015-05-13 Thread Mayuresh Gharat

I suppose it is way log management works in kafka. I am not sure the exact reason for this. Also the index files that are constructed have a mapping of relative offset to the base offset of log file to the real offset. The key value in index file is of the form . Thanks, Mayuresh On Wed, May 1

Re: Hitting integer limit when setting log segment.bytes

2015-05-13 Thread Lance Laursen

Hey folks, Any update on this? On Thu, Apr 30, 2015 at 5:34 PM, Lance Laursen wrote: > Hey all, > > I am attempting to create a topic which uses 8GB log segment sizes, like > so: > ./kafka-topics.sh --zookeeper localhost:2181 --create --topic perftest6p2r > --partitions 6 --replication-factor 2

Kafka log flush time outlier

2015-05-13 Thread Rajiv Kurian

I have a single broker in a cluster of 9 brokers that has a log-flush-time-99th of 260 ms or more. Other brokers have a log-flush-time-99th of less than 30 ms. The misbehaving broker is running on the same kind of machine (c3.4x on Ec2) that the other ones are running on. It's bytes-in, bytes-out,

Re: OutOfMemory error on broker when rolling logs

2015-05-13 Thread Jay Kreps

I think "java.lang.OutOfMemoryError: Map failed" has usually been "out of address space for mmap" if memory serves. If you sum the length of all .index files while the service is running (not after stopped), do they sum to something really close to 2GB? If so it is likely either that the OS/arch i

OutOfMemory error on broker when rolling logs

2015-05-13 Thread Jeff Field

Hello, We are doing a Kafka POC on our CDH cluster. We are running 3 brokers with 24TB (48TB Raw) of available RAID10 storage (XFS filesystem mounted with nobarrier/largeio) (HP Smart Array P420i for the controller, latest firmware) and 48GB of RAM. The broker is running with "-Xmx4G -Xms4G -ser

Re: Compression and batching

2015-05-13 Thread Jiangjie Qin

Yes, in old producer we don¹t control the compressed message size. In new producer, we estimate the compressed size heuristically and decide whether to close the batch or not. It is not perfect but at least better than the old one. Jiangjie (Becket) Qin On 5/13/15, 4:00 PM, "Jamie X" wrote: >Ji

Re: Compression and batching

2015-05-13 Thread Jamie X

Jiangjie, I changed my code to group by partition, then for each partition to group mesages into up to 900kb of uncompressed data, and then sent those batches out. That worked fine and didn't cause any MessageTooLarge errors. So it looks like the issue is that the producer batches all the messages

Re: Experiences testing new producer performance across multiple threads/producer counts

2015-05-13 Thread Jiangjie Qin

Thanks for sharing this, Garry. I actually did similar tests before but unfortunately lost the test data because my laptop rebooted and I forgot to save the dataŠ Anyway, several things to verify: 1. Remember KafkaProducer holds lock per partition. So if you have only one partition in the target

Re: Auto-rebalance not triggering in 2.10-0.8.1.1

2015-05-13 Thread Jiangjie Qin

Automatic preferred leader election hasn¹t been turned on in 0.8.1.1. It¹s been turned on in latest trunk though. The config name is ³auto.leader.rebalance.enable". Jiangjie (Becket) Qin On 5/13/15, 10:50 AM, "Stephen Armstrong" wrote: >Does anyone have any insight into this? Am I correct that

Re: Compression and batching

2015-05-13 Thread Jiangjie Qin

If you are sending in sync mode, producer will just group by partition the list of messages you provided as argument of send() and send them out. You don¹t need to worry about batch.num.messages. There is a potential that compressed message is even bigger than uncompressed message, though. I¹m not

Re: New Producer Async - Metadata Fetch Timeout

2015-05-13 Thread Jiangjie Qin

Isn’t the producer part of the application? The metadata is stored in memory. If the application rebooted (process restarted), all the metadata will be gone. Jiangjie (Becket) Qin On 5/13/15, 9:54 AM, "Mohit Gupta" wrote: >I meant the producer. ( i.e. application using the producer api to push

Re: Getting NotLeaderForPartitionException in kafka broker

2015-05-13 Thread Jiangjie Qin

Does this topic exist in Zookeeper? On 5/12/15, 11:35 PM, "tao xiao" wrote: >Hi, > >Any updates on this issue? I keep seeing this issue happening over and >over >again > >On Thu, May 7, 2015 at 7:28 PM, tao xiao wrote: > >> Hi team, >> >> I have a 12 nodes cluster that has 800 topics and each o

Experiences testing new producer performance across multiple threads/producer counts

2015-05-13 Thread Garry Turkington

Hi, I talked with Gwen at Strata last week and promised to share some of my experiences benchmarking an app reliant on the new producer. I'm using relatively meaty boxes running my producer code (24 core/64GB RAM) but I wasn't pushing them until I got them on the same 10GB fabric as the Kafka

Re: New Producer API Design

2015-05-13 Thread Guozhang Wang

Hello Mohit, When we originally design the new producer API we removed the serializer / deserializer from the old producer and made it generic as accepting only message, but we later concluded it would still be more beneficial to add the serde back into the producer API. And as you observed one co

Re: New Producer API Design

2015-05-13 Thread Ewen Cheslack-Postava

You can of course use KafkaProducer to get a producer interface that can accept a variety of types. For example, if you have an Avro serializer that accepts both primitive types (e.g. String, integer types) and complex types (e.g. records, arrays, maps), Object is the only type you can use to cover

Re: Auto-rebalance not triggering in 2.10-0.8.1.1

2015-05-13 Thread Stephen Armstrong

Does anyone have any insight into this? Am I correct that 0.8.1.1 should be running the leader election automatically? If this is a known issue, is there any reason not to have a cron script that runs the leader election regularly? Thanks Steve On Thu, May 7, 2015 at 2:47 PM, Stephen Armstrong <

New Producer API Design

2015-05-13 Thread Mohit Gupta

Hello, I've a question regarding the design of the new Producer API. As per the design (KafkaProducer), it seems that a separate producer is required for every combination of key and value type. Where as, in documentation ( and elsewhere ) it's recommended to create a single producer instance per

Re: Compression and batching

2015-05-13 Thread Jamie X

(sorry if this messes up the mailing list, I didn't seem to get replies in my inbox) Jiangjie, I am indeed using the old producer, and on sync mode. > Notice that the old producer uses number of messages as batch limitation instead of number of bytes. Can you clarify this? I see a setting batch.

Re: New Producer Async - Metadata Fetch Timeout

2015-05-13 Thread Mohit Gupta

I meant the producer. ( i.e. application using the producer api to push messages into kafka ) . On Wed, May 13, 2015 at 10:20 PM, Mayuresh Gharat < gharatmayures...@gmail.com> wrote: > By application rebooting, do you mean you bounce the brokers? > > Thanks, > > Mayuresh > > On Wed, May 13, 2015

Re: New Producer Async - Metadata Fetch Timeout

2015-05-13 Thread Mayuresh Gharat

By application rebooting, do you mean you bounce the brokers? Thanks, Mayuresh On Wed, May 13, 2015 at 4:06 AM, Mohit Gupta wrote: > Thanks Jiangjie. This is helpful. > > Adding to what you have mentioned, I can think of one more scenario which > may not be very rare. > Say, the application is

the performance of producer[async] degraded seriously after full gc

2015-05-13 Thread pengfei li

Hi, Recently, we use kafka for message transport, And i found something strange in produer. We use producer in async mode, and if i trigger a full gc by jmap -histo:live, then the performance of producer degraded seriously. Then,I found there were a lot of scala.collection.immutable.Str

Re: Kafka log compression change in 0.8.2.1?

2015-05-13 Thread Guozhang Wang

Roger found another possible issue with snappy compression that during broker bouncing the snappy compressed messages could get corrupted while re-sending. I am not sure if it is related but would be good to verify after the upgrade. Guozhang On Tue, May 12, 2015 at 3:55 PM, Jun Rao wrote: > Hi

Re: Doubt regarding new Producer and old Producer API

2015-05-13 Thread Guozhang Wang

Hi Madhukar, 1. the java producer API can also be used for sync call; you can do it with producer.send().get(). 2. the partitioner class has been removed from the new producer API, instead now the message could have a specific partition id. You could calculate the partition id in your customized p

Re: Pulling Snapshots from Kafka, Log compaction last compact offset

2015-05-13 Thread Jonathan Hodges

Very good points, Gwen. I hadn't thought of Oracle Streams case of dependencies. I wonder if GoldenGate handles this better? The tradeoff of these approaches is that each RDBMS will be proprietary on how to get this CDC information. I guess GoldenGate can be a standard interface on RDBMs, but t

Re: New Producer Async - Metadata Fetch Timeout

2015-05-13 Thread Mohit Gupta

Thanks Jiangjie. This is helpful. Adding to what you have mentioned, I can think of one more scenario which may not be very rare. Say, the application is rebooted and the Kafka brokers registered in the producer are not reachable ( could be due to network issues or those brokers are actually down

Doubt regarding new Producer and old Producer API

2015-05-13 Thread Madhukar Bharti

Hi all, What are the possible use cases for using new producer API? - Is this only provides async call with callback feature? - Is partitioner class has been removed from new Producer API? if not then how to implement it if I want to use only client APIs? Regards, Madhukar

Re: Hitting integer limit when setting log segment.bytes

Re: kafka.message.InvalidMessageException: Message is corrupt crc

Re: Experiences testing new producer performance across multiple threads/producer counts

Re: Producer garbage collection problem

Re: Hitting integer limit when setting log segment.bytes

Re: Hitting integer limit when setting log segment.bytes

Kafka log flush time outlier

Re: OutOfMemory error on broker when rolling logs

OutOfMemory error on broker when rolling logs

Re: Compression and batching

Re: Compression and batching

Re: Experiences testing new producer performance across multiple threads/producer counts

Re: Auto-rebalance not triggering in 2.10-0.8.1.1

Re: Compression and batching

Re: New Producer Async - Metadata Fetch Timeout

Re: Getting NotLeaderForPartitionException in kafka broker

Experiences testing new producer performance across multiple threads/producer counts

Re: New Producer API Design

Re: New Producer API Design

Re: Auto-rebalance not triggering in 2.10-0.8.1.1

New Producer API Design

Re: Compression and batching

Re: New Producer Async - Metadata Fetch Timeout

Re: New Producer Async - Metadata Fetch Timeout

the performance of producer[async] degraded seriously after full gc

Re: Kafka log compression change in 0.8.2.1?

Re: Doubt regarding new Producer and old Producer API

Re: Pulling Snapshots from Kafka, Log compaction last compact offset

Re: New Producer Async - Metadata Fetch Timeout

Doubt regarding new Producer and old Producer API

30 matches

Site Navigation

Mail list logo

Footer information