Jay Kreps has commented on this before:
https://issues.apache.org/jira/browse/KAFKA-1670?focusedCommentId=14161185&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14161185
Basically, you can always have more segment files. Having too large of
segment files will signif
Are you using snappy compression? I ran into an issue with message
corruption with the new producer, snappy compression, and broker restart.
On Mon, May 4, 2015 at 12:55 AM, scguo wrote:
> Hi
>
>
>
> Here is my questions.
>
>
>
> kafka.message.InvalidMessageException: Message is corrupt (store
Hey Garry,
Super interesting. We honestly never did a ton of performance tuning on the
producer. I checked the profiles early on in development and we fixed a few
issues that popped up in deployment, but I don't think anyone has done a
really scientific look. If you (or anyone else) want to dive i
Hi,
I met the same problem. The scala bug
https://github.com/scala/scala/pull/3450 was fixed in version 2.11, and I
try the
kafka_2.11-0.8.2.1.tgz which compiled with scala 2.11, there is still the
same problem.
Could you found the solution?
Thanks
2014-02-05 0:47 GMT+08:00 Florian O
I suppose it is way log management works in kafka.
I am not sure the exact reason for this. Also the index files that are
constructed have a mapping of relative offset to the base offset of log
file to the real offset. The key value in index file is of the form
.
Thanks,
Mayuresh
On Wed, May 1
Hey folks,
Any update on this?
On Thu, Apr 30, 2015 at 5:34 PM, Lance Laursen
wrote:
> Hey all,
>
> I am attempting to create a topic which uses 8GB log segment sizes, like
> so:
> ./kafka-topics.sh --zookeeper localhost:2181 --create --topic perftest6p2r
> --partitions 6 --replication-factor 2
I have a single broker in a cluster of 9 brokers that has a
log-flush-time-99th of 260 ms or more. Other brokers have
a log-flush-time-99th of less than 30 ms. The misbehaving broker is running
on the same kind of machine (c3.4x on Ec2) that the other ones are running
on. It's bytes-in, bytes-out,
I think "java.lang.OutOfMemoryError: Map failed" has usually been "out of
address space for mmap" if memory serves.
If you sum the length of all .index files while the service is running (not
after stopped), do they sum to something really close to 2GB? If so it is
likely either that the OS/arch i
Hello,
We are doing a Kafka POC on our CDH cluster. We are running 3 brokers with 24TB
(48TB Raw) of available RAID10 storage (XFS filesystem mounted with
nobarrier/largeio) (HP Smart Array P420i for the controller, latest firmware)
and 48GB of RAM. The broker is running with "-Xmx4G -Xms4G -ser
Yes, in old producer we don¹t control the compressed message size. In new
producer, we estimate the compressed size heuristically and decide whether
to close the batch or not. It is not perfect but at least better than the
old one.
Jiangjie (Becket) Qin
On 5/13/15, 4:00 PM, "Jamie X" wrote:
>Ji
Jiangjie, I changed my code to group by partition, then for each partition
to group mesages into up to 900kb of uncompressed data, and then sent those
batches out. That worked fine and didn't cause any MessageTooLarge errors.
So it looks like the issue is that the producer batches all the messages
Thanks for sharing this, Garry. I actually did similar tests before but
unfortunately lost the test data because my laptop rebooted and I forgot
to save the dataŠ
Anyway, several things to verify:
1. Remember KafkaProducer holds lock per partition. So if you have only
one partition in the target
Automatic preferred leader election hasn¹t been turned on in 0.8.1.1. It¹s
been turned on in latest trunk though.
The config name is ³auto.leader.rebalance.enable".
Jiangjie (Becket) Qin
On 5/13/15, 10:50 AM, "Stephen Armstrong"
wrote:
>Does anyone have any insight into this? Am I correct that
If you are sending in sync mode, producer will just group by partition the
list of messages you provided as argument of send() and send them out. You
don¹t need to worry about batch.num.messages.
There is a potential that compressed message is even bigger than
uncompressed message, though. I¹m not
Isn’t the producer part of the application? The metadata is stored in
memory. If the application rebooted (process restarted), all the metadata
will be gone.
Jiangjie (Becket) Qin
On 5/13/15, 9:54 AM, "Mohit Gupta" wrote:
>I meant the producer. ( i.e. application using the producer api to push
Does this topic exist in Zookeeper?
On 5/12/15, 11:35 PM, "tao xiao" wrote:
>Hi,
>
>Any updates on this issue? I keep seeing this issue happening over and
>over
>again
>
>On Thu, May 7, 2015 at 7:28 PM, tao xiao wrote:
>
>> Hi team,
>>
>> I have a 12 nodes cluster that has 800 topics and each o
Hi,
I talked with Gwen at Strata last week and promised to share some of my
experiences benchmarking an app reliant on the new producer. I'm using
relatively meaty boxes running my producer code (24 core/64GB RAM) but I wasn't
pushing them until I got them on the same 10GB fabric as the Kafka
Hello Mohit,
When we originally design the new producer API we removed the serializer /
deserializer from the old producer and made it generic as accepting only
message, but we later concluded it would still be more
beneficial to add the serde back into the producer API. And as you observed
one co
You can of course use KafkaProducer to get a producer
interface that can accept a variety of types. For example, if you have an
Avro serializer that accepts both primitive types (e.g. String, integer
types) and complex types (e.g. records, arrays, maps), Object is the only
type you can use to cover
Does anyone have any insight into this? Am I correct that 0.8.1.1 should be
running the leader election automatically? If this is a known issue, is
there any reason not to have a cron script that runs the leader election
regularly?
Thanks
Steve
On Thu, May 7, 2015 at 2:47 PM, Stephen Armstrong <
Hello,
I've a question regarding the design of the new Producer API.
As per the design (KafkaProducer), it seems that a separate producer
is required for every combination of key and value type. Where as, in
documentation ( and elsewhere ) it's recommended to create a single
producer instance per
(sorry if this messes up the mailing list, I didn't seem to get replies in
my inbox)
Jiangjie, I am indeed using the old producer, and on sync mode.
> Notice that the old producer uses number of messages as batch limitation
instead of number of bytes.
Can you clarify this? I see a setting batch.
I meant the producer. ( i.e. application using the producer api to push
messages into kafka ) .
On Wed, May 13, 2015 at 10:20 PM, Mayuresh Gharat <
gharatmayures...@gmail.com> wrote:
> By application rebooting, do you mean you bounce the brokers?
>
> Thanks,
>
> Mayuresh
>
> On Wed, May 13, 2015
By application rebooting, do you mean you bounce the brokers?
Thanks,
Mayuresh
On Wed, May 13, 2015 at 4:06 AM, Mohit Gupta
wrote:
> Thanks Jiangjie. This is helpful.
>
> Adding to what you have mentioned, I can think of one more scenario which
> may not be very rare.
> Say, the application is
Hi,
Recently, we use kafka for message transport, And i found something
strange in produer.
We use producer in async mode, and if i trigger a full gc by jmap
-histo:live, then the performance of producer degraded seriously.
Then,I found there were a lot of scala.collection.immutable.Str
Roger found another possible issue with snappy compression that during
broker bouncing the snappy compressed messages could get corrupted while
re-sending. I am not sure if it is related but would be good to verify
after the upgrade.
Guozhang
On Tue, May 12, 2015 at 3:55 PM, Jun Rao wrote:
> Hi
Hi Madhukar,
1. the java producer API can also be used for sync call; you can do it with
producer.send().get().
2. the partitioner class has been removed from the new producer API,
instead now the message could have a specific partition id. You could
calculate the partition id in your customized p
Very good points, Gwen. I hadn't thought of Oracle Streams case of
dependencies. I wonder if GoldenGate handles this better?
The tradeoff of these approaches is that each RDBMS will be proprietary on
how to get this CDC information. I guess GoldenGate can be a standard
interface on RDBMs, but t
Thanks Jiangjie. This is helpful.
Adding to what you have mentioned, I can think of one more scenario which
may not be very rare.
Say, the application is rebooted and the Kafka brokers registered in the
producer are not reachable ( could be due to network issues or those
brokers are actually down
Hi all,
What are the possible use cases for using new producer API?
- Is this only provides async call with callback feature?
- Is partitioner class has been removed from new Producer API? if not then
how to implement it if I want to use only client APIs?
Regards,
Madhukar
30 matches
Mail list logo