Linkedin Gabblin compaction tool is using Hive to perform the compaction. Does
it mean Lumos is replaced?
Confused…
On Mar 17, 2015, at 10:00 PM, Xiao wrote:
> Hi, all,
>
> Do you know whether Linkedin plans to open source Lumos in the near future?
>
> I found the answer from Qiao Lin’s po
Hi, all,
Do you know whether Linkedin plans to open source Lumos in the near future?
I found the answer from Qiao Lin’s post about replication from Oracle/mySQL to
Hadoop.
- https://engineering.linkedin.com/data-ingestion/gobblin-big-data-ease
At the source side, it can be DataBus-ba
Yes, store info in zookeeper would work. In new consumer since the
coordinator will resides on the server side, this would be easily
detected. I’m not sure if it is still worth making this change on the old
consumer, though. Especially this is a backward incompatible change in a
sense that all the
AFAIK , linkedin uses databus to do the same. Aesop is built on top of
databus , extending its beautiful capabilities to mysql n hbase
On Mar 18, 2015 7:37 AM, "Xiao" wrote:
> Hi, all,
>
> Do you know how Linkedin team publishes changed rows in Oracle to Kafka? I
> believe they already knew the w
Hi, all,
Do you know how Linkedin team publishes changed rows in Oracle to Kafka? I
believe they already knew the whole problem very well.
Using triggers? or directly parsing the log? or using any Oracle GoldenGate
interfaces?
Any lesson or any standard message format? Could the Linkedin peo
The intention of this test is to check how kafka would behaves if two
different assignment strategies are set in the same consumer group. In
reality this would happen as we never know what configurations downstream
consumers would use.
What about we store the assignment strategy in zk and send out
I think this is a usability issue. It might need an extra admin tool to verify
if all configuration settings are correct, even if the broker can return an
error message to the consumers.
Thanks,
Xiao Li
On Mar 17, 2015, at 5:18 PM, Jiangjie Qin wrote:
> The problem is the consumers are ind
The problem is the consumers are independent to each other. We purely
depend on the same algorithm running on different consumers to achieve
agreement on partition assignment. Breaking this assumption violates the
design in the first place.
On 3/17/15, 4:13 PM, "Mayuresh Gharat" wrote:
>Probably
Probably we should return an error response if you already have a partition
assignment strategy inplace for a group and you try to use other strategy.
Thanks,
Mayuresh
On Tue, Mar 17, 2015 at 2:10 PM, Jiangjie Qin
wrote:
> Yeah, using different partition assignment algorithms in the same consu
We are trying to see what might have caused it.
We had some questions :
1) Is this reproducible? That way we can dig deep.
This looks interesting problem to solve and you might have caught a bug,
but we need to verify the root cause before filing a ticket.
Thanks,
Mayuresh
On Tue, Mar 17, 201
Mathias,
SPM for Kafka will give you Consumer Offsets by Host, Consumer Id, Topic,
and Partition, and you can alert (thresholds and/or anomalies) on any
combination of these, and of course on any of the other 100+ Kafka metrics
there.
See http://blog.sematext.com/2015/02/10/kafka-0-8-2-monitoring/
Any suggestions on how to measure throughput of the Mirror Maker Naidu Saladi
- Forwarded Message -
From: Saladi Naidu
To: "users@kafka.apache.org"
Sent: Monday, March 16, 2015 10:31 PM
Subject: How to measure performance of Mirror Maker
We have three Kafka clusters deploy
When I sent the email last time the message formatting was messed. Hopefully
this does not have happen this time.
We have very unique problem.
We have an application deployed on WebLogic cluster that is spread across 2
datacenter (active-active) DC1 and DC2 (different LAN but same WAN). This
pr
> What version are you running ?
Version 0.8.2.0
> Your case is 2). But the only thing weird is your replica (broker 3) is
> requesting for offset which is greater than the leaders log end offset.
So what could be the cause?
Thanks
Zakee
> On Mar 17, 2015, at 11:45 AM, Mayuresh Gharat
> w
Yeah, using different partition assignment algorithms in the same consumer
group won¹t work. Is there a particular reason you want to do this?
On 3/17/15, 8:32 AM, "tao xiao" wrote:
>This is the corrected zk result
>
>Here is the result from zk
>[zk: localhost:2181(CONNECTED) 0] get
>/consumers/
Hi Mathias,
We call bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker via NRPE,
and alert through Nagios.
-Robin
On Tue, Mar 17, 2015 at 2:46 AM, Kasper Mackenhauer Jacobsen <
kas...@falconsocial.com> wrote:
> Hi Mathias,
>
> We're currently using a custom solution that queries kafka and
This is a great set of projects!
We should put this list of projects on a site somewhere so people can more
easily see and refer to it. These aren't Kafka-specific, but most seem to be
"MySQL CDC." Does anyone have a place where they can host a page? Preferably a
wiki, so we can keep it up to d
Clint,
Your code looks fine and the output doesn't actually have any errors, but
you're also not waiting for the messages to be published. Try changing
producer.send(data);
to
producer.send(data).get();
to wait block until the message has been acked. If it runs and exits
cleanly, then you shou
What version are you running ?
The code for latest version says that :
1) if the log end offset of the replica is greater than the leaders log end
offset, the replicas offset will be reset to logEndOffset of the leader.
2) Else if the log end offset of the replica is smaller than the leaders
log
Thanks, Jon. That helps.
On Tue, Mar 17, 2015 at 11:34 AM, Jon Bringhurst <
jbringhu...@linkedin.com.invalid> wrote:
> At LinkedIn, we're running on 1.8.0u5. YRMV depending on hardware and
> load, but this is what we typically run with:
>
> -server
> -Xms4g
> -Xmx4g
> -XX:PermSize=96m
> -XX:MaxP
At LinkedIn, we're running on 1.8.0u5. YRMV depending on hardware and load, but
this is what we typically run with:
-server
-Xms4g
-Xmx4g
-XX:PermSize=96m
-XX:MaxPermSize=96m
-XX:+UseG1GC
-XX:MaxGCPauseMillis=20
-XX:InitiatingHeapOccupancyPercent=35
-verbose:gc
-XX:+PrintGCDetails
-XX:+PrintGCTim
cool.
On Tue, Mar 17, 2015 at 10:15 AM, Zakee wrote:
> Hi Mayuresh,
>
> The logs are already attached and are in reverse order starting backwards
> from [2015-03-14 07:46:52,517] to the time when brokers were started.
>
> Thanks
> Zakee
>
>
>
> > On Mar 17, 2015, at 12:07 AM, Mayuresh Gharat <
>
Resurrecting an old thread. Are people running Kafka on Java 8 now?
On Sun, Aug 10, 2014 at 11:44 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:
> Just curious if you saw any issues with Java 1.8 or if everything went
> smoothly?
>
> Otis
> --
> Performance Monitoring * Log Analytics
Hi Mayuresh,
The logs are already attached and are in reverse order starting backwards from
[2015-03-14 07:46:52,517] to the time when brokers were started.
Thanks
Zakee
> On Mar 17, 2015, at 12:07 AM, Mayuresh Gharat
> wrote:
>
> Hi Zakee,
>
> Thanks for the logs. Can you paste earlier l
We set the partitions the python consumers needs manually for now, I'm
looking into a solution using zookeeper (possibly) to balance them out
automatically though.
On Tue, Mar 17, 2015 at 2:51 PM, Todd Palino wrote:
> Yeah, this is exactly correct. The python client does not implement the
> Zook
This is the corrected zk result
Here is the result from zk
[zk: localhost:2181(CONNECTED) 0] get
/consumers/test/owners/mm-benchmark-test/0
Node does not exist: /consumers/test/owners/mm-benchmark-test/0
[zk: localhost:2181(CONNECTED) 1] get
/consumers/test/owners/mm-benchmark-test1/0
test-loca
Hi team,
I have two consumer instances with the same group id connecting to two
different topics with 1 partition created for each. One consumer uses
partition.assignment.strategy=roundrobin and the other one uses default
assignment strategy. Both consumers have 1 thread spawned internally and
con
Pretty much a hijack / plug as well (=
https://github.com/mardambey/mypipe
"MySQL binary log consumer with the ability to act on changed rows and
publish changes to different systems with emphasis on Apache Kafka."
Mypipe currently encodes events using Avro before pushing them into Kafka
and is
Hello Dima,
The current consumer does not have explicit memory control mechanism, but
you can try to indirectly bound the memory usage via the following configs:
fetch.message.max.bytes and queued.max.message.chunks. Details can be found
at http://kafka.apache.org/documentation.html#consumerconfig
Hi
I can't get the Kafka/Avro serializer producer example to work.
import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConf
Hi
I face to OOME while trying to consume from one topic 10 partitions (100
000 messages each partition) 5 consumers(consumer groups),
consumer.timeout=10ms. OOME was gotten after 1-2 minutes after start.
Java heap - Xms=1024M
LAN about 10Gbit
This is standalone application.
Kafka version 0.8.2
Yeah, this is exactly correct. The python client does not implement the
Zookeeper logic that would be needed to do a balanced consumer. While it's
certainly possible to do it (for example, Joe implemented it in Go), the
logic is non-trivial and nobody has bothered to this point. I don't think
anyon
Thanks
I just came across this https://github.com/mumrah/kafka-python/issues/112
It says:
That contract of one message per consumer group only works for the
coordinated consumers which are implemented for the JVM only (i.e., Scala and
Java clients).
-Original Message-
From: Ste
It's possible that I just haven't used it but I am reasonably sure that the
python API doesn't have a way to store offsets in ZK. You would need to
implement something more or less compatible with what the Scala/Java API does,
presumably.
On the plus side the python API -- possibly just becaus
Hi,
I wrote a small python script to consume messages from kafka.
The consumer is defined as follows:
kafka = KafkaConsumer('my-replicated-topic',
metadata_broker_list=['localhost:9092'],
group_id='my_consumer_group',
auto_commi
Hi Mathias,
We're currently using a custom solution that queries kafka and zookeeper (2
different processes) for topic size and consumer offset and submits the
information to a collectd/statsd instance that ships it on to graphite, so
we can track it in grafana.
There's no alerting built in, but
Hi Lance,
I tried Kafka Offset Monitor a while back, but it didn't play especially
nice with a lot of topics / partitions (we currently have around 1400
topics and 4000 partitions in total). Might be possible to make it work a
bit better, but not sure it would be the best way to do alerting.
Than
Hi Zakee,
Thanks for the logs. Can you paste earlier logs from broker-3 up to :
[2015-03-14 07:46:52,517] ERROR [ReplicaFetcherThread-2-4], Current
offset 1754769769 for partition [Topic22kv,5] out of range; reset
offset to 1400864851 (kafka.server.ReplicaFetcherThread)
That would help us figure
38 matches
Mail list logo