Anything with a table structure is probably not going to handle schemaless
data (i.e. JSON) very well without some extra help -- tables usually expect
schemas and JSON doesn't have a schema. As it stands today, the JDBC sink
connector will probably not handle your use case.
To send schemaless data
FYI, DNS caching is still not fixed in 0.10. The zookeeper DNS cache has
been fixed for the zookeeper server where quorum members refresh the ip
address of their peers. But the client still doesn't do ip address refresh.
On Mon, Jan 9, 2017 at 9:56 PM, Jack Lund
wrote:
> On Thu, Jan 5, 2017 at 4
Hello ,
Please find my question below :-
Producer(Network Zone A) -> Reverse Proxy --> Broker(Network Zone B)
The reverse proxy is set between Zone A and Zone B which binds RPIP:1234 to
broker_ip:123 123 is the port broker listens to.
There is no Reverse proxy set for zookeeper port so Network
In terms of big files which is quite often in HDFS, does connect task parallel
process the same file like what MR deal with split files? I do not think so. In
this case, Kafka connect implement has no advantages to read single big file
unless you also use mapreduce.
Sent from my iPhone
On Jan
That's great thank you I have it working
One other thing I noticed; if I send a batch of data then wait then
compaction never happens. If I send a few more messages later then the
first batch gets compacted. I guess it needs a constant flow to trigger
compaction of completed segments. So it shows t
Hello!
Is there a log4j.appender.connectAppender?
I noticed there is a log4j.appender.kafkaAppender.
I was hoping to setup the connect-log4j.properties like kafka's.
log4j.appender.connectAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.connectAppender.DatePattern='.'-MM-dd-
I'm starting to look at upgrading to 0.10.1.1, but looks like the docs have
not been updated since 0.10.1.0.
Are there any plans to update the docs to explicitly discuss how to upgrade
from 0.10.1.0 -> 0.10.1.1, and 0.10.0.X -> 0.10.1.1?
Single app with single consumer. Pulling ~30 records / min.
When I enter 'kafka-topics ... --new-consumer --group
--describe' it always tells me "Consumer group is rebalancing".
If I enter "kafka-consumer-offset-checker ...--topic --group "it responds with appropriate consumer position(s) but
FWIW - (for some distant observer):
I think my topic / consumer was too slow for the default commit interval. I
added these lines to the above config and it seems to be working ok:
// These are likely the default but Im adding them ... anyway...
consumerProperties.put("enable.auto.commit", "tr
Hi Jeff,
0.10.1.1 is a bugfix release on top of 0.10.1.0, we do not include any
protocol / api / functionality changes into the bug fixes and that is why
we do not have an upgrade section for it.
We usually only have upgrade section from 0.10.a.X to 0.10.b.X, but
probably we should re-number 0.10
Hello,
I am writing a kafka sink connector for my product that is distributed
table (Underneath a distribute K-V store where cluster of nodes are holding
a different partitions/buckets of a table. This is based on Hash
partitioned on Keys).
When i write a SinkTask, I get the SinkRecord that conta
Will,
The HDFS connector we ship today is for Kafka -> HDFS, so it isn't
reading/processing data in HDFS.
I was discussing both directions because the question was unclear. However,
there's no reason you couldn't create a connector that processes files in
splits to parallelize an HDFS -> Kafka pa
This may be a bit use-case dependent, but I think simply using the key
from the Kafka record as the KV key is a good start.
Another option is to use topic-partition-offset as the key. This has
the benefit of removing duplicates, but it also means that keys are no
longer meaningful for applications
Ewen: I think he was looking for exactly what you were guessing he
doesn't: "My goal is to pipe that json document in a postgres table
that has two columns: id and json."
Postgres has some nice built-in functions that make this actually
useful and not as nuts as it may appear.
As Ewen mentioned,
Hi.
I am developing a simple log counting application using Kafka Streams 0.10.1.1.
Its implementation is almost the same as the WordCountProcessor in the
confluent document
[http://docs.confluent.io/3.1.1/streams/developer-guide.html#processor-api].
I am using in-memory state store,
its key is
Is your goal to simply log connect to file rather than to the console?
In this case your configuration is almost right. Just change the first
line in connect-log4j.properties to:
log4j.rootLogger=INFO, stdout, connectAppender
and then add the lines you have in your email.
Or you can get rid of s
btw. It bugs me a bit that Connect logs to console and not to file by
default. I think tools should log to console, but Connect is more of a
service / daemon and should log to file like the brokers do. So when
you get your log4j config to work, perhaps submit a PR to Apache Kafka
so we'll all enjoy
17 matches
Mail list logo