Since these tools are so useful, I wonder what it requires (from both
Airbnb and Kafka) to merge this into Kafka project. I think there are
couple of Jira regarding improved tool usability that this resolved.
On Mon, Sep 15, 2014 at 11:45 AM, Alexis Midon
wrote:
> distribution will be even based
For Fluffka, I created a wrapping function:
IterStatus timedHasNext() {
try {
long startTime = System.currentTimeMillis();
it.hasNext();
long endTime = System.currentTimeMillis();
return new IterStatus(true,endTime-startTime);
} catch (ConsumerTimeoutException e)
Just to update (better late than never!):
The Kafka source & sink for Flume were updated to latest Kafka version
and improved a bit (offsets are now committed after data is written to
Flume channel).
If you build Flume from trunk, you'll get these.
Gwen
On Sun, Aug 3, 2014 at 10:31 AM, Andrew Ehr
Using high level consumer and assuming you already created an iterator:
while (msgCount < maxMessages && it.hasNext()) {
bytes = it.next().message();
eventList.add(bytes);
}
(See a complete example here:
https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-kafka-source/src/main/jav
Take a look at ConsumerOffsetChecker. It does just that: print the
offset and lag for each consumer and partition.
You can either use that class directly, or use it as a guideline for
your implementation
On Wed, Oct 1, 2014 at 2:10 AM, Shlomi Hazan wrote:
> Hi,
> How can I programmatically get t
Do we have a jira to support removal of dead brokers without having to
start a new broker with the same id?
I think its something we'll want to allow.
On Thu, Oct 2, 2014 at 7:45 AM, Jun Rao wrote:
> The reassign partition process only completes after the new replicas are
> fully caught up and t
I'm using Hue's ZooKeeper app: http://gethue.com/new-zookeeper-browser-app/
This UI looks very cute, but I didn't try it yet:
https://github.com/claudemamo/kafka-web-console
Gwen
On Tue, Oct 7, 2014 at 5:08 PM, Shafaq wrote:
> We are going to deploy Kafka in Production and also monitor it via c
can you check that you can connect on port 9092 from producer to
broker? (check with telnet or something similar)
ping may succeed when a port is blocked.
On Wed, Oct 8, 2014 at 9:40 AM, ravi singh wrote:
> Even though I am able to ping to the broker machine from my producer
> machine , the produ
If you use the high level consumer implementation, and register all
consumers as part of the same group - they will load-balance
automatically.
When you add a consumer to the group, if there are enough partitions
in the topic, some of the partitions will be assigned to the new
consumer.
When a con
harninder
>
>
> On Wed, Oct 8, 2014 at 11:35 PM, Gwen Shapira wrote:
>
>> If you use the high level consumer implementation, and register all
>> consumers as part of the same group - they will load-balance
>> automatically.
>>
>> When you add a consumer to
ddff9-0
Isn't Kafka the best thing ever? :)
Gwen
On Wed, Oct 8, 2014 at 11:23 AM, Gwen Shapira wrote:
> yep. exactly.
>
> On Wed, Oct 8, 2014 at 11:07 AM, Sharninder wrote:
>> Thanks Gwen.
>>
>> When you're saying that I can add consumers to the same g
The problem with Kafka is that we never know when a consumer is
"truly" inactive.
But - if you decide to define inactive as consumer who's last offset
is lower than anything available on the log (or perhaps lagging by
over X messages?), its fairly easy to write a script to detect and
clean them di
Out of curiosity: did you choose Redis because ZooKeeper is not well
supported in Clojure? Or were there other reasons?
On Mon, Oct 13, 2014 at 2:04 PM, Gerrit Jansen van Vuuren
wrote:
> Hi Steven,
>
> Redis:
>
> I've had a discussion on redis today, and one architecture that does come
> up is
ack = 2 *will* throw an exception when there's only one node in ISR.
The problem with ack=2 is that if you have 3 replicas and you got acks
from 2 of them, the one replica which did not get the message can
still be in ISR and get elected as leader, leading for a loss of the
message. If you specify
Just note that this is not a universal solution. Many use-cases care
about which partition you end up writing to since partitions are used
to... well, partition logical entities such as customers and users.
On Wed, Oct 15, 2014 at 9:03 PM, Jun Rao wrote:
> Kyle,
>
> What you wanted is not supp
I assume the messages themselves contain the timestamp?
If you use Flume, you can configure a Kafka source to pull data from
Kafka, use an interceptor to pull the date out of your message and
place it in the event header and then the HDFS sink can write to a
partition based on the timestamp.
Gwen
If you have "auto.create.topics.enable" set to "true" (default),
producing to a topic creates it.
Its a bit tricky because the "send" that creates the topic can fail
with "leader not found" or similar issue. retrying few times will
eventually succeed as the topic gets created and the leader gets
e
0.8.1.1 producer is Sync by default, and you can set producer.type to
async if needed.
On Fri, Oct 17, 2014 at 2:57 PM, Mohit Anchlia wrote:
> Thanks! How can I tell if I am using async producer? I thought all the
> sends are async in nature
> On Fri, Oct 17, 2014 at 11:44 AM, Gwe
btw. I got a blog post where I show how I work around the blocking
hasNext() thing.
May be helpful:
http://ingest.tips/2014/10/12/kafka-high-level-consumer-frequently-missing-pieces/
On Thu, Oct 16, 2014 at 12:52 PM, Neha Narkhede wrote:
> Josh,
>
> The consumer's API doesn't allow you to specify
gt; > request.required.acks=0,
>> > I thought this sets the producer to be async?
>> > On Fri, Oct 17, 2014 at 11:59 AM, Gwen Shapira
>> > wrote:
>> >> 0.8.1.1 producer is Sync by default, and you can set producer.type to
>> >> async if need
ives the message. And async means it just dispatches the message
> without any gurantees that message is delivered. Did I get that part right?
> On Fri, Oct 17, 2014 at 1:28 PM, Gwen Shapira wrote:
>
>> Sorry if I'm confusing you :)
>>
>> Kafka 0.8.1.1 has two
s but is there a
> place that lists some important performance specific parameters?
> On Fri, Oct 17, 2014 at 2:43 PM, Gwen Shapira wrote:
>
>> If I understand correctly (and I'll be happy if someone who knows more
>> will jump in and correct me):
>>
>> The Sync/
3-replica topic in a
> 12-node Kafka cluster, there's a relatively high probability that losing 2
> nodes from this cluster will result in an inability to write to the cluster.
>
> On Tue, Oct 14, 2014 at 4:50 PM, Gwen Shapira wrote:
>
>> ack = 2 *will* throw an excepti
Consumers always read from the leader replica, which is always in sync
by definition. So you are good there.
The concern would be if the leader crashes during this period.
On Tue, Oct 21, 2014 at 2:56 PM, Neil Harkins wrote:
> Hi. I've got a 5 node cluster running Kafka 0.8.1,
> with 4697 parti
Anything missing in the output of:
kafka-topics.sh --describe --zookeeper localhost:2181
?
On Tue, Oct 21, 2014 at 4:29 PM, Jonathan Creasy
wrote:
> I¹d like to be able to see a little more detail for a topic.
>
> What is the best way to get this information?
>
> Topic Partition Replica B
RAID-10?
Interesting choice for a system where the data is already replicated
between nodes. Is it to avoid the cost of large replication over the
network? how large are these disks?
On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino wrote:
> In fact there are many more than 4000 open files. Many of o
n
>
>
> On Oct 22, 2014, at 11:01 AM, Gwen Shapira wrote:
>
>> RAID-10?
>> Interesting choice for a system where the data is already replicated
>> between nodes. Is it to avoid the cost of large replication over the
>> network? how large are these disks?
>>
While I agree with Mark that testing the end-to-end pipeline is
critical, note that in terms of performance - whatever you write to
hook-up Teradata to Kafka is unlikely to be as fast as Teradata
connector for Sqoop (especially the newer one). Quite a lot of
optimization by Teradata engineers went
Todd,
Did you load-test using SSDs?
Got numbers to share?
On Fri, Oct 24, 2014 at 10:40 AM, Todd Palino wrote:
> Hmm, I haven't read the design doc lately, but I'm surprised that there's
> even a discussion of sequential disk access. I suppose for small subsets of
> the writes you can write larg
note that --zookeeper is the location of the zookeeper server, not Kafka broker.
Are you running zookeeper on both 192.168.100.91 and 192.168.100.92?
Zookeeper is based on simple majority, therefore you can't run it with
2 nodes (well you can, but it will freeze if you lose one node), you
need ei
High level consumer commits before shutting down.
If you'll look at ZookeeperConsumerConnector.scala (currently the only
implementation of ConsumerConnector) you'll see shutdown() includes
the following:
if (config.autoCommitEnable)
commitOffsets()
Gwen
On Tue, Oct 28, 201
The producer configuration should list the kafka brokers, not the zookeeper
quorum.
See here: http://kafka.apache.org/documentation.html#producerconfigs
(and send my regards to Alex Gorbachev ;)
Gwen
On Fri, Oct 31, 2014 at 8:05 PM, Tomas Nunez wrote:
> Hi
>
> I'm trying to upgrade a 0.7 kaf
This is part of Scala, so it should be in the scala-library-...jar
On Fri, Oct 31, 2014 at 8:26 PM, Tomas Nunez wrote:
> Well... I used strace and I found it was looking for some classes in a
> wrong path. I fixed most of them, but there's one that isn't anywhere,
> neither the new nor the old
7;m following
> https://cwiki.apache.org/confluence/display/KAFKA/Migrating+from+0.7+to+0.8
> and I can't see there anything about downloading classes, and I don't find
> much people with the same problem, which leads me to think that I'm doing
> something wrong...
>
Not sure about the throughput, but:
"I mean that the words counted in spark should grow up" - The spark
word-count example doesn't accumulate.
It gets an RDD every n seconds and counts the words in that RDD. So we
don't expect the count to go up.
On Mon, Nov 3, 2014 at 6:57 AM, Eduardo Costa Al
+1
Thats what we use to generate broker id in automatic deployments.
This method makes troubleshooting easier (you know where each broker is
running), and doesn't require keeping extra files around.
On Mon, Nov 3, 2014 at 2:17 PM, Joe Stein wrote:
> Most folks strip the IP and use that as the br
lientCnxn$AuthData.class
> org/apache/zookeeper/ClientCnxn$EndOfStreamException.class
> org/apache/zookeeper/ClientCnxn$EventThread.class
> org/apache/zookeeper/ClientCnxn$Packet.class
> org/apache/zookeeper/ClientCnxn$SendThread.class
> org/apache/zookeeper/ClientCnxn$SessionExpiredEx
Regarding more information:
Maybe ltrace?
If I were you, I'd go to MigrationTool code and start adding LOG lines.
because there aren't enough of those to troubleshoot.
On Wed, Nov 5, 2014 at 6:13 PM, Gwen Shapira wrote:
> org.apache.zookeeper.ClientCnxn is throwing the exceptio
Also, can you post your configs? Especially the "zookeeper.connect" one?
On Wed, Nov 5, 2014 at 6:15 PM, Gwen Shapira wrote:
> Regarding more information:
> Maybe ltrace?
>
> If I were you, I'd go to MigrationTool code and start adding LOG lines.
> because t
What's the window size? If the window is around 10 seconds and you are
sending data at very stable rate, this is expected.
On Thu, Nov 6, 2014 at 9:32 AM, Eduardo Costa Alfaia wrote:
> Hi Guys,
>
> I am doing some tests with Spark Streaming and Kafka, but I have seen
> something strange, I hav
+1 for dropping Java 6
On Thu, Nov 6, 2014 at 9:31 AM, Steven Schlansker wrote:
> Java 6 has been End of Life since Feb 2013.
> Java 7 (and 8, but unfortunately that's too new still) has very compelling
> features which can make development a lot easier.
>
> The sooner more projects drop Java 6
Java6 is supported on CDH4 but not CDH5.
On Thu, Nov 6, 2014 at 9:54 AM, Koert Kuipers wrote:
> when is java 6 dropped by the hadoop distros?
>
> i am still aware of many clusters that are java 6 only at the moment.
>
>
>
> On Thu, Nov 6, 2014 at 12:44 PM, Gwen Shapira
Done!
Thank you for using Kafka and letting us know :)
On Sat, Nov 8, 2014 at 2:15 AM, vipul jhawar wrote:
> Exponential @exponentialinc is using kafka in production to power the
> events ingestion pipeline for real time analytics and log feed consumption.
>
> Please post on powered by kafka wi
Updated. Thanks!
On Sat, Nov 8, 2014 at 12:16 PM, Jimmy John wrote:
> Livefyre (http://web.livefyre.com/) uses kafka for the real time
> notifications, analytics pipeline and as the primary mechanism for general
> pub/sub.
>
> thx...
> jim
>
> On Sat, Nov 8, 2014
The producer code here looks fine. It may be an issue with the consumer, or
how the consumer is used.
If you are running the producer before starting a consumer, make sure you
get all messages by setting auto.offset.reset=smallest (in the console
consumer you can use --from-beginning)
Also, you c
I'm not Jay, but fixed it anyways ;)
Gwen
On Sun, Nov 9, 2014 at 10:34 AM, vipul jhawar
wrote:
> Hi Jay
>
> Thanks for posting the update.
>
> However, i checked the page history and the hyperlink is pointing to the
> wrong domain.
> Exponential refers to www.exponential.com. I sent the twitter
In Sqoop we do the following:
Maven runs a shell script, passing the version as a parameter.
The shell-script generates a small java class, which is then built with a
Maven plugin.
Our code references this generated class when we expose "getVersion()".
Its complex and ugly, so I'm kind of hoping
T 2011
> version=10.0.1
> groupId=com.google.guava
> artifactId=guava
>
> Thanks,
>
> Bhavesh
>
> On Tue, Nov 11, 2014 at 10:34 AM, Gwen Shapira
> wrote:
>
> > In Sqoop we do the following:
> >
> > Maven runs a shell script, passing the version
Perhaps relevant:
Hadoop is moving toward dropping Java6 in next release.
https://issues.apache.org/jira/browse/HADOOP-10530
On Thu, Nov 6, 2014 at 11:03 AM, Jay Kreps wrote:
> Yeah it is a little bit silly that people are still using Java 6.
>
> I guess this is a tradeoff--being more conserva
Nope.
Here's the JIRA where we are still actively working on security, targeting
0.9:
https://issues.apache.org/jira/browse/KAFKA-1682
Gwen
On Tue, Nov 11, 2014 at 7:37 PM, Kashyap Mhaisekar
wrote:
> Hi,
> Is there a way to secure the topics created in Kafka 0.8.2 beta? The need
> is to ensure
, Nov 12, 2014 at 9:09 AM, Mark Roberts wrote:
> Just to be clear: this is going to be exposed via some Api the clients can
> call at startup?
>
>
> > On Nov 12, 2014, at 08:59, Guozhang Wang wrote:
> >
> > Sounds great, +1 on this.
> >
> >> On T
Actually, Jun suggested exposing this via JMX.
On Wed, Nov 12, 2014 at 9:31 AM, Gwen Shapira wrote:
> Good question.
>
> The server will need to expose this in the protocol, so Kafka clients will
> know what they are talking to.
>
> We may also want to expose this in the pro
every server in the cluster? Is there a reason
>> not to include this in the API itself?
>>
>> -Mark
>>
>> On Wed, Nov 12, 2014 at 9:50 AM, Joel Koshy wrote:
>>
>> > +1 on the JMX + gradle properties. Is there any (seamless) way of
>> > includ
I think the issue is that you are:
" running the above snippet for every broker ... I am assuming that
item.partitionsMetadata() only returns PartitionMetadata for the partitions
this broker is responsible for "
This is inaccurate. Each broker will check ZooKeeper for PartitionMetadata
and return
Hi Casey,
1. There's some limit based on size of zookeeper nodes, not sure exactly
where it is though. We've seen 30 node clusters running in production.
2. For your scenario to work, the new broker will need to have the same
broker id as the old one - or you'll need to manually re-assign partiti
I don't see any advantage to more than one broker per server. In my
experience a single broker is capable of saturating the network link
and therefore I can't see how a second or third brokers will give any
benefits.
Gwen
On Fri, Nov 28, 2014 at 9:24 AM, Sa Li wrote:
> Dear all
>
> I am provisio
Can you elaborate a bit on what an object API wrapper will look like?
Since the serialization API already exists today, its very easy to
know how I'll use the new producer with serialization - exactly the
same way I use the existing one.
If we are proposing a change that will require significant c
If you write to a non-leader partition, I'd expect you'd get
NotLeaderForPartitionException (thrown by
Partition.appendMessagesToLeader).
This will get sent to the producer as error code 6.
I don't see anything special in the producer side to handle this
specific (although I'd expect a forced meta
I think that A will not be able to become a follower until B becomes a leader.
On Sun, Dec 7, 2014 at 11:07 AM, Xiaoyu Wang wrote:
> On preferred replica election, controller sends LeaderAndIsr requests to
> brokers. Broker will handle the LeaderAndIsr request by either become a
> leader or becom
It looks like none of your replicas are in-sync. Did you enable unclean
leader election?
This will allow one of the un-synced replicas to become leader, leading to
data loss but maintaining availability of the topic.
Gwen
On Tue, Dec 9, 2014 at 8:43 AM, Neil Harkins wrote:
> Hi. We've suffered
There is a parameter called replica.fetch.max.bytes that controls the
size of the messages buffer a broker will attempt to consume at once.
It defaults to 1MB, and has to be at least message.max.bytes (so at
least one message can be sent).
If you try to support really large messages and increase t
to 10MB to
> allow larger messages, so perhaps that's related. But should that really be
> big enough to cause OOMs on an 8GB heap? Are there other broker settings we
> can tune to avoid this issue?
>
> On Wed, Dec 10, 2014 at 11:05 AM, Gwen Shapira
> wrote:
>
&
Ah, found where we actually size the request as partitions * fetch size.
Thanks for the correction, Jay and sorry for the mix-up, Solon.
On Wed, Dec 10, 2014 at 10:41 AM, Jay Kreps wrote:
> Hey Solon,
>
> The 10MB size is per-partition. The rationale for this is that the fetch
> size per-partiti
didn't realize we needed to
> take the (partitions * fetch size) calculation into account when choosing
> partition counts for our topics, so this is a bit of a rude surprise.
>
> On Wed, Dec 10, 2014 at 3:50 PM, Gwen Shapira wrote:
>
>> Ah, found where we actually size the re
When you send messages to Kafka you send a pair. The key
can include the user id.
Here's how:
KeyedMessage data = new KeyedMessage
(user_id, user_id, event);
producer.send(data);
Hope this helps,
Gwen
On Mon, Dec 15, 2014 at 10:29 AM, Harold Nguyen wrote:
> Hello Kafka Experts!
>
>
many different keys can Kafka
> support ?
>
> Harold
>
> On Mon, Dec 15, 2014 at 10:46 AM, Gwen Shapira
> wrote:
>>
>> When you send messages to Kafka you send a pair. The key
>> can include the user id.
>>
>> Here's how:
>
Currently you can find the number of consumer groups through ZooKeeper:
connect to ZK and run
ls /consumers
and count the number of results
On Mon, Dec 15, 2014 at 11:34 AM, nitin sharma
wrote:
> Hi Team,
>
> Is it possible to know how many Consumer Group connected to kafka broker Ids
> and as
connect to a zookeeper..?
>
> Regards,
> Nitin Kumar Sharma.
>
>
> On Mon, Dec 15, 2014 at 6:36 PM, Neha Narkhede wrote:
>>
>> In addition to Gwen's suggestion, we actually don't have jmx metrics that
>> give you a list of actively consuming processes.
>
" If all
the consumers stop listening how long will Kafka continue to store messages
for that group?"
Kafka retains data for set amount of time, regardless of whether
anyone is listening or not. This amount of time is configurable.
Because Kafka performance is generally constant with the amount of
2:33 PM, Greg Lloyd wrote:
> Thanks for the reply,
>
> So if I wanted to add a new group of consumers 6 months into the lifespan
> of my implementation and I didn't want that new group to process all the
> last six months is there a method to manage this?
>
>
>
>
Looks like you can't connect to: 10.100.98.100:9092
I'd validate that this is the issue using telnet and then check the
firewall / ipfilters settings.
On Thu, Dec 18, 2014 at 2:21 PM, Sa Li wrote:
> Dear all
>
> We just build a kafka production cluster, I can create topics in kafka
> production
8:9092.
>
> Just in case, is it possibly caused by other types of issues?
>
> thanks
>
> Alec
>
> On Thu, Dec 18, 2014 at 2:33 PM, Gwen Shapira wrote:
>>
>> Looks like you can't connect to: 10.100.98.100:9092
>>
>> I'd validate that this i
Hi,
LogManager.nextLogDir() has the logic for choosing which directory to use.
The documentation of the method says:
/**
* Choose the next directory in which to create a log. Currently this is
done
* by calculating the number of partitions in each directory and then
choosing the
* data
IMO:
KAFKA-1790 - can be pushed out (or even marked as "won't fix")
KAFKA-1782 - can be pushed out (not really a blocker)
The rest look like actual blockers to me.
Gwen
On Tue, Dec 23, 2014 at 1:32 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:
> Hi,
>
> I see 16 open issues for 0.8.
Actually, KAFKA-1785 <https://issues.apache.org/jira/browse/KAFKA-1785> can
also wait - since it is likely to be part of a larger patch.
On Thu, Dec 25, 2014 at 10:39 AM, Gwen Shapira
wrote:
> IMO:
> KAFKA-1790 - can be pushed out (or even marked as "won't fix")
>
OffsetCommitRequest has two constructors now:
For version 0:
OffsetCommitRequest(String groupId, Map offsetData)
And version 1:
OffsetCommitRequest(String groupId, int generationId, String
consumerId, Map offsetData)
None of them seem to require timestamps... so I'm not sure where you
see that
Ah, I see :)
The readFrom function basically tries to read two extra fields if you
are on version 1:
if (versionId == 1) {
groupGenerationId = buffer.getInt
consumerId = readShortString(buffer)
}
The rest looks identical in version 0 and 1, and still no timestamp in sight...
Gwe
t;
>
> Dana Powers
> Rdio, Inc.
> dana.pow...@rd.io
> rdio.com/people/dpkp/
>
> On Mon, Jan 5, 2015 at 9:49 AM, Gwen Shapira wrote:
>
>> Ah, I see :)
>>
>> The readFrom function basically tries to read two extra fields if you
>> are on version 1:
>
The Apache process is that you vote for an RC, and if the vote passes
(i.e. three +1 from PMC and no -1) the same artifacts will be released
(without RC).
If issues are discovered, there may be another RC.
Note that the RC is published on Jun's directory, not an official
Kafka repository.
You can
At the moment, the best way would be:
* Wait about two weeks
* Upgrade to 0.8.2
* Use kafka-topic.sh --delete
:)
2015-01-14 9:26 GMT-08:00 Armando Martinez Briones :
> Hi.
>
> What is the best way to delete a topic into production environment?
>
> --
> [image: Tralix][image: 1]José Armando Martí
From: Armando Martinez Briones
> To: users@kafka.apache.org
> Sent: Wednesday, January 14, 2015 11:33 AM
> Subject: Re: Delete topic
>
> thanks Gwen Shapira ;)
>
> El 14 de enero de 2015, 11:31, Gwen Shapira
> escribió:
>
>> At the moment, the best way would be
You may find this article useful for troubleshooting and modifying TIME_WAIT:
http://www.linuxbrigade.com/reduce-time_wait-socket-connections/
The line you have for increasing file limit is fine, but you may also
need to increase the limit system wide:
insert "fs.file-max = 10" in /etc/sysctl.
Would make sense to enable it after we have authorization feature and
admins can control who can delete what.
On Thu, Jan 15, 2015 at 6:32 PM, Jun Rao wrote:
> Yes, I agree it's probably better not to enable "delete.topic.enable" by
> default.
>
> Thanks,
>
> Jun
>
> On Thu, Jan 15, 2015 at 6:29
Those errors are expected - if broker 10.0.0.11 went down, it will
reset the connection and the other broker will close the socket.
However, it looks like 10.0.0.11 crashes every two minutes?
Do you have the logs from 10.0.0.11?
On Thu, Jan 15, 2015 at 9:51 PM, Tousif wrote:
> i'm using kafka 2.
Two things:
1. The OOM happened on the consumer, right? So the memory that matters
is the RAM on the consumer machine, not on the Kafka cluster nodes.
2. If the consumers belong to the same consumer group, each will
consume a subset of the partitions and will only need to allocate
memory for those
Hi,
As a former DBA, I hear you on backups :)
Technically, you could copy all log.dir files somewhere safe
occasionally. I'm pretty sure we don't guarantee the consistency or
safety of this copy. You could find yourself with a corrupt "backup"
by copying files that are either in the middle of get
ld one use ZFS or BTRFS snapshot functionality for this?
>
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Tue, Jan 20, 2015 at 1:39 AM, Gwen Shapira wrote:
&g
the console-consumer every once in a while. Although note that the "full"
> is constrained by the retention period of the data (controlled at the
> queue/cluster level).
> From: Gwen Shapira
> To: "users@kafka.apache.org"
> Sent: Tuesday, January 20, 20
It sounds like you have two zookeepers, one for HDP and one for Kafka.
Did you move Kafka from one zookeeper to another?
Perhaps Kafka finds the topics (logs) on disk, but they do not exist
in ZK because you are using a different zookeeper now.
Gwen
On Thu, Jan 22, 2015 at 6:38 PM, Jun Rao wrot
Also, do you have delete.topic.enable=true on all brokers?
The automatic topic creation can fail if the default number of
replicas is greater than number of available brokers. Check the
default.replication.factor parameter.
Gwen
On Tue, Jan 27, 2015 at 12:29 AM, Joel Koshy wrote:
> Which versio
It sounds like you are describing Flume, with SpoolingDirectory source
(or exec source running tail) and Kafka channel.
On Wed, Jan 28, 2015 at 10:39 AM, Fernando O. wrote:
> Hi all,
> I'm evaluating using Kafka.
>
> I liked this thing of Facebook scribe that you log to your own machine and
>
IIRC, the directory is only created after you send data to the topic.
Do you get errors when your producer sends data?
Another common issue is that you specify replication-factor 3 when you
have fewer than 3 brokers.
Gwen
On Mon, Feb 2, 2015 at 2:34 AM, Xinyi Su wrote:
> Hi,
>
> I am using Kaf
If you want to emulate the old sync producer behavior, you need to set
the batch size to 1 (in producer config) and wait on the future you
get from Send (i.e. future.get)
I can't think of good reasons to do so, though.
Gwen
On Mon, Feb 2, 2015 at 11:08 AM, Otis Gospodnetic
wrote:
> Hi,
>
> Is
gress into a single
>> request--giving a kind of "group commit" effect.
>>
>> The hope is that this will be both simpler to understand (a single api that
>> always works the same) and more powerful (you always get a response with
>> error and offset informatio
he Producer was
> using SYNC mode?" is YES, in which case the connection from X to Y would be
> open for just as long as with a SYNC producer running in Y?
>
> Thanks,
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elast
whole batch. This significantly complicates recovery logic
> where we need to commit a batch as opposed 1 record at a time.
>
> Do you guys have any plans to add better semantics around batches?
>
> On Mon, Feb 2, 2015 at 1:34 PM, Gwen Shapira wrote:
>
>> If I understood the
When's the party?
:)
On Mon, Feb 2, 2015 at 8:13 PM, Jay Kreps wrote:
> Yay!
>
> -Jay
>
> On Mon, Feb 2, 2015 at 2:23 PM, Neha Narkhede wrote:
>>
>> Great! Thanks Jun for helping with the release and everyone involved for
>> your contributions.
>>
>> On Mon, Feb 2, 2015 at 1:32 PM, Joe Stein wr
ription you are saying you actually
> > > care
> > > > how many physical requests are issued. I think it is more like it is
> > just
> > > > syntactically annoying to send a batch of data now because it needs a
> > for
> > > > loop.
> >
Thanks Jon. I updated the FAQ with your procedure:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdowemigratetocommittingoffsetstoKafka(ratherthanZookeeper)in0.8.2
?
On Thu, Feb 5, 2015 at 9:16 AM, Jon Bringhurst <
jbringhu...@linkedin.com.invalid> wrote:
> There should probably be
The Kafka documentation has several good diagrams. Did you check it out?
http://kafka.apache.org/documentation.html
On Thu, Feb 5, 2015 at 6:31 AM, Ankur Jain wrote:
> Hi Team,
>
> I am looking out high and low level architecture diagram of Kafka with
> Zookeeper, but haven't got any good one ,
1 - 100 of 535 matches
Mail list logo