Re: [jira] [Commented] (KAFKA-717) scala 2.10 build support

2013-07-10 Thread Cosmin Lehene
We're also interested in getting this for 0.8.

I'm curious if someone maintains an updated branch on Github or willing to
do so.

Thanks,
Cosmin

On 7/6/13 6:15 AM, "Matei Zaharia (JIRA)"  wrote:

>
>[ 
>https://issues.apache.org/jira/browse/KAFKA-717?page=com.atlassian.jira.pl
>ugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13701239#comm
>ent-13701239 ] 
>
>Matei Zaharia commented on KAFKA-717:
>-
>
>I'm curious, is this intended to be included in 0.8? I use Kafka in my
>project (www.spark-project.org, recently entered Apache Incubator) and
>we've had to ship our own JAR to get Scala 2.9 support, and now we might
>have to do the same for 2.10 when we update to 2.10.
>
>> scala 2.10 build support
>> 
>>
>> Key: KAFKA-717
>> URL: https://issues.apache.org/jira/browse/KAFKA-717
>> Project: Kafka
>>  Issue Type: Improvement
>>  Components: packaging
>>Affects Versions: 0.8
>>Reporter: Viktor Taranenko
>>  Labels: build
>> Attachments: 0001-common-changes-for-2.10.patch,
>>0001-common-changes-for-2.10.patch,
>>0001-KAFKA-717-Convert-to-scala-2.10.patch,
>>0002-java-conversions-changes.patch,
>>0002-java-conversions-changes.patch, 0003-add-2.9.3.patch,
>>0003-add-2.9.3.patch,
>>0004-Fix-cross-compile-of-tests-update-to-2.10.2-and-set-.patch,
>>KAFKA-717-complex.patch, KAFKA-717-simple.patch, kafka_scala_2.10.tar.gz
>>
>>
>
>
>--
>This message is automatically generated by JIRA.
>If you think it was sent incorrectly, please contact your JIRA
>administrators
>For more information on JIRA, see: http://www.atlassian.com/software/jira



Make documentation part of new features acceptance criteria?

2013-07-10 Thread Cosmin Lehene
I'm not sure if there's already a guideline like this, but I wouldn't it make 
sense to have it in order to keep documentation in sync with the code?
Also, having this type of documentation as part of the codebase to allow proper 
versioning might be a good idea as well.

Cosmin


[jira] [Commented] (KAFKA-966) Allow high level consumer to 'nak' a message and force Kafka to close the KafkaStream without losing that message

2013-07-10 Thread Chris Curtin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704434#comment-13704434
 ] 

Chris Curtin commented on KAFKA-966:


Not to be dense, but wouldn't managing the offsets that way remove the ability 
to easily multi-thread the consumer? The commitOffsets method is on the 
ConsumerConnector not the KafkaStream, so to do a multi-threaded client I'd 
need to write logic to checkpoint all the threads to make sure they are all 
okay before committing back to Kafka. 

commitOffsets would also require that all the messages on all partitions 
succeed or be rolled back together, so a failure on one message could stop 
everything. In a multi-partition model where the partitions end up in different 
Shards, databases etc. that makes the consumer a lot more complicated.

> Allow high level consumer to 'nak' a message and force Kafka to close the 
> KafkaStream without losing that message
> -
>
> Key: KAFKA-966
> URL: https://issues.apache.org/jira/browse/KAFKA-966
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8
>Reporter: Chris Curtin
>Assignee: Neha Narkhede
>Priority: Minor
>
> Enhancement request.
> The high level consumer is very close to handling a lot of situations a 
> 'typical' client would need. Except for when the message received from Kafka 
> is valid, but the business logic that wants to consume it has a problem.
> For example if I want to write the value to a MongoDB or Cassandra database 
> and the database is not available. I won't know until I go to do the write 
> that the database isn't available, but by then it is too late to NOT read the 
> message from Kafka. Thus if I call shutdown() to stop reading, that message 
> is lost since the offset Kafka writes to ZooKeeper is the next offset.
> Ideally I'd like to be able to tell Kafka: close the KafkaStream but set the 
> next offset to read for this partition to this message when I start up again. 
> And if there are any messages in the BlockingQueue for other partitions, find 
> the lowest # and use it for that partitions offset since I haven't consumed 
> them yet.
> Thus I can cleanly shutdown my processing, resolve whatever the issue is and 
> restart the process.
> Another idea might be to allow a 'peek' into the next message and if I 
> succeed in writing to the database call 'next' to remove it from the queue. 
> I understand this won't deal with a 'kill -9' or hard failure of the JVM 
> leading to the latest offsets not being written to ZooKeeper but it addresses 
> a likely common scenario for consumers. Nor will it add true transactional 
> support since the ZK update could fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Make documentation part of new features acceptance criteria?

2013-07-10 Thread Jun Rao
Cosmin,

That's a good idea. In the past, for major new features, we tend to create
a wiki page to outline the design. The wiki pages can be organized better.
Is this what you are looking for?

Thanks,

Jun


On Wed, Jul 10, 2013 at 1:17 AM, Cosmin Lehene  wrote:

> I'm not sure if there's already a guideline like this, but I wouldn't it
> make sense to have it in order to keep documentation in sync with the code?
> Also, having this type of documentation as part of the codebase to allow
> proper versioning might be a good idea as well.
>
> Cosmin
>


Re: Make documentation part of new features acceptance criteria?

2013-07-10 Thread Jay Kreps
I like the idea of improving our documentation. Help is very much
appreciated in this area (but of course the problem is that the people who
experience the holes almost by definition can't fill them in). So even just
pointing out areas that aren't covered is really helpful.

We are in a sort of awkward stage this week because we have a 0.8 beta
release but no detailed docs on its internals.

WRT your specific proposals. I don't think we should do the documentation
with each feature because I think that tends to lead to a bunch of little
documents one for each change. I think we effectively get this out of
JIRA+wiki today. This usually serves as a fairly complete design doc +
commentary be others. It is pretty hard to get information out of this
format for a new user, though.

We do version control documentation but we can't physically version control
it with the code because the code is in git and Apache only allows SVN as a
mechanism for publishing to xxx.apache.org. :-(

Instead what about this: we add a new release criteria for documentation
completeness. It would be good to formalize the release criteria anyway.
Informally they are something like
1. Developers think it is feature complete
2. Unit tests pass
3. Integration/stress tests pass
4. Some production usage
It would be good to add to this list (5) documentation up-to-date and not
do a release without this.

It is debatable whether this should apply to beta releases, but probably it
should. We can certainly apply it to the final 0.8 release if people are on
board.

-Jay



On Wed, Jul 10, 2013 at 1:17 AM, Cosmin Lehene  wrote:

> I'm not sure if there's already a guideline like this, but I wouldn't it
> make sense to have it in order to keep documentation in sync with the code?
> Also, having this type of documentation as part of the codebase to allow
> proper versioning might be a good idea as well.
>
> Cosmin
>


[jira] [Commented] (KAFKA-717) scala 2.10 build support

2013-07-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704719#comment-13704719
 ] 

Jun Rao commented on KAFKA-717:
---

We should take this patch in trunk, instead of 0.8.

> scala 2.10 build support
> 
>
> Key: KAFKA-717
> URL: https://issues.apache.org/jira/browse/KAFKA-717
> Project: Kafka
>  Issue Type: Improvement
>  Components: packaging
>Affects Versions: 0.8
>Reporter: Viktor Taranenko
>  Labels: build
> Attachments: 0001-common-changes-for-2.10.patch, 
> 0001-common-changes-for-2.10.patch, 
> 0001-KAFKA-717-Convert-to-scala-2.10.patch, 
> 0002-java-conversions-changes.patch, 0002-java-conversions-changes.patch, 
> 0003-add-2.9.3.patch, 0003-add-2.9.3.patch, 
> 0004-Fix-cross-compile-of-tests-update-to-2.10.2-and-set-.patch, 
> KAFKA-717-complex.patch, KAFKA-717-simple.patch, kafka_scala_2.10.tar.gz
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: having problem with 0.8 gzip compression

2013-07-10 Thread Scott Wang
Jun,

I did a test this morning and got a very interesting result with you
command.  I started by wipe all the log files and clean up all zookeeper
data files.

Once I restarted both server, producer and consumer then execute your
command, what I got is a empty log as following:

Dumping /Users/scott/Temp/kafka/test-topic-0/.log
Starting offset: 0

One observation, the .index file was getting huge but
there was nothing in .log file.

Thanks,
Scott




On Tue, Jul 9, 2013 at 8:40 PM, Jun Rao  wrote:

> Could you run the following command on one of the log files of your topic
> and attach the output?
>
> bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files
> /tmp/kafka-logs/testtopic-0/.log
>
> Thanks,
>
> Jun
>
>
> On Tue, Jul 9, 2013 at 3:23 PM, Scott Wang <
> scott.w...@rumbleentertainment.com> wrote:
>
> > Another piece of information, the snappy compression also does not work.
> >
> > Thanks,
> > Scott
> >
> >
> > On Tue, Jul 9, 2013 at 11:07 AM, Scott Wang <
> > scott.w...@rumbleentertainment.com> wrote:
> >
> > > I just try it and it still not showing up, thanks for looking into
> this.
> > >
> > > Thanks,
> > > Scott
> > >
> > >
> > > On Tue, Jul 9, 2013 at 8:06 AM, Jun Rao  wrote:
> > >
> > >> Could you try starting the consumer first (and enable gzip in the
> > >> producer)?
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >>
> > >> On Mon, Jul 8, 2013 at 9:37 PM, Scott Wang <
> > >> scott.w...@rumbleentertainment.com> wrote:
> > >>
> > >> > No, I did not start the consumer before the producer.  I actually
> > >> started
> > >> > the producer first and nothing showed up in the consumer unless I
> > >> commented
> > >> > out this line -- props.put("compression.codec", "gzip").If I
> > >> commented
> > >> > out the compression codec, everything just works.
> > >> >
> > >> >
> > >> > On Mon, Jul 8, 2013 at 9:07 PM, Jun Rao  wrote:
> > >> >
> > >> > > Did you start the consumer before the producer? Be default, the
> > >> consumer
> > >> > > gets only the new data?
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Jun
> > >> > >
> > >> > >
> > >> > > On Mon, Jul 8, 2013 at 2:53 PM, Scott Wang <
> > >> > > scott.w...@rumbleentertainment.com> wrote:
> > >> > >
> > >> > > > I am testing with Kafka 0.8 beta and having problem of receiving
> > >> > message
> > >> > > in
> > >> > > > consumer.  There is no error so does anyone have any insights.
> > >>  When I
> > >> > > > commented out the "compression.code" everything works fine.
> > >> > > >
> > >> > > > My producer:
> > >> > > > public class TestKafka08Prod {
> > >> > > >
> > >> > > > public static void main(String [] args) {
> > >> > > >
> > >> > > > Producer producer = null;
> > >> > > > try {
> > >> > > > Properties props = new Properties();
> > >> > > > props.put("metadata.broker.list", "localhost:9092");
> > >> > > > props.put("serializer.class",
> > >> > > > "kafka.serializer.StringEncoder");
> > >> > > > props.put("producer.type", "sync");
> > >> > > > props.put("request.required.acks","1");
> > >> > > > props.put("compression.codec", "gzip");
> > >> > > > ProducerConfig config = new ProducerConfig(props);
> > >> > > > producer = new Producer(config);
> > >> > > > int j=0;
> > >> > > > for(int i=0; i<10; i++) {
> > >> > > > KeyedMessage data = new
> > >> > > > KeyedMessage("test-topic", "test-message: "+i+"
> > >> > > > "+System.currentTimeMillis());
> > >> > > > producer.send(data);
> > >> > > >
> > >> > > > }
> > >> > > >
> > >> > > > } catch (Exception e) {
> > >> > > > System.out.println("Error happened: ");
> > >> > > > e.printStackTrace();
> > >> > > > } finally {
> > >> > > > if(null != null) {
> > >> > > > producer.close();
> > >> > > > }
> > >> > > >
> > >> > > > System.out.println("Ened of Sending");
> > >> > > > }
> > >> > > >
> > >> > > > System.exit(0);
> > >> > > > }
> > >> > > > }
> > >> > > >
> > >> > > >
> > >> > > > My consumer:
> > >> > > >
> > >> > > > public class TestKafka08Consumer {
> > >> > > > public static void main(String [] args) throws
> > >> > UnknownHostException,
> > >> > > > SocketException {
> > >> > > >
> > >> > > > Properties props = new Properties();
> > >> > > > props.put("zookeeper.connect",
> > "localhost:2181/kafka_0_8");
> > >> > > > props.put("group.id", "test08ConsumerId");
> > >> > > > props.put("zk.sessiontimeout.ms", "4000");
> > >> > > > props.put("zk.synctime.ms", "2000");
> > >> > > > props.put("autocommit.interval.ms", "1000");
> > >> > > >
> > >> > > > ConsumerConfig consumerConfig = new
> ConsumerConfig(props);
> > >> > > >
> > >> > > > ConsumerC

[jira] [Created] (KAFKA-968) Typographical Errors in Output

2013-07-10 Thread Rebecca Sealfon (JIRA)
Rebecca Sealfon created KAFKA-968:
-

 Summary: Typographical Errors in Output
 Key: KAFKA-968
 URL: https://issues.apache.org/jira/browse/KAFKA-968
 Project: Kafka
  Issue Type: Bug
  Components: core, replication
Affects Versions: 0.8
 Environment: Kafka was run on GNU/Linux x86_64 but this is relevant to 
all environments
Reporter: Rebecca Sealfon
Assignee: Neha Narkhede
Priority: Trivial
 Fix For: 0.8


The word "partition" is referred to as "partion" in 
system_test/replication_testsuite/testcase_0106/testcase_0106_properties.json 
line 2 and core/src/main/scala/kafka/server/AbstractFetcherManager.scala line 
49.  This typo may interfere with text-based searching of output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-959) DefaultEventHandler can send more produce requests than necesary

2013-07-10 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-959:


Assignee: Guozhang Wang

> DefaultEventHandler can send more produce requests than necesary
> 
>
> Key: KAFKA-959
> URL: https://issues.apache.org/jira/browse/KAFKA-959
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8
>Reporter: Jun Rao
>Assignee: Guozhang Wang
>
> In DefaultEventHandler, for a batch of messages, it picks a random partition 
> per message (when there is no key specified). This means that it can send up 
> to P produce requests where P is the number of partitions in a topic. A 
> better way is probably to pick a single random partition for the whole batch 
> of messages. This will reduce the number of produce requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up

2013-07-10 Thread Sriram Subramanian (JIRA)
Sriram Subramanian created KAFKA-969:


 Summary: Need to prevent failure of rebalance when there are no 
brokers available when consumer comes up
 Key: KAFKA-969
 URL: https://issues.apache.org/jira/browse/KAFKA-969
 Project: Kafka
  Issue Type: Bug
Reporter: Sriram Subramanian
Assignee: Sriram Subramanian
 Attachments: emptybrokeronrebalance-1.patch

There are some rare instances when a consumer would be up before bringing up 
the Kafka brokers. This would usually happen in a test scenario. In such 
conditions, during rebalance instead of failing the rebalance we just log the 
error and subscribe to broker changes. When the broker comes back up, we 
trigger the rebalance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up

2013-07-10 Thread Sriram Subramanian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriram Subramanian updated KAFKA-969:
-

Attachment: emptybrokeronrebalance-1.patch

> Need to prevent failure of rebalance when there are no brokers available when 
> consumer comes up
> ---
>
> Key: KAFKA-969
> URL: https://issues.apache.org/jira/browse/KAFKA-969
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Attachments: emptybrokeronrebalance-1.patch
>
>
> There are some rare instances when a consumer would be up before bringing up 
> the Kafka brokers. This would usually happen in a test scenario. In such 
> conditions, during rebalance instead of failing the rebalance we just log the 
> error and subscribe to broker changes. When the broker comes back up, we 
> trigger the rebalance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up

2013-07-10 Thread Sriram Subramanian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriram Subramanian updated KAFKA-969:
-

Status: Patch Available  (was: Open)

> Need to prevent failure of rebalance when there are no brokers available when 
> consumer comes up
> ---
>
> Key: KAFKA-969
> URL: https://issues.apache.org/jira/browse/KAFKA-969
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Attachments: emptybrokeronrebalance-1.patch
>
>
> There are some rare instances when a consumer would be up before bringing up 
> the Kafka brokers. This would usually happen in a test scenario. In such 
> conditions, during rebalance instead of failing the rebalance we just log the 
> error and subscribe to broker changes. When the broker comes back up, we 
> trigger the rebalance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


formatting if statements

2013-07-10 Thread Jay Kreps
Guys,

I am seeing this:

if (condition) {
  // something
}
else {
  // something else
}

This is not the style we are using. Please don't do this or accept code
that looks like this. It should be

if (condition) {
  // something
} else {
  //something else
}

Thanks!

-Jay


[jira] [Commented] (KAFKA-965) merge c39d37e9dd97bf2462ffbd1a96c0b2cb05034bae from 0.8 to trunk

2013-07-10 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705248#comment-13705248
 ] 

Jay Kreps commented on KAFKA-965:
-

- core/src/main/scala/kafka/admin/AdminUtils.scala: You changed the name of 
replicaIndex back to getWrappedIndex? Is this intentional?
- core/src/main/scala/kafka/server/HighwaterMarkCheckpoint.scala you are 
re-adding this file but it has been replaced by OffsetIndex
- This is not due to the merge, but the manual synchronization done for 
updatemetadata is really bad code. If we want to do ad hoc synchronization that 
should have been wrapped in a MetadataCache class. As it is verifying the 
correctness of that involves examinging all of KafkaApis.

> merge c39d37e9dd97bf2462ffbd1a96c0b2cb05034bae from 0.8 to trunk
> 
>
> Key: KAFKA-965
> URL: https://issues.apache.org/jira/browse/KAFKA-965
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Affects Versions: 0.8.1
>Reporter: Jun Rao
>Assignee: Jun Rao
> Attachments: kafka-965.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-966) Allow high level consumer to 'nak' a message and force Kafka to close the KafkaStream without losing that message

2013-07-10 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705272#comment-13705272
 ] 

Joel Koshy commented on KAFKA-966:
--

Yes if you need to implement support for transactions across partitions that 
are potentially owned by different consumer instances then this approach 
wouldn't work. Not sure if it is feasible in your case but if there are a group 
of messages that need to be committed together then you could send them with a 
key and partition those messages into the same partition. So exactly one 
consumer thread will be responsible for those messages.


> Allow high level consumer to 'nak' a message and force Kafka to close the 
> KafkaStream without losing that message
> -
>
> Key: KAFKA-966
> URL: https://issues.apache.org/jira/browse/KAFKA-966
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8
>Reporter: Chris Curtin
>Assignee: Neha Narkhede
>Priority: Minor
>
> Enhancement request.
> The high level consumer is very close to handling a lot of situations a 
> 'typical' client would need. Except for when the message received from Kafka 
> is valid, but the business logic that wants to consume it has a problem.
> For example if I want to write the value to a MongoDB or Cassandra database 
> and the database is not available. I won't know until I go to do the write 
> that the database isn't available, but by then it is too late to NOT read the 
> message from Kafka. Thus if I call shutdown() to stop reading, that message 
> is lost since the offset Kafka writes to ZooKeeper is the next offset.
> Ideally I'd like to be able to tell Kafka: close the KafkaStream but set the 
> next offset to read for this partition to this message when I start up again. 
> And if there are any messages in the BlockingQueue for other partitions, find 
> the lowest # and use it for that partitions offset since I haven't consumed 
> them yet.
> Thus I can cleanly shutdown my processing, resolve whatever the issue is and 
> restart the process.
> Another idea might be to allow a 'peek' into the next message and if I 
> succeed in writing to the database call 'next' to remove it from the queue. 
> I understand this won't deal with a 'kill -9' or hard failure of the JVM 
> leading to the latest offsets not being written to ZooKeeper but it addresses 
> a likely common scenario for consumers. Nor will it add true transactional 
> support since the ZK update could fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (KAFKA-970) ./sbt +package rebuilds the Hadoop consumer jar N times with the same output file

2013-07-10 Thread Jay Kreps (JIRA)
Jay Kreps created KAFKA-970:
---

 Summary: ./sbt +package rebuilds the Hadoop consumer jar N times 
with the same output file
 Key: KAFKA-970
 URL: https://issues.apache.org/jira/browse/KAFKA-970
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Jay Kreps
 Fix For: 0.8


Running ./sbt +package now builds all jars for all scala versions. 
Unfortunately for the Hadoop producer and consumer since it uses the same file 
name every time this just means it is overwriting the same file over and over 
and the final file is whatever the last scala version is that is built. This 
should be a trivial fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: having problem with 0.8 gzip compression

2013-07-10 Thread Joel Koshy
Weird - I tried your exact code and it worked for me (although I was
using 0.8 head and not the beta). Can you re-run with trace logs
enabled in your producer and paste that output? Broker logs also if
you can?

Thanks,

Joel

On Wed, Jul 10, 2013 at 10:23 AM, Scott Wang
 wrote:
> Jun,
>
> I did a test this morning and got a very interesting result with you
> command.  I started by wipe all the log files and clean up all zookeeper
> data files.
>
> Once I restarted both server, producer and consumer then execute your
> command, what I got is a empty log as following:
>
> Dumping /Users/scott/Temp/kafka/test-topic-0/.log
> Starting offset: 0
>
> One observation, the .index file was getting huge but
> there was nothing in .log file.
>
> Thanks,
> Scott
>
>
>
>
> On Tue, Jul 9, 2013 at 8:40 PM, Jun Rao  wrote:
>
>> Could you run the following command on one of the log files of your topic
>> and attach the output?
>>
>> bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files
>> /tmp/kafka-logs/testtopic-0/.log
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Tue, Jul 9, 2013 at 3:23 PM, Scott Wang <
>> scott.w...@rumbleentertainment.com> wrote:
>>
>> > Another piece of information, the snappy compression also does not work.
>> >
>> > Thanks,
>> > Scott
>> >
>> >
>> > On Tue, Jul 9, 2013 at 11:07 AM, Scott Wang <
>> > scott.w...@rumbleentertainment.com> wrote:
>> >
>> > > I just try it and it still not showing up, thanks for looking into
>> this.
>> > >
>> > > Thanks,
>> > > Scott
>> > >
>> > >
>> > > On Tue, Jul 9, 2013 at 8:06 AM, Jun Rao  wrote:
>> > >
>> > >> Could you try starting the consumer first (and enable gzip in the
>> > >> producer)?
>> > >>
>> > >> Thanks,
>> > >>
>> > >> Jun
>> > >>
>> > >>
>> > >> On Mon, Jul 8, 2013 at 9:37 PM, Scott Wang <
>> > >> scott.w...@rumbleentertainment.com> wrote:
>> > >>
>> > >> > No, I did not start the consumer before the producer.  I actually
>> > >> started
>> > >> > the producer first and nothing showed up in the consumer unless I
>> > >> commented
>> > >> > out this line -- props.put("compression.codec", "gzip").If I
>> > >> commented
>> > >> > out the compression codec, everything just works.
>> > >> >
>> > >> >
>> > >> > On Mon, Jul 8, 2013 at 9:07 PM, Jun Rao  wrote:
>> > >> >
>> > >> > > Did you start the consumer before the producer? Be default, the
>> > >> consumer
>> > >> > > gets only the new data?
>> > >> > >
>> > >> > > Thanks,
>> > >> > >
>> > >> > > Jun
>> > >> > >
>> > >> > >
>> > >> > > On Mon, Jul 8, 2013 at 2:53 PM, Scott Wang <
>> > >> > > scott.w...@rumbleentertainment.com> wrote:
>> > >> > >
>> > >> > > > I am testing with Kafka 0.8 beta and having problem of receiving
>> > >> > message
>> > >> > > in
>> > >> > > > consumer.  There is no error so does anyone have any insights.
>> > >>  When I
>> > >> > > > commented out the "compression.code" everything works fine.
>> > >> > > >
>> > >> > > > My producer:
>> > >> > > > public class TestKafka08Prod {
>> > >> > > >
>> > >> > > > public static void main(String [] args) {
>> > >> > > >
>> > >> > > > Producer producer = null;
>> > >> > > > try {
>> > >> > > > Properties props = new Properties();
>> > >> > > > props.put("metadata.broker.list", "localhost:9092");
>> > >> > > > props.put("serializer.class",
>> > >> > > > "kafka.serializer.StringEncoder");
>> > >> > > > props.put("producer.type", "sync");
>> > >> > > > props.put("request.required.acks","1");
>> > >> > > > props.put("compression.codec", "gzip");
>> > >> > > > ProducerConfig config = new ProducerConfig(props);
>> > >> > > > producer = new Producer(config);
>> > >> > > > int j=0;
>> > >> > > > for(int i=0; i<10; i++) {
>> > >> > > > KeyedMessage data = new
>> > >> > > > KeyedMessage("test-topic", "test-message: "+i+"
>> > >> > > > "+System.currentTimeMillis());
>> > >> > > > producer.send(data);
>> > >> > > >
>> > >> > > > }
>> > >> > > >
>> > >> > > > } catch (Exception e) {
>> > >> > > > System.out.println("Error happened: ");
>> > >> > > > e.printStackTrace();
>> > >> > > > } finally {
>> > >> > > > if(null != null) {
>> > >> > > > producer.close();
>> > >> > > > }
>> > >> > > >
>> > >> > > > System.out.println("Ened of Sending");
>> > >> > > > }
>> > >> > > >
>> > >> > > > System.exit(0);
>> > >> > > > }
>> > >> > > > }
>> > >> > > >
>> > >> > > >
>> > >> > > > My consumer:
>> > >> > > >
>> > >> > > > public class TestKafka08Consumer {
>> > >> > > > public static void main(String [] args) throws
>> > >> > UnknownHostException,
>> > >> > > > SocketException {
>> > >> > > >
>> > >> > > > Properties props = new Properties();
>> > >> > > > props.put("zook

[jira] [Commented] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up

2013-07-10 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705340#comment-13705340
 ] 

Joel Koshy commented on KAFKA-969:
--

This seems reasonable, but I'm not fully convinced about it. E.g., a test
framework should ensure external dependencies are up before attempting to
make service calls to those dependencies. That said, it is perhaps also
reasonable from a consumer's perspective to expect that returned streams be
empty at first, and whenever brokers and topics show up, then events should
just show up.

I'm +1 on this patch except for the if-else formatting issue.  Also, I think
this patch alone would be insufficient to meet the above.  There are two
other issues:

- We should register a watcher under the topics path (currently done only if
  a wildcard is specified)
- KAFKA-956 is also related. I need to give that one some thought.




> Need to prevent failure of rebalance when there are no brokers available when 
> consumer comes up
> ---
>
> Key: KAFKA-969
> URL: https://issues.apache.org/jira/browse/KAFKA-969
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Attachments: emptybrokeronrebalance-1.patch
>
>
> There are some rare instances when a consumer would be up before bringing up 
> the Kafka brokers. This would usually happen in a test scenario. In such 
> conditions, during rebalance instead of failing the rebalance we just log the 
> error and subscribe to broker changes. When the broker comes back up, we 
> trigger the rebalance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up

2013-07-10 Thread Sriram Subramanian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705461#comment-13705461
 ] 

Sriram Subramanian commented on KAFKA-969:
--

The other issues you mention are separate from this. You should file JIRAs for 
those. 

> Need to prevent failure of rebalance when there are no brokers available when 
> consumer comes up
> ---
>
> Key: KAFKA-969
> URL: https://issues.apache.org/jira/browse/KAFKA-969
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Attachments: emptybrokeronrebalance-1.patch
>
>
> There are some rare instances when a consumer would be up before bringing up 
> the Kafka brokers. This would usually happen in a test scenario. In such 
> conditions, during rebalance instead of failing the rebalance we just log the 
> error and subscribe to broker changes. When the broker comes back up, we 
> trigger the rebalance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up

2013-07-10 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705484#comment-13705484
 ] 

Guozhang Wang commented on KAFKA-969:
-

I think KAFKA-956 is orthogonal to this JIRA. It is about broker metadata not 
able to propagate to the brokers yet.

> Need to prevent failure of rebalance when there are no brokers available when 
> consumer comes up
> ---
>
> Key: KAFKA-969
> URL: https://issues.apache.org/jira/browse/KAFKA-969
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Attachments: emptybrokeronrebalance-1.patch
>
>
> There are some rare instances when a consumer would be up before bringing up 
> the Kafka brokers. This would usually happen in a test scenario. In such 
> conditions, during rebalance instead of failing the rebalance we just log the 
> error and subscribe to broker changes. When the broker comes back up, we 
> trigger the rebalance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up

2013-07-10 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-969:
--

   Resolution: Fixed
Fix Version/s: 0.8
   Status: Resolved  (was: Patch Available)

Thanks for the patch. Committed to 0.8 with the following minor change.

* change ERROR logging to WARN.

> Need to prevent failure of rebalance when there are no brokers available when 
> consumer comes up
> ---
>
> Key: KAFKA-969
> URL: https://issues.apache.org/jira/browse/KAFKA-969
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Fix For: 0.8
>
> Attachments: emptybrokeronrebalance-1.patch
>
>
> There are some rare instances when a consumer would be up before bringing up 
> the Kafka brokers. This would usually happen in a test scenario. In such 
> conditions, during rebalance instead of failing the rebalance we just log the 
> error and subscribe to broker changes. When the broker comes back up, we 
> trigger the rebalance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up

2013-07-10 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705490#comment-13705490
 ] 

Joel Koshy commented on KAFKA-969:
--

As I already said, I'm +1 on this patch for what it intends to address. Those 
two issues I mentioned are orthogonal. By "above" in my comment I was referring 
to the possible expectation from consumers: ".. from a consumer's perspective 
to expect that returned streams be empty at first, and whenever brokers and 
topics show up, then events should just show up." - not the "failed to 
rebalance issue".


> Need to prevent failure of rebalance when there are no brokers available when 
> consumer comes up
> ---
>
> Key: KAFKA-969
> URL: https://issues.apache.org/jira/browse/KAFKA-969
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Fix For: 0.8
>
> Attachments: emptybrokeronrebalance-1.patch
>
>
> There are some rare instances when a consumer would be up before bringing up 
> the Kafka brokers. This would usually happen in a test scenario. In such 
> conditions, during rebalance instead of failing the rebalance we just log the 
> error and subscribe to broker changes. When the broker comes back up, we 
> trigger the rebalance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up

2013-07-10 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705517#comment-13705517
 ] 

Swapnil Ghike commented on KAFKA-969:
-

I agree with Joel. Registering a consumers' subscription in zookeeper and being 
open to brokers and topics showing up after the consumer has started is 
reasonable. It's similar to a consumer waiting to consume data from the tail of 
a topic when no new data is coming in and partition count could change.

> Need to prevent failure of rebalance when there are no brokers available when 
> consumer comes up
> ---
>
> Key: KAFKA-969
> URL: https://issues.apache.org/jira/browse/KAFKA-969
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Fix For: 0.8
>
> Attachments: emptybrokeronrebalance-1.patch
>
>
> There are some rare instances when a consumer would be up before bringing up 
> the Kafka brokers. This would usually happen in a test scenario. In such 
> conditions, during rebalance instead of failing the rebalance we just log the 
> error and subscribe to broker changes. When the broker comes back up, we 
> trigger the rebalance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (KAFKA-971) Handle synchronization in updatemetatdata in KafkaApi better

2013-07-10 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-971:
-

 Summary: Handle synchronization in updatemetatdata in KafkaApi 
better
 Key: KAFKA-971
 URL: https://issues.apache.org/jira/browse/KAFKA-971
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 0.8.1
Reporter: Jun Rao


It's better to wrapped all synchronization of metadata cache in a MetadataCache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-965) merge c39d37e9dd97bf2462ffbd1a96c0b2cb05034bae from 0.8 to trunk

2013-07-10 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-965:
--

   Resolution: Fixed
Fix Version/s: 0.8.1
   Status: Resolved  (was: Patch Available)

Thanks for the review. Committed to trunk after addressing the comments.

1. Reverted the change.
2. deleted the file.
3. Filed kafka-971 to track this.

> merge c39d37e9dd97bf2462ffbd1a96c0b2cb05034bae from 0.8 to trunk
> 
>
> Key: KAFKA-965
> URL: https://issues.apache.org/jira/browse/KAFKA-965
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Affects Versions: 0.8.1
>Reporter: Jun Rao
>Assignee: Jun Rao
> Fix For: 0.8.1
>
> Attachments: kafka-965.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (KAFKA-838) Update design document to match Kafka 0.8 design

2013-07-10 Thread Sriram Subramanian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriram Subramanian reassigned KAFKA-838:


Assignee: Sriram Subramanian

> Update design document to match Kafka 0.8 design
> 
>
> Key: KAFKA-838
> URL: https://issues.apache.org/jira/browse/KAFKA-838
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Neha Narkhede
>Assignee: Sriram Subramanian
>
> Kafka 0.8 design is significantly different as compared to Kafka 0.7

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (KAFKA-781) Add option to the controlled shutdown tool to timeout after n secs

2013-07-10 Thread Sriram Subramanian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriram Subramanian resolved KAFKA-781.
--

Resolution: Fixed

We have moved this logic into the broker.

> Add option to the controlled shutdown tool to timeout after n secs
> --
>
> Key: KAFKA-781
> URL: https://issues.apache.org/jira/browse/KAFKA-781
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.8
>Reporter: Neha Narkhede
>Priority: Critical
>  Labels: replication-tools
>
> Right now, the controlled shutdown tool has a number of retries option. This 
> is required since it might take multiple retries to move leaders from a 
> broker. However, it is convenient to also have an option that allows the tool 
> to timeout after n secs and retry until then.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira