[jira] [Commented] (KAFKA-1130) "log.dirs" is a confusing property name

2013-11-12 Thread David Arthur (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820086#comment-13820086
 ] 

David Arthur commented on KAFKA-1130:
-

I think people primarily think of Kafka as a queueing system despite the 
one-liner on the project page. In that context, "log" is pretty ambiguous.

I opened this JIRA on behalf of someone on IRC who is a system admin type. From 
their perspective, they look at the config file and see "log.dirs" and think 
application logs.

I like [~joestein]'s idea for "log.data.dirs". The description for that 
property could be clarified as well.

> "log.dirs" is a confusing property name
> ---
>
> Key: KAFKA-1130
> URL: https://issues.apache.org/jira/browse/KAFKA-1130
> Project: Kafka
>  Issue Type: Bug
>  Components: config
>Affects Versions: 0.8
>Reporter: David Arthur
>Priority: Minor
>
> "log.dirs" is a somewhat misleading config name. The term "log" comes from an 
> internal Kafka class name, and shouldn't leak out into the public API (in 
> this case, the config).
> Something like "data.dirs" would be less confusing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1130) "log.dirs" is a confusing property name

2013-11-12 Thread David Arthur (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Arthur updated KAFKA-1130:


Attachment: KAFKA-1130.diff

Straw man proposal

> "log.dirs" is a confusing property name
> ---
>
> Key: KAFKA-1130
> URL: https://issues.apache.org/jira/browse/KAFKA-1130
> Project: Kafka
>  Issue Type: Bug
>  Components: config
>Affects Versions: 0.8
>Reporter: David Arthur
>Priority: Minor
> Attachments: KAFKA-1130.diff
>
>
> "log.dirs" is a somewhat misleading config name. The term "log" comes from an 
> internal Kafka class name, and shouldn't leak out into the public API (in 
> this case, the config).
> Something like "data.dirs" would be less confusing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1130) "log.dirs" is a confusing property name

2013-11-12 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820222#comment-13820222
 ] 

Jay Kreps commented on KAFKA-1130:
--

Fair enough. If we are going to make the change let's go with data.dirs as that 
will be the least ambiguous.

> "log.dirs" is a confusing property name
> ---
>
> Key: KAFKA-1130
> URL: https://issues.apache.org/jira/browse/KAFKA-1130
> Project: Kafka
>  Issue Type: Bug
>  Components: config
>Affects Versions: 0.8
>Reporter: David Arthur
>Priority: Minor
> Attachments: KAFKA-1130.diff
>
>
> "log.dirs" is a somewhat misleading config name. The term "log" comes from an 
> internal Kafka class name, and shouldn't leak out into the public API (in 
> this case, the config).
> Something like "data.dirs" would be less confusing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15201: address more review comments

2013-11-12 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/
---

(Updated Nov. 12, 2013, 4:34 p.m.)


Review request for kafka.


Summary (updated)
-

address more review comments


Bugs: KAFKA-1117
https://issues.apache.org/jira/browse/KAFKA-1117


Repository: kafka


Description (updated)
---

kafka-1117; fix 3


kafka-1117; fix 2


kafka-1117; fix 1


kafka-1117


Diffs (updated)
-

  config/tools-log4j.properties 79240490149835656e2a013a9702c5aa41c104f1 
  core/src/main/scala/kafka/api/OffsetResponse.scala 
08dc3cd3d166efba6b2b43f6e148f636b175affe 
  core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 

Diff: https://reviews.apache.org/r/15201/diff/


Testing
---


Thanks,

Jun Rao



[jira] [Commented] (KAFKA-1117) tool for checking the consistency among replicas

2013-11-12 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820229#comment-13820229
 ] 

Jun Rao commented on KAFKA-1117:


Updated reviewboard https://reviews.apache.org/r/15201/
 against branch origin/trunk

> tool for checking the consistency among replicas
> 
>
> Key: KAFKA-1117
> URL: https://issues.apache.org/jira/browse/KAFKA-1117
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 0.8.1
>Reporter: Jun Rao
>Assignee: Jun Rao
> Fix For: 0.8.1
>
> Attachments: KAFKA-1117.patch, KAFKA-1117_2013-11-11_08:44:25.patch, 
> KAFKA-1117_2013-11-12_08:34:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1117) tool for checking the consistency among replicas

2013-11-12 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1117:
---

Attachment: KAFKA-1117_2013-11-12_08:34:53.patch

> tool for checking the consistency among replicas
> 
>
> Key: KAFKA-1117
> URL: https://issues.apache.org/jira/browse/KAFKA-1117
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 0.8.1
>Reporter: Jun Rao
>Assignee: Jun Rao
> Fix For: 0.8.1
>
> Attachments: KAFKA-1117.patch, KAFKA-1117_2013-11-11_08:44:25.patch, 
> KAFKA-1117_2013-11-12_08:34:53.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15201: address more review comments

2013-11-12 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28744
---


Looks good overall - this is definitely a useful tool to have. I'm posting 
comments so far but I will come back to it tomorrow to finish up.

So what would be the recommended procedure when the tool reports an 
inconsistency? Would operations have to go in, stop the brokers and truncate 
the log segments to the last segments that are verified to be consistent? If 
so, it would be convenient if the tool could issue another OffsetRequest to 
report the offset boundaries of the last consistent segments so you don't have 
to work too hard to determine the segment file.

Also, unless I'm reading it incorrect additional caveats are that the tool can 
only keep working so long as: partitions are not being added (or reassigned); 
logs are not being compacted.



config/tools-log4j.properties


Is this needed?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


This is becoming my most frequent review comment :)
Can you use CommandLineUtils.checkRequiredArgs?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


Any reason to not use kafka.consumer.Whitelist and the isTopicAllowed API 
that it provides?


- Joel Koshy


On Nov. 12, 2013, 4:34 p.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> ---
> 
> (Updated Nov. 12, 2013, 4:34 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
> https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -
> 
>   config/tools-log4j.properties 79240490149835656e2a013a9702c5aa41c104f1 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 
> 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



[jira] [Commented] (KAFKA-270) sync producer / consumer test producing lot of kafka server exceptions & not getting the throughput mentioned here http://incubator.apache.org/kafka/performance.html

2013-11-12 Thread wangxu(alvin) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820938#comment-13820938
 ] 

wangxu(alvin) commented on KAFKA-270:
-

can also try closing the producer.
I would get the below error in broker console without closing producer. If I 
close the producer, the error will never appear.
[2013-11-05 14:07:34,097] ERROR Closing socket for /192.168.30.114 because of 
error (kafka.network.Processor)
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198)
at sun.nio.ch.IOUtil.read(IOUtil.java:171)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
at kafka.utils.Utils$.read(Utils.scala:538)
at 
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at kafka.network.Processor.read(SocketServer.scala:311)
at kafka.network.Processor.run(SocketServer.scala:214)
at java.lang.Thread.run(Thread.java:662)

>  sync producer / consumer test producing lot of kafka server exceptions & not 
> getting the throughput mentioned here 
> http://incubator.apache.org/kafka/performance.html
> --
>
> Key: KAFKA-270
> URL: https://issues.apache.org/jira/browse/KAFKA-270
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core
>Affects Versions: 0.7
> Environment: Linux 2.6.18-238.1.1.el5 , x86_64 x86_64 x86_64 
> GNU/Linux 
> ext3 file system with raid10
>Reporter: Praveen Ramachandra
>  Labels: clients, core, newdev, performance
>
> I am getting ridiculously low producer and consumer throughput.
> I am using default config values for producer, consumer and broker
> which are very good starting points, as they should yield sufficient
> throughput.
> Appreciate if you point what settings/changes-in-code needs to be done
> to get higher throughput.
> I changed num.partitions in the server.config to 10.
> Please look below for exception and error messages from the server
> BTW: I am running server, zookeeper, producer, consumer on the same host.
> Consumer Code=
>long startTime = System.currentTimeMillis();
>long endTime = startTime + runDuration*1000l;
>Properties props = new Properties();
>props.put("zk.connect", "localhost:2181");
>props.put("groupid", subscriptionName); // to support multiple
> subscribers
>props.put("zk.sessiontimeout.ms", "400");
>props.put("zk.synctime.ms", "200");
>props.put("autocommit.interval.ms", "1000");
>consConfig =  new ConsumerConfig(props);
>consumer =
> kafka.consumer.Consumer.createJavaConsumerConnector(consConfig);
>Map topicCountMap = new HashMap();
>topicCountMap.put(topicName, new Integer(1)); // has the topic
> to which to subscribe to
>Map>> consumerMap =
> consumer.createMessageStreams(topicCountMap);
>KafkaMessageStream stream =  
> consumerMap.get(topicName).get(0);
>ConsumerIterator it = stream.iterator();
>while(System.currentTimeMillis() <= endTime )
>{
>it.next(); // discard data
>consumeMsgCount.incrementAndGet();
>}
> End consumer CODE
> =Producer CODE
>props.put("serializer.class", "kafka.serializer.StringEncoder");
>props.put("zk.connect", "localhost:2181");
>// Use random partitioner. Don't need the key type. Just
> set it to Integer.
>// The message is of type String.
>producer = new kafka.javaapi.producer.Producer String>(new ProducerConfig(props));
>long endTime = startTime + runDuration*1000l; // run duration
> is in seconds
>while(System.currentTimeMillis() <= endTime )
>{
>String msg =
> org.apache.commons.lang.RandomStringUtils.random(msgSizes.get(0));
>producer.send(new ProducerData(topicName, msg));
>pc.incrementAndGet();
>}
>java.util.Date date = new java.util.Date(System.currentTimeMillis());
>System.out.println(date+" :: stopped producer for topic"+topicName);
> =END Producer CODE
> I see a bunch of exceptions like this
> [2012-02-11 02:44:11,945] ERROR Closing socket for /188.125.88.145 because of 
> error (kafka.network.Processor)
> java.io.IOException: Connection reset by peer
>   at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>   at 
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelIm

[jira] [Updated] (KAFKA-1112) broker can not start itself after kafka is killed with -9

2013-11-12 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps updated KAFKA-1112:
-

Attachment: KAFKA-1112-v1.patch

The way the check was supposed to work was this: if the last offset in the file 
is the recoveryPoint-1 then skip the recovery (because the whole file is 
flushed). The way this was implemented was by using the last entry in the index 
to find the final message.

Overall I feel this is a bit of a hack, but we wanted to separate out the 
"fsync is async" feature from a full incremental recovery implementation that 
only recovers unflushed data.

The immediate problem was that we broke the short circuit by adding code to try 
to handle a corner case: what if log is truncated after to a flush and hence 
the end of the log is < recovery point. This was just totally broken and we 
were short circuiting out of the check in virtually all cases including corrupt 
index.

This issue wasn't caught because there was a bug in the log corruption unit 
test that gave a false pass on all index corruptions. :-(

The fix is the following:
1. Fix the logical bug
2. Add LogSegment.needsRecovery() which is a more paranoid version of what we 
were doing before that attempts to be safe regardless of any index or log 
corruption that may have occurred. Having this method here is a little hacky 
but probably okay until we get a full incremental recovery impl.
3. Fix the unit test that covers this.


> broker can not start itself after kafka is killed with -9
> -
>
> Key: KAFKA-1112
> URL: https://issues.apache.org/jira/browse/KAFKA-1112
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8, 0.8.1
>Reporter: Kane Kim
>Assignee: Jay Kreps
>Priority: Critical
> Attachments: KAFKA-1112-v1.patch, KAFKA-1112.out
>
>
> When I kill kafka with -9, broker cannot start itself because of corrupted 
> index logs. I think kafka should try to delete/rebuild indexes itself without 
> manual intervention. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15201: address more review comments

2013-11-12 Thread Swapnil Ghike

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28781
---



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


fetchsize -> fetch-size for consistency with other options?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


We can probably change this to ConsumerConfig.FetchSize. Anytime we change 
the max message size on the broker, we will probably change default fetch size 
on consumer, so that can serve as the source of truth for this tool as well.



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


Should we just do .replaceAll("""["']""", "") instead?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


val fetcherThreads = for ((i, element) <- 
topicAndPartitionsPerBroker.view.zipWithIndex) yield new ReplicaFetcher(, 
doVerification = if (i == 0) true else false) 

to avoid variable = true and then variable = false? 



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


Minor comment: Can we rename this to initialOffsetMap (we use the offsetMap 
name in ReplicaFetchThread), I got confused on the first glace..



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


I thought this loop is supposed to go through all the messages that can be 
returned by the messageIterator, but looks like it will check for only the 
first message obtained via each messageIterator.



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


Wondering why we don't exit when checksums don't match?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


typo



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala


typo


Another caveat seems to be that the tool cannot handle changes in 1. partition 
leadership change 2. topic configuration change (number of partitions).

- Swapnil Ghike


On Nov. 12, 2013, 4:34 p.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> ---
> 
> (Updated Nov. 12, 2013, 4:34 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
> https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -
> 
>   config/tools-log4j.properties 79240490149835656e2a013a9702c5aa41c104f1 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 
> 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>