[ 
https://issues.apache.org/jira/browse/KAFKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-903:
--------------------------

    Attachment: kafka-903_v3.patch

Attach patch v3. 

To address Jay's concern, instead of using a generic renameTo util, only falls 
back to the non-atomic renameTo in checkpointing the high watermark file. Since 
both files are in the same dir and we control the naming, those other causes 
you listed that can fail renameTo won't happen. I didn't do the os level 
checking since I am not sure it that works well for environments like cygwin. 
We could guard this under a broker config parameter, but I am not sure if it's 
worth it.

For Sriram's concern, this seems to be at least a problem for some versions of 
java on Windows since other projects like Hadoop 
(https://issues.apache.org/jira/browse/HADOOP-959) have also seen this before.
  
                
> [0.8.0 - windows]  FATAL - [highwatermark-checkpoint-thread1] 
> (Logging.scala:109) - Attempt to swap the new high watermark file with the 
> old one failed
> -------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-903
>                 URL: https://issues.apache.org/jira/browse/KAFKA-903
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>         Environment: Windows 7 with SP 1; jdk 7_0_17; scala-library-2.8.2, 
> probably copied on 4/30. kafka-0.8, built current on 4/30.
> -rwx------+ 1 reefedjib None   41123 Mar 19  2009 commons-cli-1.2.jar
> -rwx------+ 1 reefedjib None   58160 Jan 11 13:45 commons-codec-1.4.jar
> -rwx------+ 1 reefedjib None  575389 Apr 18 13:41 
> commons-collections-3.2.1.jar
> -rwx------+ 1 reefedjib None  143847 May 21  2009 commons-compress-1.0.jar
> -rwx------+ 1 reefedjib None   52543 Jan 11 13:45 commons-exec-1.1.jar
> -rwx------+ 1 reefedjib None   57779 Jan 11 13:45 commons-fileupload-1.2.1.jar
> -rwx------+ 1 reefedjib None  109043 Jan 20  2008 commons-io-1.4.jar
> -rwx------+ 1 reefedjib None  279193 Jan 11 13:45 commons-lang-2.5.jar
> -rwx------+ 1 reefedjib None   60686 Jan 11 13:45 commons-logging-1.1.1.jar
> -rwx------+ 1 reefedjib None 1891110 Apr 18 13:41 guava-13.0.1.jar
> -rwx------+ 1 reefedjib None  206866 Apr  7 21:24 jackson-core-2.1.4.jar
> -rwx------+ 1 reefedjib None  232245 Apr  7 21:24 jackson-core-asl-1.9.12.jar
> -rwx------+ 1 reefedjib None   69314 Apr  7 21:24 
> jackson-dataformat-smile-2.1.4.jar
> -rwx------+ 1 reefedjib None  780385 Apr  7 21:24 
> jackson-mapper-asl-1.9.12.jar
> -rwx------+ 1 reefedjib None   47913 May  9 23:39 jopt-simple-3.0-rc2.jar
> -rwx------+ 1 reefedjib None 2365575 Apr 30 13:06 
> kafka_2.8.0-0.8.0-SNAPSHOT.jar
> -rwx------+ 1 reefedjib None  481535 Jan 11 13:46 log4j-1.2.16.jar
> -rwx------+ 1 reefedjib None   20647 Apr 18 13:41 log4j-over-slf4j-1.6.6.jar
> -rwx------+ 1 reefedjib None  251784 Apr 18 13:41 logback-classic-1.0.6.jar
> -rwx------+ 1 reefedjib None  349706 Apr 18 13:41 logback-core-1.0.6.jar
> -rwx------+ 1 reefedjib None   82123 Nov 26 13:11 metrics-core-2.2.0.jar
> -rwx------+ 1 reefedjib None 1540457 Jul 12  2012 ojdbc14.jar
> -rwx------+ 1 reefedjib None 6418368 Apr 30 08:23 scala-library-2.8.2.jar
> -rwx------+ 1 reefedjib None 3114958 Apr  2 07:47 scalatest_2.10-1.9.1.jar
> -rwx------+ 1 reefedjib None   25962 Apr 18 13:41 slf4j-api-1.6.5.jar
> -rwx------+ 1 reefedjib None   62269 Nov 29 03:26 zkclient-0.2.jar
> -rwx------+ 1 reefedjib None  601677 Apr 18 13:41 zookeeper-3.3.3.jar
>            Reporter: Rob Withers
>            Priority: Blocker
>         Attachments: kafka_2.8.0-0.8.0-SNAPSHOT.jar, kafka-903.patch, 
> kafka-903_v2.patch, kafka-903_v3.patch
>
>
> This FATAL shuts down both brokers on windows, 
> {2013-05-10 18:23:57,636} DEBUG [local-vat] (Logging.scala:51) - Sending 1 
> messages with no compression to [robert_v_2x0,0]
> {2013-05-10 18:23:57,637} DEBUG [local-vat] (Logging.scala:51) - Producer 
> sending messages with correlation id 178 for topics [robert_v_2x0,0] to 
> broker 1 on 192.168.1.100:9093
> {2013-05-10 18:23:57,689} FATAL [highwatermark-checkpoint-thread1] 
> (Logging.scala:109) - Attempt to swap the new high watermark file with the 
> old one failed
> {2013-05-10 18:23:57,739}  INFO [Thread-4] (Logging.scala:67) - [Kafka 
> Server 0], shutting down
> Furthermore, attempts to restart them fail, with the following log:
> {2013-05-10 19:14:52,156}  INFO [Thread-1] (Logging.scala:67) - [Kafka Server 
> 0], started
> {2013-05-10 19:14:52,157}  INFO [ZkClient-EventThread-32-localhost:2181] 
> (Logging.scala:67) - New leader is 0
> {2013-05-10 19:14:52,193} DEBUG [ZkClient-EventThread-32-localhost:2181] 
> (ZkEventThread.java:79) - Delivering event #1 done
> {2013-05-10 19:14:52,193} DEBUG [ZkClient-EventThread-32-localhost:2181] 
> (ZkEventThread.java:69) - Delivering event #4 ZkEvent[Data of 
> /controller_epoch changed sent to 
> kafka.controller.ControllerEpochListener@5cb88f42]
> {2013-05-10 19:14:52,210} DEBUG [SyncThread:0] 
> (FinalRequestProcessor.java:78) - Processing request:: 
> sessionid:0x13e9127882e0001 type:exists cxid:0x1d zxid:0xfffffffffffffffe 
> txntype:unknown reqpath:/controller_epoch
> {2013-05-10 19:14:52,210} DEBUG [SyncThread:0] 
> (FinalRequestProcessor.java:160) - sessionid:0x13e9127882e0001 type:exists 
> cxid:0x1d zxid:0xfffffffffffffffe txntype:unknown reqpath:/controller_epoch
> {2013-05-10 19:14:52,213} DEBUG [Thread-1-SendThread(localhost:2181)] 
> (ClientCnxn.java:838) - Reading reply sessionid:0x13e9127882e0001, packet:: 
> clientPath:null serverPath:null finished:false header:: 29,3  replyHeader:: 
> 29,37,0  request:: '/controller_epoch,T  response:: 
> s{16,36,1368231712816,1368234889961,1,0,0,0,1,0,16} 
> {2013-05-10 19:14:52,219}  INFO [Thread-5] (Logging.scala:67) - [Kafka Server 
> 0], shutting down

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to