Re: log file flush?

2013-02-19 Thread Jason Huang
Very detailed and clear explanation. Thanks a lot! Jason On Tue, Feb 19, 2013 at 11:28 PM, Jay Kreps wrote: > Yes, exactly. Here is the full story: > > When you restart kafka it checks if a clean shutdown was executed on > the log (which would have left a marker file), if the shutdown was > cle

Re: connecting to kafkaserver and zk from inside VM

2013-02-19 Thread Jun Rao
Is this related to item #2 in http://kafka.apache.org/faq.html ? Thanks, Jun On Tue, Feb 19, 2013 at 5:32 PM, Andre Z wrote: > Hi, > > I have the following situation. I have set up a http post client sending > data to my VM. The VM has a post ressource that passes on the data to a > kafkaProdu

Re: log file flush?

2013-02-19 Thread Jay Kreps
Yes, exactly. Here is the full story: When you restart kafka it checks if a clean shutdown was executed on the log (which would have left a marker file), if the shutdown was clean it assumes the log was fully flushed and uses it as is. If not (as in the case of a hard kill or machine crash) it exe

Re: log file flush?

2013-02-19 Thread Jason Huang
This starts to make sense to me. So a log segment file (0.log) may have some messages that's in local filesystem hard drive, some messages that's in pagecache? Say if a 000.log file has 150 messages and the first 100 has been flushed to local hard drive and the last 50 is still in the

connecting to kafkaserver and zk from inside VM

2013-02-19 Thread Andre Z
Hi, I have the following situation. I have set up a http post client sending data to my VM. The VM has a post ressource that passes on the data to a kafkaProducer. Everything works fine with having the kafkaConsumer inside the VM. Now I want the cosumer outside and I get a timeout. I only change

Re: log file flush?

2013-02-19 Thread Jay Kreps
To be clear: to lose data in the filesystem you need to hard kill the machine. A hard kill of the process will not cause that. -Jay On Tue, Feb 19, 2013 at 8:25 AM, Jun Rao wrote: > Jason, > > Although messages are always written to the log segment file, they > initially are only in the file sys

Re: log file flush?

2013-02-19 Thread Jun Rao
Jason, Although messages are always written to the log segment file, they initially are only in the file system's pagecache. As Swapnil mentioned earlier, messages are flushed to disk periodically. If you do a clean shutdown (kill -15), we close all log file, which should flush all dirty data to d

Re: log file flush?

2013-02-19 Thread Jason Huang
Thanks for response. My confusion is that - once I see the message content in the .log file, doesn't that mean the message has already been flushed to the hard drive? Why would those messages still get lost if someone manually kill the process (or if the server crashes unexpectedly)? Jason On Tu

Re: log file flush?

2013-02-19 Thread Swapnil Ghike
Correction - The flush happens based on *number of messages* and time limits, whichever is hit first. On 2/19/13 3:50 AM, "Swapnil Ghike" wrote: >The flush happens based on size and time limits, >whichever is hit first.

Re: log file flush?

2013-02-19 Thread Swapnil Ghike
The messages for a topic are kept in the kafka broker's memory before they are flushed to the disk. The flush happens based on size and time limits, whichever is hit first. If you kill the kafka server process before any message has been flushed to the disk, those messages will be lost. The config

log file flush?

2013-02-19 Thread Jason Huang
Hello, I am confused about "log file flush". In my naive understanding, once a message is produced and sent to the kafka server, it will be written to the hard drive at the log file. Since it is in the hard drive already, what exactly do you mean by "log file flush"? I asked because we found that