Re: Updates lost

Jiang Chen Wed, 31 Aug 2011 07:47:37 -0700

Cheers. That would be another solution.


On Wed, Aug 31, 2011 at 10:42 AM, Jim Ancona <j...@anconafamily.com> wrote:
> You could also look at Hector's approach in:
> https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/service/clock/MicrosecondsSyncClockResolution.java
>
> It works well and I believe there was some performance testing done on
> it as well.
>
> Jim
>
> On Tue, Aug 30, 2011 at 3:43 PM, Jeremy Hanna
> <jeremy.hanna1...@gmail.com> wrote:
>> Sorry - misread your earlier email.  I would login to IRC and ask in 
>> #cassandra.  I would think given the nature of nanotime you'll run into 
>> harder to track down problems, but it may be fine.
>>
>> On Aug 30, 2011, at 2:06 PM, Jiang Chen wrote:
>>
>>> Do you see any problem with my approach to derive the current time in
>>> nano seconds though?
>>>
>>> On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
>>> <jeremy.hanna1...@gmail.com> wrote:
>>>> Yes - the reason why internally Cassandra uses milliseconds * 1000 is 
>>>> because System.nanoTime javadoc says "This method can only be used to 
>>>> measure elapsed time and is not related to any other notion of system or 
>>>> wall-clock time."
>>>>
>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>
>>>> On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote:
>>>>
>>>>> Indeed it's microseconds. We are talking about how to achieve the
>>>>> precision of microseconds. One way is System.currentTimeInMillis() *
>>>>> 1000. It's only precise to milliseconds. If there are more than one
>>>>> update in the same millisecond, the second one may be lost. That's my
>>>>> original problem.
>>>>>
>>>>> The other way is to derive from System.nanoTime(). This function
>>>>> doesn't directly return the time since epoch. I used the following:
>>>>>
>>>>>       private static long nanotimeOffset = System.nanoTime()
>>>>>                       - System.currentTimeMillis() * 1000000;
>>>>>
>>>>>       private static long currentTimeNanos() {
>>>>>               return System.nanoTime() - nanotimeOffset;
>>>>>       }
>>>>>
>>>>> The timestamp to use is then currentTimeNanos() / 1000.
>>>>>
>>>>> Anyone sees problem with this approach?
>>>>>
>>>>> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <edlinuxg...@gmail.com> 
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna 
>>>>>> <jeremy.hanna1...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> I would not use nano time with cassandra.  Internally and throughout the
>>>>>>> clients, milliseconds is pretty much a standard.  You can get into 
>>>>>>> trouble
>>>>>>> because when comparing nanoseconds with milliseconds as long numbers,
>>>>>>> nanoseconds will always win.  That bit us a while back when we deleted
>>>>>>> something and it couldn't come back because we deleted it with 
>>>>>>> nanoseconds
>>>>>>> as the timestamp value.
>>>>>>>
>>>>>>> See the caveats for System.nanoTime() for why milliseconds is a 
>>>>>>> standard:
>>>>>>>
>>>>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>>>>
>>>>>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>>>>>>
>>>>>>>> Looks like the theory is correct for the java case at least.
>>>>>>>>
>>>>>>>> The default timestamp precision of Pelops is millisecond. Hence the
>>>>>>>> problem as explained by Peter. Once I supplied timestamps precise to
>>>>>>>> microsecond (using System.nanoTime()), the problem went away.
>>>>>>>>
>>>>>>>> I previously stated that sleeping for a few milliseconds didn't help.
>>>>>>>> It was actually because of the precision of Java Thread.sleep().
>>>>>>>> Sleeping for less than 15ms often doesn't sleep at all.
>>>>>>>>
>>>>>>>> Haven't checked the Python side to see if it's similar situation.
>>>>>>>>
>>>>>>>> Cheers.
>>>>>>>>
>>>>>>>> Jiang
>>>>>>>>
>>>>>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <jia...@gmail.com> wrote:
>>>>>>>>> It's a single node. Thanks for the theory. I suspect part of it may
>>>>>>>>> still be right. Will dig more.
>>>>>>>>>
>>>>>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>>>>>>>>> <peter.schul...@infidyne.com> wrote:
>>>>>>>>>>> The problem still happens with very high probability even when it
>>>>>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa uses 
>>>>>>>>>>> microseconds
>>>>>>>>>>> it can't be the cause. Also I have the same problem with a Java
>>>>>>>>>>> client
>>>>>>>>>>> using Pelops.
>>>>>>>>>>
>>>>>>>>>> You connect to localhost, but is that a single node or part of a
>>>>>>>>>> cluster with RF > 1? If the latter, you need to use QUORUM 
>>>>>>>>>> consistency
>>>>>>>>>> level to ensure that a read sees your write.
>>>>>>>>>>
>>>>>>>>>> If it's a single node and not a pycassa / client issue, I don't know
>>>>>>>>>> off hand.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> / Peter Schuller (@scode on twitter)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
>>>>>> http://wiki.apache.org/cassandra/DataModel
>>>>>> The CLI uses microseconds. If your code and the CLI are doing different
>>>>>> things with time BadThingsWillHappen TM
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
>

Re: Updates lost

Reply via email to