My point is about the difficulty in having perfect clocks in a distributed system. If nanosecond precision isn’t happening at Google scale, it’s unlikely to be happening anywhere. The fact that dapper was written in the context of tracing is irrelevant.
I agree with you : yes precise time at the nano scale is hard. However while the context of *tracing* is indeed is irrelevant, the notion of *measure* time ; this isn’t the same problem at all, the paper is about measuring things that span across different software/hardware while the problem here is the order of writes (as mentioned in the original question). Anyway I wouldn’t even trust nanoTime to generate timestamp at the *nanoscale*, let’s look at java.lang.System.nanotTime(), the javadoc <http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang/System.java> says this call gives the nano precision however the resolution is at least as good the millisecond, indeed depending on OS or hardware there may be be not the same *accuracy*, on Linux for example the code may be using using an internal counter <http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/os/linux/vm/os_linux.cpp#l1453>, and even if it doesn’t I’m not even sure that Linux::clock_gettime(CLOCK_MONOTONIC, &tp); will even be *consistent* across different hardware thread (since each core may run at different speed for several reasons like physical differences, power management, etc…). This implies that java threads may issue writes that may not be ordered at the nano second precision. Multicore processors are already a distributed system, so yes working with an accurate nanoseconds precision in a distributed system with network latencies is incredibly hard if not impossible. There’s till the possibility of using a single threaded writer (but there’s still other issues). — Brice On Thu, Oct 29, 2015 at 8:02 PM, Jonathan Haddad <j...@jonhaddad.com> wrote: My point is about the difficulty in having perfect clocks in a distributed > system. If nanosecond precision isn't happening at Google scale, it's > unlikely to be happening anywhere. The fact that dapper was written in the > context of tracing is irrelevant. > On Thu, Oct 29, 2015 at 7:27 PM Brice Dutheil <brice.duth...@gmail.com> > wrote: > >> Additionally if the time uuid is generated client side, make sure the >> boxes that will perform the write hava correct ntp/ptp configuration. >> >> @John Haddad >> >> Keep in mind that in a distributed environment you probably have so much >> variance that nanosecond precision is pointless. Even google notes that in >> the paper, Dapper, a Large-Scale Distributed Systems Tracing Infrastructure >> [http://research.google.com/pubs/pub36356.html] >> >> I agree with your statement about variance. Though I just like to mention >> Dapper is about *tracing* query/code, more generally it’s about about >> the execution overhead of tracing, which is a bit different that just >> *timestamping*. >> >> >> -- Brice >> >> On Thu, Oct 29, 2015 at 2:45 PM, Clint Martin < >> clintlmar...@coolfiretechnologies.com> wrote: >> >>> Generating the time uuid on the server side via the now() function also >>> makes the operation non idempotent. This may not be a huge problem for your >>> application but it is something to keep in mind. >>> >>> Clint >>> On Oct 29, 2015 9:01 AM, "Kai Wang" <dep...@gmail.com> wrote: >>> >>>> If you want the timestamp to be generated on the C* side, you need to >>>> sync clocks among nodes to the nanosecond precision first. That alone might >>>> be hard or impossible already. I think the safe bet is to generate the >>>> timestamp on the client side. But depending on your data volume, if data >>>> comes from multiple clients you still need to sync clocks among them. >>>> >>>> >>>> On Thu, Oct 29, 2015 at 7:57 AM, <chandrasekar....@wipro.com> wrote: >>>> >>>>> Hi Doan, >>>>> >>>>> >>>>> >>>>> Is the timeBased() method available in Java driver similar to now() >>>>> function >>>>> in cqlsh. Does both provide identical results. >>>>> >>>>> >>>>> >>>>> Also, the preference is to generate values during record insertion >>>>> from database side, rather than client side. Something similar to >>>>> SYSTIMESTAMP in Oracle. >>>>> >>>>> >>>>> >>>>> Regards, Chandra Sekar KR >>>>> >>>>> *From:* DuyHai Doan [mailto:doanduy...@gmail.com] >>>>> *Sent:* 29/10/2015 5:13 PM >>>>> *To:* user@cassandra.apache.org >>>>> *Subject:* Re: Oracle TIMESTAMP(9) equivalent in Cassandra >>>>> >>>>> >>>>> >>>>> You can use TimeUUID data type and provide the value yourself from >>>>> client side. >>>>> >>>>> >>>>> >>>>> The Java driver offers an utility class >>>>> com.datastax.driver.core.utils.UUIDs and the method timeBased() to >>>>> generate >>>>> the TimeUUID. >>>>> >>>>> >>>>> >>>>> The precision is only guaranteed up to 100 nano seconds. So you can >>>>> have possibly 10k distincts values for 1 millsec. For your requirement of >>>>> 20k per sec, it should be enough. >>>>> >>>>> >>>>> >>>>> On Thu, Oct 29, 2015 at 12:10 PM, <chandrasekar....@wipro.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> Oracle Timestamp data type supports fractional seconds (upto 9 digits, >>>>> 6 is default). What is the Cassandra equivalent data type for Oracle >>>>> TimeStamp nanosecond precision. >>>>> >>>>> >>>>> >>>>> This is required for determining the order of insertion of record >>>>> where the number of records inserted per sec is close to 20K. Is TIMEUUID >>>>> an alternate functionality which can determine the order of record >>>>> insertion in Cassandra ? >>>>> >>>>> >>>>> >>>>> Regards, Chandra Sekar KR >>>>> >>>>> The information contained in this electronic message and any >>>>> attachments to this message are intended for the exclusive use of the >>>>> addressee(s) and may contain proprietary, confidential or privileged >>>>> information. If you are not the intended recipient, you should not >>>>> disseminate, distribute or copy this e-mail. Please notify the sender >>>>> immediately and destroy all copies of this message and any attachments. >>>>> WARNING: Computer viruses can be transmitted via email. The recipient >>>>> should check this email and any attachments for the presence of viruses. >>>>> The company accepts no liability for any damage caused by any virus >>>>> transmitted by this email. www.wipro.com >>>>> >>>>> >>>>> The information contained in this electronic message and any >>>>> attachments to this message are intended for the exclusive use of the >>>>> addressee(s) and may contain proprietary, confidential or privileged >>>>> information. If you are not the intended recipient, you should not >>>>> disseminate, distribute or copy this e-mail. Please notify the sender >>>>> immediately and destroy all copies of this message and any attachments. >>>>> WARNING: Computer viruses can be transmitted via email. The recipient >>>>> should check this email and any attachments for the presence of viruses. >>>>> The company accepts no liability for any damage caused by any virus >>>>> transmitted by this email. www.wipro.com >>>>> >>>> >>>> >>