And by default in CQL 3 the timestamp is generated server side. There is an option to provide them client side however.
Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 10/01/2013, at 3:32 AM, Vegard Berget <p...@fantasista.no> wrote: > Hi, > > The timestamp is generated on the client side, so actually if you have two > clients which sets the timestamp from the system time, you will experience > trouble. I don't know how Astyanax does it, and I am not sure if it would > cause trouble when getting data? Could it be that the Process server > actually saw the information, but tried to update with a lower timestamp - > which then again means that failed - until 40 seconds had passed. > From http://wiki.apache.org/cassandra/DataModel: > "All values are supplied by the client, including the 'timestamp'. This means > that clocks on the clients should be synchronized (in the Cassandra server > environment is useful also), as these timestamps are used for conflict > resolution. In many cases the 'timestamp' is not used in client applications, > and it becomes convenient to think of a column as a name/value pair. For the > remainder of this document, 'timestamps' will be elided for readability It is > also worth noting the name and value are binary values, although in many > applications they are UTF8 serialized strings." > > .vegard, > > ----- Original Message ----- > From: > user@cassandra.apache.org > > To: > <user@cassandra.apache.org> > Cc: > > Sent: > Wed, 9 Jan 2013 15:56:08 +0200 > Subject: > Re: How long does it take for a write to actually happen? > > > Aaron, thanks a lot for you response! It gave us many ideas for future > re-factorings. > > Meanwhile, while trying to monitor Cassandra response times on all 3 servers > (online, offline and cassandra itself), I have noticed that the system time > was different on all 3. After I ran ntpdate on all of them, the problem was > gone! The changes saved in Cassandra on offline are immediately visible to > online. > > Unfortunately, I cannot explain, why system time on the client machine > matters, but I really hope that I have found the root cause of the problem, > and it is not just a coincidence that performance has improved, after I have > synched the times. > > Best, > Vitaly Sourikov > > On Wed, Jan 9, 2013 at 4:24 AM, aaron morton <aa...@thelastpickle.com> wrote: > EC2 m1.large node > You will have a much happier time if you use a m1.xlarge. > > We set MAX_HEAP_SIZE="6G" and HEAP_NEWSIZE="400M" > Thats a pretty low new heap size. > > checks for new entries (in "Entries" CF, with indexed column status=1), > processes them, and sets the status to 2, when done > This is not the best data model. > You may be better have one CF for the unprocessed and one for the process. > Or if you really need a queue using something like Kafka. > > I will appreciate any advice on how to speed the writes up, > Writes are instantly available for reading. > The first thing I would do is see where the delay is Use the nodetool cfstats > to see the local write latency, or track the write latency from the client > perspective. > > If you are looking for near real time / continuous computation style > processing take a look at http://storm-project.net/ and register for this > talk from a Brian O'Neill one of my fellow Data Stax MVP's > http://learn.datastax.com/WebinarCEPDistributedProcessingonCassandrawithStorm_Registration.html > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 9/01/2013, at 5:48 AM, Vitaly Sourikov <vitaly.souri...@gmail.com> wrote: > > Hi, > we are currently at an early stage of our project and have only one Cassandra > 1.1.7 node hosted on EC2 m1large node, where the data is written to the > ephemeral disk, and /var/lib/cassandra/data is just a soft link to it. Commit > logs and caches are still on /var/lib/cassandra/. We set MAX_HEAP_SIZE="6G" > and HEAP_NEWSIZE="400M" > > On the client-side, we use Astyanax 1.56.18 to access the data. We have a > processing server that writes to Cassandra, and an online server that reads > from it. The former wakes up every 0.5-5sec., checks for new entries (in > "Entries" CF, with indexed column status=1), processes them, and sets the > status to 2, when done The online server checks once a second if an entry > that should be processed got the status 2 and sends it to its client side for > display. Processing takes 5-10 seconds and updates various columns in the > "Entries" CF few times on the way. One of these columns may contain ~12KB of > textual data, others are just short strings or numbers. > > Now, our problem is that it takes 20-40 seconds before the online server > actually sees the change - and it is way too long, this process is supposed > to be nearly real-time. Moreover, in sqlsh, if I perform a similar update, it > is immediately seen in the following select results, but the updates from the > back-end server also do not appear for 20-40 seconds. > > I tried switching the row caches for that table and in yaml on and of. I > tried commitlog_sync: batch with commitlog_sync_batch_window_in_ms: 50. > Nothing helped. > > I will appreciate any advice on how to speed the writes up, or at least an > explanation why this happens. > > thanks, > Vitaly > >