date:20161020

time series data model

2016-10-20 Thread wxn...@zjqunshuo.com

Hi All, I'm trying to migrate my time series data which is GPS trace from mysql to C*. I want a wide row to hold one day data. I designed the data model as below. Please help to see if there is any problem. Any suggestion is appreciated. Table Model: CREATE TABLE cargts.eventdata ( deviceid

Re: time series data model

2016-10-20 Thread Vladimir Yudovin

Hi Simon, Why position is text and not float? Text takes much more place. Also speed and headings can be calculated basing on latest positions, so you can also save them. If you really need it in data base you can save them as floats, or compose single float value like speed.heading: 41.173 (or

Re: non incremental repairs with cassandra 2.2+

2016-10-20 Thread kurt Greaves

Welp, that's good but wasn't apparent in the codebase :S. Kurt Greaves k...@instaclustr.com www.instaclustr.com On 20 October 2016 at 05:02, Alexander Dejanovski wrote: > Hi Kurt, > > we're not actually. > Reaper performs full repair by subrange but does incremental repair on all > ranges at on

Re: time series data model

2016-10-20 Thread kurt Greaves

Ah didn't pick up on that but looks like he's storing JSON within position. Is there any strong reason for this or as Vladimir mentioned can you store the fields under "position" in separate columns? Kurt Greaves k...@instaclustr.com www.instaclustr.com On 20 October 2016 at 08:17, Vladimir Yudov

Re: time series data model

2016-10-20 Thread kurt Greaves

If event_time is timestamps since unix epoch you 1. may want to use the in-built timestamps type, and 2. order by event_time DESC. 2 applies if you want to do queries such as "select * from eventdata where ... and event_time > x" (i.e; get latest events). Other than that your model seems workable,

Re: non incremental repairs with cassandra 2.2+

2016-10-20 Thread kurt Greaves

probably because i was looking the wrong version of the codebase :p

Re: time series data model

2016-10-20 Thread wxn...@zjqunshuo.com

Thank you Kurt, I thought the one column which was identified by the compsite key(deviceId+date+event_time) can hold only one value, so I packaged all info into one JSON. Maybe I'm wrong. I rewrite the table as below. CREATE TABLE cargts.eventdata ( deviceid int, date int, event_time

Re: time series data model

2016-10-20 Thread wxn...@zjqunshuo.com

Hi Kurt, I do need to align the time windows to day bucket to prevent one row become too big, and event_time is timestamp since unix epoch. If I use bigint as type of event_time, can I do queries as you mentioned? -Simon Wu From: kurt Greaves Date: 2016-10-20 16:18 To: user Subject: Re: time s

Re: Inconsistencies in materialized views

2016-10-20 Thread siddharth verma

Hi Edward, Thanks a lot for your help. It helped us narrow down the problem. Regards On Mon, Oct 17, 2016 at 9:33 PM, Edward Capriolo wrote: > https://issues.apache.org/jira/browse/CASSANDRA-11198 > > Which has problems "maybe" fixed by: > > https://issues.apache.org/jira/browse/CASSANDRA-1147

strange node load decrease after nodetool repair -pr

2016-10-20 Thread Oleg Krayushkin

Hi. After I've run token-ranged repair from node at 12.5.13.125 with nodetool repair -full -st ${start_tokens[i]} -et ${end_tokens[i]} on every token range, I got this node load: -- Address Load Tokens Owns Rack UN 12.5.13.141 23.94 GB 256 32.3% rack1 DN 12.5.13.125

Handle Leap Seconds with Cassandra

2016-10-20 Thread Anuj Wadehra

Hi, I would like to know how you guys handle leap seconds with Cassandra. I am not bothered about the livelock issue as we are using appropriate versions of Linux and Java. I am more interested in finding an optimum answer for the following question: How do you handle wrong ordering of multiple

Re: Handle Leap Seconds with Cassandra

2016-10-20 Thread Ben Bromhead

http://www.datastax.com/dev/blog/preparing-for-the-leap-second gives a pretty good overview If you are using a timestamp as part of your primary key, this is the situation where you could end up overwriting data. I would suggest using timeuuid instead which will ensure that you get different prima

Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Vikas Jaiman

Hi, Normally people would like to store smaller values in Cassandra. Is there anyone using it to store for larger values (e.g 500KB or more) and if so what are the issues you are facing . I Would like to know the tweaks also which you are considering. Thanks, Vikas

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Harikrishnan Pillai

We use Cassandra to store images .any data above 2 mb we chunk it and store.it works perfectly . Sent from my iPhone > On Oct 20, 2016, at 12:09 PM, Vikas Jaiman wrote: > > Hi, > > Normally people would like to store smaller values in Cassandra. Is there > anyone using it to store for larger

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Justin Cameron

You can, but it is not really very efficient or cost-effective. You may encounter issues with streaming, repairs and compaction if you have very large blobs (100MB+), so try to keep them under 10MB if possible. I'd suggest storing blobs in something like Amazon S3 and keeping just the bucket name

Rebuild failing while adding new datacenter

2016-10-20 Thread Jai Bheemsen Rao Dhanwada

Hello All, I have single datacenter with 3 C* nodes and we are trying to expand the cluster to another region/DC. I am seeing the below error while doing a "nodetool rebuild -- name_of_existing_data_center" . [user@machine ~]$ nodetool rebuild DC1 nodetool: Unable to find sufficient sources for s

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread sai krishnam raju potturi

we faced a similar issue earlier, but that was more related to firewall rules. The newly added datacenter was not able to communicate with the existing datacenters on the port 7000(inter-node communication). Your's might be a different issue, but just saying. On Thu, Oct 20, 2016 at 4:12 PM, Jai

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Jai Bheemsen Rao Dhanwada

thanks, This always works on 2.1.13 and 2.1.16 version but not on 3.0.8. definitely not a firewall issue On Thu, Oct 20, 2016 at 1:16 PM, sai krishnam raju potturi < pskraj...@gmail.com> wrote: > we faced a similar issue earlier, but that was more related to firewall > rules. The newly added dat

Cluster Maintenance Mishap

2016-10-20 Thread Branton Davis

Howdy folks. I asked some about this in IRC yesterday, but we're looking to hopefully confirm a couple of things for our sanity. Yesterday, I was performing an operation on a 21-node cluster (vnodes, replication factor 3, NetworkTopologyStrategy, and the nodes are balanced across 3 AZs on AWS EC2

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Yabin Meng

I have seen this on other releases, on 2.2.x. The workaround is exactly like yours, some other system keyspaces also need similar changes. I would say this is a benign bug. Yabin On Thu, Oct 20, 2016 at 4:41 PM, Jai Bheemsen Rao Dhanwada < jaibheem...@gmail.com> wrote: > thanks, > > This alway

Re: Cluster Maintenance Mishap

2016-10-20 Thread Yabin Meng

Most likely the issue is caused by the fact that when you move the data, you move the system keyspace data away as well. Meanwhile, due to the error of data being copied into a different location than what C* is expecting, when C* starts, it can not find the system metadata info and therefore tries

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Jai Bheemsen Rao Dhanwada

Thank you Yabin, is there a exisiting JIRA that I can refer to? On Thu, Oct 20, 2016 at 2:05 PM, Yabin Meng wrote: > I have seen this on other releases, on 2.2.x. The workaround is exactly > like yours, some other system keyspaces also need similar changes. > > I would say this is a benign bug.

Re: Cluster Maintenance Mishap

2016-10-20 Thread Branton Davis

Thanks for the response, Yabin. However, if there's an answer to my question here, I'm apparently too dense to see it ;) I understand that, since the system keyspace data was not there, it started bootstrapping. What's not clear is if they took over the token ranges of the previous nodes or got

Re: Introducing Cassandra 3.7 LTS

2016-10-20 Thread sankalp kohli

This is awesome. I have send out the patches which we back ported into 2.1 on the dev list. On Wed, Oct 19, 2016 at 4:33 PM, kurt Greaves wrote: > > On 19 October 2016 at 21:07, sfesc...@gmail.com > wrote: > >> Wow, thank you for doing this. This sentiment regarding stability seems >> to be wid

Re: Introducing Cassandra 3.7 LTS

2016-10-20 Thread Ben Bromhead

Thanks Sankalp, we are also reviewing our internal 2.1 list against what you published (though we are trying to upgrade everyone to later versions e.g. 2.2). It's great to compare notes. On Thu, 20 Oct 2016 at 16:19 sankalp kohli wrote: > This is awesome. I have send out the patches which we bac

Re: Introducing Cassandra 3.7 LTS

2016-10-20 Thread sankalp kohli

I will also publish 3.0 back ports once we are running 3.0 On Thu, Oct 20, 2016 at 4:23 PM, Ben Bromhead wrote: > Thanks Sankalp, we are also reviewing our internal 2.1 list against what > you published (though we are trying to upgrade everyone to later versions > e.g. 2.2). It's great to compar

Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Yabin Meng

Sorry, I'm not aware of it On Thu, Oct 20, 2016 at 6:00 PM, Jai Bheemsen Rao Dhanwada < jaibheem...@gmail.com> wrote: > Thank you Yabin, is there a exisiting JIRA that I can refer to? > > On Thu, Oct 20, 2016 at 2:05 PM, Yabin Meng wrote: > >> I have seen this on other releases, on 2.2.x. The wo

Re: Cluster Maintenance Mishap

2016-10-20 Thread Yabin Meng

I believe you're using VNodes (because token range change doesn't make sense for single-token setup unless you change it explicitly). If you bootstrap a new node with VNodes, I think the way that the token ranges are assigned to the node is random (I'm not 100% sure here, but should be so logically

Re: failure node rejoin

2016-10-20 Thread Yuji Ito

Thanks Ben, I tried to run a rebuild and repair after the failure node rejoined the cluster as a "new" node with -Dcassandra.replace_address_first_boot. The failure node could rejoined and I could read all rows successfully. (Sometimes a repair failed because the node cannot access other node. If

Re: failure node rejoin

2016-10-20 Thread Ben Slater

A couple of questions: 1) At what stage did you have (or expect to have) 1000 rows (and have the mismatch between actual and expected) - at that end of operation (2) or after operation (3)? 2) What replication factor and replication strategy is used by the test keyspace? What consistency level is u

Re: Cluster Maintenance Mishap

2016-10-20 Thread Branton Davis

I guess I'm either not understanding how that answers the question and/or I've just a done a terrible job at asking it. I'll sleep on it and maybe I'll think of a better way to describe it tomorrow ;) On Thu, Oct 20, 2016 at 8:45 PM, Yabin Meng wrote: > I believe you're using VNodes (because to

Re: Cluster Maintenance Mishap

2016-10-20 Thread Jeremiah D Jordan

The easiest way to figure out what happened is to examine the system log. It will tell you what happened. But I’m pretty sure your nodes got new tokens during that time. If you want to get back the data inserted during the 2 hours you could use sstableloader to send all the data from the /var

Re: Cluster Maintenance Mishap

2016-10-20 Thread kurt Greaves

On 20 October 2016 at 20:58, Branton Davis wrote: > Would they have taken on the token ranges of the original nodes or acted > like new nodes and got new token ranges? If the latter, is it possible > that any data moved from the healthy nodes to the "new" nodes or > would restarting them with th

How to throttle up/down compactions without a restart

2016-10-20 Thread Thomas Julian

Hello, I was going through this presentation and the Slide-55 caught my attention. i.e) "Throttled down compactions during high load period, throttled up during low load period" Can we throttle down compactions without a restart? If this can be done, what are all the parameters(JMX

Re: How to throttle up/down compactions without a restart

2016-10-20 Thread kurt Greaves

You can throttle compactions using nodetool setcompactionthroughput . Where x is in mbps. If you're using 2.2 or later this applies immediately to all running compactions, otherwise it applies on any "new" compactions. You will want to be careful of allowing compactions to utilise too much disk ban

Re: How to throttle up/down compactions without a restart

2016-10-20 Thread Jeff Jirsa

You can also set concurrent compactors through JMX – in the CompactionManager mbean, you have CoreCompactionThreads and MaxCompactionThreads – you can adjust them at runtime, but do it in an order such that Max is always higher than Core From: kurt Greaves Reply-To: "user@cassandra.apa

Re: failure node rejoin

2016-10-20 Thread Yuji Ito

thanks Ben, > 1) At what stage did you have (or expect to have) 1000 rows (and have the mismatch between actual and expected) - at that end of operation (2) or after operation (3)? after operation 3), at operation 4) which reads all rows by cqlsh with CL.SERIAL > 2) What replication factor and r

Re: failure node rejoin

2016-10-20 Thread Ben Slater

OK. Are you certain your tests don’t generate any overlapping inserts (by PK)? Cassandra basically treats any inserts with the same primary key as updates (so 1000 insert operations may not necessarily result in 1000 rows in the DB). On Fri, 21 Oct 2016 at 16:30 Yuji Ito wrote: > thanks Ben, > >

Re: failure node rejoin

2016-10-20 Thread Yuji Ito

> Are you certain your tests don’t generate any overlapping inserts (by PK)? Yes. The operation 2) also checks the number of rows just after all insertions. On Fri, Oct 21, 2016 at 2:51 PM, Ben Slater wrote: > OK. Are you certain your tests don’t generate any overlapping inserts (by > PK)? Cas

Re: failure node rejoin

2016-10-20 Thread Ben Slater

Just to confirm, are you saying: a) after operation 2, you select all and get 1000 rows b) after operation 3 (which only does updates and read) you select and only get 953 rows? If so, that would be very unexpected. If you run your tests without killing nodes do you get the expected (1,000) rows?

40 matches

Mail list logo