Re: How to organize a timeseries by device?

2015-11-10 Thread Guillaume Charhon
d use 5 minute as your bucket key since you > only have 1 event every 5 minute. 5-minute bucket seems too small. The > bucket key we mentioned is for you to break the (device_id, timestamp) > partitions into ones with size between ~1MB to ~10MB. > > On Mon, Nov 9, 2015 at 1

Re: How to organize a timeseries by device?

2015-11-09 Thread Guillaume Charhon
hreshold is > defined by compaction_large_partition_warning_threshold_mb in > cassandra.yaml. The default is 100MB. > > You can also use nodetool cfstats to check partition size. > > On Mon, Nov 9, 2015 at 10:53 AM, Guillaume Charhon < > guilla...@databerries.com> wro

Re: How to organize a timeseries by device?

2015-11-09 Thread Guillaume Charhon
o Cassandra's strengths. > > If you bucket the time-based table, do a separate query for each time > bucket. > > -- Jack Krupansky > > On Mon, Nov 9, 2015 at 10:16 AM, Guillaume Charhon < > guilla...@databerries.com> wrote: > >> Kai, Jack, >> &g

Re: How to organize a timeseries by device?

2015-11-09 Thread Guillaume Charhon
utes, or whatever time >> interval makes sense to give 1 to 10 megabytes per partition) and time and >> device as the clustering keys. >> >> Or, consider DSE SEarch and then you can do whatever ad hoc queries you >> want using Solr. Or Stratio or TupleJump Star

How to organize a timeseries by device?

2015-11-09 Thread Guillaume Charhon
Hello, We are currently storing geolocation events (about 1 per 5 minutes) for each device we track. We currently have 2 TB of data. I would like to store the device_id, the timestamp of the event, latitude and longitude. I though about using the device_id as the partition key and timestamp as the