Data modelling questions

2015-02-20 Thread AM
Hi All. I am currently looking at using Riak as a data store for time series data. Currently we get about 1.5T of data in JSON format that I intend to persist in Riak. I am having some difficulty figuring out how to model it such that I can fulfill the use cases I have been handed. The data

Re: Data modelling questions

2015-02-22 Thread AM
S3 and we get notified (usually on an hourly basis, some logs on a 10-min basis) so I can massage it further but I am concerned that every place where I buffer is another opportunity for losing data and I would like to avoid reprocessing as much as possible. Messages will already have the

Re: Data modelling questions

2015-02-23 Thread AM
erts off of the data whose granularity is most likely going to be of the order of 10 mins . These are just counters on a single time dimension so I am assuming that if I get the model right I will this will be easy. Yes we can do this via EMR but it also requires additional moving parts that we

Re: Data modelling questions

2015-02-24 Thread AM
any other questions. Thanks so much for all the help. I think I have a pretty good idea as to how to move forward. Thanks again. AM Jason On 24 Feb 2015, at 05:24, AM wrote: On 2/22/15 6:16 PM, Jason Campbell wrote: Coming at this from another angle, if you already have a permanent data st