Why not store session as integer? And timestamp as timesamp(z?) ?
If you know the types of events, also store them as integer , and save a map of them in the app or on another table ? And save the parameters as a json column, so you have more data-types? Hstore only has strings. Be carefull with the mongodb hipster on the stackoverflow post. Elasticsearch is often used for log collection. So, what really is the problem ? On Tue, May 13, 2014 at 4:11 AM, Peeyush Agarwal < peeyushagarwal1...@gmail.com> wrote: > Hi, > > I have log data of the following format: > > > Session Timestamp Event Parameters1 1 > Started Session 1 2 Logged In > Username:"user1"2 3 Started Session1 3 > Started Challenge title:"Challenge 1", level:"2"2 4 > Logged In Username:"user2" > > Now, a person wants to carry out analytics on this log data (And would > like to receive it as a JSON blob after appropriate transformations). For > example, he may want to receive a JSON blob where the Log Data is grouped > by Session and TimeFromSessionStart and CountOfEvents are added before > the data is sent so that he can carry out meaningful analysis. Here I > should return: > > > [ > { > > "session":1,"CountOfEvents":3,"Actions":[{"TimeFromSessionStart":0,"Event":"Session > Started"}, {"TimeFromSessionStart":1, "Event":"Logged In", > "Username":"user1"}, {"TimeFromSessionStart":2, "Event":"Startd Challenge", > "title":"Challenge 1", "level":"2" }] > }, > { > "session":2, > "CountOfEvents":2,"Actions":[{"TimeFromSessionStart":0,"Event":"Session > Started"}, {"TimeFromSessionStart":2, "Event":"Logged In", > "Username":"user2"}] > }] > > > Here, TimeFromSessionStart, CountOfEvents etc. [Let's call it synthetic > additional data] will not be hard coded and I will make a web interface to > allow the person to decide what kind of synthetic data he requires in the > JSON blob. I would like to provide a good amount of flexibility to the > person to decide what kind of synthetic data he wants in the JSON blob. > > If I use PostgreSQL, I can store the data in the following manner: Session > and Event can be string, Timestamp can be date and Parameters can be > hstore(key value pairs available in PostgreSQL). After that, I can use > SQL queries to compute the synthetic (or additional) data, store it > temporarily in variables in a Rails Application (which will interact with > PostgreSQL database and act as interface for the person who wants the JSON > blob) and create JSON blob from it. > > However I am not sure if PostgreSQL is the best choice for this use case. > I have put the detailed question on SO at > http://stackoverflow.com/questions/23544604/log-data-analytics > > Looking for some help from the community. > > Peeyush Agarwal >