Example Data Modelling

2015-07-06 Thread Srinivasa T N
Hi, I have basic doubt: I have an RDBMS with the following two tables: Emp - EmpID, FN, LN, Phone, Address Sal - Month, Empid, Basic, Flexible Allowance My use case is to print the Salary slip at the end of each month and the slip contains emp name and his other details. Now, if

Re: Example Data Modelling

2015-07-07 Thread Srinivasa T N
RY KEY(EmpID, month) > > ) > > > > That way the salaries will be partitioned by EmpID and clustered by month, > which I guess is the natural sorting you want. > > > > Hope it helps, > > Cheers! > > > Carlos Alonso | Software Engineer | @calonso > &

Storing large files for later processing through hadoop

2015-01-02 Thread Srinivasa T N
Hi All, The problem I am trying to address is: Store the raw files (files are in xml format and of the size arnd 700MB) in cassandra, later fetch it and process it in hadoop cluster and populate back the processed data in cassandra. Regarding this, I wanted few clarifications: 1) The FAQ ( ht

Re: Storing large files for later processing through hadoop

2015-01-02 Thread Srinivasa T N
On Fri, Jan 2, 2015 at 5:54 PM, mck wrote: > > You could manually chunk them down to 64Mb pieces. > > Can this split and combine be done automatically by cassandra when inserting/fetching the file without application being bothered about it? > > > 2) Can I replace HDFS with Cassandra so that I

Re: Storing large files for later processing through hadoop

2015-01-02 Thread Srinivasa T N
es of vastly exceeding recommended limits. > > On Fri, Jan 2, 2015 at 9:53 AM, Srinivasa T N wrote: > >> >> >> On Fri, Jan 2, 2015 at 5:54 PM, mck wrote: >> >>> >>> You could manually chunk them down to 64Mb pieces. >>> >>

Re: Storing large files for later processing through hadoop

2015-01-02 Thread Srinivasa T N
Hi Wilm, The reason is that for some auditing purpose, I want to store the original files also. Regards, Seenu. On Fri, Jan 2, 2015 at 11:09 PM, Wilm Schumacher wrote: > Hi, > > perhaps I totally misunderstood your problem, but why "bother" with > cassandra for storing in the first place? >

Re:

2015-01-05 Thread Srinivasa T N
Just an arrow in the dark: Doucment "CQL for Cassandra 2.x Documentation" informs that cassandra allows to query on a column when it is indexed. Regards, Seenu. On Mon, Jan 5, 2015 at 5:14 PM, Nagesh wrote: > Hi All, > > I have designed a column family > > prodgroup text, prodid int, status int

Queries required before data modeling?

2015-01-06 Thread Srinivasa T N
Hi All, I was just googling around and reading the various articles on data modeling in cassandra. All of them talk about working backwards, i.e., first now what type of queries you are going to make and select a right data model which can support those queries efficiently. But one thing I can

Re: Queries required before data modeling?

2015-01-06 Thread Srinivasa T N
r. However, both of these approaches have performance implications > (they fan out and scan lots of data) and if you need Cassandra's speed and > scalability then you're going to need to model in a scalable way. > > > > > On Tue, Jan 6, 2015 at 11:47 AM, Srinivasa T N wro

How to store weather station Details along with monitoring data efficiently?

2015-01-23 Thread Srinivasa T N
Hi All, I was following the TimeSeries data modelling in PlanetCassandra by Patrick McFadin. Regarding that, I had one query: If I need to store the weather station name also, should it be in the same table, say: create table test (wea_id int, wea_name text, wea_add text, eventday timeuuid, e

Re: How to store weather station Details along with monitoring data efficiently?

2015-01-23 Thread Srinivasa T N
I forgot, my task at hand is to generate a report of all the weather station's along with the sum of temperatures measured each day. Regards, Seenu. On Fri, Jan 23, 2015 at 2:14 PM, Srinivasa T N wrote: > Hi All, >I was following the TimeSeries data modelling in PlanetC

Re: How to store weather station Details along with monitoring data efficiently?

2015-01-24 Thread Srinivasa T N
a : Primary & clustered keys. > > Errata: You could add Longitude & Latitude too to the model to add a > level of detail especially since its widely prevalent for weather station > data. > > hope this helps. > > jan/ > > > On Friday, January 23, 2015 3:1

Re: Suggestion Date as a Partition key

2015-02-04 Thread Srinivasa T N
I would not suggest only date as the partition key. This creates all the records related to a single day go into a single partition and will create load on one partition when other partitions are free. Try to add some other field also to the primary key so that the load is distributed. Check thi

Re: Timeseries analysis using Cassandra and partition by date period

2015-04-06 Thread Srinivasa T N
Comparison to OpenTSDB HBase For one we do not use id’s for strings. The string data (metric names and tags) are written to row keys and the appropriate indexes. Because Cassandra has much wider rows there are far fewer keys written to the database. The space saved by using id’s is minor and by n

Support for ad-hoc query

2015-06-08 Thread Srinivasa T N
Hi All, I have an web application running with my backend data stored in cassandra. Now I want to do some analysis on the data stored which requires some ad-hoc queries fired on cassandra. How can I do the same? Regards, Seenu.

Re: Support for ad-hoc query

2015-06-09 Thread Srinivasa T N
andra wasn't designed for it. One thing we've done for our own > project is to combine solr with our own fuzzy index to make ad-hoc queries > against a single table more friendly. > > > > On Tue, Jun 9, 2015 at 2:38 AM, Srinivasa T N wrote: > >> Hi All, >>