How much data do you think you will need ad hoc query ability for? On Fri, Jan 20, 2012 at 11:28 AM, Brian O'Neill <b...@alumni.brown.edu>wrote:
> > I can't remember if I asked this question before, but.... > > We're using Cassandra as our transactional system, and building up quite a > library of map/reduce jobs that perform data quality analysis, statistics, > etc. > (> 100 jobs now) > > But... we are still struggling to provide an "ad-hoc" query mechanism for > our users. > > To fill that gap, I believe we still need to materialize our data in an > RDBMS. > > Anyone have any ideas? Better ways to support ad-hoc queries? > > Effectively, our users want to be able to select count(distinct Y) from X > group by Z. > Where Y and Z are arbitrary columns of rows in X. > > We believe we can create column families with different key structures > (using Y an Z as row keys), but some column names we don't know / can't > predict ahead of time. > > Are people doing bulk exports? > Anyone trying to keep an RDBMS in synch in real-time? > > -brian > > -- > Brian ONeill > Lead Architect, Health Market Science (http://healthmarketscience.com) > mobile:215.588.6024 > blog: http://weblogs.java.net/blog/boneill42/ > blog: http://brianoneill.blogspot.com/ > >