Re: Schema Design

Nick Santini Wed, 26 Jan 2011 13:32:15 -0800

One thing you can do is create one CF, then as the row key use the
application name + timestamp, with that you can do your range query using
OOP. then store whatever you want in the row


problem would be if one app generates far more logs than the others

Nicolas Santini


On Thu, Jan 27, 2011 at 10:26 AM, Bill Speirs <bill.spe...@gmail.com> wrote:

> I have a basic understanding of OPP... if most of my messages come
> within a single hour then a few nodes could be storing all of my
> values, right?
>
> You totally lost me on, "whether to shard data as per system..." Is my
> schema (one column family per system, and row keys as TimeUUIDType)
> sharding by system? I thought -- probably incorrectly -- that the row
> keys are used in the sharding process, not column families.
>
> Thanks...
>
> Bill-
>
> On Wed, Jan 26, 2011 at 4:17 PM, buddhasystem <potek...@bnl.gov> wrote:
> >
> > Having separate columns for Year, Month etc seems redundant. It's tons
> more
> > efficient to keep say UTC time in POSIX format (basically integer). It's
> > easy to convert back and forth.
> >
> > If you want to get a range of dates, in that case you might use Order
> > Preserving Partitioner, and sort out which systems logged later in
> client.
> > Read up on consequences of using OPP.
> >
> > Whether to shard data as per system depends on how many you have. If more
> > than a few, don't do that, there are memory considerations.
> >
> > Cheers
> >
> > Maxim
> >
> > --
> > View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Schema-Design-tp5964167p5964227.html
> > Sent from the cassandra-u...@incubator.apache.org mailing list archive
> at Nabble.com.
> >
>

Re: Schema Design

Reply via email to