Thanks Tyler!
On Thu, Feb 3, 2011 at 12:06 PM, Tyler Hobbs wrote:
> On Wed, Feb 2, 2011 at 3:27 PM, Aditya Narayan wrote:
>>
>> Can I have some more feedback about my schema perhaps somewhat more
>> criticisive/harsh ?
>
> It sounds reasonable to me.
>
> Since you're writing/reading all of the
On Wed, Feb 2, 2011 at 3:27 PM, Aditya Narayan wrote:
> Can I have some more feedback about my schema perhaps somewhat more
> criticisive/harsh ?
>
It sounds reasonable to me.
Since you're writing/reading all of the subcolumns at the same time, I would
opt for a standard column with the tags se
Can I have some more feedback about my schema perhaps somewhat more
criticisive/harsh ?
Thanks again,
Aditya Narayan
On Wed, Feb 2, 2011 at 10:27 PM, Aditya Narayan wrote:
> @Bill
> Thank you BIll!
>
> @Cassandra users
> Can others also leave their suggestions and comments about my schema, plea
@Bill
Thank you BIll!
@Cassandra users
Can others also leave their suggestions and comments about my schema, please.
Also my question about whether to use a superColumn or alternatively,
just store the data (that would otherwise be stored in subcolumns) as
serialized into a single column in standa
I did not understand before... sorry.
Again, depending upon how many reminders you have for a single user, this could
be a long/wide row. Again, it really comes down to how many reminders are we
talking about and how often will they be read/written. While a single row can
contain millions (may
You got me wrong perhaps..
I am already splitting the row on per user basis ofcourse, otherwise
the schema wont make sense for my usage. The row contains only
*reminders of a single user* sorted in chronological order. The
reminder Id are stored as supercolumn name and subcolumn contain tags
for t
Any time I see/hear "a single row containing all ..." I get nervous. That single
row is going to reside on a single node. That is potentially a lot of load
(don't know the system) for that single node. Why wouldn't you split it by at
least user? If it won't be a lot of load, then why are you usi
I think you got it exactly what I wanted to convey except for few
things I want to clarify:
I was thinking of a single row containing all reminders (& not split
by day). History of the reminders need to be maintained for some time.
After certain time (say 3 or 6 months) they may be deleted by ttl
To reiterate, so I know we're both on the same page, your schema would be
something like this:
- A column family (as you describe) to store the details of a reminder. One
reminder per row. The row key would be a TimeUUID.
- A super column family to store the reminders for each user, for each
Actually, I am trying to use Cassandra to display to users on my
applicaiton, the list of all Reminders set by themselves for
themselves, on the application.
I need to store rows containing the timeline of daily Reminders put by
the users, for themselves, on application. The reminders need to be
p
r a host in a single row is not a good choice. 2
>> reason:
>> 1, too few keys, so your data will not distributing well.
>> 2, data under a key will always increase. So Cassandra have to do more
>> SSTable compaction.
>>
>> -邮件原件-
>> 发件人: Wil
t;
> -邮件原件-
> 发件人: William R Speirs [mailto:bill.spe...@gmail.com]
> 发送时间: 2011年1月27日 9:15
> 收件人: user@cassandra.apache.org
> 主题: Re: Schema Design
>
> It makes sense that the single row for a system (with a growing number of
> columns) will reside on a single mac
.@gmail.com]
发送时间: 2011年1月27日 9:15
收件人: user@cassandra.apache.org
主题: Re: Schema Design
It makes sense that the single row for a system (with a growing number of
columns) will reside on a single machine.
With that in mind, here is my updated schema:
- A single column family for all the mess
Ah, sweet... thanks for the link!
Bill-
On 01/26/2011 08:20 PM, buddhasystem wrote:
Bill, it's all explained here:
http://wiki.apache.org/cassandra/MemtableThresholds#JVM_Heap_Size,the
Watch the number of CFs and the memtable sizes.
In my experience, this all matters.
Bill, it's all explained here:
http://wiki.apache.org/cassandra/MemtableThresholds#JVM_Heap_Size,the
Watch the number of CFs and the memtable sizes.
In my experience, this all matters.
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Schema-Des
ich you means you'll definitely not be able to
distribute your load very well.
From: Bill Speirs [bill.spe...@gmail.com]
Sent: Wednesday, January 26, 2011 1:23 PM
To: user@cassandra.apache.org
Subject: Re: Schema Design
I like this approach, but I have 2 q
very well.
From: Bill Speirs [bill.spe...@gmail.com]
Sent: Wednesday, January 26, 2011 1:23 PM
To: user@cassandra.apache.org
Subject: Re: Schema Design
I like this approach, but I have 2 questions:
1) what is the implications of continually adding columns t
I used the term "sharding" a bit frivolously. Sorry. It's just splitting
semantically homogenious data among CFs doesn't scale too well, as each CF
is allocated a piece of memory on the server.
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Sche
One thing you can do is create one CF, then as the row key use the
application name + timestamp, with that you can do your range query using
OOP. then store whatever you want in the row
problem would be if one app generates far more logs than the others
Nicolas Santini
On Thu, Jan 27, 2011 at 1
My cli knowledge sucks so far, so I'll leave that to othersI'm doing
most of my reading/writing through a thrift client (hector/java based)
As for the implications, as of the latest version of Cassandra there is not
theoretical limit to the number of columns that a particular row can hold.
O
I have a basic understanding of OPP... if most of my messages come
within a single hour then a few nodes could be storing all of my
values, right?
You totally lost me on, "whether to shard data as per system..." Is my
schema (one column family per system, and row keys as TimeUUIDType)
sharding by
I like this approach, but I have 2 questions:
1) what is the implications of continually adding columns to a single
row? I'm unsure how Cassandra is able to grow. I realize you can have
a virtually infinite number of columns, but what are the implications
of growing the number of columns over time
Having separate columns for Year, Month etc seems redundant. It's tons more
efficient to keep say UTC time in POSIX format (basically integer). It's
easy to convert back and forth.
If you want to get a range of dates, in that case you might use Order
Preserving Partitioner, and sort out which sys
I would say in that case you might want to try a single column family
where the key to the column is the system name.
Then, you could name your columns as the timestamp. Then when retrieving
information from the data store you can can, in your slice request, specify
your start column as X and
24 matches
Mail list logo