considering:
row size large or not
update a lot or not - update is insert actually
read heavy or not
overall read performance
if row size large , you may consider table:user_detail , add column id in
all tables.
In application side, merge/join by id.
But paid read price, 2nd query to user_de
How many rows in average per partition? around 10K. Let me get this straight :
You are bifurcating your partitions on either email or username , essentially
potentially doubling the data because you don’t have a way to manage a central
system of record of users ? We are just analyzing output log
How many rows in average per partition?
Let me get this straight : You are bifurcating your partitions on either email
or username , essentially potentially doubling the data because you don’t have
a way to manage a central system of record of users ?
I would do this: (my opinion)
Migrate to a
Thanks Jeff for suggestions.
On Mon, Jul 10, 2017 at 9:50 PM Jeff Jirsa wrote:
>
>
> On 2017-07-10 07:13 (-0700), Siddharth Prakash Singh
> wrote:
> > I am planning to build a user activity timeline. Users on our system
> > generates different kind of activity. For example - Search some product
On 2017-07-10 07:13 (-0700), Siddharth Prakash Singh wrote:
> I am planning to build a user activity timeline. Users on our system
> generates different kind of activity. For example - Search some product,
> Calling our sales team, Marking favourite etc.
> Now I would like to generate timeline
ndra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__auto_snapshot
>>
>>
>>
>>
>>
>> *From:* Ali Akhtar [mailto:ali.rac...@gmail.com]
>> *Sent:* Sunday, April 26, 2015 10:31 PM
>>
>> *To:* user@cassandra.apache.org
>> *Subjec
...@gmail.com]
> *Sent:* Sunday, April 26, 2015 10:31 PM
>
> *To:* user@cassandra.apache.org
> *Subject:* Re: Data model suggestions
>
>
>
> Thanks Peer. I like the approach you're suggesting.
>
>
>
> Why do you recommend truncating the last active table rat
/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__auto_snapshot
From: Ali Akhtar [mailto:ali.rac...@gmail.com]
Sent: Sunday, April 26, 2015 10:31 PM
To: user@cassandra.apache.org
Subject: Re: Data model suggestions
Thanks Peer. I like the approach you're sugge
Sharma [mailto:narendra.sha...@gmail.com]
> *Sent:* Friday, April 24, 2015 6:53 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Data model suggestions
>
>
>
> I think one table say record should be good. The primary key is record id.
> This will ensure good distribution.
> Just update
ting
> doesn’t create automatic snapshots.
>
>
>
>
>
> *From:* Narendra Sharma [mailto:narendra.sha...@gmail.com]
> *Sent:* Friday, April 24, 2015 6:53 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Data model suggestions
>
>
>
> I think one table say record
snapshots.
From: Narendra Sharma [mailto:narendra.sha...@gmail.com]
Sent: Friday, April 24, 2015 6:53 AM
To: user@cassandra.apache.org
Subject: Re: Data model suggestions
I think one table say record should be good. The primary key is record id. This
will ensure good distribution.
Just update
I think one table say record should be good. The primary key is record id.
This will ensure good distribution.
Just update the active attribute to true or false.
For range query on active vs archive records maintain 2 indexes or try
secondary index.
On Apr 23, 2015 1:32 PM, "Ali Akhtar" wrote:
>
Good point about the range selects. I think they can be made to work with
limits, though. Or, since the active records will never usually be > 500k,
the ids may just be cached in memory.
Most of the time, during reads, the queries will just consist of select *
where primaryKey = someValue . One ro
Hi,
If your external API returns active records, that means I am guessing you
need to do a select * on the active table to figure out which records in
the table are no longer active.
You might be aware that range selects based on partition key will timeout
in cassandra. They can however be made t
That's returned by the external API we're querying. We query them for
active records, if a previous active record isn't included in the results,
that means its time to archive that record.
On Thu, Apr 23, 2015 at 9:20 PM, Manoj Khangaonkar
wrote:
> Hi,
>
> How do you determine if the record is n
Hi,
How do you determine if the record is no longer active ? Is it a perioidic
process that goes through every record and checks when the last update
happened ?
regards
On Thu, Apr 23, 2015 at 8:09 AM, Ali Akhtar wrote:
> Hey all,
>
> We are working on moving a mysql based application to Cassa
load balancing... Sequential writes can cause hot spots...
> Uneven load balancing for multiple tables”
>
> -- Jack Krupansky
>
> *From:* Kevin Burton
> *Sent:* Saturday, June 7, 2014 1:27 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Data model for streaming a
balancing for multiple tables”
-- Jack Krupansky
From: Kevin Burton
Sent: Saturday, June 7, 2014 1:27 PM
To: user@cassandra.apache.org
Subject: Re: Data model for streaming a large table in real time.
I just checked the source and in 2.1.0 it's not deprecated.
So it *might* be *
You do not Need RAID0 for data. Let C* do striping over data disks.
And maybe CL ANY/ONE might be sufficient for your writes.
> Am 08.06.2014 um 06:15 schrieb Kevin Burton :
>
> we're using containers for other reasons, not just cassandra.
>
> Tightly constraining resources means we don't hav
we're using containers for other reasons, not just cassandra.
Tightly constraining resources means we don't have to worry about cassandra
, the JVM , or Linux doing something silly and using too many resources and
taking down the whole box.
On Sat, Jun 7, 2014 at 8:25 PM, Colin Clark wrote:
>
You won't need containers - running one instance of Cassandra in that
configuration will hum along quite nicely and will make use of the cores
and memory.
I'd forget the raid anyway and just mount the disks separately (jbod)
--
Colin
320-221-9531
On Jun 7, 2014, at 10:02 PM, Kevin Burton wrote
Write Consistency Level + Read Consistency Level > Replication Factor
ensure your reads will read consistently and having 3 nodes lets you
achieve redundancy in event of node failure.
So writing with CL of local quorum and reading with CL of local quorum
(2+2>3) with replication factor of 3 ensure
This is a basic question, but having heard that advice before, I'm curious
about why the minimum recommended replication factor is three? Certainly
additional redundancy, and, I believe, a minimum threshold for paxos. Are there
other reasons?
On Jun 7, 2014 10:52 PM, Colin wrote:
To have any r
Right now I'm just putting everything together as a proof of concept… so
just two cheap replicas for now. And it's at 1/1th of the load.
If we lose data it's ok :)
I think our config will be 2-3x 400GB SSDs in RAID0 , 3 replicas, 16 cores,
probably 48-64GB of RAM each box.
Just one datacent
To have any redundancy in the system, start with at least 3 nodes and a
replication factor of 3.
Try to have at least 8 cores, 32 gig ram, and separate disks for log and data.
Will you be replicating data across data centers?
--
Colin
320-221-9531
> On Jun 7, 2014, at 9:40 PM, Kevin Burton w
Oh.. To start with we're going to use from 2-10 nodes..
I think we're going to take the original strategy and just to use 100
buckets .. 0-99… then the timestamp under that.. I think it should be fine
and won't require an ordered partitioner. :)
Thanks!
On Sat, Jun 7, 2014 at 7:38 PM, Colin Cl
With 100 nodes, that ingestion rate is actually quite low and I don't think
you'd need another column in the partition key.
You seem to be set in your current direction. Let us know how it works out.
--
Colin
320-221-9531
On Jun 7, 2014, at 9:18 PM, Kevin Burton wrote:
What's 'source' ? You
What's 'source' ? You mean like the URL?
If source too random it's going to yield too many buckets.
Ingestion rates are fairly high but not insane. About 4M inserts per
hour.. from 5-10GB…
On Sat, Jun 7, 2014 at 7:13 PM, Colin Clark wrote:
> Not if you add another column to the partition key
Not if you add another column to the partition key; source for example.
I would really try to stay away from the ordered partitioner if at all
possible.
What ingestion rates are you expecting, in size and speed.
--
Colin
320-221-9531
On Jun 7, 2014, at 9:05 PM, Kevin Burton wrote:
Thanks fo
Thanks for the feedback on this btw.. .it's helpful. My notes below.
On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark wrote:
> No, you're not-the partition key will get distributed across the cluster
> if you're using random or murmur.
>
Yes… I'm aware. But in practice this is how it will work…
I
No, you're not-the partition key will get distributed across the cluster if
you're using random or murmur. You could also ensure that by adding
another column, like source to ensure distribution. (Add the seconds to the
partition key, not the clustering columns)
I can almost guarantee that if you
well you could add milliseconds, at best you're still bottlenecking most of
your writes one one box.. maybe 2-3 if there are ones that are lagging.
Anyway.. I think using 100 buckets is probably fine..
Kevin
On Sat, Jun 7, 2014 at 2:45 PM, Colin wrote:
> The add seconds to the bucket. Also,
The add seconds to the bucket. Also, the data will get cached-it's not going
to hit disk on every read.
Look at the key cache settings on the table. Also, in 2.1 you have even more
control over caching.
--
Colin
320-221-9531
> On Jun 7, 2014, at 4:30 PM, Kevin Burton wrote:
>
>
>> On Sat
On Sat, Jun 7, 2014 at 1:34 PM, Colin wrote:
> Maybe it makes sense to describe what you're trying to accomplish in more
> detail.
>
>
Essentially , I'm appending writes of recent data by our crawler and
sending that data to our customers.
They need to sync to up to date writes…we need to get th
Maybe it makes sense to describe what you're trying to accomplish in more
detail.
A common bucketing approach is along the lines of year, month, day, hour,
minute, etc and then use a timeuuid as a cluster column.
Depending upon the semantics of the transport protocol you plan on utilizing,
e
Another way around this is to have a separate table storing the number of
buckets.
This way if you have too few buckets, you can just increase them in the
future.
Of course, the older data will still have too few buckets :-(
On Sat, Jun 7, 2014 at 11:09 AM, Kevin Burton wrote:
>
> On Sat, Jun
On Sat, Jun 7, 2014 at 10:41 AM, Colin Clark wrote:
> It's an anti-pattern and there are better ways to do this.
>
>
Entirely possible :)
It would be nice to have a document with a bunch of common cassandra design
patterns.
I've been trying to track down a pattern for this and a lot of this is
It's an anti-pattern and there are better ways to do this.
I have implemented the paging algorithm you've described using wide rows
and bucketing. This approach is a more efficient utilization of
Cassandra's built in wholesome goodness.
Also, I wouldn't let any number of clients (huge) connect d
I just checked the source and in 2.1.0 it's not deprecated.
So it *might* be *being* deprecated but I haven't seen anything stating
that.
On Sat, Jun 7, 2014 at 8:03 AM, Colin wrote:
> I believe Byteorderedpartitioner is being deprecated and for good reason.
> I would look at what you could a
"One node would take all the load, followed by the next node" --> with
this design, you are not exploiting all the power of the cluster. If only
one node takes all the load at a time, what is the point having 20 or 10
nodes ?
You'd better off using limited wide row with bucketing to achieve this
I believe Byteorderedpartitioner is being deprecated and for good reason. I
would look at what you could achieve by using wide rows and murmur3partitioner.
--
Colin
320-221-9531
> On Jun 6, 2014, at 5:27 PM, Kevin Burton wrote:
>
> We have the requirement to have clients read from our tabl
Hi Duy:
The compound partition key seems perfect, but you say that pagination isn't
possible with it: why is that?
Regards,
James
On Sat, Mar 22, 2014 at 10:40 AM, DuyHai Doan wrote:
> Ben
>
>
> > When you say beware of the cardinality, do you think that the
> cardinality is too low in this
Ben
> When you say beware of the cardinality, do you think that the cardinality
is too low in this instance?
Secondary indexes in C* are distributed across all the nodes containing
actual data so somehow it helps avoiding hot spots. However, since there
are only 2 values for your boolean flag, e
Hey Duy Hai,
On Fri, Mar 21, 2014 at 7:34 PM, DuyHai Doan wrote:
> Your previous "select * from x where flag = true;" translate into:
>
> SELECT * FROM x WHERE id=... AND flag = true
>
> Of course, you'll need to provide the id in any case.
This is an interesting option, though this app needs
On Sat, Mar 22, 2014 at 3:32 AM, Ben Hood <0x6e6...@gmail.com> wrote:
> Also a very good point. The main query paths the app needs to support are:
>
> select * from x where flag=true and id = ? and timestamp >= ? and timestamp
> <= ?
> select * from x where flag=false and id = ? and timestamp >= ?
On Sat, Mar 22, 2014 at 1:31 AM, Laing, Michael
wrote:
> Whoops now there are only 2 partition keys! Not good if you have any
> reasonable number of rows...
Yes, this column family will have a large number of rows.
> I monitor partition sizes and shard enough to keep them reasonable in this
> so
Of course what you really want is this:
create table x(
id text,
timestamp timeuuid,
flag boolean,
// other fields
primary key (flag, id, timestamp)
)
Whoops now there are only 2 partition keys! Not good if you have any
reasonable number of rows...
Faced with a situation like this (alt
Hello Ben
Try the following alternative with composite partition key to encode the
dual states of the boolean:
create table x(
id text,
flag boolean,
timestamp timeuuid,
// other fields
primary key (*(id,flag)* timestamp)
)
Your previous "select * from x where flag = true;" transla
I think there is not an extremely simple solution to your problem. You
will probably need to use multiple tables to get the view you need. One
keyed just by file UUID, which tracks some basic metadata about the file
including the last modified time. Another as a materialized view of the
most rece
t;
>
>
>
> -Original Message-
> From: y2k...@gmail.com on behalf of Jimmy Lin
> Sent: Thu 11-Jul-13 13:09
> To: user@cassandra.apache.org
> Subject: Re: data model question : finding out the n most recent changes
> items
>
> what I mean is, I really just w
-Original Message-
From: y2k...@gmail.com on behalf of Jimmy Lin
Sent: Thu 11-Jul-13 13:09
To: user@cassandra.apache.org
Subject: Re: data model question : finding out the n most recent changes items
what I mean is, I really just want the last modified date instead of series
of timestamp and still
what I mean is, I really just want the last modified date instead of series
of timestamp and still able to sort or order by it.
(maybe I should rephrase my question as how to sort or order by last
modified column in a row)
CREATE TABLE user_file (
user_id uuid,
modified_date timest
What you described this sounds like the most appropriate:
CREATE TABLE user_file (
user_id uuid,
modified_date timestamp,
file_id timeuuid,
PRIMARY KEY(user_id, modified_date)
);
If you normally need more information about the file then either store that as
addit
You can refer to the Data Modelling guide here:
http://clojurecassandra.info/articles/data_modelling.html
It includes several things you've mentioned (namely, range queries and
dynamic tables).
Also, it seems that it'd be useful for you to use indexes, and performing
filtering (for things related
We have built a similar system, you can ready about our data model in CQL3
here:
http://www.slideshare.net/carlyeks/nyc-big-tech-day-2013
We are going to be presenting a similar talk next week at the cassandra
summit.
On Fri, Jun 7, 2013 at 12:34 PM, Davide Anastasia <
davide.anasta...@qualityc
for
you though.
Dean
From: aaron morton mailto:aa...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Date: Friday, April 5, 2013 10:59 AM
To: "user@cassandra.apache.org<mailto:use
> Whats the recommendation on querying a data model like StartDate > “X” and
> counter > “Y” .
>
>
it's not possible.
If you are using secondary indexes you have to have an equals clause in the
statement.
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@a
On Wed, Mar 13, 2013 at 4:23 AM, Mohan L wrote:
>
>
> On Fri, Mar 8, 2013 at 9:42 PM, aaron morton
> wrote:
>>
>> > 1). create a column family 'cfrawlog' which stores raw log as received.
>> > row key could be 'ddmmhh'(new row is added for each hour or less), each
>> > 'column name' is uuid w
On Fri, Mar 8, 2013 at 9:42 PM, aaron morton wrote:
> > 1). create a column family 'cfrawlog' which stores raw log as received.
> row key could be 'ddmmhh'(new row is added for each hour or less), each
> 'column name' is uuid with 'value' is raw log data. Since we are also going
> to use this
> 1). create a column family 'cfrawlog' which stores raw log as received. row
> key could be 'ddmmhh'(new row is added for each hour or less), each
> 'column name' is uuid with 'value' is raw log data. Since we are also going
> to use this log for forensics purpose, so it will help us to hav
Row key based on hour will create hot spots for write - for an entire hour, all
the writes will be going to the same node, i.e., the node where the row
resides. You need to come up with a row key that distributes writes evenly
across all your C* nodes, e.g., time concatenated with a sequence cou
13 19:12
To: user@cassandra.apache.org
Subject: Re: data model advice needed
One possibility would be to use dynamic columns, with each column name being a
composite made from a timestamp, and the value of each containing serialized
json of the details. The host could be the key. Then you could
19:12
To: user@cassandra.apache.org
Subject: Re: data model advice needed
One possibility would be to use dynamic columns, with each column name being a
composite made from a timestamp, and the value of each containing serialized
json of the details. The host could be the key. Then you could slice
There are many different patterns in noSQL with 90% being different than
an RDBMS. Check out this page for some things to get you thinking
http://buffalosw.com/wiki/Patterns-Page/
If you ever consider playorm and you can figure out how to partition your
data(perhaps by month), you can do querie
One possibility would be to use dynamic columns, with each column name being a
composite made from a timestamp, and the value of each containing serialized
json of the details. The host could be the key. Then you could slice the data
by column name.
Ken
- Original Message -
Fro
February 26, 2013 12:27 AM
>> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
>> mailto:user@cassandra.apache.org>>
>> Subject: Re: Data Model - Additional Column Families or one CF?
>>
>> Aaron,
>>
>> Would 50 CF
e.org>" <
> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Date: Tuesday, February 26, 2013 12:27 AM
> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <
> user@cassandra.apache.org<mailto:user@cassandra.apache.org&
;user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Subject: Re: Data Model - Additional Column Families or one CF?
Aaron,
Would 50 CFs be pushing it? According to
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-an
ruary 26, 2013 12:27 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Subject: Re: Data Model - Additional Column Families or one CF?
Aaron,
Would 50 CFs be pushing it? According to
http://www.datastax.com/dev/blog/wh
Greetings!
Thank you very much sharing your insight and experience.
I am trying to migrate a normalized Schema -- 1 TB database. The data
is hierarchical...
child entities carry foreign keys to the parent entities. There are
several instances like
ShapeTable, Circle, Square, Rectangle etc...
Aaron,
Would 50 CFs be pushing it? According to
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management,
"This has been tested to work across hundreds or even thousands of
ColumnFamilies."
What is the bottleneck, IO?
Thanks,
Javier
On Sun, Feb 24,
Thanks Aaron, this was a big help!
—
Sent from Mailbox for iPhone
On Thu, Feb 21, 2013 at 9:27 AM, aaron morton
wrote:
> If you have a limited / known number (say < 30) of types, I would create a
> CF for each of them.
> If the number of types is unknown or very large I would have one CF with
If you have a limited / known number (say < 30) of types, I would create a CF
for each of them.
If the number of types is unknown or very large I would have one CF with the
row key you described.
Generally I avoid data models that require new CF's as the data grows.
Additionally having diffe
In the case without CQL3, where I would use composite columns, I see how
this sort of lines up with what CQL3 is doing.
I don't have the ability to use CQL3 as I am using pycassa for my client,
so that leaves me with CompositeColumns
Under composite columns, I would have 1 row, which would be sto
> I have heard it best to try and avoid the use of super columns for now.
Yup.
Your model makes sense. If you are creating the CF using the cassandra-cli you
will probably want to reverse order the column names see
http://thelastpickle.com/2011/10/03/Reverse-Comparators/
If you want to use CQ
Thinking a little more on your issue, you can also do that in playroom as
OneToMany is represented with a few columns in the owning table/entity
unlike JPA and RDBMS.
Ie.
Student.java {
List - These course primary keys are saved one per column in the
student's row
}
Course.java {
List - These
Yes, this scenario can occur(even with quorum writes/reads as you are dealing
with different rows) as one write may be complete and the other not while
someone else is reading from the cluster. Generally though, you can do read
repair when you read it in ;). Ie. See if things are inconsistent
lt;mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Date: Friday, September 14, 2012 3:00 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Subject: Re: Data Model
Consider a course_students col fami
> Consider a course_students col family which gives a list of students for a
> course
I would use two CF's:
Course CF:
* Each row is one course
* Columns are the properties and values of the course
CourseEnrolements CF
* Each row is one course
* Column name is th
I'm fairly new to Cassandra myself, but had to solve a similar problem. If
ordering of the student number values is not important to you, you can
store them as UTF8 values (Ascii would work too, may be a better choice?),
and the resulting columns would be sorted by the lexical ordering of the
nume
I just started learning Cassandra any suggestion where to start with ??
Thanks
Soumya
On Thu, Sep 13, 2012 at 10:54 AM, Roshni Rajagopal <
roshni_rajago...@hotmail.com> wrote:
> I want to learn how we can model a mix of static and dynamic columns in
> a family.
>
> Consider a course_students c
> Isn't kafka too young for production using purpose ?
The best way to advance the project is to use it and contribute your experience
and time.
btw, checking out kafka is a great idea. There are people around having Fun
Times with Kafka in production
Cheers
-
Aaron Morton
Fre
Isn't kafka too young for production using purpose ?
Clearly that would fit much better my needs but I can't afford early stage
project not ready for production. Is it ?
Le 30 avr. 2012 à 14:28, samal a écrit :
>
>
> On Mon, Apr 30, 2012 at 5:52 PM, Morgan Segalis wrote:
> Hi Samal,
>
> Th
On Mon, Apr 30, 2012 at 5:52 PM, Morgan Segalis wrote:
> Hi Samal,
>
> Thanks for the TTL feature, I wasn't aware of it's existence.
>
> Day's partitioning will be less wider than month partitionning (about 30
> times less give or take ;-) )
> Per day it should have something like 100 000 message
Hi Samal,
Thanks for the TTL feature, I wasn't aware of it's existence.
Day's partitioning will be less wider than month partitionning (about 30 times
less give or take ;-) )
Per day it should have something like 100 000 messages stored, most of it would
be retrieved so deleted before the TTL f
On Mon, Apr 30, 2012 at 4:25 PM, Morgan Segalis wrote:
> Hi Aaron,
>
> Thank you for your answer, I was beginning to think that my question would
> never be answered ;-)
>
> Actually, this is what I was going for, except one thing, instead of
> partitioning row per month, I though about partition
Hi Aaron,
Thank you for your answer, I was beginning to think that my question would
never be answered ;-)
Actually, this is what I was going for, except one thing, instead of
partitioning row per month, I though about partitioning per day, like that
everyday I launch the cleaning tool, and it
Message Queue is often not a great use case for Cassandra. For information on
how to handle high delete workloads see
http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra
It hard to create a model without some idea of the data load, but I would
suggest you start with:
CF: Us
Thanks!
Better than mine, as it considered later additions of services!
Will update my code,
Thanks
*Tamar Fraenkel *
Senior Software Engineer, TOK Media
[image: Inline image 1]
ta...@tok-media.com
Tel: +972 2 6409736
Mob: +972 54 8356490
Fax: +972 2 5612956
On Mon, Mar 12, 2012 at 11:
Alternate would be to add another row to your user CF specific for Facebook
ids. Column ID would be the Facebook identifier and value would be your
internal uuid.
Consider when you want to add another service like twitter. Will you then
add another CF per service or just another row specific now
In this case, where you know the query upfront, I add a custom secondary index
using another CF to support the query. It's a little easier here because the
data wont change.
UserLookupCF (using composite types for the key value)
row_key: e.g. "facebook:12345" or "twitter:12345"
col_name : e.g
Hi!
Thanks for the response.
>From what I read, secondary indices are good only for columns with few
possible values. Is this a good fit for my case? I have unique facebook id
for every user.
Thanks
*Tamar Fraenkel *
Senior Software Engineer, TOK Media
[image: Inline image 1]
ta...@tok-media.com
Either you do that or you could think about using a secondary index on the
fb user name in your primary cf.
See http://www.datastax.com/docs/1.0/ddl/indexes
Cheers
Am 11.03.2012 um 09:51 schrieb Tamar Fraenkel :
Hi!
I need some advise:
I have user CF, which has a UUID key which is my internal u
On Fri, Feb 24, 2012 at 10:46 AM, David Leimbach wrote:
>
>
> On Thu, Feb 23, 2012 at 7:54 PM, Martin Arrowsmith
> wrote:
>>
>> Hi Franc,
>>
>> Or, you can consider using composite columns. It is not recommended to use
>> Super Columns anymore.
>
>
> Yes, but why? Is it because composite columns
On Thu, Feb 23, 2012 at 7:54 PM, Martin Arrowsmith <
arrowsmith.mar...@gmail.com> wrote:
> Hi Franc,
>
> Or, you can consider using composite columns. It is not recommended to use
> Super Columns anymore.
>
Yes, but why? Is it because composite columns effectively replace and
simplify similar mo
On Fri, Feb 24, 2012 at 2:54 PM, Martin Arrowsmith <
arrowsmith.mar...@gmail.com> wrote:
> Hi Franc,
>
> Or, you can consider using composite columns. It is not recommended to use
> Super Columns anymore.
>
> Best wishes,
>
On first read it would seem that there is fair bit of overhead with
compo
On Fri, Feb 24, 2012 at 2:54 PM, Martin Arrowsmith <
arrowsmith.mar...@gmail.com> wrote:
> Hi Franc,
>
> Or, you can consider using composite columns. It is not recommended to use
> Super Columns anymore.
>
Thanks,
I'll look in to composite columns
cheers
>
> Best wishes,
>
> Martin
>
>
> On
Hi Franc,
Or, you can consider using composite columns. It is not recommended to use
Super Columns anymore.
Best wishes,
Martin
On Thu, Feb 23, 2012 at 7:51 PM, Indranath Ghosh wrote:
> How about using a composite row key like the following:
>
> Entity.Day1.TypeA: {col1:val1, col2:val2, . . .
How about using a composite row key like the following:
Entity.Day1.TypeA: {col1:val1, col2:val2, . . . }
Entity.Day1.TypeB: {col1:val1, col2:val2, . . . }
.
.
Entity.DayN.TypeA: {col1:val1, col2:val2, . . . }
Entity.DayN.TypeB: {col1:val1, col2:val2, . . . }
It is better to avoid super columns..
this is what i thought. thanks for clarifying.
On 2/2/2012 10:44 PM, aaron morton wrote:
Short answer is no. The slightly longer answer is nope.
All column names in a CF are compared using the same comparator. You
will need to create a new CF.
Cheers.
-
Aaron Morton
Freelan
1 - 100 of 137 matches
Mail list logo