The fact that subcolumns inside the supercolumns aren't indexed
currently may suck here, whenever a small no (10-20 ) of subcolumns
need to be retreived from a large list of subcolumns of a supercolumn
like MyPostsIdKeysList.

On Fri, Jan 7, 2011 at 9:58 PM, Raj <rajkumar....@gmail.com> wrote:
> My question is in context of a social network schema design
>
> I am thinking of following schema for storing a user's data that is
> required as he logs in & is led to his homepage:-
> (I aimed at a schema design such that through a single row read query
> all the data that would be required to put up the homepage of that
> user, is retreived.)
>
> UserSuperColumnFamily: {    // Column Family
>
> UserIDKey:
> {columns:            MyName, MyEmail, MyCity,...etc
>  supercolumns:    MyFollowersList, MyFollowiesList, MyPostsIdKeysList,
> MyInterestsList, MyAlbumsIdKeysList, MyVideoIdKeysList,
> RecentNotificationsForUserList,  MessagesReceivedList,
> MessagesSentList, AccountSettingsList, RecentSelfActivityList,
> UpdatesFromFollowiesList
> }
> }
>
> Thus user's newfeed would be generated using superColumn:
> UpdatesFromFollowiesList. But the UpdatesFromFollowiesList, would
> obviously contain only Id of the posts and not the entire post data.
>
> Questions:
>
> 1.) What could be the problems with this design, any improvements ?
>
> 2.) Would frequent & heavy overwrite operations/ row mutations (for
> example; when propagating the post updates for news-feed from some
> user to all his followies) which leads to rows ultimately being in
> several SSTables, will lead to degraded read performance ?? Is it
> suitable to use row Cache(too big row but all data required uptil user
> is logged in) If I do not use cache, it may be very expensive to pull
> the row each time a data is required for the given user since row
> would be in several sstables. How can I improve the
> read performance here
>
> The actual data of the posts from network would be retrieved using
> PostIdKey through subsequent read queries from columnFamily
> PostsSuperColumnFamily which would be like follows:
>
> PostsSuperColumnFamily:{
>
> PostIdKey:
> {
> columns:            PostOwnerId, PostBody
> supercolumns:   TagsForPost {list of columns of all tags for the
> post}, PeopleWhoLikedThisPost {list of columns of UserIdKey of all the
> likers}
> }
> }
>
> Is this the best design to go with or are there any issues to consider
> here ? Thanks in anticipation of your valuable comments.!
>

Reply via email to