Thanks Zach, Nice Idea ! and what about looking at, may be, some custom caching solutions, leaving aside cassandra caching .. ?
On Sun, Oct 30, 2011 at 2:00 AM, Zach Richardson < j.zach.richard...@gmail.com> wrote: > Aditya, > > Depending on how often you have to write to the database, you could > perform dual writes to two different column families, one that has > summary + details in it, and one that only has the summary. > > This way you can get everything with one query, or the summary with > one query, this should also help optimize your caching. > > The question here would of course be whether or not you have a read or > write heavy workload. Since you seem to be concerned about the > caching, it sounds like you have more of a read heavy workload and > wouldn't pay to heavily with the dual writes. > > Zach > > > On Sat, Oct 29, 2011 at 2:21 PM, Mohit Anchlia <mohitanch...@gmail.com> > wrote: > > On Sat, Oct 29, 2011 at 11:23 AM, Aditya Narayan <ady...@gmail.com> > wrote: > >> @Mohit: > >> I have stated the example scenarios in my first post under this heading. > >> Also I have stated above why I want to split that data in two rows & > like > >> Ikeda below stated, I'm too trying out to prevent the frequently > accessed > >> rows being bloated with large data & want to prevent that data from > entering > >> cache as well. > > > > I think you are missing the point. You don't get any benefit > > (performance, access), you are already breaking it into 2 rows. > > > > Also, I don't know of any way where you can selectively keep the rows > > or keys in the cache. Other than having some background job that keeps > > the cache hot with those keys/rows you only have one option of keeping > > it in different CF since you are already breaking a row in 2 rows. > > > >> > >>> Okay so as most know this practice is called a wide row - we use them > >>> quite a lot. However, as your schema shows it will cache (while being > >>> active) all the row in memory. One way we got around this issue was to > >>> basically create some materialized views of any more common data so we > can > >>> easily get to the minimum amount of information required without > blowing too > >>> much memory with the larger representations. > >> > >> Yes exactly this is problem I am facing but I want to keep the both the > >> types(common + large/detailed) of data in single CF so that it could > server > >> 'two materialized views'. > >> > >>> > >>> My perspective is that indexing some of the higher levels of data would > be > >>> the way to go - Solr or elastic search for distributed or if you know > you > >>> only need it local just use a caching solution like ehcache > >> > >> What do you mean exactly by "indexing some of the higher levels of > data" ? > >> > >> Thanks you guys! > >> > >> > >> > >>> > >>> Anthony > >>> > >>> > >>> On 28/10/2011, at 21:42 PM, Aditya Narayan wrote: > >>> > >>> > I need to keep the data of some entities in a single CF but split in > two > >>> > rows for each entity. One row contains an overview information for > the > >>> > entity & another row contains detailed information about entity. I am > >>> > wanting to keep both rows in single CF so they may be retrieved in a > single > >>> > query when required together. > >>> > > >>> > Now the problem I am facing is that I want to cache only first type > of > >>> > rows(ie, the overview containing rows) & avoid second type rows(that > >>> > contains large data) from getting into cache. > >>> > > >>> > Is there a way I can manipulate such filtering of cache entering rows > >>> > from a single CF? > >>> > > >>> > > >>> > >> > >> > > >