Hi Dave, Glad to hear others are using it in this fashion!
Are you using Tyler's suggested strategy for user-profile data - one CF that stores the "timeline", with rows of user-ids, and TimeUUID columns for each data-collection-time. Then some post-processing with Hadoop over the timelines for each user to build a "Profile"? Are you on 0.7 or 0.6.x? Dave Viner On Tue, Mar 1, 2011 at 1:31 AM, Dave Gardner <dave.gard...@visualdna.com>wrote: > Dave > > Tyler's answer already covers CFs etc.. > > We are using Cassandra to store user profile data for exactly the sort of > use case you describe. We don't yet store _all_ the data in Cassandra; > currently we are focusing on the stuff we need available for real-time > access. We use Hadoop to analyse the profiles from within Cassandra. > > Dave > > > On 23 February 2011 23:21, Dave Viner <davevi...@gmail.com> wrote: > >> Hi all, >> >> I'm wondering if anyone has used cassandra as a datastore for a >> user-profile service. I'm thinking of applications like behavioral >> targeting, where there are lots & lots of users (10s to 100s of millions), >> and lots & lots of data about them intermixed in, say, weblogs (probably TBs >> worth). The idea would be to use Cassandra as a datastore for distributed >> parallel processing of the TBs of files (say on hadoop). Then the resulting >> user-profiles would be query-able quickly. >> >> Anyone know of that sort of application of Cassandra? I'm trying to >> puzzle out just what the column family might look like. Seems like a mix of >> time-oriented information (user x visits site y at time z), location >> information (user x appeared from ip x.y.z.a which is geo-location 31.20309, >> 120.10923), and derived information (because user x visited site y 15 times >> within a 10 day window, user x must be interested in buying a car). >> >> I don't have specifics as yet... just some general thoughts. But this >> feels like a Cassandra type problem. (User profile can have lots of columns >> per user, but the exact columns might differ from user to user... very >> scalable, etc) >> >> Thanks >> Dave Viner >> >> >