Hi all,

I'm wondering if anyone has used cassandra as a datastore for a user-profile
service.  I'm thinking of applications like behavioral targeting, where
there are lots & lots of users (10s to 100s of millions), and lots & lots of
data about them intermixed in, say, weblogs (probably TBs worth).  The idea
would be to use Cassandra as a datastore for distributed parallel processing
of the TBs of files (say on hadoop).  Then the resulting user-profiles would
be query-able quickly.

Anyone know of that sort of application of Cassandra?  I'm trying to puzzle
out just what the column family might look like.  Seems like a mix of
time-oriented information (user x visits site y at time z), location
information (user x appeared from ip x.y.z.a which is geo-location 31.20309,
120.10923), and derived information (because user x visited site y 15 times
within a 10 day window, user x must be interested in buying a car).

I don't have specifics as yet... just some general thoughts.  But this feels
like a Cassandra type problem.  (User profile can have lots of columns per
user, but the exact columns might differ from user to user... very scalable,
etc)

Thanks
Dave Viner

Reply via email to