Thanks Aaron & Shaun, ****************************** I think my question might have been unclear to some of you. So I would again explain my problem(& solution which I thought of) for the sake of clarity:-
Consider I have 2 rows. 1st row contains 60-70 columns and 2nd row contains like in hundreds of thousands columns. Both the columns sets are all valueless. I need to just findout the **common column names** in the two rows. **These two rows are known to me**. So what I plan to do is, I just pick up all **columns (names)** of 1st row (60 -70 columns) and just ask for them in 2nd row, whatever column names I get back is my result. Would there be any problem with this solution ? This is how I am expecting to get common column names. Please do not consider it as a JOIN case as it leads to unnecessary confusions, I just need common column names from valueless columns in the two rows. ******************************** Aaron, actually the intersection data is very much context based. So say if there are 10 million rows in CF A & 1 million in CF B, then intersection data would be containing 10 million *1 million rows. This would involve very huge & unaffordable amounts of denormalization. And finding columns in client would require pulling unnecessary columns like pulling 100,000 columns from a row of which only 60-70 are required . Shaun, I hope my above clarification has clarified things a bit. Yes, the rows, of which I need to find common columns are known to me. Thank you all, Asil On Mon, Feb 7, 2011 at 3:53 AM, Shaun Cutts <sh...@cuttshome.net> wrote: > In theory, you should be able to do joins by creating an extra column in one > column family, holding the "foreign key" of the matching row in the other > family. > > This assumes that the info you are joining on is available in both CFs (is > not some sort of functional transformation). > > I have just found that the implementation for secondary indexes is not yet > very close to optimal for more complex "joins" involving multiple indexes, > I'm not sure if that affects you as you didn't say what you are joining on. > > -- Shaun > > > On Feb 6, 2011, at 4:22 PM, Aaron Morton wrote: > >> Is it possible for you to dernormalise and write all the intersection >> values? Will depend on how many I guess. >> >> The other alternative is to pull back more data that you need and the >> intersection in code in the client. >> >> >> Hope that helps. >> Aaron >> On 7/02/2011, at 7:11 AM, Aklin_81 <asdk...@gmail.com> wrote: >> >>> Hi, >>> >>> @buddhasystem : yes that's well known solution. But obviously when >>> mysql couldnt satisfy my needs, I am here. My question is in context >>> of Cassandra, if it possible to achieve intersection result set of >>> columns in two rows, by the way I spoke about. >>> >>> @Edward: yes that I know but how does that fit here for obtaining the >>> common columns among two rows. >>> >>> Thanks for your comments.. >>> >>> -Asil >>> >>> >>> On Sun, Feb 6, 2011 at 9:55 PM, Edward Capriolo <edlinuxg...@gmail.com> >>> wrote: >>>> On Sun, Feb 6, 2011 at 10:15 AM, buddhasystem <potek...@bnl.gov> wrote: >>>>> >>>>> Hello, >>>>> >>>>> If the amount of data is _that_ small, you'll have a much easier life with >>>>> MySQL, which supports the "join" procedure -- because that's exactly what >>>>> you want to achieve. >>>>> >>>>> >>>>> asil klin wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I want to procure the intersection of columns set of two rows (from 2 >>>>>> different column families). >>>>>> >>>>>> To achieve the intersection results, Can I, first retrieve all >>>>>> columns(around 300) from first row and just query by those column >>>>>> names in the second row(which contains maximum 100 000 columns) ? >>>>>> >>>>>> I am using the results during the write time & not before presentation >>>>>> to the user, so latency wont be much concern while writing. >>>>>> >>>>>> Is it the proper way to procure intersection results of two rows ? >>>>>> >>>>>> Would love to hear your comments.. >>>>>> >>>>>> >>>>>> --------- >>>>>> >>>>>> Regards, >>>>>> Asil >>>>>> >>>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Finding-the-intersection-results-of-column-sets-of-two-rows-tp5997248p5997743.html >>>>> Sent from the cassandra-u...@incubator.apache.org mailing list archive at >>>>> Nabble.com. >>>>> >>>> >>>> You can use multi-get when fetching lists of already know keys >>>> optimize your round rip time. >>>> > >