Thanks Aaron & Shaun,

******************************
I think my question might have been unclear to some of you. So I would
again explain my problem(& solution which I thought of) for the sake
of clarity:-

Consider I have 2 rows.  1st row contains 60-70 columns and 2nd row
contains like in hundreds of thousands columns. Both the columns sets
are all valueless. I need to just findout the **common column names**
in the two rows. **These two rows are known to me**. So what I plan to
do is, I just pick up all **columns (names)** of 1st row (60 -70
columns) and just ask for them in 2nd row, whatever column names I get
back is my result.
Would there be any problem with this solution ? This is how I am
expecting to get common column names.

Please do not consider it as a JOIN case as it leads to unnecessary
confusions, I just need common column names from valueless columns in
the two rows.

********************************

Aaron, actually the intersection data is very much context based. So
say if there are 10 million rows in CF A & 1 million in CF B, then
intersection data would be containing 10 million *1 million rows. This
would involve very huge & unaffordable amounts of denormalization.
And finding columns in client would require pulling unnecessary
columns like pulling 100,000 columns from a row of which only 60-70
are required .

Shaun, I hope my above clarification has clarified things a bit. Yes,
the rows, of which I need to find common columns are known to me.


Thank you all,
Asil


On Mon, Feb 7, 2011 at 3:53 AM, Shaun Cutts <sh...@cuttshome.net> wrote:
> In theory, you should be able to do joins by creating an extra column in one 
> column family, holding the "foreign key" of the matching row in the other 
> family.
>
> This assumes that the info you are joining on is available in both CFs (is 
> not some sort of functional transformation).
>
> I have just found that the implementation for secondary indexes is not yet 
> very close to optimal for more complex "joins" involving multiple indexes, 
> I'm not sure if that affects you as you didn't say what you are joining on.
>
> -- Shaun
>
>
> On Feb 6, 2011, at 4:22 PM, Aaron Morton wrote:
>
>> Is it possible for you to dernormalise and write all the intersection 
>> values? Will depend on how many I guess.
>>
>> The other alternative is to pull back more data that you need and the 
>> intersection in code in the client.
>>
>>
>> Hope that helps.
>> Aaron
>> On 7/02/2011, at 7:11 AM, Aklin_81 <asdk...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> @buddhasystem : yes that's well known solution. But obviously when
>>> mysql couldnt satisfy my needs, I am here. My question is in context
>>> of Cassandra, if it possible to achieve intersection result set of
>>> columns in two rows, by the way I spoke about.
>>>
>>> @Edward: yes that I know but how does that fit here for obtaining the
>>> common columns among two rows.
>>>
>>> Thanks for your comments..
>>>
>>> -Asil
>>>
>>>
>>> On Sun, Feb 6, 2011 at 9:55 PM, Edward Capriolo <edlinuxg...@gmail.com> 
>>> wrote:
>>>> On Sun, Feb 6, 2011 at 10:15 AM, buddhasystem <potek...@bnl.gov> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> If the amount of data is _that_ small, you'll have a much easier life with
>>>>> MySQL, which supports the "join" procedure -- because that's exactly what
>>>>> you want to achieve.
>>>>>
>>>>>
>>>>> asil klin wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I want to procure the intersection of columns set of two rows (from 2
>>>>>> different column families).
>>>>>>
>>>>>> To achieve the intersection results, Can I, first retrieve all
>>>>>> columns(around 300) from first row and just query by those column
>>>>>> names in the second row(which contains maximum 100 000 columns) ?
>>>>>>
>>>>>> I am using the results during the write time & not before presentation
>>>>>> to the user, so latency wont be much concern while writing.
>>>>>>
>>>>>> Is it the proper way to procure intersection results of two rows ?
>>>>>>
>>>>>> Would love to hear your comments..
>>>>>>
>>>>>>
>>>>>> ---------
>>>>>>
>>>>>> Regards,
>>>>>> Asil
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context: 
>>>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Finding-the-intersection-results-of-column-sets-of-two-rows-tp5997248p5997743.html
>>>>> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
>>>>> Nabble.com.
>>>>>
>>>>
>>>> You can use multi-get when fetching lists of already know keys
>>>> optimize your round rip time.
>>>>
>
>

Reply via email to