I guess it will depend on your use case.
If your columns for table1 and table2 are significant in size it might be the 
case that model 2 is faster and you could perform queries in parallel, but …
If you always need to retrieve both the row from table1 and table2, then both 
queries together might have some overhead in memory, cpu, …
The answer will really depend on the amount of data that you push in every 
table on how frequent (will there be a difference in how partitions will be 
spread over ssTables for both tables?) and how you want to retrieve it.

The only way to know for sure would be to perform benchmark tests with 
representable data for your use case.
NoSQLBench<https://docs.nosqlbench.io/> might be interesting to look into. If 
you are not familiar with it, then might take you a bit of time to figure out 
how to have representable tests/results.

Kind regards,

[cid:image001.png@01D886E7.E4E5C360]<https://inspire.skyline.be/?utm_source=signature&utm_medium=email&utm_campaign=cta>


Michiel Saelen | Principal Solution Architect
Email michiel.sae...@skyline.be<mailto:michiel.sae...@skyline.be>

Skyline Communications
39 Hong Kong Street #02-01 | Singapore 059678
www.skyline.be<https://www.skyline.be> | +65 6920 1145<tel:+6569201145>

[cid:image002.png@01D886E7.E4E5C360]<https://skyline.be/>


[cid:image003.png@01D886E7.E4E5C360]<https://teams.microsoft.com/l/chat/0/0?users=michiel.sae...@skyline.be>
[cid:image004.png@01D886E7.E4E5C360]<https://community.dataminer.services/?utm_source=signature&utm_medium=email&utm_campaign=icon>
[cid:image005.png@01D886E7.E4E5C360]<https://www.linkedin.com/company/skyline-communications>
[cid:image006.png@01D886E7.E4E5C360]<https://www.youtube.com/user/SkylineCommu>
[cid:image007.png@01D886E7.E4E5C360]<https://www.facebook.com/SkylineCommunications/>
[cid:image008.png@01D886E7.E4E5C360]<https://www.instagram.com/skyline.dataminer/>
[cid:image009.png@01D886E7.E4E5C360]<https://skyline.be/skyline/awards?utm_source=signature&utm_medium=email&utm_campaign=icon>


[cid:image010.png@01D886E7.E4E5C360]

From: MyWorld <timeplus.1...@gmail.com>
Sent: Thursday, June 23, 2022 09:38
To: user@cassandra.apache.org
Subject: Query around Data Modelling

This message was sent from outside the company. Please do not click links or 
open attachments unless you recognise the source of this email and know the 
content is safe.

Hi all,

Just a small query around data Modelling.
Suppose we have to design the data model for 2 different use cases which will 
query the data on same set of (partion+clustering key). So should we maintain a 
seperate table for each or a single table.

Model1 - Combined table
Table(Pk,CK, col1,col2, col3, col4,col5)

Model2 - Seperate tables
Table1(Pk,CK,col1,col2,col3)
Table2(Pk,CK,col3,col4,col45)

So here partion and clustering keys are same. Also note column col3 is required 
in both use cases.

As per my thought in Model2, partition size would be less. There would be less 
sstables and when I use level compaction, it could be easily maintained. So 
should be better read performance.

Please help me to highlight the drawback and advantage of each data model. Here 
we have a mix kind of workload (read/write)

Reply via email to