Re: Query around Data Modelling -2

2022-07-01 Thread Bowen Song via user
mp;utm_campaign=icon> *From:* Bowen Song *Sent:* Friday, July 1, 2022 08:48 *To:* user@cassandra.apache.org *Subject:* Re: Query around Data Modelling -2 This message was sent from outside the company. Please do not click links or open attachments unle

Re: Query around Data Modelling -2

2022-07-01 Thread MyWorld
/SkylineCommu> > > <https://www.facebook.com/SkylineCommunications/> > > <https://www.instagram.com/skyline.dataminer/> > > > <https://skyline.be/skyline/awards?utm_source=signature&utm_medium=email&utm_campaign=icon> > > > > > >

RE: Query around Data Modelling -2

2022-06-30 Thread Michiel Saelen
dium=email&utm_campaign=icon> [cid:image010.png@01D88D2B.263669C0] From: Bowen Song Sent: Friday, July 1, 2022 08:48 To: user@cassandra.apache.org Subject: Re: Query around Data Modelling -2 This message was sent from outside the company. Please do not click links or open attachments unles

Re: Query around Data Modelling -2

2022-06-30 Thread Bowen Song
And why do you do that? On 30/06/2022 16:35, MyWorld wrote: We run major compaction once in a week On Thu, Jun 30, 2022, 8:14 PM Bowen Song wrote: I have noticed this "running a weekly repair and compaction job". What do you mean weekly compaction job? Have you disabled the auto-

Re: Query around Data Modelling -2

2022-06-30 Thread MyWorld
We run major compaction once in a week On Thu, Jun 30, 2022, 8:14 PM Bowen Song wrote: > I have noticed this "running a weekly repair and compaction job". > > What do you mean weekly compaction job? Have you disabled the > auto-compaction on the table and is relying on weekly scheduled > compact

Re: Query around Data Modelling -2

2022-06-30 Thread Bowen Song
I have noticed this "running a weekly repair and compaction job". What do you mean weekly compaction job? Have you disabled the auto-compaction on the table and is relying on weekly scheduled compactions? Or running weekly major compactions? Neither of these sounds right. On 30/06/2022 15:03

Re: Query around Data Modelling -2

2022-06-30 Thread MyWorld
Hi Jeff, We are running repair with -pr option. You are right it would have no or very minimal impact on read (considering the fact now data has to be read from 2 levels instead of 3). But my guess there is no negative impact of this model2. On Thu, Jun 30, 2022, 7:41 PM Jeff Jirsa wrote: > Ho

Re: Query around Data Modelling -2

2022-06-30 Thread Jeff Jirsa
How are you running repair? -pr? Or -st/-et? 4.0 gives you real incremental repair which helps. Splitting the table won’t make reads faster. It will increase the potential parallelization of compaction. > On Jun 30, 2022, at 7:04 AM, MyWorld wrote: > >  > Hi all, > > Another query around d

Re: Query around Data Modelling

2022-06-22 Thread MyWorld
Thanks a lot Jeff, Michiel and Manish for your replies. Really helpful. On Thu, Jun 23, 2022, 9:50 AM Jeff Jirsa wrote: > This is assuming each row is like … I dunno 10-1000 bytes. If you’re > storing like a huge 1mb blob use two tables for sure. > > On Jun 22, 2022, at 9:06 PM, Jeff Jirsa wrot

Re: Query around Data Modelling

2022-06-22 Thread Jeff Jirsa
This is assuming each row is like … I dunno 10-1000 bytes. If you’re storing like a huge 1mb blob use two tables for sure. > On Jun 22, 2022, at 9:06 PM, Jeff Jirsa wrote: > >  > > Ok so here’s how I would think about this > > The writes don’t matter. (There’s a tiny tiny bit of nuance in

Re: Query around Data Modelling

2022-06-22 Thread Jeff Jirsa
Ok so here’s how I would think about this The writes don’t matter. (There’s a tiny tiny bit of nuance in one table where you can contend adding to the memtable but the best cassandra engineers on earth probably won’t notice that unless you have really super hot partitions, so ignore the write

Re: Query around Data Modelling

2022-06-22 Thread MyWorld
Hi Jeff, Let me know how no of rows have an impact here. May be today I have 80-100 rows per partition. But what if I started storing 2-4k rows per partition. However total partition size is still under 100 MB On Thu, Jun 23, 2022, 7:18 AM Jeff Jirsa wrote: > How many rows per partition in each

RE: Query around Data Modelling

2022-06-22 Thread Michiel Saelen
I guess it will depend on your use case. If your columns for table1 and table2 are significant in size it might be the case that model 2 is faster and you could perform queries in parallel, but … If you always need to retrieve both the row from table1 and table2, then both queries together might

Re: Query around Data Modelling

2022-06-22 Thread manish khandelwal
Table1 should be fine if some column values are not entered than Cassandra will not create entry for them so partiton will almost be same in both cases. On Thu, Jun 23, 2022, 07:08 MyWorld wrote: > Hi all, > > Just a small query around data Modelling. > Suppose we have to design the data model f

Re: Query around Data Modelling

2022-06-22 Thread Jeff Jirsa
How many rows per partition in each model? > On Jun 22, 2022, at 6:38 PM, MyWorld wrote: > >  > Hi all, > > Just a small query around data Modelling. > Suppose we have to design the data model for 2 different use cases which will > query the data on same set of (partion+clustering key). So s