1. I bet you'd get better performance using RANGE partitioning. 2. Twenty million rows per userid is a *LOT*. No subdivisions (like date range)?
On Mon, Dec 23, 2024 at 1:23 PM Divyansh Gupta JNsThMAudy < ag1567...@gmail.com> wrote: > Adrian, Please check this out; > > PARTITION BY HASH (userid); CREATE TABLE dbo.googledocs_tbl_clone_part_0 > PARTITION OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84, > remainder 0); ... CREATE TABLE dbo.googledocs_tbl_clone_part_83 PARTITION > OF dbo.googledocs_tbl_clone FOR VALUES WITH (modulus 84, remainder 83); > > On Mon, Dec 23, 2024 at 11:48 PM Divyansh Gupta JNsThMAudy < > ag1567...@gmail.com> wrote: > >> Adrian, the partition is on userid using hash partition with 84 partitions >> >> Ron, there could be more than 20 Million records possible for a single >> userid in that case if I create index on userid only not on other column >> the query is taking more than 30 seconds to return the results. >> >> On Mon, 23 Dec 2024, 11:40 pm Ron Johnson, <ronljohnso...@gmail.com> >> wrote: >> >>> If your queries all reference userid, then you only need indices on gdid >>> and userid. >>> >>> On Mon, Dec 23, 2024 at 12:49 PM Divyansh Gupta JNsThMAudy < >>> ag1567...@gmail.com> wrote: >>> >>>> I have one confusion with this design if I opt to create 50 columns I >>>> need to create 50 index which will work with userid index in Bitmap on the >>>> other hand if I create a JSONB column I need to create a single index ? >>>> >>>> On Mon, 23 Dec 2024, 11:10 pm Ron Johnson, <ronljohnso...@gmail.com> >>>> wrote: >>>> >>>>> Given what you just wrote, I'd stick with 50 separate t* columns. >>>>> Simplifies queries, simplifies updates, and eliminates JSONB conversions. >>>>> >>>>> On Mon, Dec 23, 2024 at 12:29 PM Divyansh Gupta JNsThMAudy < >>>>> ag1567...@gmail.com> wrote: >>>>> >>>>>> Values can be updated based on customer actions >>>>>> >>>>>> All rows won't have all 50 key value pairs always if I make those >>>>>> keys into columns the rows might have null value on the other hand if it >>>>>> is >>>>>> JSONB then the key value pair will not be there >>>>>> >>>>>> Yes in UI customers can search for the key value pairs >>>>>> >>>>>> During data population the key value pair will be empty array in case >>>>>> of JSONB column or NULL in case of table columns, later when customer >>>>>> performs some actions that time the key value pairs will populate and >>>>>> update, based on what action customer performs. >>>>>> >>>>>> On Mon, 23 Dec 2024, 10:51 pm Divyansh Gupta JNsThMAudy, < >>>>>> ag1567...@gmail.com> wrote: >>>>>> >>>>>>> Let's make it more understandable, here is the table schema with 50 >>>>>>> columns in it >>>>>>> >>>>>>> CREATE TABLE dbo.googledocs_tbl ( >>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE >>>>>>> 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL, >>>>>>> userid int8 NOT NULL, >>>>>>> t1 int4 NULL, >>>>>>> t2 int4 NULL, >>>>>>> t3 int4 NULL, >>>>>>> t4 int4 NULL, >>>>>>> t5 int4 NULL, >>>>>>> t6 int4 NULL, >>>>>>> t7 int4 NULL, >>>>>>> t8 int4 NULL, >>>>>>> t9 int4 NULL, >>>>>>> t10 int4 NULL, >>>>>>> t11 int4 NULL, >>>>>>> t12 int4 NULL, >>>>>>> t13 int4 NULL, >>>>>>> t14 int4 NULL, >>>>>>> t15 int4 NULL, >>>>>>> t16 int4 NULL, >>>>>>> t17 int4 NULL, >>>>>>> t18 int4 NULL, >>>>>>> t19 int4 NULL, >>>>>>> t20 int4 NULL, >>>>>>> t21 int4 NULL, >>>>>>> t22 int4 NULL, >>>>>>> t23 int4 NULL, >>>>>>> t24 int4 NULL, >>>>>>> t25 int4 NULL, >>>>>>> t26 int4 NULL, >>>>>>> t27 int4 NULL, >>>>>>> t28 int4 NULL, >>>>>>> t29 int4 NULL, >>>>>>> t30 int4 NULL, >>>>>>> t31 int4 NULL, >>>>>>> t32 int4 NULL, >>>>>>> t33 int4 NULL, >>>>>>> t34 int4 NULL, >>>>>>> t35 int4 NULL, >>>>>>> t36 int4 NULL, >>>>>>> t37 int4 NULL, >>>>>>> t38 int4 NULL, >>>>>>> t39 int4 NULL, >>>>>>> t40 int4 NULL, >>>>>>> t41 int4 NULL, >>>>>>> t42 int4 NULL, >>>>>>> t43 int4 NULL, >>>>>>> t44 int4 NULL, >>>>>>> t45 int4 NULL, >>>>>>> t46 int4 NULL, >>>>>>> t47 int4 NULL, >>>>>>> t48 int4 NULL, >>>>>>> t49 int4 NULL, >>>>>>> t50 int4 NULL, >>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid), >>>>>>> ); >>>>>>> >>>>>>> Every time when i query I will query it along with userid >>>>>>> Ex : where userid = 12345678 and t1 in (1,2,3) and t2 in (0,1,2) >>>>>>> more key filters if customer applies >>>>>>> >>>>>>> On the other hand if I create a single jsonb column the schema will >>>>>>> look like : >>>>>>> >>>>>>> CREATE TABLE dbo.googledocs_tbl ( >>>>>>> gdid int8 GENERATED BY DEFAULT AS IDENTITY( INCREMENT BY 1 MINVALUE >>>>>>> 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL, >>>>>>> userid int8 NOT NULL, >>>>>>> addons_json jsonb default '{}'::jsonb >>>>>>> CONSTRAINT googledocs_tbl_pkey PRIMARY KEY (gdid), >>>>>>> ); >>>>>>> >>>>>>> and the query would be like >>>>>>> where userid = 12345678 and ((addons_json @> {t1:1}) or >>>>>>> (addons_json @> {t1:2}) or (addons_json @> {t1:3}) >>>>>>> more key filters if customer applies >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Dec 23, 2024 at 10:38 PM David G. Johnston < >>>>>>> david.g.johns...@gmail.com> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Dec 23, 2024, 10:01 Divyansh Gupta JNsThMAudy < >>>>>>>> ag1567...@gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> So here my question is considering one JSONB column is perfect or >>>>>>>>> considering 50 columns will be more optimised. >>>>>>>>> >>>>>>>> The relational database engine is designed around the column-based >>>>>>>> approach. Especially if the columns are generally unchanging, combined >>>>>>>> with using fixed-width data types. >>>>>>>> >>>>>>>> David J. >>>>>>>> >>>>>>>> >>>>> >>>>> -- >>>>> Death to <Redacted>, and butter sauce. >>>>> Don't boil me, I'm still alive. >>>>> <Redacted> lobster! >>>>> >>>> >>> >>> -- >>> Death to <Redacted>, and butter sauce. >>> Don't boil me, I'm still alive. >>> <Redacted> lobster! >>> >> -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster!