Re: Maximum number of columns in a table

DuyHai Doan Thu, 15 Sep 2016 13:00:10 -0700

"But the problem is I can't use secondary indexing "where int25=5", while
with normal columns I can."


You have many objectives that contradict themselves in term of impl.

Right now you're unlucky, SASI does not support indexing collections yet
(it may come in future, when ?  ¯\_(ツ)_/¯ )

If you're using DSE Search or Stratio Lucene Index, you can index map
values

On Thu, Sep 15, 2016 at 9:53 PM, Dorian Hoxha <dorian.ho...@gmail.com>
wrote:

> Yes that makes more sense. But the problem is I can't use secondary
> indexing "where int25=5", while with normal columns I can.
>
> On Thu, Sep 15, 2016 at 8:23 PM, sfesc...@gmail.com <sfesc...@gmail.com>
> wrote:
>
>> I agree a single blob would also work (I do that in some cases). The
>> reason for the map is if you need more flexible updating. I think your
>> solution of a map/data type works well.
>>
>> On Thu, Sep 15, 2016 at 11:10 AM DuyHai Doan <doanduy...@gmail.com>
>> wrote:
>>
>>> "But I need rows together to work with them (indexing etc)"
>>>
>>> What do you mean rows together ? You mean that you want to fetch a
>>> single row instead of 1 row per property right ?
>>>
>>> In this case, the map might be the solution:
>>>
>>> CREATE TABLE generic_with_maps(
>>>    object_id uuid
>>>    boolean_map map<text, boolean>
>>>    text_map map<text, text>
>>>    long_map map<text, long>,
>>>    ...
>>>    PRIMARY KEY(object_id)
>>> );
>>>
>>> The trick here is to store all the fields of the object in different
>>> map, depending on the type of the field.
>>>
>>> The map key is always text and it contains the name of the field.
>>>
>>> Example
>>>
>>> {
>>>    "id": xxxx,
>>>     "name": "John DOE",
>>>     "age":  32,
>>>     "last_visited_date":  "2016-09-10 12:01:03",
>>> }
>>>
>>> INSERT INTO generic_with_maps(id, map_text, map_long, map_date)
>>> VALUES(xxx, {'name': 'John DOE'}, {'age': 32}, {'last_visited_date': 
>>> '2016-09-10
>>> 12:01:03'});
>>>
>>> When you do a select, you'll get a SINGLE row returned. But then you
>>> need to extract all the properties from different maps, not a big deal
>>>
>>> On Thu, Sep 15, 2016 at 7:54 PM, Dorian Hoxha <dorian.ho...@gmail.com>
>>> wrote:
>>>
>>>> @DuyHai
>>>> Yes, that's another case, the "entity" model used in rdbms. But I need
>>>> rows together to work with them (indexing etc).
>>>>
>>>> @sfespace
>>>> The map is needed when you have a dynamic schema. I don't have a
>>>> dynamic schema (may have, and will use the map if I do). I just have
>>>> thousands of schemas. One user needs 10 integers, while another user needs
>>>> 20 booleans, and another needs 30 integers, or a combination of them all.
>>>>
>>>> On Thu, Sep 15, 2016 at 7:46 PM, DuyHai Doan <doanduy...@gmail.com>
>>>> wrote:
>>>>
>>>>> "Another possible alternative is to use a single map column"
>>>>>
>>>>> --> how do you manage the different types then ? Because maps in
>>>>> Cassandra are strongly typed
>>>>>
>>>>> Unless you set the type of map value to blob, in this case you might
>>>>> as well store all the object as a single blob column
>>>>>
>>>>> On Thu, Sep 15, 2016 at 6:13 PM, sfesc...@gmail.com <
>>>>> sfesc...@gmail.com> wrote:
>>>>>
>>>>>> Another possible alternative is to use a single map column.
>>>>>>
>>>>>>
>>>>>> On Thu, Sep 15, 2016 at 7:19 AM Dorian Hoxha <dorian.ho...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Since I will only have 1 table with that many columns, and the other
>>>>>>> tables will be "normal" tables with max 30 columns, and the memory of 2K
>>>>>>> columns won't be that big, I'm gonna guess I'll be fine.
>>>>>>>
>>>>>>> The data model is too dynamic, the alternative would be to create a
>>>>>>> table for each user which will have even more overhead since the number 
>>>>>>> of
>>>>>>> users is in the several thousands/millions.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 15, 2016 at 3:04 PM, DuyHai Doan <doanduy...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> There is no real limit in term of number of columns in a table, I
>>>>>>>> would say that the impact of having a lot of columns is the amount of 
>>>>>>>> meta
>>>>>>>> data C* needs to keep in memory for encoding/decoding each row.
>>>>>>>>
>>>>>>>> Now, if you have a table with 1000+ columns, the problem is
>>>>>>>> probably your data model...
>>>>>>>>
>>>>>>>> On Thu, Sep 15, 2016 at 2:59 PM, Dorian Hoxha <
>>>>>>>> dorian.ho...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Is there alot of overhead with having a big number of columns in a
>>>>>>>>> table ? Not unbounded, but say, would 2000 be a problem(I think 
>>>>>>>>> that's the
>>>>>>>>> maximum I'll need) ?
>>>>>>>>>
>>>>>>>>> Thank You
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>

Re: Maximum number of columns in a table

Reply via email to