Re: [DISCUSS] CEP-20: Dynamic Data Masking

Aaron Ploetz Tue, 23 Aug 2022 10:13:13 -0700

Some thoughts on this one:

In a prior job, we'd give app teams access to a single keyspace, and two
roles: a read-write role and a read-only role.  In some cases, a
"privileged" application role was also requested.  Depending on the
requirements, I could see the UNMASK permission being applied to the RW or
privileged roles.  But if there's a problem on the table and the operators
go in to investigate, they will likely use a SUPERUSER account, and they'll
see that data.


How hard would it be for SUPERUSERs to *not* automatically get the UNMASK
permission?

I'll also echo the concerns around masking primary key components.  It's
highly likely that certain personal data properties would be used as a
partition or clustering key (ex: range query for people born within a
certain timeframe).  In addition to the "breaks existing" concern, I'm
curious about the challenges around getting that to work with the current
primary key implementation.

Does this first implementation only apply to payload (non-key) columns?
The examples in the CEP currently do not show primary key components being
masked.

Thanks,

Aaron


On Tue, Aug 23, 2022 at 6:44 AM Henrik Ingo <henrik.i...@datastax.com>
wrote:

> On Tue, Aug 23, 2022 at 1:10 PM Andrés de la Peña <adelap...@apache.org>
> wrote:
>
>> One thought: The way the CEP is currently written, it is only possible to
>>> mask a column one way. You can only define one masking function for a
>>> column, and since you use the original column name, you could only return
>>> one version of it in the result set, even if you had a way to define
>>> several functions.
>>>
>>
>> Right, it's one single type of mapping per the column, declared on
>> CREATE/ALTER TABLE statements. Also, users can manually specify their own
>> masking function in SELECT statements if they have permissions for seeing
>> the clear data.
>>
>> For those cases where the data is automatically masked for an
>> unprivileged user, I don't see the use of including different types of
>> masking for the same column into the same result set. Instead, we might be
>> interested on having different types of masking associated to different
>> roles. We could do so with dedicated CREATE/DROP/LIST MASK statements,
>> instead of using the CREATE/ALTER/DESCRIBE TABLE statements. That CREATE
>> MASK statement would associate a masking function to a column and role.
>> However, I'm not sure we need that type of granularity instead of the
>> simplicity of attaching the masking to the column declaration. wdyt?
>>
>>
>>
> My gut feeling likewise is that this adds complexity but little value.
>
>>
>>>
>
> --
>
> Henrik Ingo
>
> +358 40 569 7354 <358405697354>
>
> [image: Visit us online.] <https://www.datastax.com/>  [image: Visit us
> on Twitter.] <https://twitter.com/DataStaxEng>  [image: Visit us on
> YouTube.]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_channel_UCqA6zOSMpQ55vvguq4Y0jAg&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=bmIfaie9O3fWJAu6lESvWj3HajV4VFwgwgVuKmxKZmE&s=16sY48_kvIb7sRQORknZrr3V8iLTfemFKbMVNZhdwgw&e=>
>   [image: Visit my LinkedIn profile.]
> <https://www.linkedin.com/in/heingo/>
>

Re: [DISCUSS] CEP-20: Dynamic Data Masking

Reply via email to