sounds interesting. I would like to understand a couple things here. If the 
column names are the same for masked and unmasked data, it would impact 
existing applications. I am curious what the transition plan look like for 
applications that expect unmasked data?

For example, let’s say you store SSNs and Birth dates. Upon enabling this 
feature, let’s say the app user is not given the UNMASK permission. Now the app 
is receiving masked values for these columns. This is fine for most read only 
applications. However, a lot of times these columns may be used as primary keys 
or part of primary keys in other tables. This would break existing applications.

How would this work in mixed mode when  ew nodes in the cluster are masking 
data and others aren’t? How would it impact the driver?

How would the application learn that the column values are masked? This is 
important in case a user has UNMASK permission and then later taken away. Again 
this would break a lot of applications.

Dinesh

> On Aug 19, 2022, at 4:50 AM, Andrés de la Peña <adelap...@apache.org> wrote:
> 
> 
> Hi everyone,
> 
> I'd like to start a discussion about this proposal for dynamic data masking: 
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-20%3A+Dynamic+Data+Masking
> 
> Dynamic data masking allows to obscure sensitive information without changing 
> the stored data. It would be based on a set of native CQL functions providing 
> different types of masking, such as replacing the column value by "XXXX". 
> These functions could be used as regular functions or attached to table 
> columns with CREATE/ALTER table. There would be a new UNMASK permission, so 
> only the users with this permissions would be able to see the unmasked column 
> values. It would be possible to customize masking by using UDFs as masking 
> functions.
> 
> Thanks,

Reply via email to