Re: [DISCUSS] Best way to re-encrypt existing data (TDE cache key rotation).

Nikolay Izhikov Mon, 25 May 2020 02:00:49 -0700

> This willl takes us to the re-encryption using full rebalancing

Rebalance will require 2x efforts for reencryption


1. Read and send data from supplier node.
2. Reencrypt and write data on demander node.

Instead of

1. Read, reencrypt and write data on «demander» node.


> 25 мая 2020 г., в 11:46, Alexei Scherbakov <alexey.scherbak...@gmail.com> 
> написал(а):
> 
> For me, the one big disadvantage for offline re-encryption is the
> possibility to run out of WAL history.
> If an re-encryption takes a long time we will get full rebalancing with
> partition eviction.
> This willl takes us to the re-encryption using full rebalancing, proposed
> by me earlier.
> 
> 
> 
> пн, 25 мая 2020 г. в 11:27, Nikolay Izhikov <nizhi...@apache.org>:
> 
>>> And definitely this approach is much simplier to implement
>> 
>> I agree.
>> 
>> If we allow to made nodes offline for reencryption then we can implement a
>> fully offline procedure:
>> 
>> 1. Stop node.
>> 2. Execute some control.sh command that will reencrypt all data without
>> starting node
>> 3. Start node.
>> 
>> Pavel, can you, please, write it one more time - what disadvantages in
>> offline procedure?
>> 
>>> 25 мая 2020 г., в 11:20, Alexei Scherbakov <alexey.scherbak...@gmail.com>
>> написал(а):
>>> 
>>> And definitely this approach is much simplier to implement because all
>>> corner cases are handled by rebalancing code.
>>> 
>>> пн, 25 мая 2020 г. в 11:16, Alexei Scherbakov <
>> alexey.scherbak...@gmail.com
>>>> :
>>> 
>>>> I mean: serving supply requests.
>>>> 
>>>> пн, 25 мая 2020 г. в 11:15, Alexei Scherbakov <
>>>> alexey.scherbak...@gmail.com>:
>>>> 
>>>>> Nikolay,
>>>>> 
>>>>> Can you explain why such restriction is necessary ?
>>>>> Most likely having a currently re-encrypting node serving only demand
>>>>> requests will have least preformance impact on a grid.
>>>>> 
>>>>> пн, 25 мая 2020 г. в 11:08, Nikolay Izhikov <nizhi...@apache.org>:
>>>>> 
>>>>>> Hello, Alexei.
>>>>>> 
>>>>>> I think we want to implement this feature without nodes restart.
>>>>>> In the ideal scenario all nodes will stay alive and respond to the
>> user
>>>>>> requests.
>>>>>> 
>>>>>>> 24 мая 2020 г., в 15:24, Alexei Scherbakov <
>>>>>> alexey.scherbak...@gmail.com> написал(а):
>>>>>>> 
>>>>>>> Pavel Pereslegin,
>>>>>>> 
>>>>>>> I see another opportunity.
>>>>>>> We can use rebalancing to re-encrypt node data with a new key.
>>>>>>> It's a trivial procedure for me: stop a node, clear database, change
>> a
>>>>>> key,
>>>>>>> start node and wait for rebalancing to complete.
>>>>>>> Data will be re-encrypted during rebalancing.
>>>>>>> 
>>>>>>> Did I miss something ?
>>>>>>> 
>>>>>>> пт, 22 мая 2020 г. в 16:14, Ivan Rakov <ivan.glu...@gmail.com>:
>>>>>>> 
>>>>>>>> Folks,
>>>>>>>> 
>>>>>>>> Just keeping you informed: I and my colleagues are highly interested
>>>>>> in TDE
>>>>>>>> in general and keys rotations specifically, but we don't have enough
>>>>>> time
>>>>>>>> so far.
>>>>>>>> We'll dive into this feature and participate in reviews next month.
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Best Regards,
>>>>>>>> Ivan Rakov
>>>>>>>> 
>>>>>>>> On Sun, May 17, 2020 at 10:51 PM Pavel Pereslegin <xxt...@gmail.com
>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hello, Alexey.
>>>>>>>>> 
>>>>>>>>>> is the encryption key for the data the same on all nodes in the
>>>>>>>> cluster?
>>>>>>>>> Yes, each encrypted cache group has its own encryption key, the key
>>>>>> is
>>>>>>>>> the same on all nodes.
>>>>>>>>> 
>>>>>>>>>> Clearly, during the re-encryption there will exist pages
>>>>>>>>>> encrypted with both new and old keys at the same time.
>>>>>>>>> Yes, there will be pages encrypted with different keys at the same
>>>>>> time.
>>>>>>>>> Currently, we only store one key for one cache group. To rotate a
>>>>>> key,
>>>>>>>>> at a certain point in time it is necessary to support several keys
>>>>>> (at
>>>>>>>>> least for reading the WAL).
>>>>>>>>> For the "in place" strategy, we'll store the encryption key
>>>>>> identifier
>>>>>>>>> on each encrypted page (we currently have some unused space on
>>>>>>>>> encrypted page, so I don't expect any memory overhead here). Thus,
>> we
>>>>>>>>> will have several keys for reading and one key for writing. I
>> assume
>>>>>>>>> that the old key will be automatically deleted when a specific WAL
>>>>>>>>> segment is deleted (and re-encryption is finished).
>>>>>>>>> 
>>>>>>>>>> Will a node continue to re-encrypt the data after it restarts?
>>>>>>>>> Yes.
>>>>>>>>> 
>>>>>>>>>> If a node goes down during the re-encryption, but the rest of the
>>>>>>>>>> cluster finishes re-encryption, will we consider the procedure
>>>>>>>> complete?
>>>>>>>>> I'm not sure, but it looks like the key rotation is complete when
>> we
>>>>>>>>> set the new key on all nodes so that the updates will be encrypted
>>>>>>>>> with the new key (as required by PCI DSS).
>>>>>>>>> Status of re-encryption can be obtained separately (locally or
>>>>>> cluster
>>>>>>>>> wide).
>>>>>>>>> 
>>>>>>>>> I forgot to mention that with “in place” re-encryption it will be
>>>>>>>>> impossible to quickly cancel re-encryption, because by canceling we
>>>>>>>>> mean re-encryption with the old key.
>>>>>>>>> 
>>>>>>>>>> How do you see the whole key rotation procedure will work?
>>>>>>>>> Initial design for re-encryption with "partition copying" is
>>>>>> described
>>>>>>>>> here [1]. I'll prepare detailed design for "in place" re-encryption
>>>>>> if
>>>>>>>>> we'll go this way. In short, send the new encryption key
>>>>>> cluster-wide,
>>>>>>>>> each node adds a new key and starts background re-encryption.
>>>>>>>>> 
>>>>>>>>> [1]
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Copywithre-encryptiondesign
>>>>>>>>> .
>>>>>>>>> 
>>>>>>>>> вс, 17 мая 2020 г. в 18:35, Alexey Goncharuk <
>>>>>> alexey.goncha...@gmail.com
>>>>>>>>> :
>>>>>>>>>> 
>>>>>>>>>> Pavel, Anton,
>>>>>>>>>> 
>>>>>>>>>> How do you see the whole key rotation procedure will work?
>> Clearly,
>>>>>>>>> during
>>>>>>>>>> the re-encryption there will exist pages encrypted with both new
>> and
>>>>>>>> old
>>>>>>>>>> keys at the same time. Will a node continue to re-encrypt the data
>>>>>>>> after
>>>>>>>>> it
>>>>>>>>>> restarts? If a node goes down during the re-encryption, but the
>>>>>> rest of
>>>>>>>>> the
>>>>>>>>>> cluster finishes re-encryption, will we consider the procedure
>>>>>>>> complete?
>>>>>>>>> By
>>>>>>>>>> the way, is the encryption key for the data the same on all nodes
>> in
>>>>>>>> the
>>>>>>>>>> cluster?
>>>>>>>>>> 
>>>>>>>>>> чт, 14 мая 2020 г. в 11:30, Anton Vinogradov <a...@apache.org>:
>>>>>>>>>> 
>>>>>>>>>>> +1 to "In place re-encryption".
>>>>>>>>>>> 
>>>>>>>>>>> - It has a simple design.
>>>>>>>>>>> - Clusters under load may require just load to re-encrypt the
>> data.
>>>>>>>>>>> (Friendly to load).
>>>>>>>>>>> - Easy to throttle.
>>>>>>>>>>> - Easy to continue.
>>>>>>>>>>> - Design compatible with the multi-key architecture.
>>>>>>>>>>> - It can be optimized to use own WAL buffer and to re-encrypt
>> pages
>>>>>>>>> without
>>>>>>>>>>> restoring them to on-heap.
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, May 14, 2020 at 1:54 AM Pavel Pereslegin <
>> xxt...@gmail.com
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hello Igniters.
>>>>>>>>>>>> 
>>>>>>>>>>>> Recently, master key rotation for Apache Ignite Transparent Data
>>>>>>>>>>>> Encryption was implemented [1], but some security standards (PCI
>>>>>>>> DSS
>>>>>>>>>>>> at least) require rotation of all encryption keys [2].
>> Currently,
>>>>>>>>>>>> encryption occurs when reading/writing pages to disk, cache
>>>>>>>>> encryption
>>>>>>>>>>>> keys are stored in metastore.
>>>>>>>>>>>> 
>>>>>>>>>>>> I'm going to contribute cache encryption key rotation and want
>> to
>>>>>>>>>>>> consult what is the best way to re-encrypting existing data, I
>> see
>>>>>>>>> two
>>>>>>>>>>>> different strategies.
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. In place re-encryption:
>>>>>>>>>>>> Using the old key, sequentially read all the pages from the
>>>>>>>>> datastore,
>>>>>>>>>>>> mark as dirty and log them into the WAL. After checkpoint pages
>>>>>>>> will
>>>>>>>>>>>> be stored to disk encrypted with the new key (as usual, along
>> with
>>>>>>>>>>>> updates). This strategy requires store the identifier (number)
>> of
>>>>>>>> the
>>>>>>>>>>>> encryption key into the encrypted page.
>>>>>>>>>>>> pros:
>>>>>>>>>>>> - can work in the background with minimal performance impact
>>>>>>>> (this
>>>>>>>>>>>> impact can be managed).
>>>>>>>>>>>> cons:
>>>>>>>>>>>> - page duplication in the WAL may affect performance and
>>>>>>>> historical
>>>>>>>>>>>> rebalance.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. Copy partition with re-encryption.
>>>>>>>>>>>> This strategy is similar to partition snapshotting [3] - create
>>>>>>>>>>>> partition copy encrypted with the new key and then replace the
>>>>>>>>>>>> original partition file with the new one (see details [4]).
>>>>>>>>>>>> pros:
>>>>>>>>>>>> - should work faster than "in place" re-encryption.
>>>>>>>>>>>> cons:
>>>>>>>>>>>> - re-encryption in active cluster (and on unstable topology) can
>>>>>>>> be
>>>>>>>>>>>> difficult to implement.
>>>>>>>>>>>> 
>>>>>>>>>>>> (See more detailed comparison [5])
>>>>>>>>>>>> 
>>>>>>>>>>>> Re-encryption of existing data is a long and rare procedure (It
>> is
>>>>>>>>>>>> recommended to change the key every 6 months, but at least once
>>>>>>>> every
>>>>>>>>>>>> 2 years). Thus, re-encryption can be implemented for maintenance
>>>>>>>> mode
>>>>>>>>>>>> (for example, on a stable topology in a read-only cluster) and
>> in
>>>>>>>>> such
>>>>>>>>>>>> case the approach with partition copying seems simpler and
>> faster.
>>>>>>>>>>>> 
>>>>>>>>>>>> So, what do you think - do we need "online" re-encryption and
>>>>>> which
>>>>>>>>> of
>>>>>>>>>>>> the proposed options is best suited for this?
>>>>>>>>>>>> 
>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-12186
>>>>>>>>>>>> [2]
>>>>>>>>> https://www.pcisecuritystandards.org/documents/PCI_DSS_v3-2-1.pdf
>>>>>>>>>>>> [3]
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots#IEP-43:Clustersnapshots-Partitionscopystrategy
>>>>>>>>>>>> [4]
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Copywithre-encryptiondesign
>>>>>>>>>>>> .
>>>>>>>>>>>> [5]
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Comparison
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Alexei Scherbakov
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> 
>>>>> Best regards,
>>>>> Alexei Scherbakov
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> 
>>>> Best regards,
>>>> Alexei Scherbakov
>>>> 
>>> 
>>> 
>>> --
>>> 
>>> Best regards,
>>> Alexei Scherbakov
>> 
>> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov

Re: [DISCUSS] Best way to re-encrypt existing data (TDE cache key rotation).

Reply via email to