[email protected] has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20447 )

Change subject: [WIP] IMPALA-13039: AES Encryption/ Decryption Support in Impala
......................................................................


Patch Set 10:

(71 comments)

> Patch Set 10:
>
> (3 comments)
Hi guys, some comments related to formatting, variable name enhancement, 
paraphrasing, commit message and tests are still pending. Will keep updating 
them soon along in upcoming patch sets. Need some clarification regarding some 
comments so will reach out for the same and address them. ECB support for 
encryption has been removed and an error is added in such a case. I hope this 
design is in congruence with the latest discussions on the same.

http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1812
PS9, Line 1812:
              : // Description of the modes s
> I find it a bit misleading, shouldn't we return an error in this case? Now
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1846
PS9, Line 1846:     return StringVal::null();
> We should check if 'key' or 'iv' (in case of GCM) is null. Shouldn't that b
key is getting checked at 1851.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1857
PS9, Line 1857:   AES_CIPHER_MODE mode_;
> Should we care about the IV when using e.g. ECB which doesn't use an IV?
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1863
PS9, Line 1863:   }
> Why use new for a local object? Just instantiate it
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1863
PS9, Line 1863:
> Could use a more descriptive variable name, e.g. 'encryption' ('key' is alr
Do you mean instead of 't'?


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1868
PS9, Line 1868:
> If there is an ELSE branch, always use braces ('{}') around the blocks.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1872
PS9, Line 1872:
> IV could be NULL.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1876
PS9, Line 1876:
> This could be extracted to a variable before L1869 so it can be used there
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1886
PS9, Line 1886:   }
> Variable-length arrays are not part of the C++ standard (though GCC support
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1889
PS9, Line 1889:   t.InitializeFields(key.ptr, key.len, iv.ptr, iv.len);
> No need to extract this into a variable.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1891
PS9, Line 1891: utput leng
> No need for the cast as uint8_t and char are the same type (though technica
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h
File be/src/exprs/string-functions.h:

http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@28
PS9, Line 28: using namespace impala_udf;
> Unneeded import.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@30
PS9, Line 30: namespace impala {
> Unneeded import.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@31
PS9, Line 31:
> Unneeded import.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@33
PS9, Line 33: using impala_udf::AnyVal;
> Unneeded import.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@52
PS9, Line 52:   enum TrimPosition {
> Unused forward declaration.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util-test.cc
File be/src/util/openssl-util-test.cc:

http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util-test.cc@68
PS9, Line 68:
> We're in the impala namespace, no need to specify it. Applies to the other
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util-test.cc@68
PS9, Line 68:     AES_CIPHER_MODE modes[] = {
> line too long (145 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h
File be/src/util/openssl-util.h:

http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@25
PS9, Line 25: #include "common/status.h"
> Unneeded include.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@72
PS9, Line 72:
> Are all of these supported? The comment of class EncryptionKey only lists G
Yes, all of these are supported.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@73
PS9, Line 73:   AES_256_CFB,
> Why are we removing AES_192_ECB?
It's put under future steps with all other modes


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@80
PS9, Line 80:
> DEFAULT or AES_DEFAULT would be better. On the other hand I don't think we
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@127
PS9, Line 127: ngth ciphertexts. The IV serves as input
> ECB is not a stream cipher mode, this is misleading now.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@138
PS9, Line 138:
> As far as I know the AAD is not confidential, and the previous text didn't
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@153
PS9, Line 153:  is
> Why change "This key" to "The key"? "key" here refers to the EncryptionKey
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@153
PS9, Line 153: case of ECB mode. In other modes, 'out' buff
> This is no longer true for ECB because an extra block can be added.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@154
PS9, Line 154:   /// buffer. This key must be initialized before calling this 
function. If 'in' == 'out',
> line too long (94 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@157
PS9, Line 157:
> Isn't this the output length? It should be named accordingly and also menti
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@157
PS9, Line 157:   /// data_read refers to the output length.
> line too long (111 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@157
PS9, Line 157:
> Use nullptr instead.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@160
PS9, Line 160:
> See L153.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@161
PS9, Line 161:  da
> See L153 (this vs. the).
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@164
PS9, Line 164:
> See L157.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@164
PS9, Line 164: 
> Use nullptr instead.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@164
PS9, Line 164:   /// same length as 'data'buffer.
> line too long (111 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@166
PS9, Line 166: -place; otherwise, the buffers must not overla
> Is it still true after this patch?
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@174
PS9, Line 174: ironme
> Start this on a new line. We only skip the newline after the opening "{" if
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@174
PS9, Line 174:  speci
> We're inside the impala namespace, so no need to specify it. Also applies t
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@178
PS9, Line 178: runtim
> Add a newline.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@181
PS9, Line 181: mode_ == AES_CIP
> Add function comment.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@197
PS9, Line 197: sted by the user specifically.
> See L153.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@201
PS9, Line 201: deToStrin
> See L157.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@224
PS9, Line 224:  E
> Should end with an underscore: 'key_length_'.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@224
PS9, Line 224:   /// Returns a EVP_CIPHER according to cipher mode at runtime
> This should be right after L214.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@225
PS9, Line 225:   const EVP_CIPHER* GetCipher() const;
> This should be right after L217.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@225
PS9, Line 225: PH
> Should end with an underscore: 'iv_length_'.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@201
PS9, Line 201:   initialized_ = true;
> line too long (99 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@205
PS9, Line 205:     int64_t* data_read) {
> line too long (99 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@215
PS9, Line 215: ool encrypt, const uint8_t* data, int64_t len, uint8_t* out, 
int64_t* data_read) {
             :   DCHECK(initialized_);
> Instead of this comment, you could have a variable
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@221
PS9, Line 221:  otherwise it will be disabled.
> Isn't ECB padded to a full block?
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@226
PS9, Line 226:  arbitrary lengt
> Should use nullptr instead of NULL, also on the next line.
In the first initialization, we initialize only evpCipher and in the second 
initialization, we set key and iv vector. We do this because, we take variable 
length iv vector, hence, we have to initialize iv length in gcm mode first 
before initializing iv vector.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@233
PS9, Line 233: , nu
> Now that we've touched the line, it should also be changed to nullptr.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@239
PS9, Line 239:     // Set iv_vector for GCM mode.
> See L226 about initialising it twice.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@255
PS9, Line 255:     int in_len = static_cast<int>(min<int64_t>(len - offset, 
numeric_limits<int>::max()));
> Why did you remove DCHECK_EQ(in_len, out_len)? If it is no longer the case
ECB mode has different in_len and out_len. It doesn't matter to use different 
offsets as we traverse congruently in both data and out buffer.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@257
PS9, Line 257: t ?
> Use nullptr instead.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@274
PS9, Line 274:
> Use nullptr instead.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@277
PS9, Line 277:
> Use nullptr instead.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@280
PS9, Line 280:                           &final_out_len) :
> Is it sure that we can't handle it without gcm_tag? It's not good if we can
Yes, its sure. Whenever we'll support gcm_tag in further patches, we can 
uncomment this and add sufficient error messages.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@289
PS9, Line 289: ) {
> Use nullptr instead.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@311
PS9, Line 311: PHE
> Why are we changing the default mode?
We're not changing the default mode here, we're just returning the cipher for 
our supported modes which could be default mode as well. Just to make it 
uniform, added a nullptr return statement and added a DCHECK for the same in 
EncryptInternal. setCipherMode method makes sure that the modes we initialize 
are the ones we support.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@316
PS9, Line 316:
> As I commented at openssl-util.h:80, I don't think having a 'default' membe
Done.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@336
PS9, Line 336:     memcpy(iv_, iv, iv_length_);
> Use braces around the IF and ELSE blocks.
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@339
PS9, Line 339:
> Why do we need to null out the 'iv_' array if 'iv_len' is zero while when 0
We only use iv_length and thus removed the else block. Done.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@365
PS9, Line 365:       return (MaxSupportedTlsVersion() >= TLS1_2_VERSION && 
EVP_aes_256_ctr);
> I don't think it's necessary to insert return statements after each case, w
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@371
PS9, Line 371:
> What happened to AES_192_ECB? Did we remove support for it?
It's not supported in this patch as mentioned in the future steps in the commit 
message.


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@374
PS9, Line 374:
> Doesn't AES_128_GCM have the same problem as AES_256_GCM, described on L345
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@383
PS9, Line 383: MODE::
> No need for "impala::".
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@384
PS9, Line 384:   }
> If the whole IF statement doesn't fit on one line, use braces around the IF
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@392
PS9, Line 392: ;
> Shouldn't we include that it is the 256 bit version of GCM? We also have AE
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@403
PS9, Line 403:     r
> Const doesn't make much sense when returning a value type (as opposed to po
Done


http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@404
PS9, Line 404:   }
> When we have ELSE [IF] branches, we always use braces around the statement
Done



--
To view, visit http://gerrit.cloudera.org:8080/20447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3902f2b1d95da4d06995cbd687e79c48e16190c9
Gerrit-Change-Number: 20447
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Comment-Date: Wed, 10 Jul 2024 19:46:31 +0000
Gerrit-HasComments: Yes

Reply via email to