[email protected] has posted comments on this change. ( http://gerrit.cloudera.org:8080/20447 )
Change subject: [WIP] IMPALA-13039: AES Encryption/ Decryption Support in Impala ...................................................................... Patch Set 10: (71 comments) > Patch Set 10: > > (3 comments) Hi guys, some comments related to formatting, variable name enhancement, paraphrasing, commit message and tests are still pending. Will keep updating them soon along in upcoming patch sets. Need some clarification regarding some comments so will reach out for the same and address them. ECB support for encryption has been removed and an error is added in such a case. I hope this design is in congruence with the latest discussions on the same. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1812 PS9, Line 1812: : // Description of the modes s > I find it a bit misleading, shouldn't we return an error in this case? Now Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1846 PS9, Line 1846: return StringVal::null(); > We should check if 'key' or 'iv' (in case of GCM) is null. Shouldn't that b key is getting checked at 1851. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1857 PS9, Line 1857: AES_CIPHER_MODE mode_; > Should we care about the IV when using e.g. ECB which doesn't use an IV? Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1863 PS9, Line 1863: } > Why use new for a local object? Just instantiate it Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1863 PS9, Line 1863: > Could use a more descriptive variable name, e.g. 'encryption' ('key' is alr Do you mean instead of 't'? http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1868 PS9, Line 1868: > If there is an ELSE branch, always use braces ('{}') around the blocks. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1872 PS9, Line 1872: > IV could be NULL. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1876 PS9, Line 1876: > This could be extracted to a variable before L1869 so it can be used there Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1886 PS9, Line 1886: } > Variable-length arrays are not part of the C++ standard (though GCC support Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1889 PS9, Line 1889: t.InitializeFields(key.ptr, key.len, iv.ptr, iv.len); > No need to extract this into a variable. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions-ir.cc@1891 PS9, Line 1891: utput leng > No need for the cast as uint8_t and char are the same type (though technica Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h File be/src/exprs/string-functions.h: http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@28 PS9, Line 28: using namespace impala_udf; > Unneeded import. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@30 PS9, Line 30: namespace impala { > Unneeded import. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@31 PS9, Line 31: > Unneeded import. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@33 PS9, Line 33: using impala_udf::AnyVal; > Unneeded import. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/exprs/string-functions.h@52 PS9, Line 52: enum TrimPosition { > Unused forward declaration. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util-test.cc File be/src/util/openssl-util-test.cc: http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util-test.cc@68 PS9, Line 68: > We're in the impala namespace, no need to specify it. Applies to the other Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util-test.cc@68 PS9, Line 68: AES_CIPHER_MODE modes[] = { > line too long (145 > 90) Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h File be/src/util/openssl-util.h: http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@25 PS9, Line 25: #include "common/status.h" > Unneeded include. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@72 PS9, Line 72: > Are all of these supported? The comment of class EncryptionKey only lists G Yes, all of these are supported. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@73 PS9, Line 73: AES_256_CFB, > Why are we removing AES_192_ECB? It's put under future steps with all other modes http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@80 PS9, Line 80: > DEFAULT or AES_DEFAULT would be better. On the other hand I don't think we Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@127 PS9, Line 127: ngth ciphertexts. The IV serves as input > ECB is not a stream cipher mode, this is misleading now. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@138 PS9, Line 138: > As far as I know the AAD is not confidential, and the previous text didn't Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@153 PS9, Line 153: is > Why change "This key" to "The key"? "key" here refers to the EncryptionKey Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@153 PS9, Line 153: case of ECB mode. In other modes, 'out' buff > This is no longer true for ECB because an extra block can be added. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@154 PS9, Line 154: /// buffer. This key must be initialized before calling this function. If 'in' == 'out', > line too long (94 > 90) Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@157 PS9, Line 157: > Isn't this the output length? It should be named accordingly and also menti Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@157 PS9, Line 157: /// data_read refers to the output length. > line too long (111 > 90) Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@157 PS9, Line 157: > Use nullptr instead. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@160 PS9, Line 160: > See L153. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@161 PS9, Line 161: da > See L153 (this vs. the). Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@164 PS9, Line 164: > See L157. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@164 PS9, Line 164: > Use nullptr instead. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@164 PS9, Line 164: /// same length as 'data'buffer. > line too long (111 > 90) Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@166 PS9, Line 166: -place; otherwise, the buffers must not overla > Is it still true after this patch? Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@174 PS9, Line 174: ironme > Start this on a new line. We only skip the newline after the opening "{" if Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@174 PS9, Line 174: speci > We're inside the impala namespace, so no need to specify it. Also applies t Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@178 PS9, Line 178: runtim > Add a newline. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@181 PS9, Line 181: mode_ == AES_CIP > Add function comment. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@197 PS9, Line 197: sted by the user specifically. > See L153. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@201 PS9, Line 201: deToStrin > See L157. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@224 PS9, Line 224: E > Should end with an underscore: 'key_length_'. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@224 PS9, Line 224: /// Returns a EVP_CIPHER according to cipher mode at runtime > This should be right after L214. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@225 PS9, Line 225: const EVP_CIPHER* GetCipher() const; > This should be right after L217. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@225 PS9, Line 225: PH > Should end with an underscore: 'iv_length_'. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc File be/src/util/openssl-util.cc: http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@201 PS9, Line 201: initialized_ = true; > line too long (99 > 90) Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@205 PS9, Line 205: int64_t* data_read) { > line too long (99 > 90) Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@215 PS9, Line 215: ool encrypt, const uint8_t* data, int64_t len, uint8_t* out, int64_t* data_read) { : DCHECK(initialized_); > Instead of this comment, you could have a variable Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@221 PS9, Line 221: otherwise it will be disabled. > Isn't ECB padded to a full block? Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@226 PS9, Line 226: arbitrary lengt > Should use nullptr instead of NULL, also on the next line. In the first initialization, we initialize only evpCipher and in the second initialization, we set key and iv vector. We do this because, we take variable length iv vector, hence, we have to initialize iv length in gcm mode first before initializing iv vector. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@233 PS9, Line 233: , nu > Now that we've touched the line, it should also be changed to nullptr. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@239 PS9, Line 239: // Set iv_vector for GCM mode. > See L226 about initialising it twice. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@255 PS9, Line 255: int in_len = static_cast<int>(min<int64_t>(len - offset, numeric_limits<int>::max())); > Why did you remove DCHECK_EQ(in_len, out_len)? If it is no longer the case ECB mode has different in_len and out_len. It doesn't matter to use different offsets as we traverse congruently in both data and out buffer. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@257 PS9, Line 257: t ? > Use nullptr instead. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@274 PS9, Line 274: > Use nullptr instead. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@277 PS9, Line 277: > Use nullptr instead. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@280 PS9, Line 280: &final_out_len) : > Is it sure that we can't handle it without gcm_tag? It's not good if we can Yes, its sure. Whenever we'll support gcm_tag in further patches, we can uncomment this and add sufficient error messages. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@289 PS9, Line 289: ) { > Use nullptr instead. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@311 PS9, Line 311: PHE > Why are we changing the default mode? We're not changing the default mode here, we're just returning the cipher for our supported modes which could be default mode as well. Just to make it uniform, added a nullptr return statement and added a DCHECK for the same in EncryptInternal. setCipherMode method makes sure that the modes we initialize are the ones we support. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@316 PS9, Line 316: > As I commented at openssl-util.h:80, I don't think having a 'default' membe Done. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@336 PS9, Line 336: memcpy(iv_, iv, iv_length_); > Use braces around the IF and ELSE blocks. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@339 PS9, Line 339: > Why do we need to null out the 'iv_' array if 'iv_len' is zero while when 0 We only use iv_length and thus removed the else block. Done. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@365 PS9, Line 365: return (MaxSupportedTlsVersion() >= TLS1_2_VERSION && EVP_aes_256_ctr); > I don't think it's necessary to insert return statements after each case, w Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@371 PS9, Line 371: > What happened to AES_192_ECB? Did we remove support for it? It's not supported in this patch as mentioned in the future steps in the commit message. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@374 PS9, Line 374: > Doesn't AES_128_GCM have the same problem as AES_256_GCM, described on L345 Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@383 PS9, Line 383: MODE:: > No need for "impala::". Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@384 PS9, Line 384: } > If the whole IF statement doesn't fit on one line, use braces around the IF Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@392 PS9, Line 392: ; > Shouldn't we include that it is the 256 bit version of GCM? We also have AE Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@403 PS9, Line 403: r > Const doesn't make much sense when returning a value type (as opposed to po Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@404 PS9, Line 404: } > When we have ELSE [IF] branches, we always use braces around the statement Done -- To view, visit http://gerrit.cloudera.org:8080/20447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3902f2b1d95da4d06995cbd687e79c48e16190c9 Gerrit-Change-Number: 20447 Gerrit-PatchSet: 10 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Comment-Date: Wed, 10 Jul 2024 19:46:31 +0000 Gerrit-HasComments: Yes
