Pranav Lodha has posted comments on this change. ( http://gerrit.cloudera.org:8080/20447 )
Change subject: IMPALA-13039: AES Encryption/ Decryption Support in Impala ...................................................................... Patch Set 15: (64 comments) > Uploaded patch set 15. http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@159 PS13, Line 159: if (len.val > StringVal::MAX_LENGTH) { > nit for here and below: if you want to update indentation in non-related co It'd get fixed in the final patch when I'll rebase the branch. http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1346 PS13, Line 1346: int3 > nit: the indent seems to be off It'd get fixed in the final patch when I'll rebase the branch. http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1810 PS13, Line 1810: _thresho > "other modes" or "CTR and CFB mode." Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1845 PS13, Line 1845: if (s2len == 0) return IntVal(s1len); > Any need to signal an error condition via 'ctx' in this case? Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1857 PS13, Line 1857: DCHECK(!ctx->impl()->state()->GetQueryStatus().ok()); > I think it could be useful to log this fact, and also the encryption mode u Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1858 PS13, Line 1858: return IntVal::null(); > Add { and } around the block. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1865 PS13, Line 1865: } > style nit: missing space before parenthesis Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1866 PS13, Line 1866: inters in th > "Invalid AES mode" would be more informative. Also, including 'mode' would Done, we fallback to other modes in case mode is null, added logs for it as well. http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1872 PS13, Line 1872: [0] + s2len + 1, 0); > nit: "ECB mode is not supported for encryption." Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1877 PS13, Line 1877: > You could include the 'cipher_mode' in the error message. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1881 PS13, Line 1881: stitution > This part can be deleted. It is unnecessarily confusing, since ECB is not s Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1893 PS13, Line 1893: reinte > Nit: "space". Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1903 PS13, Line 1903: > nit: 'status'. Descriptive variable names help readers understand the code. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1906 PS13, Line 1906: > nit: "encryption" Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1912 PS13, Line 1912: > No need for a cast. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1924 PS13, Line 1924: > Mention the other supported modes as well in the comment. See my other comm Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1930 PS13, Line 1930: StringVal StringFunctions::PrettyPrintMemory(FunctionContext* context, > The key being NULL should be an error. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1944 PS13, Line 1944: AES-GCM provides both confidentiality and integrity, making it suitable for secure : // communication and storage. Due to its security features and efficiency, : // > See my other comment on L1857. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1947 PS13, Line 1947: // > No newline necessary, "else" should come after the closing "}". Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1960 PS13, Line 1960: > You could include the 'cipher_mode' in the error message. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1978 PS13, Line 1978: > No need for the cast. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1982 PS13, Line 1982: ES only suppor > When do we need to add this additional block size? Is it needed for all mod Done, only needed for ECB. http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1982 PS13, Line 1982: ctx->SetError("AES only supports 128 and 256 bits key length"); > See comment about var-len arrays on the stack at PS9 L1886. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1986 PS13, Line 1986: te > nit: 'status' Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/exprs/string-functions-ir.cc@1988 PS13, Line 1988: r_mode; > nit: "decryption" Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h File be/src/util/openssl-util.h: http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@79 PS13, Line 79: AES_128 > It's a bit funny: the comment says it's "enum of all the AES modes that are Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@152 PS13, Line 152: > must contain? Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@153 PS13, Line 153: ata' > Nit: the 'out' buffer. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@153 PS13, Line 153: ffer ' > Nit: the 'data' buffer. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@153 PS13, Line 153: h > "Should be" or "needs to be". Also, it can actually be bigger, so it needs Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@157 PS13, Line 157: If 'in' == 'out', the opera > This would be better: "'data_read' (if not NULL) will be set to the output Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@157 PS13, Line 157: function. > Add single quotes: 'data_read'. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@162 PS13, Line 162: > See comments about the 'out' buffer above the Encrypt() function. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@163 PS13, Line 163: ut dat > nit: in case Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@165 PS13, Line 165: t' b > What is 'in'? Is that actually 'data'? Yes, its the expr data that we input. http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@168 PS13, Line 168: operation > See L157. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@181 PS13, Line 181: > Yes, continuation lines should be indented +4 relative to the line they con Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@187 PS13, Line 187: > ditto Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@190 PS13, Line 190: > Should actually be 'key_' and 'iv_', with underscores. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@193 PS13, Line 193: > Follow the naming convention: GetGcmTag(). Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@194 PS13, Line 194: GetGcmTag(u > Follow the naming convention: SetGcmTag(). Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@211 PS13, Line 211: t buffer > See comment about output buffer length above the Encrypt() function. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.h@216 PS13, Line 216: otherwise the buffers must not overlap > See L157. Unit is bytes. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h File be/src/util/openssl-util.h: http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@138 PS9, Line 138: GCM protects. > Not done. Also, why do you reword this comment? There is no need for it. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.h@224 PS9, Line 224: /// uninitialized keys. > Not done, 'key_length_' should come right after 'key_'. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc File be/src/util/openssl-util.cc: http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@77 PS9, Line 77: #if OPENSSL_VERSION_NUMBER < 0x10100000L > No need to do that, changing this is unnecessary. I think its okay to remove we/ us. It looked proper to me. Irrespective, I've changed it. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@305 PS9, Line 305: > We're in the impala namespace, no need to add "impala::" here. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@344 PS9, Line 344: out, g > No need to write "impala::". Applies to the other cases too. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@345 PS9, Line 345: } > Please revert it back to the original wording, no need to reword it. Done http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@361 PS9, Line 361: > No need to rewrite this paragraph. Had removed 'we', but irrespective have removed it. http://gerrit.cloudera.org:8080/#/c/20447/9/be/src/util/openssl-util.cc@392 PS9, Line 392: ing En > No need for "impala::". Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc File be/src/util/openssl-util.cc: http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@223 PS13, Line 223: int padding_flag = padding_enabled ? 1 : 0; > No need for this variable, bools are implicitly converted to 1 or 0. Please refer to Michael's comment on L233 on PS10. http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@227 PS13, Line 227: Moreover, numerous modes (e.g., CTR) are well-optimized, leveraging : // hardware ac > This is also not true because ECB necessitates a multiple of 16 bytes. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@228 PS13, Line 228: c > Nit: add full stop ("."). Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@271 PS13, Line 271: } > If the whole IF statement can fit on one line, we don't put it into {}, but Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@297 PS13, Line 297: != 1) { > See L271. Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@318 PS13, Line 318: > This line should come after AES_256_CFB, so that 256 bit and 128 bit modes Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@339 PS13, Line 339: void EncryptionKey::SetGcmTag(uint8_t* expr) { > nit: do you want to add DCHECK() for iv? It's checked in the respective udfs. http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@344 PS13, Line 344: ag_, AES_B > This is a very weird notation for a function of such signature: why not to Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@348 PS13, Line 348: (m) { > This is a very weird notation for a function of such signature if looking a Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@382 PS13, Line 382: AES_CIPHER_MODE EncryptionKey::GetSupportedDefaultMode() { : if (IsModeSupport > What is this for? Why not to handle it with 'default' case below? Default is basically for fallback modes while INVALID refers to a wrong mode entered by the user. http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@407 PS13, Line 407: return AES_CIPHER_MODE::AES_256_GCM; > nit: this can be dropped since any invalid mode is already handled at line Done http://gerrit.cloudera.org:8080/#/c/20447/13/be/src/util/openssl-util.cc@415 PS13, Line 415: return AES_CIPHER_MODE::AES_128_GCM; : } else > Yes, the "else if" should come on the same line as the closing "}". Done http://gerrit.cloudera.org:8080/#/c/20447/13/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/20447/13/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3315 PS13, Line 3315: vert double to string > Do we know why we have additional data compared to previous patches (like P Because of gcm_tag. -- To view, visit http://gerrit.cloudera.org:8080/20447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3902f2b1d95da4d06995cbd687e79c48e16190c9 Gerrit-Change-Number: 20447 Gerrit-PatchSet: 15 Gerrit-Owner: Pranav Lodha <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Pranav Lodha <[email protected]> Gerrit-Comment-Date: Sat, 10 Aug 2024 11:52:36 +0000 Gerrit-HasComments: Yes
