lincoln-lil commented on code in PR #25176: URL: https://github.com/apache/flink/pull/25176#discussion_r1717019309
########## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/MathFunctionsITCase.java: ########## @@ -141,4 +178,57 @@ Stream<TestSetSpec> getTestSetSpecs() { new BigDecimal("123.45"), DataTypes.DECIMAL(6, 2).notNull())); } + + private Stream<TestSetSpec> unhexTestCases() { + return Stream.of( + TestSetSpec.forFunction(BuiltInFunctionDefinitions.UNHEX) + .onFieldsWithData((String) null) + .andDataTypes(DataTypes.STRING()) + // null input + .testResult($("f0").unhex(), "UNHEX(f0)", null, DataTypes.BYTES()) + // empty string + .testResult(lit("").unhex(), "UNHEX('')", new byte[0], DataTypes.BYTES()) + // invalid hex string + .testResult( + lit("1").unhex(), "UNHEX('1')", new byte[] {0}, DataTypes.BYTES()) + .testResult( + lit("146").unhex(), + "UNHEX('146')", + new byte[] {0, 0x46}, + DataTypes.BYTES()) + .testResult(lit("z").unhex(), "UNHEX('z')", null, DataTypes.BYTES()) + .testResult(lit("1-").unhex(), "UNHEX('1-')", null, DataTypes.BYTES()) + // normal cases + .testResult( + lit("466C696E6B").unhex(), + "UNHEX('466C696E6B')", + new byte[] {0x46, 0x6c, 0x69, 0x6E, 0x6B}, + DataTypes.BYTES()) + .testResult( + lit("4D7953514C").unhex(), + "UNHEX('4D7953514C')", + new byte[] {0x4D, 0x79, 0x53, 0x51, 0x4C}, + DataTypes.BYTES()) + .testResult( + lit("\uD83D\uDE00").unhex(), + "UNHEX('\uD83D\uDE00')", + null, + DataTypes.BYTES()) + .testResult( + lit("\uD83D\uDE00").hex().unhex(), + "UNHEX(HEX('\uD83D\uDE00'))", + new byte[] {-16, -97, -104, -128}, Review Comment: Cool! Verified this can get the expected 😀 ########## flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/EncodingUtils.java: ########## @@ -187,6 +187,33 @@ public static String hex(byte[] bytes) { return new String(hexChars); } + /** + * Inspired from {@link EncodingUtils#decodeHex(String)}, but returns null instead of throwing Review Comment: It would be good to emphasize specific processing differences, including header byte strategies for odd-length inputs. Another thing is the method name, I'm thinking about using a consistent name, e.g., `decodeHexLax` or `lenientHexDecode `, WDYT? ########## flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/EncodingUtils.java: ########## @@ -187,6 +187,33 @@ public static String hex(byte[] bytes) { return new String(hexChars); } + /** + * Inspired from {@link EncodingUtils#decodeHex(String)}, but returns null instead of throwing Review Comment: Perhaps we can phrase like this: ``` * Converts an array of characters representing hexadecimal values into an array of bytes of * those same values. The returned array will be half the length of the passed array, as it * takes two characters to represent any given byte. If the input array has an odd length, * the first byte is handled separately and set to 0. * * <p>Unlike {@link #decodeHex(String)}, this method does not throw an exception for odd-length * inputs or invalid characters. Instead, it returns null if invalid characters are encountered. * * @param bytes An array of characters containing hexadecimal digits. * @return A byte array containing the binary data decoded from the supplied char array, * or null if the input contains invalid hexadecimal characters. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org