[ 
https://issues.apache.org/jira/browse/HDDS-10744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

István Fajth updated HDDS-10744:
--------------------------------
    Description: 
LiveFileMetaData class in RocksDB has three methods that are returning a byte[] 
which we convert to String after any call.
These methods are:
- columnFamilyName()
- smallestKey()
- largestKey()

We use 3 different conversion to String for the returned byte arrays.
For largestKey and smallestKey we use FixedLengthStringCodec.bytes2String and 
new String(byte[], UTF_8)
For columnFamilyName we use org.apache.hadoop.hdds.StringUtils.bytes2String, 
new String(byte[], UTF_8), and org.bouncycastle.util.Strings.bytes2String.

>From these methods, FixedLengthStringCodec throws an exception if the 
>conversion can not be done, and it uses ISO_8859_1 as the charset for the 
>conversion, while the rest uses UTF_8 charset for the conversion, and replaces 
>the characters that UTF-8 can not represent.

Based on how and where we use these it seems to be safe to settle on UTF-8 as 
the target charset, and use StringUtils.bytes2String from our own utilities 
which uses the String constructor as of now by the way.

Removing org.bouncycastle.util.Strings usage is also beneficial for crypto 
compliance related development.

  was:Remove org.bouncycastle.util.Strings usage from RocksDBStoreMetrics there 
are more standrad way to convert a byte array to a string.


> Standardize byte array conversion to String for LiveFileMetaData in RocksDB
> ---------------------------------------------------------------------------
>
>                 Key: HDDS-10744
>                 URL: https://issues.apache.org/jira/browse/HDDS-10744
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: db, Security
>    Affects Versions: 1.4.0
>            Reporter: István Fajth
>            Assignee: István Fajth
>            Priority: Major
>              Labels: pull-request-available
>
> LiveFileMetaData class in RocksDB has three methods that are returning a 
> byte[] which we convert to String after any call.
> These methods are:
> - columnFamilyName()
> - smallestKey()
> - largestKey()
> We use 3 different conversion to String for the returned byte arrays.
> For largestKey and smallestKey we use FixedLengthStringCodec.bytes2String and 
> new String(byte[], UTF_8)
> For columnFamilyName we use org.apache.hadoop.hdds.StringUtils.bytes2String, 
> new String(byte[], UTF_8), and org.bouncycastle.util.Strings.bytes2String.
> From these methods, FixedLengthStringCodec throws an exception if the 
> conversion can not be done, and it uses ISO_8859_1 as the charset for the 
> conversion, while the rest uses UTF_8 charset for the conversion, and 
> replaces the characters that UTF-8 can not represent.
> Based on how and where we use these it seems to be safe to settle on UTF-8 as 
> the target charset, and use StringUtils.bytes2String from our own utilities 
> which uses the String constructor as of now by the way.
> Removing org.bouncycastle.util.Strings usage is also beneficial for crypto 
> compliance related development.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to