As an initial step, could we introduce some sort of log warning,
metric or other indicator for operators to determine if they're
running with a non-UTF-8 encoding?

On Mon, Nov 28, 2022 at 1:21 PM David Capwell <dcapw...@apple.com> wrote:
>
> It probably has to be done on a  case-by-case basis
>
>
> Yeah, this is what I feel as well…
>
> Does the linter provide more detail than just the list?
>
>
> Not really, it shows how to fix but can’t really say if the fix will cause 
> issues… If you are not running with UTF-8 we do the right thing most of the 
> time, but some files “may” break… this would also be true if you 
> backup/restore these files on a different environment...
>
>
> On Nov 10, 2022, at 12:44 PM, Derek Chen-Becker <de...@chen-becker.org> wrote:
>
> This seems fraught with peril. I think that it should be fixed, but I
> also wonder what the testing requirements would be to validate no
> regression. It probably has to be done on a  case-by-case basis. Is it
> as simple as auditing places where we're calling getBytes or
> PrintReader/PrintWriter without an explicit encoding? Some of them,
> like 
> https://github.com/apache/cassandra/blob/30ad754d7e95501ffa916bf986e4cfda1aa5e441/src/java/org/apache/cassandra/tools/HashPassword.java#L128,
> look like that would be easy to address, but others seem like they
> could be complicated.
>
> Does the linter provide more detail than just the list?
>
> Cheers,
>
> Derek
>
> On Fri, Nov 4, 2022 at 2:09 PM David Capwell <dcapw...@apple.com> wrote:
>
>
> Testing out linter trying to see if it can solve a case for Simulator and see 
> we have 25 cases where we don’t add the encoding and rely on default, which 
> is based off the system…
>
> If we attempt to fix these cases, I am wondering if this is a regression… it 
> “might” be the case someone set -Dfile.encoding=ascii or updated env LANG to 
> something non-UTF based…
>
> Here is the list reported
>
> org.apache.cassandra.cql3.functions.JavaBasedUDFunction since first 
> historized release
> org.apache.cassandra.db.ColumnFamilyStore since first historized release
> org.apache.cassandra.db.compaction.CompactionLogger$CompactionLogSerializer 
> since first historized release
> org.apache.cassandra.db.filter.RowFilter$CustomExpression since first 
> historized release
> org.apache.cassandra.db.lifecycle.LogTransaction since first historized 
> release
> org.apache.cassandra.gms.FailureDetector since first historized release
> org.apache.cassandra.index.sasi.analyzer.StandardTokenizerImpl since first 
> historized release
> org.apache.cassandra.io.sstable.SSTable since first historized release
> org.apache.cassandra.io.util.FileReader since first historized release
> org.apache.cassandra.io.util.FileReader since first historized release
> org.apache.cassandra.io.util.FileWriter since first historized release
> org.apache.cassandra.io.util.FileWriter since first historized release
> org.apache.cassandra.metrics.SamplingManager since first historized release
> org.apache.cassandra.metrics.SamplingManager since first historized release
> org.apache.cassandra.schema.IndexMetadata since first historized release
> org.apache.cassandra.security.PEMBasedSslContextFactory since first 
> historized release
> org.apache.cassandra.tools.HashPassword since first historized release
> org.apache.cassandra.tools.JMXTool$Dump$Format$3 since first historized 
> release
> org.apache.cassandra.tools.NodeTool$NodeToolCmd since first historized release
> org.apache.cassandra.tools.SSTableMetadataViewer since first historized 
> release
> org.apache.cassandra.transport.Client since first historized release
> org.apache.cassandra.utils.ByteArrayUtil since first historized release
> org.apache.cassandra.utils.FBUtilities since first historized release
> org.apache.cassandra.utils.GuidGenerator since first historized release
> org.apache.cassandra.utils.HeapUtils since first historized release
>
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>


-- 
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+

Reply via email to