Dries Samyn created FLINK-39190:
-----------------------------------

             Summary: flink-s3-fs-hadoop bundles unshaded Jackson 2.17.2 (via 
AWS SDK v1)
                 Key: FLINK-39190
                 URL: https://issues.apache.org/jira/browse/FLINK-39190
             Project: Flink
          Issue Type: Bug
          Components: FileSystems
    Affects Versions: 2.1.1
            Reporter: Dries Samyn


`flink-s3-fs-hadoop` pulls in `aws-java-sdk-core:1.12.779` transitively through 
`flink-s3-fs-base`. The AWS SDK declares `jackson-databind:2.17.2` and 
`jackson-dataformat-cbor:2.17.2` as compile-scoped dependencies.

The maven-shade-plugin configuration in flink-s3-fs-hadoop only relocates two 
Flink-internal packages:
 
* `org.apache.flink.runtime.fs.hdfs` → `org.apache.flink.fs.s3hadoop.common`
* `org.apache.flink.runtime.util` → `org.apache.flink.fs.s3hadoop.common`

`com.fasterxml.jackson.*` is not relocated.

As a result, `flink-s3-fs-hadoop-2.1.1.jar` contains 
`com.fasterxml.jackson.databind.ObjectReader` (and related classes) at version 
2.17.2 under their original package names.

 When this JAR is present on the Flink job's runtime classpath (e.g. placed in 
usrlib/ via Jib's packaged containerising mode, or added as runtimeOnly in a 
Gradle build), its older Jackson classes shadow the application's Jackson.

For example, any job using jackson-dataformat-csv:2.18+ then fails immediately:

```
java.lang.NoSuchFieldError: CLEAR_CURRENT_TOKEN_ON_CLOSE
      at 
com.fasterxml.jackson.dataformat.csv.CsvParser.close(CsvParser.java:548)
      at 
com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:2132)
          ~[flink-s3-fs-hadoop-2.1.1.jar:2.1.1]
```

`StreamReadFeature.CLEAR_CURRENT_TOKEN_ON_CLOSE` was added in Jackson 2.18.0.

Note that `flink-shaded-jackson` correctly relocates Jackson to 
`org.apache.flink.shaded.jackson2.*`, but `flink-s3-fs-hadoop` does not apply 
the same treatment, creating an inconsistency across Flink's own artifacts.

  Steps to reproduce:

 # Create a Flink 2.1.1 job that uses jackson-dataformat-csv:2.18+
 # Package flink-s3-fs-hadoop-2.1.1.jar on the runtime classpath (e.g. via 
runtimeOnly in Gradle / Jib packaged mode)
 # Deploy to a Flink cluster
 # CsvParser throws `NoSuchFieldError: CLEAR_CURRENT_TOKEN_ON_CLOSE`

Expected behaviour: flink-s3-fs-hadoop should shade/relocate its bundled 
Jackson to avoid classpath conflicts, consistent with how flink-shaded-jackson 
handles the core runtime's Jackson dependency.

  ---
  Proposed patch: [https://github.com/apache/flink/pull/27700]

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to