Hi all,

I am resending this message again in case you missed it. (Bcc'ing
contributors who had recent activities on SparkRunner and SparkReceiverIO)

Particularly, for users or contributors of  "*SparkRunner"* and "
*SparkReceiverIO"*, please take a look and feel free to share your ideas or
concerns.

If no objections are received, we will proceed with the upgrade *by the end
of this week*.

Thanks!

Shunping

On Wed, Feb 5, 2025 at 10:34 AM Shunping Huang <mark.sphu...@gmail.com>
wrote:

> Hi all,
>
> (Sorry Kenn, I didn't mean to interrupt the flow of your previous
> conversation. I was just finishing this email when yours came through...)
>
> I need some opinion from you all regarding Spark and the SLF4J 2.x upgrade.
>
> Here is some background.
>
>    - Spark used to have compile dependency on SLF4J 1.x.(e.g.
>    https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.12/3.2.0).
>    But since 3.4.0 ((
>    
> https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.12/3.4.0)),
>    it has switched to SLF4J 2.x.
>    - In an internal logging module of Spark <3.4.0, it references a class
>    `StaticLoggerBinder` (
>    
> https://github.com/apache/spark/blob/v3.2.1/core/src/main/scala/org/apache/spark/internal/Logging.scala#L222)
>    which only exists in SLF4J 1.x binding artifacts (e.g.
>    org.slf4j:slf4j-simple, org.slf4j:slf4j-reload4j, etc). When we upgrade
>    SLF4J and its related artifacts to 2.x, the previously mentioned class no
>    longer exists, this will cause error like " java.lang.NoClassDefFoundError:
>    org/slf4j/impl/StaticLoggerBinder"
>
>
> During the upgrade, I have seen test failures in two sub-projects in Beam.
>
>    1. *SparkReceiverIO* (
>    https://github.com/apache/beam/tree/master/sdks/java/io/sparkreceiver/2
>    ).
>       - This IO was built on top of Spark 2.x and so some tests are
>       failing when upgrading SLF4J to 2.x
>       - I am wondering *if there is any objection* to upgrading it to use
>       Spark 3.x. (In fact, I tested out this idea and tests related to
>       SparkReceiverIo run fine on Spark 3.x)
>    2. *SparkRunner* (
>    https://github.com/apache/beam/tree/master/runners/spark/3)
>       - As I mentioned above, some versions of Spark 3.x do not work
>       along well with SLF4J 2.x. Specifically, version check tests (e.g.
>       runners:spark:3:sparkVersionsTest) failed on 3.2.x.
>       - In theory, any Spark < 3.4.0 might be impacted, but due to
>       certain transitive dependency (and also some luck?), only 3.2.x tests
>       failed. See my comment at
>       
> https://github.com/apache/beam/pull/33574/files#diff-78a108ab469ee9be0d8fae0f18c0c143e04fc24d44f9f78f65b97434fc234890
>       for more details.
>       - Do we want to continue to support Spark <3.4.0 or do we want to
>       take some of the versions out because of this SLF4J upgrade.
>
>
> Last thing, there is a workaround (more like a hack) to support Spark <
> 3.4.0 under SLF4J 2.x: by putting a SLF4J 1.x binding which is not under
> group `org.slf4j` in the dependency (
> https://github.com/apache/beam/pull/33574). An example is
> `org.apache.logging.log4j:log4j-slf4j-impl`. In my opinion, the mixed use
> of SFL4J 1.x and 2.x could be a problem, but this seems to be a way if we
> want to continue our support on older versions of Spark. If you have any
> other ideas, please feel free to share here.
>
> Thanks,
>
> Shunping
>
>
>
>
>
> On Tue, Feb 4, 2025 at 3:13 PM Shunping Huang <mark.sphu...@gmail.com>
> wrote:
>
>> Hi everyone,
>>
>> I put together a short doc to summarize the existing logging
>> infrastructure(dependencies) in Beam Java and outline a plan to improve it.
>> Basically, we are on the path towards slf4j 2.x.
>>
>>
>> https://docs.google.com/document/d/1IkbiM4m8D-aB3NYI1aErFZHt6M7BQ-8eCULh284Davs/edit?usp=sharing
>>
>> If you are interested in this topic, please take a look and share any
>> feedback.
>>
>> Regards,
>>
>> Shunping
>>
>

Reply via email to