[Spark SQL]: Crash when attempting to select PostgreSQL bpchar without length specifier in Spark 3.5.0

2024-01-29 Thread Lily Hahn
Hi, I’m currently migrating an ETL project to Spark 3.5.0 from 3.2.1 and ran into an issue with some of our queries that read from PostgreSQL databases. Any attempt to run a Spark SQL query that selects a bpchar without a length specifier from the source DB seems to crash: py4j.protocol.Py4JJav

Re: startTimestamp doesn't work when using rate-micro-batch format

2024-01-29 Thread Mich Talebzadeh
As I stated earlier on,, there are alternatives that you might explore socket sources for testing purposes. from pyspark.sql import SparkSession from pyspark.sql.functions import expr, when from pyspark.sql.types import StructType, StructField, LongType spark = SparkSession.builder \ .master(

Re: startTimestamp doesn't work when using rate-micro-batch format

2024-01-29 Thread Perfect Stranger
Yes, there's definitely an issue, can someone fix it? I'm not familiar with apache jira, do I need to make a bug report or what? On Mon, Jan 29, 2024 at 2:57 AM Mich Talebzadeh wrote: > OK > > This is the equivalent Python code > > from pyspark.sql import SparkSession > from pyspark.sql.functio