Hi Sanket,
Driver and executor logs are written to stdout by default, it can be
configured using SPARK_HOME/conf/log4j.properties file. The file including
the entire SPARK_HOME/conf is auto propogateded to all driver and executor
container and mounted as volume.
Thanks
On Mon, 9 Oct, 2023, 5:37
Unsubscribe
> Em 9 de out. de 2023, à(s) 07:03, Mich Talebzadeh
> escreveu:
>
> Hi,
>
> Please see my responses below:
>
> 1) In Spark Structured Streaming does commit mean streaming data has been
> delivered to the sink like Snowflake?
>
> No. a commit does not refer to data being delivere
Your mileage varies. Often there is a flavour of Cloud Data warehouse
already there. CDWs like BigQuery, Redshift, Snowflake and so forth. They
can all do a good job for various degrees
- Use efficient data types. Choose data types that are efficient for
Spark to process. For example, use in
Thank you for your feedback Mich.
In general how can one optimise the cloud data warehouses (the sink part), to
handle streaming Spark data efficiently, avoiding bottlenecks that discussed.
AKOn Monday, 9 October 2023 at 11:04:41 BST, Mich Talebzadeh
wrote:
Hi,
Please see my responses
In a nutshell, is this what you are trying to do?
1. Read the Delta table into a Spark DataFrame.
2. Explode the string column into a struct column.
3. Convert the hexadecimal field to an integer.
4. Write the DataFrame back to the Delta table in merge mode with a
unique key.
Is t
Hi All,
We are trying to send the spark logs using fluent-bit. We validated that
fluent-bit is able to move logs of all other pods except the driver/executor
pods.
It would be great if someone can guide us where should I look for spark logs in
Spark on Kubernetes with client/cluster mode deplo
Hi,
Please see my responses below:
1) In Spark Structured Streaming does commit mean streaming data has been
delivered to the sink like Snowflake?
No. a commit does not refer to data being delivered to a sink like
Snowflake or bigQuery. The term commit refers to Spark Structured Streaming
(SS) i