Pyspark Write Batch Streaming Data to Snowflake Fails with more columns

2024-02-09 Thread Varun Shah
Hi Team, We currently have implemented pyspark spark-streaming application on databricks, where we read data from s3 and write to the snowflake table using snowflake connector jars (net.snowflake:snowflake-jdbc v3.14.5 and net.snowflake:spark-snowflake v2.12:2.14.0-spark_3.3) . Currently facing a

Building an Event-Driven Real-Time Data Processor with Spark Structured Streaming and API Integration

2024-02-09 Thread Mich Talebzadeh
Appreciate your thoughts on this, Personally I think Spark Structured Streaming can be used effectively in an Event Driven Architecture as well as continuous streaming) >From the link here

Re: Pyspark Write Batch Streaming Data to Snowflake Fails with more columns

2024-02-09 Thread Mich Talebzadeh
Hi Varun, I am no expert on Snowflake, however, the issue you are facing, particularly if it involves data trimming in a COPY statement and potential data mismatch, is likely related to how Snowflake handles data ingestion rather than being directly tied to PySpark. The COPY command in Snowflake i

Re: Enhanced Console Sink for Structured Streaming

2024-02-09 Thread Neil Ramaswamy
Thanks for the comments, Anish and Jerry. To summarize so far, we are in agreement that: 1. Enhanced console sink is a good tool for new users to understand Structured Streaming semantics 2. It should be opt-in via an option (unlike my original proposal) 3. Out of the 2 modes of verbosity I propos

Re: Building an Event-Driven Real-Time Data Processor with Spark Structured Streaming and API Integration

2024-02-09 Thread Mich Talebzadeh
The full code is available from the link below https://github.com/michTalebzadeh/Event_Driven_Real_Time_data_processor_with_SSS_and_API_integration Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom view my Linkedin profile