Re: LLM based data pre-processing

2025-01-03 Thread Gurunandan
HI Mayur, Please evaluate Langchain's Spark Dataframe Agent for your use case. documentation: 1) https://python.langchain.com/v0.1/docs/integrations/toolkits/spark/ 2) https://python.langchain.com/docs/integrations/tools/spark_sql/ regards, Guru On Fri, Jan 3, 2025 at 6:38 PM Mayur Dattatray Bho

Re: Which shuffle operations trigger AQE and which don't?

2024-11-14 Thread Gurunandan
AQE can dynamically prune shuffle partitions based on filter conditions, this will reduce the amount of data processed.The AQE will optimize logical Plan following every stage i.e at shuffle boundaries. The logical plan created with optimization applied by AQE will colalase small shuffle partitions

Re: [Spark SQL] [DISK_ONLY Persistence] getting "this.inMemSorter" is null exception

2024-11-12 Thread Gurunandan
even if the blocks are not available after the un-persists, RDD should >> recompute and not throw NullPointerException. >> >> Looking forward to some guidance on how I should proceed further. >> >> Regards, >> Ashwani >> >> >> >> On Mon, N

Re: [Spark SQL] [DISK_ONLY Persistence] getting "this.inMemSorter" is null exception

2024-11-11 Thread Gurunandan
Hi Ashwani, Please verify input data by ensuring that the data being processed is valid and free of null values or unexpected data types. if data undergoes complex transformations before sorting review the data Transformations, verify that data transformations don't introduce inconsistencies or nul

Re: ClassCastException in Spark3.5 when submit job to an old yarn cluster using netty 4.1.17

2024-11-06 Thread Gurunandan
Hi Smokeriu, Please verify if there are conflicting netty jars by adding the below configurations at the cluster level to print more logs to identify the jars from which the class is loaded. spark.executor.extraJavaOptions=-verbose:class spark.driver.extraJavaOptions=-verbose:class regards, Guru

Re: [SPARK CORE] Incompatible configuration used between Spark and HBaseTestingUtility

2024-11-06 Thread Gurunandan
wrote: > > Yes, we use org.apache.hbase.connectors.spark:hbase-spark:1.0.0.7.2.16.0-287 > > În mie., 30 oct. 2024 la 15:30, Gurunandan a scris: >> >> Hi Evelina, >> Do you use Spark HBase Connector ( hbase-spark ) as part of the unit-test >> setup? >> >

Re: [SPARK CORE] Incompatible configuration used between Spark and HBaseTestingUtility

2024-10-30 Thread Gurunandan
Hi Evelina, Do you use Spark HBase Connector ( hbase-spark ) as part of the unit-test setup? regards, Guru On Wed, Oct 30, 2024 at 5:35 PM Evelina Dumitrescu wrote: > > Hello, > > TLDR; The question is asked also here: > https://stackoverflow.com/questions/79139516/incompatible-configuration-use

Re:

2024-10-29 Thread Gurunandan
Hi, REPLACE clause is only supported for Delta Lake tables. Please refer to documentation at https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-table-using.html#parameters for further information. regards, Guru On Tue, Oct 29, 2024 at 5:32 PM Suphakit Wongsarawit wrote: