date:20220126

RE: Regarding spark-3.2.0 decommission features.

2022-01-26 Thread Rao, Abhishek (Nokia - IN/Bangalore)

Hi Dongjoon Hyun, Any inputs on the below issue would be helpful. Please let us know if we're missing anything? Thanks and Regards, Abhishek From: Patidar, Mohanlal (Nokia - IN/Bangalore) Sent: Thursday, January 20, 2022 11:58 AM To: user@spark.apache.org Subject: Suspected SPAM - RE: Regardin

Re: question for definition of column types

2022-01-26 Thread Sean Owen

You can cast the cols as well. But are the columns strings to begin with? they could also actually be doubles. On Wed, Jan 26, 2022 at 8:49 PM wrote: > when creating dataframe from a list, how can I specify the col type? > > such as: > > >>> df = > >>> > spark.createDataFrame(list,["name","title

Re: question for definition of column types

2022-01-26 Thread Peyman Mohajerian

from pyspark.sql.types import * list =[("buck trends", "ceo", 20.00, 0.25, "100")] schema = StructType([ StructField("name", StringType(), True), StructField("title", StringType(), True), StructField("salary", DoubleType(), True),

question for definition of column types

2022-01-26 Thread capitnfrakass

when creating dataframe from a list, how can I specify the col type? such as: df = spark.createDataFrame(list,["name","title","salary","rate","insurance"]) df.show() +---+-+--++-+ | name|title|salary|rate|insurance| +---+-+--++

Re: Migration to Spark 3.2

2022-01-26 Thread Stephen Coy

Hi Aurélien! Please run mvn dependency:tree and check it for Jackson dependencies. Feel free to respond with the output if you have any questions about it. Cheers, Steve C > On 22 Jan 2022, at 10:49 am, Aurélien Mazoyer wrote: > > Hello, > > I migrated my code to Spark 3.2 and I am

unsubscribe

2022-01-26 Thread Lucas Schroeder Rossi

unsubscribe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: [Spark UDF]: Where does UDF stores temporary Arrays/Sets

2022-01-26 Thread Sean Owen

Really depends on what your UDF is doing. You could read 2GB of XML into much more than that as a DOM representation in memory. Remember 15GB of executor memory is shared across tasks. You need to get a handle on what memory your code is using to begin with to start to reason about whether that's e

Re: [Spark UDF]: Where does UDF stores temporary Arrays/Sets

2022-01-26 Thread Abhimanyu Kumar Singh

Thanks for your quick response. For some reasons I can't use spark-xml (schema related issue). I've tried reducing number of tasks per executor by increasing the number of executors, but it still throws same error. I can't understand why does even 15gb of executor memory is not sufficient to par

Re: [Spark UDF]: Where does UDF stores temporary Arrays/Sets

2022-01-26 Thread Sean Owen

Executor memory used shows data that is cached, not the VM usage. You're running out of memory somewhere, likely in your UDF, which probably parses massive XML docs as a DOM first or something. Use more memory, fewer tasks per executor, or consider using spark-xml if you are really just parsing pie

[Spark UDF]: Where does UDF stores temporary Arrays/Sets

2022-01-26 Thread Abhimanyu Kumar Singh

I'm doing some complex operations inside spark UDF (parsing huge XML). Dataframe: | value | | Content of XML File 1 | | Content of XML File 2 | | Content of XML File N | val df = Dataframe.select(UDF_to_parse_xml(value)) UDF looks something like: val XMLelements : Array[MyClass1] = getXMLelemen

RE: Regarding spark-3.2.0 decommission features.

Re: question for definition of column types

Re: question for definition of column types

question for definition of column types

Re: Migration to Spark 3.2

unsubscribe

Re: [Spark UDF]: Where does UDF stores temporary Arrays/Sets

Re: [Spark UDF]: Where does UDF stores temporary Arrays/Sets

Re: [Spark UDF]: Where does UDF stores temporary Arrays/Sets

[Spark UDF]: Where does UDF stores temporary Arrays/Sets

10 matches

Site Navigation

Mail list logo

Footer information