Need help understanding tuning docs

2024-08-14 Thread Sreyan Chakravarty
victed." Seems contradictory, what is going on? -- Regards, Sreyan Chakravarty

Is one Spark partition mapped to one and only Spark Task ?

2024-03-24 Thread Sreyan Chakravarty
. -- Regards, Sreyan Chakravarty

Re: pyspark - Where are Dataframes created from Python objects stored?

2024-03-18 Thread Sreyan Chakravarty
be faked. I want data to actually reside on the storage or executors. Maybe this will be better tackled in a separate thread here: https://lists.apache.org/thread/w6f7rq7m8fj6hzwpyhvvx3c42wbmkwdq -- Regards, Sreyan Chakravarty

pyspark - Use Spark to generate a large dataset on the fly

2024-03-18 Thread Sreyan Chakravarty
tition the data from the Kafka topic ? Basically, my problem means calls from sending each piece of data as I receive it to the worker node. Can that be done somehow ? -- Regards, Sreyan Chakravarty

pyspark - Use Spark to generate a large dataset on the fly

2024-03-18 Thread Sreyan Chakravarty
tition the data from the Kafka topic ? *Basically, my problem means calls from sending each piece of data as I receive it to the worker node. Can that be done somehow ?* -- Regards, Sreyan Chakravarty

Re: pyspark - Where are Dataframes created from Python objects stored?

2024-03-18 Thread Sreyan Chakravarty
So just to be clear the transformations are always executed on the worker node but it is just transferred until an action on the dataframe is triggered. Am I correct ? If so, then how do I generate a large dataset ? I may need something like that for synthetic data for testing. Any way to do that ? -- Regards, Sreyan Chakravarty

pyspark - Where are Dataframes created from Python objects stored?

2024-03-14 Thread Sreyan Chakravarty
- That does not make sense when the dataframe grows large. -- Regards, Sreyan Chakravarty