Since df1 and df2 are different DataFrames, you will need to use a join. For
example: df1.join(df2.selectExpr(“Name”, “NumReads as ctrl_2”), on=[“Name”])
> On Dec 17, 2021, at 16:25, Andrew Davidson wrote:
>
>
> Hi I am a newbie
>
> I have 16,000 data files, all files have the same number o
Hi I am a newbie
I have 16,000 data files, all files have the same number of rows and columns.
The row ids are identical and are in the same order. I want to create a new
data frame that contains the 3rd column from each data file
I wrote a test program that uses a for loop and Join. It works w
Hi Abhinav,
Using ReadStream or Read will not mind.
The following error
java.lang.NoSuchMethodError:
org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.checkFieldNames(
states that you are using different version of Spark at someplace of your
project or you are using an o
May I ask why you don’t use spark.read and spark.write instead of
readStream and writeStream? Thanks.
On 2021-12-17 15:09, Abhinav Gundapaneni wrote:
Hello Spark community,
I’m using Apache spark(version 3.2) to read a CSV file to a
dataframe using ReadStream, process the dataframe and write
Hello Spark community,
I’m using Apache spark(version 3.2) to read a CSV file to a dataframe using
ReadStream, process the dataframe and write the dataframe to Delta file using
WriteStream. I’m getting a failure during the WriteStream process. I’m trying
to run the script locally in my windows